Receive notifications for image analysis with Amazon Rekognition Custom Labels and analyze predictions

Amazon Rekognition Custom Labels is a fully managed computer vision service that allows developers to build custom models to classify and identify objects in images that are specific and unique to your business.

Rekognition Custom Labels doesn’t require you to have any prior computer vision expertise. You can get started by simply uploading tens of images instead of thousands. If the images are already labeled, you can begin training a model in just a few clicks. If not, you can label them directly within the Rekognition Custom Labels console, or use Amazon SageMaker Ground Truth to label them. Rekognition Custom Labels uses transfer learning to automatically inspect the training data, select the right model framework and algorithm, optimize the hyperparameters, and train the model. When you’re satisfied with the model accuracy, you can start hosting the trained model with just one click.

However, if you’re a business user looking to solve a computer vision problem, visualize inference results of the custom model, and receive notifications when such inference results are available, you have to rely on your engineering team to build such an application. For example, an agricultural operations manager can be notified when a crop is found to have a disease, a winemaker can be notified when the grapes are ripe for harvesting, or a store manager can be notified when it’s time to restock inventories such as soft drinks in a vertical refrigerator.

In this post, we walk you through the process of building a solution that allows you to visualize the inference result and send notifications to subscribed users when specific labels are identified in images that are processed using models built by Rekognition Custom Labels.

Solution overview

The following diagram illustrates our solution architecture.

Architecture Diagram

This solution uses the following AWS services to implement a scalable and cost-effective architecture:

  • Amazon Athena – A serverless interactive query service that makes it easy to analyze data in Amazon S3 using standard SQL.
  • AWS Lambda – A serverless compute service that lets you run code in response to triggers such as changes in data, shifts in system state, or user actions. Because Amazon S3 can directly trigger a Lambda function, you can build a variety of real-time serverless data-processing systems.
  • Amazon QuickSight – A very fast, easy-to-use, cloud-powered business analytics service that makes it easy to build visualizations, perform ad hoc analysis, and quickly get business insights from the data.
  • Amazon Rekognition Custom Labels – Allows you to train a custom computer vision model to identify the objects and scenes in images that are specific to your business needs.
  • Amazon Simple Notification Service – Amazon SNS is a fully managed messaging service for both application-to-application (A2A) and application-to-person (A2P) communication.
  • Amazon Simple Queue Service – Amazon SQS is a fully managed message queuing service that enables you to decouple and scale microservices, distributed systems, and serverless applications.
  • Amazon Simple Storage Service – Amazon S3 serves as an object store for your documents and allows for central management with fine-tuned access controls.

The solution utilizes a serverless workflow that gets triggered when an image is uploaded to the input S3 bucket. An SQS queue receives an event notification for object creation. The solution also creates dead-letter queues (DLQs) to set aside and isolate messages that can’t be processed correctly. A Lambda function feeds off of the SQS queue and makes the DetectLabels API call to detect all labels in the image. To scale this solution and make it a loosely coupled design, the Lambda function sends the prediction results to another SQS queue. This SQS queue triggers another Lambda function, which analyzes all the labels found in the predictions. Based on the user preference (configured during solution deployment), the function publishes a message to an SNS topic. The SNS topic is configured to deliver email notifications to the user. You can configure the Lambda function to add a URL to the message sent to Amazon SNS to access the image (using an Amazon S3 presigned URL). Finally, the Lambda function uploads a prediction result and image metadata to an S3 bucket. You can then use Athena and QuickSight to analyze and visualize the results from the S3 bucket.

Prerequisites

You need to have a model trained and running with Rekognition Custom Labels.

Rekognition Custom Labels lets you manage the machine learning model training process on the Amazon Rekognition console, which simplifies the end-to-end model development process. For this post, we use a classification model trained to detect plant leaf disease.

Deploy the solution

You deploy an AWS CloudFormation template to provision the necessary resources, including S3 buckets, SQS queues, SNS topic, Lambda functions, and AWS Identity and Access Management (IAM) roles. The template creates the stack the us-east-1 Region, but you can use the template to create your stack in any Region where the above AWS services are available.

  1. Launch the following CloudFormation template in the Region and AWS account where you deployed the Rekognition Custom Labels model:

  1. For Stack name, enter a stack name, such as rekognition-customlabels-analytics-and-notification.
  2. For CustomModelARN, enter the ARN of the Amazon Rekognition Custom Labels model that you want to use.

The Rekognition Custom Labels model needs to be deployed in the same AWS account.

  1. For EmailNotification, enter an email address where you want to receive notifications.
  2. For InputBucketName, enter a unique name for the S3 bucket the stack creates; for example, plant-leaf-disease-data-input.

This is where the incoming plant leaf images are stored.

  1. For LabelsofInterest, you can enter up to 10 different labels you want to be notified of, in comma-separated format. For our plant disease example, enter bacterial-leaf-blight,leaf-smut.
  2. For MinConfidence, enter the minimum confidence threshold to receive notification. Labels detected with a confidence below the value of MinConfidence aren’t returned in the response and will not generate notification.
  3. For OutputBucketName, enter a unique name for the S3 bucket the stack creates; for example, plant-leaf-disease-data-output.

The output bucket contains JSON files with image metadata (labels found and confidence score).

  1. Choose Next.

  1. On the Configure stack options page, set any additional parameters for the stack, including tags.
  2. Choose Next.
  3. In the Capabilities and transforms section, select the check box to acknowledge that AWS CloudFormation might create IAM resources.
  4. Choose Create stack.

The stack details page should show the status of the stack as CREATE_IN_PROGRESS. It can take up to 5 minutes for the status to change to CREATE_COMPLETE.

Amazon SNS will send a subscription confirmation message to the email address. You need to confirm the subscription.

Test the solution

Now that we have deployed the resources, we’re ready to test the solution. Make sure you start the model.

  1. On the Amazon S3 console, choose Buckets.
  2. Choose the input S3 bucket.

  1. Upload test images to the bucket.

In production, you can set up automated processes to deliver images to this bucket.

These images trigger the workflow. If the label confidence exceeds the specified threshold, you receive an email notification like the following.

You can also configure the SNS topic to deliver these notifications to any destinations supported by the service.

Analyze the prediction results

After you test the solution, you can extend the solution to create a visual analysis for the predictions of processed images. For this purpose, we use Athena, an interactive query service that makes it easy to analyze data directly from Amazon S3 using standard SQL, and QuickSight to visualize the data.

Configure Athena

If you are not familiar with Amazon Athena, see this tutorial. On the Athena console, create a table in the Athena data catalog with the following code:

CREATE EXTERNAL TABLE IF NOT EXISTS `default`.`rekognition_customlabels_analytics` (
`Image` string,
`Label` string,
`Confidence` string
)
ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe'
WITH SERDEPROPERTIES (
'serialization.format' = '1'
) LOCATION 's3://<<OUTPUT BUCKET NAME>>/'
TBLPROPERTIES ('has_encrypted_data'='false');

Populate the Location field in the preceding query with your output bucket name, such as plant-leaf-disease-data-output.

This code tells Athena how to interpret each row of the text in the S3 bucket.

You can now query the data:

SELECT * FROM "default"."rekognition_customlabels_analytics" limit 10;

Configure QuickSight

To configure QuickSight, complete the following steps:

  1. Open the QuickSight console.
  2. If you’re not signed up for QuickSight, you’re prompted with the option to sign up. Follow the steps to sign up to use QuickSight.
  3. After you log in to QuickSight, choose Manage QuickSight under your account.

  1. In the navigation pane, choose Security & permissions.
  2. Under QuickSight access to AWS services, choose Add or remove.

A page appears for enabling QuickSight access to AWS services.

  1. Select Amazon Athena.

  1. In the pop-up window, choose Next.

  1. On the S3 tab, select the necessary S3 buckets. For this post, I select the bucket that stores my Athena query results.
  2. For each bucket, also select Write permission for Athena Workgroup.
  3. Choose Finish.
  4. Choose Update.
  5. On the QuickSight console, choose New analysis.
  6. Choose New dataset.
  7. For Datasets, choose Athena.
  8. For Data source name, enter Athena-CustomLabels-analysis.
  9. For Athena workgroup, choose primary.
  10. Choose Create data source.

  1. For Database, choose default on the drop-down menu.
  2. For Tables, select the table rekognition_customlabels_analytics.
  3. Choose Select.

  1. Choose Visualize.

  1. On the Visualize page, under the Fields list, choose label and select the pie chart from Visual types.

You can add more visualizations in the dashboard. When your analysis is ready, you can choose Share to create a dashboard and share it within your organization.

Summary

In this post, we showed how you can create a solution to receive notifications for specific labels (such as bacterial leaf blight or leaf smut) found in processed images using Rekognition Custom Labels. In addition, we showed how you can create dashboards to visualize the results using Athena and QuickSight.

You can now easily share such visualization dashboards with business users and allow them to subscribe to notifications instead of having to rely on your engineering teams to build such an application.


About the Authors

Jay Rao is a Principal Solutions Architect at AWS. He enjoys providing technical and strategic guidance to customers and helping them design and implement solutions on AWS.

Pashmeen Mistry is the Senior Product Manager for Amazon Rekognition Custom Labels. Outside of work, Pashmeen enjoys adventurous hikes, photography, and spending time with his family.

Read More

Customize the Amazon SageMaker XGBoost algorithm container

The built-in Amazon SageMaker XGBoost algorithm provides a managed container to run the popular XGBoost machine learning (ML) framework, with added convenience of supporting advanced training or inference features like distributed training, dataset sharding for large-scale datasets, A/B model testing, or multi-model inference endpoints. You can also extend this powerful algorithm to accommodate different requirements.

Packaging the code and dependencies in a single container is a convenient and robust approach for long-term code maintenance, reproducibility, and auditing purposes. Modifying the container directly follows the base container faithfully and avoids duplicating existing functions already supported by the base container. In this post, we review the inner workings of the SageMaker XGBoost algorithm container and provide pragmatic scripts to directly customize the container.

SageMaker XGBoost container structure

The SageMaker built-in XGBoost algorithm is packaged as a stand-alone container, available on GitHub, and can be extended under the developer-friendly Apache 2.0 open-source license. The container packages the open-source XGBoost algorithm and ancillary tools to run the algorithm in the SageMaker environment integrated with other AWS Cloud services. This allows you to train XGBoost models on a variety of data sources, make batch predictions on offline data, or host an inference endpoint in a real-time pipeline.

The container supports training and inference operations with different entry points. For inference mode, the entry can be found in the main function in the serving.py script. For real-time inference serving, the container runs a Flask-based web server that when invoked, receives an HTTP-encoded request containing the data, decodes the data into the XGBoost’s DMatrix format, loads the model, and returns an HTTP-encoded response back. These methods are encapsulated under the ScoringService class, which can also be customized through the script mode to a great extent (see the Appendix below).

The entry point for training mode (algorithm mode) is the main function in the training.py. The main function sets up the training environment and calls the training job function. It’s flexible enough to allow for distributed or single-node training, or utilities like cross validation. The heart of the training process can be found in the train_job function.

Docker files packaging the container can be found in the GitHub repo. Note that the container is built in two steps: a base container is built first, followed by the final container on top.

Solution overview

You can modify and rebuild the container through the source code. However, this involves collecting and rebuilding all dependencies and packages from scratch. In this post, we discuss a more straightforward approach that modifies the container on top of the already-built and publicly-available SageMaker XGBoost algorithm container image directly.

In this approach, we pull a copy of the public SageMaker XGBoost image, modify the scripts or add packages, and rebuild the container on top. The modified container can be stored in a private repository. This way, we avoid rebuilding intermediary dependencies and instead build directly on top of the already-built libraries packaged in the official container.

The following figure shows an overview of the script used to pull the public base image, modify and rebuild the image, and upload it to a private Amazon Elastic Container Registry (Amazon ECR) repository. The bash script in the accompanying code of this post performs all the workflow steps shown in the diagram. The accompanying notebook shows an example where the URI of a specific version of the SageMaker XGBoost algorithm is first retrieved and passed to the bash script, which replaces two of the Python scripts in the image, rebuilds it, and pushes the modified image to a private Amazon ECR repository. You can modify the accompanying code to suit your needs.

­

Prerequisites

The GitHub repository contains the code accompanying this post. You can run the sample notebook in your AWS account, or use the provided AWS CloudFormation stack to deploy the notebook using a SageMaker notebook. You need the following prerequisites:

  • An AWS account.
  • Necessary permissions to run SageMaker batch transform and training jobs, and Amazon ECR privileges. The CloudFormation template creates sample AWS Identity and Access Management (IAM) roles.

Deploy the solution

To create your solution resources using AWS CloudFormation, choose Launch Stack:

The stack deploys a SageMaker notebook preconfigured to clone the GitHub repository. The walkthrough notebook includes the steps to pull the public SageMaker XGBoost image for a given version, modify it, and push the custom container to a private Amazon ECR repository. The notebook uses the public Abalone dataset as a sample, trains a model using the SageMaker XGBoost built-in training mode, and reuses this model in the custom image to perform batch transform jobs that produce inference together with SHAP values.

Conclusion

SageMaker built-in algorithms provide a variety of features and functionalities, and can be extended further under the Apache 2.0 open-source license. In this post, we reviewed how to extend the production built-in container for the SageMaker XGBoost algorithm to meet production requirements like backward code and API compatibility.

The sample notebook and helper scripts provide a convenient starting point to customize SageMaker XGBoost container image the way you would like it. Give it a try!

Appendix: Script mode

Script mode provides a way to modify many SageMaker built-in algorithms by providing an interface to replace the functions responsible for transforming the inputs and loading the model. Script mode isn’t as flexible as directly modifying the container, but it provides a completely Python-based route to customize the built-in algorithm with no need to work directly with Docker.

In script mode, a user-module is provided to customize data decoding, loading of the model, and making predictions. The user module can define a transformer_fn that handles all aspects of processing the request to preparing the response. Or instead of defining transformer_fn, you can provide custom methods model_fn, input_fn, predict_fn, and output_fn individually to customize loading the model and decoding and preparing the input for prediction. For a more thorough overview of script mode, see Bring Your Own Model with SageMaker Script Mode.


About the Authors

Peyman Razaghi is a Data Scientist at AWS. He holds a PhD in information theory from the University of Toronto and was a post-doctoral research scientist at the University of Southern California (USC), Los Angeles. Before joining AWS, Peyman was a staff systems engineer at Qualcomm contributing to a number of notable international telecommunication standards. He has authored several scientific research articles peer-reviewed in statistics and systems-engineering area, and enjoys parenting and road cycling outside work.

Read More

Detect adversarial inputs using Amazon SageMaker Model Monitor and Amazon SageMaker Debugger

Research over the past few years has shown that machine learning (ML) models are vulnerable to adversarial inputs, where an adversary can craft inputs to strategically alter the model’s output (in image classification, speech recognition, or fraud detection). For example, imagine you have deployed a model that identifies your employees based on images of their faces. As demonstrated in the whitepaper Accessorize to a Crime: Real and Stealthy Attacks on State-of-the-Art Face Recognition, malicious employees may apply subtle but carefully designed modifications to their image and fool the model to authenticate them as other employees. Obviously, such adversarial inputs—especially if there are a significant amount of them—can have a devastating business impact.

Ideally, we want to detect each time an adversarial input is sent to the model to quantify how adversarial inputs are impacting your model and business. To this end, a wide class of methods analyze individual model inputs to check for adversarial behavior. However, active research in adversarial ML has led to increasingly sophisticated adversarial inputs, many of which are known to make detection ineffective. The reason for this shortcoming is that it’s difficult to draw conclusions from an individual input as to whether it’s adversarial or not. To this end, a recent class of methods focuses on distributional-level checks by analyzing multiple inputs at a time. The key idea behind these new methods is that considering multiple inputs at a time enables more powerful statistical analysis that isn’t possible with individual inputs. However, in the face of a determined adversary with deep knowledge of the model, even these advanced detection methods can fail.

However, we can defeat even these determined adversaries by providing the defense methods with additional information. Specifically, instead of just the analyzing model inputs, analyzing the latent representations collected from the intermediate layers in a deep neural network significantly strengthens the defense.

In this post, we walk you through how to detect adversarial inputs using Amazon SageMaker Model Monitor and Amazon SageMaker Debugger for an image classification model hosted on Amazon SageMaker.

To reproduce the different steps and results listed in this post, clone the repository detecting-adversarial-samples-using-sagemaker into your Amazon SageMaker notebook instance and run the notebook.

Detecting adversarial inputs

We show you how to detect adversarial inputs using the representations collected from a deep neural network. The following four images show the original training image on the left (taken from the Tiny ImageNet dataset) and three images produced by the Projected Gradient Descent (PGD) attack [1] with different perturbation parameters ϵ. The model used here was ResNet18. The ϵ parameter defines the amount of adversarial noise added to the images. The original image (left) is correctly predicted as class 67 (goose). The adversarially modified images 2, 3, and 4 are incorrectly predicted as class 51 (mantis) by the ResNet18 model. We can also see that images generated with small ϵ are perceptually indistinguishable from the original input image.

Next, we create a set of normal and adversarial images and use t-Distributed Stochastic Neighbor Embedding (t-SNE [2]) to visually compare their distributions. t-SNE is a dimensionality reduction method that maps high-dimensional data into a 2- or 3-dimensional space. Each data point in the following image presents an input image. Orange data points present the normal inputs taken from the test set, and blue data points indicate the corresponding adversarial images generated with an epsilon of 0.003. If normal and adversarial inputs are distinguishable, then we would expect separate clusters in the t-SNE visualization. Because both belong to the same cluster, this means that a detection technique that focuses solely on changes in the model input distribution can’t distinguish these inputs.

Let’s take a closer look at the layer representations produced by different layers in the ResNet18 model. ResNet18 consists of 18 layers; in the following image, we visualize the t-SNE embeddings for the representations for six of these layers.

As the preceding figure shows, natural and adversarial inputs become more distinguishable for deeper layers of the ResNet18 model.

Based on these observations, we use a statistical method that measures distinguishability with hypothesis testing. The method consists of a two-sample test using maximum mean discrepancy (MMD). MMD is a kernel-based metric for measuring the similarity between two distributions generating the data. A two-sample test takes two sets that contain inputs drawn from two distributions, and determines whether these distributions are the same. We compare the distribution of inputs observed in the training data and compare it with the distribution of the inputs received during inference.

Our method uses these inputs to estimate the p-value using MMD. If the p-value is greater than a user-specific significance threshold (5% in our case), we conclude that both distributions are different. The threshold tunes the trade-off between false positives and false negatives. A higher threshold, such as 10%, decreases the false negative rate (there are fewer cases when both distributions were different but the test failed to indicate that). However, it also results in more false positives (the test indicates both distributions are different even when that isn’t the case). On the other hand, a lower threshold, such as 1%, results in fewer false positives but more false negatives.

Instead of applying this method solely on the raw model inputs (images), we use the latent representations produced by the intermediate layers of our model. To account for its probabilistic nature, we apply the hypothesis test 100 times on 100 randomly selected natural inputs and 100 randomly selected adversarial inputs. Then we report the detection rate as the percentage of tests that resulted in a detection event according to our 5% significance threshold. The higher detection rate is a stronger indication that the two distributions are different. This procedure gives us the following detection rates:

  • Layer 1: 3%
  • Layer 4: 7%
  • Layer 8: 84%
  • Layer 12: 95%
  • Layer 14: 100%
  • Layer 15: 100%

In the initial layers, the detection rate is rather low (less than 10%), but increases to 100% in the deeper layers. Using the statistical test, the method can confidently detect adversarial inputs in deeper layers. It is often sufficient to simply use the representations generated by the penultimate layer (the last layer before the classification layer in a model). For more sophisticated adversarial inputs, it’s useful to use representations from other layers and aggregate the detection rates.

Solution overview

In the previous section, we saw how to detect adversarial inputs using representations from the penultimate layer. Next, we show how to automate these tests on SageMaker by using Model Monitor and Debugger. For this example, we first train an image classification ResNet18 model on the tiny ImageNet dataset. Next, we deploy the model on SageMaker and create a custom Model Monitor schedule that runs the statistical test. Afterwards, we run inference with normal and adversarial inputs to see how effective the method is.

Capture tensors using Debugger

During model training, we use Debugger to capture representations generated by the penultimate layer, which are used later on to derive information about the distribution of normal inputs. Debugger is a feature of SageMaker that enables you to capture and analyze information such as model parameters, gradients, and activations during model training. These parameter, gradient, and activation tensors are uploaded to Amazon Simple Storage Service (Amazon S3) while the training is in progress. You can configure rules that analyze these for issues such as overfitting and vanishing gradients. For our use case, we only want to capture the penultimate layer of the model (.*avgpool_output) and the model outputs (predictions). We specify a Debugger hook configuration that defines a regular expression for the layer representations to be collected. We also specify a save_interval that instructs Debugger to collect this data during the validation phase every 100 forward passes. See the following code:

from sagemaker.debugger import DebuggerHookConfig, CollectionConfig

debugger_hook_config = DebuggerHookConfig(
      collection_configs=[ 
          CollectionConfig(
                name="custom_collection",
                parameters={ "include_regex": ".*avgpool_output|.*ResNet_output",
                             "eval.save_interval": "100" })])

Run SageMaker training

We pass the Debugger configuration into the SageMaker estimator and start the training:

import sagemaker 
from sagemaker.pytorch import PyTorch

role = sagemaker.get_execution_role()

pytorch_estimator = PyTorch(entry_point='train.py',
                            source_dir='code',
                            role=role,
                            instance_type='ml.p3.2xlarge',
                            instance_count=1,
                            framework_version='1.8',
                            py_version='py3',
                            hyperparameters = {'epochs': 25, 
                                               'learning_rate': 0.001},
                            debugger_hook_config=debugger_hook_config
                           )
pytorch_estimator.fit()

Deploy an image classification model

After the model training is complete, we deploy the model as an endpoint on SageMaker. We specify an inference script that defines the model_fn and transform_fn functions. These functions specify how the model is loaded and how incoming data needs to be preprocessed to perform the model inference. For our use case, we enable Debugger to capture relevant data during inference. In the model_fn function, we specify a Debugger hook and a save_config that specifies that for each inference request, the model inputs (images), the model outputs (predictions), and the penultimate layer are recorded (.*avgpool_output). We then register the hook on the model. See the following code:

def model_fn(model_dir):
    
    #create model    
    model = create_and_load_model(model_dir)
    
    
    #hook configuration
    tensors_output_s3uri = os.environ.get('tensors_output')
    
    #capture layers for every inference request
    save_config = smd.SaveConfig(mode_save_configs={
        smd.modes.PREDICT: smd.SaveConfigMode(save_interval=1),
    })
   
    #configure Debugger hook
    hook = smd.Hook(
        tensors_output_s3uri,
        save_config=save_config,
        include_regex='.*avgpool_output|.*ResNet_output_0|*ResNet_input',
    )
    
    #register hook
    hook.register_module(model) 
    
    #set mode
    hook.set_mode(modes.PREDICT)
    
    return model

Now we deploy the model, which we can do from the notebook in two ways. We can either call pytorch_estimator.deploy() or create a PyTorch model that points to the model artifact files in Amazon S3 that have been created by the SageMaker training job. In this post, we do the latter. This allows us to pass in environment variables into the Docker container, which is created and deployed by SageMaker. We need the environment variable tensors_output to tell the script where to upload the tensors that are collected by SageMaker Debugger during inference. See the following code:

from sagemaker.pytorch import PyTorchModel

sagemaker_model = PyTorchModel(
    model_data=pytorch_estimator.model_data,
    role=role,
    source_dir='code',
    entry_point='inference.py',
    env={
          'tensors_output': f's3://{sagemaker_session.default_bucket()}/data_capture/inference',
        },
    framework_version='1.8',
    py_version='py3',
)

Next, we deploy the predictor on an ml.m5.xlarge instance type:

predictor = sagemaker_model.deploy(
    initial_instance_count=1,
    instance_type='ml.m5.xlarge',
    data_capture_config=data_capture_config,
    deserializer=sagemaker.deserializers.JSONDeserializer(),
)

Create a custom Model Monitor schedule

When the endpoint is up and running, we create a customized Model Monitor schedule. This is a SageMaker processing job that runs on a periodic interval (such as hourly or daily) and analyzes the inference data. Model Monitor provides a pre-configured container that analyzes and detects data drift. In our case, we want to customize it to fetch the Debugger data and run the MMD two-sample test on the retrieved layer representations.

To customize it, we first define the Model Monitor object, which specifies on which instance type these jobs are going to run and the location of our custom Model Monitor container:

from sagemaker.model_monitor import ModelMonitor

monitor = ModelMonitor(
    base_job_name='ladis-monitor',
    role=role,
    image_uri=processing_repository_uri,
    instance_count=1,
    instance_type='ml.m5.large',
    env={ 'training_data':f'{pytorch_estimator.latest_job_debugger_artifacts_path()}', 
          'inference_data': f's3://{sagemaker_session.default_bucket()}/data_capture/inference'},
)

We want to run this job on an hourly basis, so we specify CronExpressionGenerator.hourly() and the output locations where analysis results are uploaded to. For that we need to define ProcessingOutput for the SageMaker processing output:

from sagemaker.model_monitor import CronExpressionGenerator, MonitoringOutput
from sagemaker.processing import ProcessingInput, ProcessingOutput

#inputs and outputs for scheduled monitoring job
destination = f's3://{sagemaker_session.default_bucket()}/data_capture/results'
processing_output = ProcessingOutput(
    output_name='result',
    source='/opt/ml/processing/results',
    destination=destination,
)
output = MonitoringOutput(source=processing_output.source, destination=processing_output.destination)

#create schedule
monitor.create_monitoring_schedule(
    output=output,
    endpoint_input=predictor.endpoint_name,
    schedule_cron_expression=CronExpressionGenerator.hourly(),
)

Let’s look closer at what our custom Model Monitor container is running. We create an evaluation script, which loads the data captured by Debugger. We also create a trial object, which enables us to access, query, and filter the data that Debugger saved. With the trial object, we can iterate over the steps saved during the inference and training phases trial.steps(mode).

First, we fetch the model outputs (trial.tensor("ResNet_output_0")) as well as the penultimate layer (trial.tensor_names(regex=".*avgpool_output")). We do this for the inference and validation phases of training (modes.EVAL and modes.PREDICT). The tensors from the validation phase serve as an estimation of the normal distribution, which we then use to compare the distribution of inference data. We created a class LADIS (Detecting Adversarial Input Distributions via Layerwise Statistics). This class provides the relevant functionalities to perform the two-sample test. It takes the list of tensors from the inference and validation phases and runs the two-sample test. It returns a detection rate, which is a value between 0–100%. The higher the value, the more likely that the inference data follows a different distribution. Furthermore, we compute a score for each sample that indicates how likely a sample is adversarial and the top 100 samples are recorded, so that users can further inspect them. See the following code:

import LADIS
import sample_selection

#access tensors saved during training
trial = create_trial("s3://xxx/training/debug-output/")

#iterate over validation steps saved by Debugger during training
for step in trial.steps(mode=modes.EVAL):
       
   #get model outputs
   tensor = trial.tensor("ResNet_output_0").value(step, mode=modes.EVAL)
   prediction = np.argmax(tensor)
   val_predictions.append(prediction)
   
   #get outputs from penultimate layer 
   for layer in trial.tensor_names(regex=".*avgpool_output"):
      tensor = trial.tensor(layer).value(step, mode=modes.EVAL)])
      val_pen_layer[layer].append(tensor)
      
#access tensors saved during inference
trial = create_trial("s3://xxx/data_capture/inference/")

#iterate over inference steps saved by Debugger
for step in trial.steps(mode=modes.PREDICT):
       
   #get model outputs
   tensor = trial.tensor("ResNet_output_0").value(step, mode=modes.PREDICT)
   prediction = np.argmax(tensor)
   inference_predictions.append(prediction)
    
   #get penultimate layer
   for layer in trial.tensor_names(regex=".*avgpool_output"):
      tensor = trial.tensor(layer).value(step, mode=modes.PREDICT)])
      inference_pen_layer[layer].append(tensor)


#create LADIS object 
ladis = LADIS.LADIS(val_pen_layer, val_predictions, 
                    inference_pen_layer, inference_predictions)

#run MMD test
detection_rate = ladis.get_detection_rate(layers=[0], combine=True)

#determine how much each sample contribute to the detection
for index in range(len(query_latent['avgpool_output_0'])):
    
    stats.append(sample_selection.compute_ME_stat(val_latent['avgpool_output_0', 
                            inference_pen_layer['avgpool_output_0'],
                            inference_pen_layer['avgpool_output_0'][index]))

#find top 100 samples that were the most impactful for detection
samples = sorted(stats)[:100]

Test against adversarial inputs

Now that our custom Model Monitor schedule has been deployed, we can produce some inference results.

First, we run with data from the holdout set and then with adversarial inputs:

test_dataset = datasets.CIFAR10('data/cifar10', train=False, download=True, transform=None)

#run inference loop over holdout dataset
for index, (image, label) in enumerate(zip(test_dataset.data, test_dataset.targets)):

    #predict
    result = predictor.predict(image)

We can then check the Model Monitor display in Amazon SageMaker Studio or use Amazon CloudWatch logs to see if an issue was found.

Next, we use the adversarial inputs against the model hosted on SageMaker. We use the test dataset of the Tiny ImageNet dataset and apply the PGD attack, which introduces perturbations at the pixel level such that the model doesn’t recognize correct classes. In the following images, the left column shows two original test images, the middle column shows their adversarially perturbed versions, and the right column shows the difference between both images.

Now we can check the Model Monitor status and see that some of the inference images were drawn from a different distribution.

Results and user action

The custom Model Monitor job determines scores for each inference request, which indicates how likely the sample is adversarial according to the MMD test. These scores are gathered for all inference requests. Their score with the corresponding Debugger step number is recorded in a JSON file and uploaded to Amazon S3. After the Model Monitoring job is complete, we download the JSON file, retrieve step numbers, and use Debugger to retrieve the corresponding model inputs for these steps. This allows us to inspect the images that were detected as adversarial.

The following code block plots the first two images that have been identified as the most likely to be adversarial:

#access inference data
trial = create_trial(f"s3://{sagemaker_session.default_bucket()}/data_capture/inference")
steps = trial.steps(mode=modes.PREDICT)

#load constraint_violations.json file generated by custom ModelMonitor
results = monitor.latest_monitoring_constraint_violations().body_dict)

for index in range(2):
    # get results: step and score
    step = results['violations'][index]['description']['Step']
    score = round( results['violations'][index]['description']['Score'],3)
    
    # get input image
    image = trial.tensor('ResNet_input_0').value(step, mode=modes.PREDICT)[0,:,:,:]
    
    # get predicted class
    predicted = np.argmax(trial.tensor('ResNet_output_0').value(step, mode=modes.PREDICT))
    
    # visualize image 
    plot_image(image, predicted)

In our example test run, we get the following output. The jellyfish image was incorrectly predicted as an orange, and the camel image as a panda. Obviously, the model failed on these inputs and didn’t even predict a similar image class, such as goldfish or horse. For comparison, we also show the corresponding natural samples from the test set on the right side. We can observe that the random perturbations introduced by the attacker are very visible in the background of both images.

The custom Model Monitor job publishes the detection rate to CloudWatch, so we can investigate how this rate changed over time. A significant change between two data points may indicate that an adversary was trying to fool the model at a specific time frame. Additionally, you can also plot the number of inference requests being processed in each Model Monitor job and the baseline detection rate, which is computed over the validation dataset. The baseline rate is usually close to 0 and only serves as a comparison metric.

The following screenshot shows the metrics generated by our test runs, which ran three Model Monitoring jobs over 3 hours. Each job processes approximately 200–300 inference requests at a time. The detection rate is 100% between 5:00 PM and 6:00 PM, and drops afterwards.

Furthermore, we can also inspect the distributions of representations generated by the intermediate layers of the model. With Debugger, we can access the data from the validation phase of the training job and the tensors from the inference phase, and use t-SNE to visualize their distribution for certain predicted classes. See the following code:

import seaborn as sns
from sklearn.manifold import TSNE


#compute TSNE embeddings
tsne = TSNE(n_components=2, verbose=1, perplexity=40, n_iter=300)
embedding = tsne.fit_transform(np.concatenate((val_penultimate_layer, inference_penultimate_layer)))

# plot results
sns.scatterplot(x=embedding[:,0], y= embedding[:,1], hue=labels, alpha=0.6, palette=sns.color_palette(None, len(np.unique(labels))), legend="full")
plt.figure(figsize=(10,5))

In our test case, we get the following t-SNE visualization for the second image class. We can observe that the adversarial samples are clustered differently than the natural ones.

Summary

In this post, we showed how to use a two-sample test using maximum mean discrepancy to detect adversarial inputs. We demonstrated how you can deploy such detection mechanisms using Debugger and Model Monitor. This workflow allows you to monitor your models hosted on SageMaker at scale and detect adversarial inputs automatically. To learn more about it, check out our GitHub repo.

References

[1] Aleksander Madry, Aleksandar Makelov, Ludwig Schmidt, Dimitris Tsipras, and Adrian Vladu. Towards deep learning models resistant to adversarial attacks. In International Conference on Learning Representations, 2018.

[2] Laurens van der Maaten and Geoffrey Hinton. Visualizing data using t-SNE. Journal of Machine Learning Research, 9:2579–2605, 2008. URL http://www.jmlr.org/papers/v9/vandermaaten08a.html.


About the Authors

Nathalie Rauschmayr is a Senior Applied Scientist at AWS, where she helps customers develop deep learning applications.

Yigitcan Kaya is a fifth year PhD student at University of Maryland and an applied scientist intern at AWS, working on security of machine learning and applications of machine learning for security.

Bilal Zafar is an Applied Scientist at AWS, working on Fairness, Explainability and Security in Machine Learning.

Sergul Aydore is a Senior Applied Scientist at AWS working on Privacy and Security in Machine Learning

Read More

Build an MLOps sentiment analysis pipeline using Amazon SageMaker Ground Truth and Databricks MLflow

As more organizations move to machine learning (ML) to drive deeper insights, two key stumbling blocks they run into are labeling and lifecycle management. Labeling is the identification of data and adding labels to provide context so an ML model can learn from it. Labels might indicate a phrase in an audio file, a car in a photograph, or an organ in an MRI. Data labeling is necessary to enable ML models to work against the data. Lifecycle management has to do with the process of setting up an ML experiment and documenting the dataset, library, version, and model used to get results. A team might run hundreds of experiments before settling on one approach. Going back and recreating that approach can be difficult without records of the elements of that experiment.

Many ML examples and tutorials start with a dataset that includes a target value. However, real-world data doesn’t always have such a target value. For example, in sentiment analysis, a person can usually make a judgment on whether a review is positive, negative, or mixed. But reviews are made up of a collection of text with no judgment value attached to it. In order to create a supervised learning model to solve this problem, a high-quality labeled dataset is essential. Amazon SageMaker Ground Truth is a fully managed data labeling service that makes it easy to build highly accurate training datasets for ML.

For organizations that use Databricks as their data and analytics platform on AWS to perform extract, transform, and load (ETL) tasks, the ultimate goal is often training a supervised learning model. In this post, we show how Databricks integrates with Ground Truth and Amazon SageMaker for data labeling and model distribution.

Solution overview

Ground Truth is a fully managed data labeling service that makes it easy to build highly accurate training datasets for ML. Through the Ground Truth console, we can create custom or built-in data labeling workflows in minutes. These workflows support a variety of use cases, including 3D point clouds, video, images, and text. In addition, Ground Truth offers automatic data labeling, which uses an ML model to label our data.

We train our model on the publicly available Amazon Customer Reviews dataset. At a high level, the steps are as follows:

  1. Extract a raw dataset to be labeled and move it to Amazon Simple Storage Service (Amazon S3).
  2. Perform labeling by creating a labeling job in SageMaker.
  3. Build and train a simple Scikit-learn linear learner model to classify the sentiment of the review text on the Databricks platform using a sample notebook.
  4. Use MLflow components to create and perform MLOps and save the model artifacts.
  5. Deploy the model as a SageMaker endpoint using the MLflow SageMaker library for real-time inference.

The following diagram illustrates the labeling and ML journey using Ground Truth and MLflow.

Create a labeling job in SageMaker

From the Amazon Customer Reviews dataset, we extract the text portions only, because we’re building a sentiment analysis model. Once extracted, we put the text in an S3 bucket and then create a Ground Truth labeling job via the SageMaker console.

On the Create labeling job page, fill out all required fields. As a part of step on this page, Ground Truth allows you to generate the job manifest file. Ground Truth uses the input manifest file to identify the number of files or objects in the labeling job so that the right number of tasks are created and sent to human (or machine) labelers. The file is automatically saved in the S3 bucket. The next step is to specify the task category and task selection. In this use case, we choose Text as the task category, and Text Classification with a single label for task selection, which means a review text will have a single sentiment: positive, negative, or neutral.

Finally, we write simple but concise instructions for labelers on how to label the text data. The instructions are displayed on the labeling tool and you can optionally review the annotator’s view at this time. Finally, we submit the job and monitor the progress on the console.

While the labeling job is in progress, we can also look at the labeled data on the Output tab. We can monitor each review text and label, and if the job was done by a human or machine. We can select 100% of the labeling jobs to be done by humans or choose machine annotation, which speeds up the job and reduces labor costs.

When the job is complete, the labeling job summary contains links to the output manifest and the labeled dataset. We can also go to Amazon S3 and download both from our S3 bucket folder.

In the next steps, we use a Databricks notebook, MLflow, and datasets labeled by Ground Truth to build a Scikit-learn model.

Download a labeled dataset from Amazon S3

We start by downloading the labeled dataset from Amazon S3. The manifest is saved in JSON format and we load it into a Spark DataFrame in Databricks. For training the sentiment analysis model, we only need the review text and sentiment that was annotated by the Ground Truth labeling job. We use select() to extract those two features. Then we convert the dataset from a PySpark DataFrame to a Pandas DataFrame, because the Scikit-learn algorithm requires Pandas DataFrame format.

Next, we use Scikit-learn CountVectorizer to transform the review text into a bigram vector by setting the ngram_range max value to 2. CountVectorizer converts text into a matrix of token counts. Then we use TfidfTransformer to transform the bigram vector into a term frequency-inverse document frequency (TF-IDF) format.

We compare the accuracy scores for training done with a bigram vector vs. bigram with TF-IDF. TF-IDF is a statistical measure that evaluates how relevant a word is to a document in a collection of documents. Because the review text tends to be relatively short, we can observe how TF-IDF affects the performance of the predictive model.

Set up an MLflow experiment

MLflow was developed by Databricks and is now an open-source project. MLflow manages the ML lifecycle, so you can track, recreate, and publish experiments easily.

To set up MLflow experiments, we use mlflow.sklearn.autolog() to enable auto logging of hyperparameters, metrics, and model artifacts whenever estimator.fit(), estimator.fit_predict(), and estimator.fit_transform() are called. Alternatively, you can do this manually by calling mlflow.log_param() and mlflow.log_metric().

We fit the transformed dataset to a linear classifier with Stochastic Gradient Descent (SGD) learning. With SGD, the gradient of the loss is estimated one sample at a time and the model is updated along the way with a decreasing strength schedule.

Those two datasets we prepared earlier are passed to the train_and_show_scores() function for training. After training, we need to register a model and save its artifacts. We use mlflow.sklearn.log_model() to do so.

Before deploying, we look at the experiment’s results and choose two experiments (one for bigram and the other for bigram with TF-IDF) to compare. In our use case, the second model trained with bigram TF-IDF performed slightly better, so we pick that model to deploy. After the model is registered, we deploy the model, changing the model stage to production. We can accomplish this on the MLflow UI, or in the code using transition_model_version_stage().

Deploy and test the model as a SageMaker endpoint

Before we deploy the trained model, we need to build a Docker container to host the model in SageMaker. We do this by running a simple MLflow command that builds and pushes the container to Amazon Elastic Container Registry (Amazon ECR) in our AWS account.

We can now find the image URI on the Amazon ECR console. We pass the image URI as an image_url parameter, and use DEPLOYMENT_MODE_CREATE for the mode parameter if this is a new deployment. If updating an existing endpoint with a new version, use DEPLOYMENT_MODE_REPLACE.

To test the SageMaker endpoint, we create a function that takes the endpoint name and input data as its parameters.

Conclusion

In this post, we showed you how to use Ground Truth to label a raw dataset, and the use the labeled data to train a simple linear classifier using Scikit-learn. In this example, we use MLflow to track hyperparameters and metrics, register a production-grade model, and deploy the trained model to SageMaker as an endpoint. Along with Databricks to process the data, you can automate this whole use case, so as new data is introduced, it can be labeled and processed into the model. By automating these pipelines and models, data science teams can focus on new use cases and uncover more insights instead of spending their time managing data updates on a day-to-day basis.

To get started, check out Use Amazon SageMaker Ground Truth to Label Data and sign up for a 14-day free trial of Databricks on AWS. To learn more about how Databricks integrates with SageMaker, as well as other AWS services like AWS Glue and Amazon Redshift, visit Databricks on AWS.

Additionally, check out the following resources used in this post:

Use the following notebook to get started.


About the Authors

Rumi Olsen is a Solutions Architect in the AWS Partner Program. She specializes in serverless and machine learning solutions in her current role, and has a background in natural language processing technologies. She spends most of her spare time with her daughter exploring the nature of Pacific Northwest.

Igor Alekseev is a Partner Solution Architect at AWS in Data and Analytics. Igor works with strategic partners helping them build complex, AWS-optimized architectures. Prior joining AWS, as a Data/Solution Architect, he implemented many projects in Big Data, including several data lakes in the Hadoop ecosystem. As a Data Engineer, he was involved in applying AI/ML to fraud detection and office automation. Igor’s projects were in a variety of industries including communications, finance, public safety, manufacturing, and healthcare. Earlier, Igor worked as full stack engineer/tech lead.

Naseer Ahmed is a Sr. Partner Solutions Architect at Databricks supporting its AWS business. Naseer specializes in Data Warehousing, Business Intelligence, App development, Container, Serverless, Machine Learning Architectures on AWS. He was voted 2021 SME of the year at Databricks and is an avid crypto enthusiast.

Read More

Enable Amazon Kendra search for a scanned or image-based text document

Amazon Kendra is an intelligent search service powered by machine learning (ML). Amazon Kendra reimagines search for your websites and applications so your employees and customers can easily find the content they’re looking for, even when it’s scattered across multiple locations and content repositories within your organization.

Amazon Kendra supports a variety of document formats, such as Microsoft Word, PDF, and text. While working with a leading Edtech customer, we were asked to build an enterprise search solution that also utilizes images and PPT files. This post focuses on extending the document support in Amazon Kendra so you can preprocess text images and scanned documents (JPEG, PNG, or PDF format)  to make them searchable. The solution combines Amazon Textract for document preprocessing and optical character recognition (OCR), and Amazon Kendra for intelligent search.

With the new Custom Document Enrichment feature in Amazon Kendra, you can now preprocess your documents during ingestion and augment your documents with new metadata. Custom Document Enrichment allows you to call external services like Amazon Comprehend, Amazon Textract, and Amazon Transcribe to extract text from images, transcribe audio, and analyze video. For more information about using Custom Document Enrichment, refer to Enrich your content and metadata to enhance your search experience with custom document enrichment in Amazon Kendra.

In this post, we propose an alternate method of preprocessing the content prior to calling the ingestion process in Amazon Kendra.

Solution overview

Amazon Textract is an ML service that automatically extracts text, handwriting, and data from scanned documents and goes beyond basic OCR to identify, understand, and extract data from forms and tables. Today, many companies manually extract data from scanned documents like PDFs, images, tables, and forms through basic OCR software that requires manual configuration, which often requires reconfiguration when the form changes.

To overcome these manual and expensive processes, Amazon Textract uses machine learning to read and process a wide range of documents, accurately extracting text, handwriting, tables, and other data without any manual effort. You can quickly automate document processing and take action on the information extracted, whether it’s automating loans processing or extracting information from invoices and receipts.

Amazon Kendra is an easy-to-use enterprise search service that allows you to add search capabilities to your applications so that end-users can easily find information stored in different data sources within your company. This could include invoices, business documents, technical manuals, sales reports, corporate glossaries, internal websites, and more. You can harvest this information from storage solutions like Amazon Simple Storage Service (Amazon S3) and OneDrive; applications such as Salesforce, SharePoint, and ServiceNow; or relational databases like Amazon Relational Database Service (Amazon RDS).

The proposed solution enables you to unlock the search potential in scanned documents, extending the ability of Amazon Kendra to find accurate answers in a wider range of document types. The workflow includes the following steps:

  1. Upload a document (or documents of various types) to Amazon S3.
  2. The event triggers an AWS Lambda function that uses the synchronous Amazon Textract API (DetectDocumentText).
  3. Amazon Textract reads the document in Amazon S3, extracts the text from it, and returns the extracted text to the Lambda function.
  4. The data source on the new text file needs to be reindexed.
  5. When reindexing is complete, you can search the new dataset either via the Amazon Kendra console or API.

The following diagram illustrates the solution architecture.

In the following sections, we demonstrate how to configure the Lambda function, create the event trigger, process a document, and then reindex the data.

Configure the Lambda function

To configure your Lambda function, add the following code to the function Python editor:

import urllib
import boto3

textract = boto3.client('textract')
def handler(event, context):
	source_bucket = event['Records'][0]['s3']['bucket']['name']
	object_key = urllib.parse.unquote_plus(event['Records'][0]['s3']['object']['key'])
	
	textract_result = textract.detect_document_text(
		Document={
			'S3Object': {
				'Bucket': source_bucket,
				'Name': object_key
			}
		})
	page=""
	blocks = [x for x in textract_result['Blocks'] if x['BlockType'] == "LINE"]
	for block in blocks:
		page += " " + block['Text']
        	
	print(page)
	s3 = boto3.resource('s3')
	object = s3.Object('demo-kendra-test', 'text/apollo11-summary.txt')
	object.put(Body=page)

We use the DetectDocumentText API to extract the text from an image (JPEG or PNG) retrieved in Amazon S3.

Create an event trigger at Amazon S3

In this step, we create an event trigger to start the Lambda function when a new document is uploaded to a specific bucket. The following screenshot shows our new function on the Amazon S3 console.

You can also verify the event trigger on the Lambda console.

Process a document

To test the process, we upload an image to the S3 folder that we defined for the S3 event trigger. We use the following sample image.

When the Lambda function is complete, we can go to the Amazon CloudWatch console to check the output. The following screenshot shows the extracted text, which confirms that the Lambda function ran successfully.

Reindex the data with Amazon Kendra

We can now reindex our data.

  1. On the Amazon Kendra console, under Data management in the navigation pane, choose Data sources.
  2. Select the data source demo-s3-datasource.
  3. Choose Sync now.

The sync state changes to Synching - crawling.

When the sync is complete, the sync status changes to Succeeded and the sync state changes to Idle.

Now we can go back to the search console and see our faceted search in action.

  1. In the navigation pane, choose Search console.

We added metadata for a few items; two of them are the ML algorithms XGBoost and BlazingText.

  1. Let’s try searching for Sagemaker.

Our search was successful, and we got a list of results. Let’s see what we have for facets.

  1. Expand Filter search results.

We have the category and tags facets that were part of our item metadata.

  1. Choose BlazingText to filter results just for that algorithm.
  2. Now let’s perform the search on newly uploaded image files. The following screenshot shows the search on new preprocessed documents.

Conclusion

This blog will be helpful in improving the effectiveness of search results and search experience. You can use Amazon Textract to extract text from scanned images that are added as metadata and later available as facets to interact with the search results. This is just an illustration of how you can use AWS native services to create a differentiated search experience for your users. This also helps in unlocking the full potential of your knowledge assets.

For a deeper dive into what you can achieve by combining other AWS services with Amazon Kendra, refer to Make your audio and video files searchable using Amazon Transcribe and Amazon KendraBuild an intelligent search solution with automated content enrichment, and other posts on the Amazon Kendra blog.


About of Author

Sanjay Tiwary is a Specialist Solutions Architect AI/ML. He spends his time working with strategic customers to define business requirements, provide L300 sessions around specific use cases, and design ML applications and services that are scalable, reliable, and performant. He has helped launch and scale the AI/ML powered Amazon SageMaker service and has implemented several proofs of concept using Amazon AI services. He has also developed the advanced analytics platform as a part of the digital transformation journey.

Read More

Interpret caller input using grammar slot types in Amazon Lex

Customer service calls require customer agents to have the customer’s account information to process the caller’s request. For example, to provide a status on an insurance claim, the support agent needs policy holder information such as the policy ID and claim number. Such information is often collected in the interactive voice response (IVR) flow at the beginning of a customer support call. IVR systems have typically used grammars based on the Speech Recognition Grammar Specification (SRGS) format to define rules and parse caller information (policy ID, claim number). You can now use the same grammars in Amazon Lex to collect information in a speech conversation. You can also provide semantic interpretation rules using ECMAScript tags within the grammar files. The grammar support in Amazon Lex provides granular control for collecting and postprocessing user input so you can manage an effective dialog.

In this post, we review the grammar support in Amazon Lex and author a sample grammar for use in an Amazon Connect contact flow.

Use grammars to collect information in a conversation

You can author the grammar as a slot type in Amazon Lex. First, you provide a set of rules in the SRGS format to interpret user input. As an optional second step, you can write an ECMA script that transforms the information collected in the dialog. Lastly, you store the grammar as an XML file in an Amazon Simple Storage Service (Amazon S3) bucket and reference the link in your bot definition. SRGS grammars are specifically designed for voice and DTMF modality. We use the following sample conversations to model our bot:

Conversation 1

IVR: Hello! How can I help you today?

User: I want to check my account balance.

IVR: Sure. Which account should I pull up?

User: Checking.

IVR: What is the account number?

User: 1111 2222 3333 4444

IVR: For verification purposes, what is your date of birth?

User: Jan 1st 2000.

IVR: Thank you. The balance on your checking account is $123 dollars.

Conversation 2

IVR: Hello! How can I help you today?

User: I want to check my account balance.

IVR: Sure. Which account should I pull up?

User: Savings.

IVR: What is the account number?

User: I want to talk to an agent.

IVR: Ok. Let me transfer the call. An agent should be able to help you with your request.

In the sample conversations, the IVR requests the account type, account number, and date of birth to process the caller’s requests. In this post, we review how to use the grammars to collect the information and postprocess it with ECMA scripts. The grammars for account ID and date cover multiple ways to provide the information. We also review the grammar in case the caller can’t provide the requested details (for example, their savings account number) and instead opts to speak with an agent.

Build an Amazon Lex chatbot with grammars

We build an Amazon Lex bot with intents to perform common retail banking functions such as checking account balance, transferring funds, and ordering checks. The CheckAccountBalance intent collects details such as account type, account ID, and date of birth, and provides the balance amount. We use a grammar slot type to collect the account ID and date of birth. If the caller doesn’t know the information or asks for an agent, the call is transferred to a human agent. Let’s review the grammar for the account ID:

<grammar version="1.0" xmlns="http://www.w3.org/2001/06/grammar" xml:lang="en-US" tag-format="semantics/1.0" root="captureAccount"><!-- Header definition for US language and the root rule "captureAccount" to start with-->

	<rule id="captureAccount" scope="public">
		<tag> out=""</tag>
		<one-of>
			<item><ruleref uri="#digit"/><tag>out += rules.digit.accountNumber</tag></item><!--Call the subrule to capture 16 digits--> 
			<item><ruleref uri="#agent"/><tag>out =rules.agent;</tag></item><!--Exit point to route the caller to an agent--> 
		</one-of>
	</rule>

	<rule id="digit" scope="public"> <!-- Capture digits from 1 to 9 -->
		<tag>out.accountNumber=""</tag>
		<item repeat="16"><!-- Repeat the rule exactly 16 times -->
			<one-of>
				<item>1<tag>out.accountNumber+=1;</tag></item>
				<item>2<tag>out.accountNumber+=2;</tag></item>
				<item>3<tag>out.accountNumber+=3;</tag></item>
				<item>4<tag>out.accountNumber+=4;</tag></item>
				<item>5<tag>out.accountNumber+=5;</tag></item>
				<item>6<tag>out.accountNumber+=6;</tag></item>
				<item>7<tag>out.accountNumber+=7;</tag></item>
				<item>8<tag>out.accountNumber+=8;</tag></item>
				<item>9<tag>out.accountNumber+=9;</tag></item>
				<item>0<tag>out.accountNumber+=0;</tag></item>
				<item>oh<tag>out.accountNumber+=0</tag></item>
				<item>null<tag>out.accountNumber+=0;</tag></item>
			</one-of>
		</item>
	</rule>
	
	<rule id="agent" scope="public"><!-- Exit point to talk to an agent-->
		<item>
			<item repeat="0-1">i</item>
			<item repeat="0-1">want to</item>
			<one-of>
				<item repeat="0-1">speak</item>
				<item repeat="0-1">talk</item>
			</one-of>
			<one-of>
				<item repeat="0-1">to an</item>
				<item repeat="0-1">with an</item>
			</one-of>
			<one-of>
				<item>agent<tag>out="agent"</tag></item>
				<item>employee<tag>out="agent"</tag></item>
			</one-of>
		</item>
    </rule>
</grammar>

The grammar has two rules to parse user input. The first rule interprets the digits provided by the caller. These digits are appended to the output via an ECMA script tag variable (out). The second rule manages the dialog if the caller wants to talk to an agent. In this case the out tag is populated with the word agent. After the rules are parsed, the out tag carries the account number (out.AccountNumber) or the string agent. The downstream business logic can now use the out tag handle the call.

Deploy the sample Amazon Lex bot

To create the sample bot and add the grammars, perform the following steps. This creates an Amazon Lex bot called BankingBot, and two grammar slot types (accountNumber, dateOfBirth).

  1. Download the Amazon Lex bot.
  2. On the Amazon Lex console, choose Actions, then choose Import.
  3. Choose the file BankingBot.zip that you downloaded, and choose Import. In the IAM Permissions section, for Runtime role, choose Create a new role with basic Amazon Lex permissions.
  4. Choose the bot BankingBot on the Amazon Lex console.
  5. Download the XML files for accountNumber and dateOfBirth. (Note: In some browsers you will have to “Save the link” to download the XML files)
  6. On the Amazon S3 console, upload the XML files.
  7. Navigate to the slot types on the Amazon Lex console, and click on the accountNumber slot type
  8. In the slot type grammar select the S3 bucket with the XML file and provide the object key. Click on Save slot type.
  9. Navigate to the slot types on the Amazon Lex console, and click on the dateOfBirth slot type
  10. In the slot type grammar select the S3 bucket with the XML file and provide the object key. Click on Save slot type.
  11. After the grammars are saved, choose Build.
  12. Download the supporting AWS Lambda and Navigate to the AWS Lambda console.
  13. On the create function page select Author from scratch. As basic information please provide the following: function name BankingBotEnglish, and Runtime Python 3.8.
  14. Click on Create function. In the Code source section, open lambda_funciton.py and delete the existing code. Download the code and open it in a text editor. Copy and paste the code into the empty lambda_funciton.py tab.
  15. Choose deploy.
  16. Navigate to the Amazon Lex Console and select BankingBot. Click on Deployment and then Aliases followed by TestBotAlias
  17. On the Aliases page select languages and navigate to English (US).
  18. For source select BankingBotEnglish, for Lambda version or alias select $LATEST
  19. Navigate to the Amazon Connect console, choose Contact flows.
  20. Download the contact flow to integrate with the Amazon Lex bot.
  21. In the Amazon Lex section, select your Amazon Lex bot and make it available for use in the Amazon Connect contact flows.
  22. Select the contact flow to load it into the application.
  23. Make sure the right bot is configured in the “Get Customer Input” block. Add a phone number to the contact flow.
  24. Choose a queue in the “Set working queue” block.
  25. Test the IVR flow by calling in to the phone number.
  26. Test the solution.

Test the solution

You can call in to the Amazon Connect phone number and interact with the bot. You can also test the solution directly on the Amazon Lex V2 console using voice and DTMF.

Conclusion

Custom grammar slots provide the ability to collect different types of information in a conversation. You have the flexibility to capture transitions such as handover to an agent. Additionally, you can postprocess the information before running the business logic. You can enable grammar slot types via the Amazon Lex V2 console or AWS SDK. The capability is available in all AWS Regions where Amazon Lex operates in the English (Australia), English (UK), and English (US) locales.

To learn more, refer to Using a custom grammar slot type. You can also view the Amazon Lex documentation for SRGS or ECMAScript for more information.


About the Authors

Kai Loreck is a professional services Amazon Connect consultant. He works on designing and implementing scalable customer experience solutions. In his spare time, he can be found playing sports, snowboarding, or hiking in the mountains.

Harshal Pimpalkhute is a Product Manager on the Amazon Lex team. He spends his time trying to get machines to engage (nicely) with humans.

Read More

Whitepaper: Machine Learning Best Practices in Healthcare and Life Sciences

For customers looking to implement a GxP-compliant environment on AWS for artificial intelligence (AI) and machine learning (ML) systems, we have released a new whitepaper: Machine Learning Best Practices in Healthcare and Life Sciences.

This whitepaper provides an overview of security and good ML compliance practices and guidance on building GxP-regulated AI/ML systems using AWS services. We cover the points raised by the FDA discussion paper and Good Machine Learning Practices (GMLP) while also drawing from AWS resources: the whitepaper GxP Systems on AWS and the Machine Learning Lens from the AWS Well-Architected Framework. The whitepaper was developed based on our experience with and feedback from AWS pharmaceutical and medical device customers, as well as AWS partners, who are currently using AWS services to develop ML models.

Healthcare and life sciences (HCLS) customers are adopting AWS AI and ML services faster than ever before, but they also face the following regulatory challenges during implementation:

  • Building a secure infrastructure that complies with stringent regulatory processes for working on the public cloud and aligning to the FDA framework for AI and ML.
  • Supporting AI/ML-enabled solutions for GxP workloads covering the following:
    • Reproducibility
    • Traceability
    • Data integrity
  • Monitoring ML models with respect to various changes to parameters and data.
  • Handling model uncertainty and confidence calibration.

In our whitepaper, you learn about the following topics:

  • How AWS approaches ML in a regulated environment and provides guidance on Good Machine Learning Practices using AWS services.
  • Our organizational approach to security and compliance that supports GxP requirements as part of the shared responsibility model.
  • How to reproduce the workflow steps, track model and dataset lineage, and establish model governance and traceability.
  • How to monitor and maintain data integrity and quality checks to detect drifts in data and model quality.
  • Security and compliance best practices for managing AI/ML models on AWS.
  • Various AWS services for managing ML models in a regulated environment.

AWS is dedicated to helping you successfully use AWS services in regulated life science environments to accelerate your research, development, and delivery of the next generation of medical, health, and wellness solutions.

Contact us with questions about using AWS services for AI/ML in GxP systems. To learn more about compliance in the cloud, visit AWS Compliance. You can also check out the following resources:


About the Authors

Susant Mallick is an Industry specialist and digital evangelist in AWS’ Global Healthcare and Life-Sciences practice. He has over 20+ years of experience in the Life Science industry working with biopharmaceutical and medical device companies across North America, APAC and EMEA regions. He has built many Digital Health Platform and Patient Engagement solutions using Mobile App, AI/ML, IoT and other technologies for customers in various Therapeutic Areas. He holds a B.Tech degree in Electrical Engineering and MBA in Finance. His thought leadership and industry expertise earned many accolades in Pharma industry forums.

Sai Sharanya Nalla is a Sr. Data Scientist at AWS Professional Services. She works with customers to develop and implement AI/ ML and HPC solutions on AWS. In her spare time, she enjoys listening to podcasts and audiobooks, taking long walks, and engaging in outreach activities.

Read More