The Truck Stops Here: How AI Is Creating a New Kind of Commercial Vehicle

For many, the term “autonomous vehicles” conjures up images of self-driving cars. Autonomy, however, is transforming much more than personal transportation.

Autonomous trucks are commercial vehicles that use AI to automate everything from shipping yard operations to long-haul deliveries. Due to industry pressures from rising delivery demand and driver shortages, as well as straightforward operational domains such as highways, these intelligent trucks may be the first autonomous vehicles to hit public roads at scale.

This technology uses long-range, high-resolution sensors, a range of deep neural networks and high-performance, energy-efficient compute to improve safety and efficiency for everyday logistics.

With the rise of e-commerce and next-day delivery, trucking plays an increasingly vital role in moving the world forward. Trucks transport more than 70 percent of all freight in the U.S. Experts estimate that most essential businesses, such as grocery stores and gas stations, would run out of supplies within days without these vehicles.

These trends come as driver shortages accelerate. The American Trucking Association reports the industry has struggled with driver supply over the past 15 years. It estimates the industry could be in need of 160,000 drivers by 2028 if trends continue. Additionally, limits on the amount of hours drivers can consecutively work restricts operation.

Autonomous driving can help ease the strain of trucking demand, as well as increase efficiency, by operating around the clock with lower requirements for human labor. In fact, a recent pilot run by self-driving trucking startup TuSimple and the U.S. Postal Service showed that autonomous trucks repeatedly arrived ahead of schedule on hub-to-hub routes.

And with hub-to-hub autonomous trucks constrained to fenced-in areas or highways, most autonomous trucks don’t have to deal with the challenges of urban traffic and neighborhood driving, freeing up roadblocks to widespread deployment.

This groundbreaking development is possible in part due to centralized, high-performance compute such as the NVIDIA DRIVE platform. With the capability to process the redundant and diverse deep neural networks necessary to operate without human supervision, these vehicles are poised to revolutionize delivery and logistics in the years to come.

Scalable Solutions for the Long Haul

Autonomous driving is a scalable technology. The Society of Automotive Engineers (SAE) defines it in categories that include assisted driving where the driver is still in control (level 2) as well as full self-driving, where no human supervision is required (level 4/5). AI compute must also be able to scale with the capabilities of self-driving software.

In addition, the system must be able to handle the harsh environments of trucking. The average truck driver travels 100,000 miles a year, compared with the average motorist, who drives about 13,500 miles a year.

NVIDIA DRIVE is the only solution that easily scales from level 2 AI-assisted driving to fully autonomous operation while being designed to withstand the wear and tear of long-haul trucking.

This versatility and durability is already in development today. Companies such as Locomation are leveraging the compute platform for platooning pilots, where one driver operates a lead truck while a fully autonomous follower truck drives in tandem. Truck manufacturer FAW and startup PlusAI are jointly developing a large-scale autonomous trucking fleet. TuSimple uses NVIDIA DRIVE in its fleet.

On the Open Road

Beyond improving current trucking practices, autonomous driving technology is opening up entirely new possibilities for the industry.

Volvo Group, one of the largest truck makers in the world, is using NVIDIA DRIVE to train, test and deploy self-driving AI vehicles, targeting public transport, freight transport, refuse and recycling collection, construction, mining, forestry and more.

It’s even envisioning cab-less operation within shipping yards and on industrial roads with the Vera pilot truck.

Self-driving truck startup Einride is also developing cab-less vehicles. It recently announced the next generation of its Pod trucks, powered by NVIDIA DRIVE AGX Orin. These futuristic electric haulers will be able to scale from closed-facility operation to fully autonomous driving on backroads and highways.

With high-performance, energy-efficient AI compute at the core, autonomous trucks will push the limits of what’s possible in delivery and logistics, transforming industries around the world.

The post The Truck Stops Here: How AI Is Creating a New Kind of Commercial Vehicle appeared first on The Official NVIDIA Blog.

Read More

Machine Learning for Computer Architecture

Posted by Amir Yazdanbakhsh, Research Scientist, Google Research

One of the key contributors to recent machine learning (ML) advancements is the development of custom accelerators, such as Google TPUs and Edge TPUs, which significantly increase available compute power unlocking various capabilities such as AlphaGo, RankBrain, WaveNets, and Conversational Agents. This increase can lead to improved performance in neural network training and inference, enabling new possibilities in a broad range of applications, such as vision, language, understanding, and self-driving cars.

To sustain these advances, the hardware accelerator ecosystem must continue to innovate in architecture design and acclimate to rapidly evolving ML models and applications. This requires the evaluation of many different accelerator design points, each of which may not only improve the compute power, but also unravel a new capability. These design points are generally parameterized by a variety of hardware and software factors (e.g., memory capacity, number of compute units at different levels, parallelism, interconnection networks, pipelining, software mapping, etc.). This is a daunting optimization task, due to the fact that the search space is exponentially large1 while the objective function (e.g., lower latency and/or higher energy efficiency) is computationally expensive to evaluate through simulations or synthesis, making identification of feasible accelerator configurations challenging .

In “Apollo: Transferable Architecture Exploration”, we present the progress of our research on ML-driven design of custom accelerators. While recent work has demonstrated promising results in leveraging ML to improve the low-level floorplanning process (in which the hardware components are spatially laid out and connected in silicon), in this work we focus on blending ML into the high-level system specification and architectural design stage, a pivotal contributing factor to the overall performance of the chip in which the design elements that control the high-level functionality are established. Our research shows how ML algorithms can facilitate architecture exploration and suggest high-performing architectures across a range of deep neural networks, with domains spanning image classification, object detection, OCR and semantic segmentation.

Architecture Search Space and Workloads
The objective in architecture exploration is to discover a set of feasible accelerator parameters for a set of workloads, such that a desired objective function (e.g., the weighted average of runtime) is minimized under an optional set of user-defined constraints. However, the manifold of architecture search generally contains many points for which there is no feasible mapping from software to hardware. Some of these design points are known a priori and can be bypassed by formulating them as optimization constraints by the user (e.g., in the case of an area budget2 constraint, the total memory size must not pass over a predefined limit). However, due to the interplay of the architecture and compiler and the complexity of the search space, some of the constraints may not be properly formulated into the optimization, and so the compiler may not find a feasible software mapping for the target hardware. These infeasible points are not easy to formulate in the optimization problem, and are generally unknown until the whole compiler pass is performed. As such, one of main challenges for architecture exploration is to effectively sidestep the infeasible points for efficient exploration of the search space with a minimum number of cycle-accurate architecture simulations.

The following figure shows the overall architecture search space of a target ML accelerator. The accelerator contains a 2D array of processing elements (PE), each of which performs a set of arithmetic computations in a single instruction multiple data (SIMD) manner. The main architectural components of each PE are processing cores that include multiple compute lanes for SIMD operations. Each PE has shared memory (PE Memory) across all their compute cores, which is mainly used to store model activations, partial results, and outputs, while individual cores feature memory that is mainly used for storing model parameters. Each core has multiple compute lanes with multi-way multiply-accumulate (MAC) units. The results of model computations at each cycle are either stored back in the PE memory for further computation or are offloaded back into the DRAM.

Overview of the template-based ML accelerator used for architecture exploration.

Optimization Strategies
In this study, we explored four optimization strategies in the context of architecture exploration:

  1. Random: Samples the architecture search space uniformly at random.
  2. Vizier: Uses Bayesian optimization for the exploration in the search space in which the evaluation of the objective function is expensive (e.g. hardware simulation which can take hours to complete). Using a collection of sampled points from the search space, the Bayesian optimization forms a surrogate function, usually represented by a Gaussian process, that approximates the manifold of the search space. Guided by the value of the surrogate function, the Bayesian optimization algorithm decides, in an exploration and exploitation trade-off, whether to sample more from the promising regions in the manifold (exploitation) or sample more from the unseen regions in the search space (exploration). Then, the optimization algorithm uses these newly sampled points and further updates the surrogate function to better model the target search space. Vizier uses expected improvement as its core acquisition function.
  3. Evolutionary: Performs evolutionary search using a population of k individuals, where the genome of each individual corresponds to a sequence of discretized accelerator configurations. New individuals are generated by selecting for each individual two parents from the population using tournament selecting, recombining their genomes with some crossover rate, and mutating the recombined genome with some probability.
  4. Population-based black-box optimization (P3BO): Uses an ensemble of optimization methods, including evolutionary and model-based, which has been shown to increase sample-efficiency and robustness. The sampled data are exchanged between optimization methods in the ensemble, and optimizers are weighted by their performance history to generate new configurations. In our study, we use a variant of P3BO in which the hyper-parameters of the optimizers are dynamically updated using evolutionary search.

Accelerator Search Space Embeddings
To better visualize the effectiveness of each optimization strategy in navigating the accelerator search space, we use t-distributed stochastic neighbor embedding (t-SNE) to map the explored configurations into a two-dimensional space across the optimization horizon. The objective (reward) for all the experiments is defined as the throughput (inference/second) per accelerator area. In the figures below, the x and y axes indicate the t-SNE components (embedding 1 and embedding 2) of the embedding space. The star and circular markers show the infeasible (zero reward) and feasible design points, respectively, with the size of the feasible points corresponding to their reward.

As expected, the random strategy searches the space in a uniformly distributed way and eventually finds very few feasible points in the design space.

Visualization presenting the t-SNE of the explored design points (~4K) by random optimization strategy (max reward = 0.96). The maximum reward points (red cross markers) are highlighted at the last frame of the animation.

Compared to the random sampling approach, the Vizier default optimization strategy strikes a good balance between exploring the search space and finding the design points with higher rewards (1.14 vs. 0.96). However, this approach tends to get stuck in infeasible regions and, while it does find a few points with the maximum reward (indicated by the red cross markers), it finds few feasible points during the last iterations of exploration.

As above, with the Vizier (default) optimization strategy (max reward = 1.14). The maximum reward points (red cross markers) are highlighted at the last frame of the animation.

The evolutionary optimization strategy, on the other hand, finds feasible solutions very early in the optimization and assemble clusters of feasible points around them. As such, this approach mostly navigates the feasible regions (the green circles) and efficiently sidesteps the infeasible points. In addition, the evolutionary search is able to find more design options with maximum reward (the red crosses). This diversity in the solutions with high reward provides flexibility to the designer in exploring various architectures with different design trade-offs.

As above, with the evolutionary optimization strategy (max reward = 1.10). The maximum reward points (red cross markers) are highlighted at the last frame of the animation.

Finally, the population-based optimization method (P3BO) explores the design space in a more targeted way (regions with high reward points) in order to find optimal solutions. The P3BO strategy finds design points with the highest reward in search spaces with tighter constraints (e.g., a larger number of infeasible points), showing its effectiveness in navigating search spaces with large numbers of infeasible points.

As above, with the P3BO optimization strategy (max reward = 1.13). The maximum reward points (red cross markers) are highlighted at the last frame of the animation.

Architecture Exploration under Different Design Constraints
We also studied the benefits of each optimization strategy under different area budget constraints, 6.8 mm2, 5.8 mm2 and 4.8 mm2. The following violin plots show the full distribution of the maximum achievable reward at the end of optimization (after ten runs each with 4K trials) across the studied optimization strategies. The wider sections represent a higher probability of observing feasible architecture configurations at a particular given reward. This implies that we favor the optimization algorithm that yields increased width at the points with higher reward (higher performance).

The two top-performing optimization strategies for architecture exploration are evolutionary and P3BO, both in terms of delivering solutions with high reward and robustness across multiple runs. Looking into different design constraints, we observe that as one tightens the area budget constraint, the P3BO optimization strategy yields more high performing solutions. For example, when the area budget constraint is set to 5.8 mm2, P3BO finds design points with a reward (throughput / accelerator area) of 1.25 outperforming all the other optimization strategies. The same trend is observed when the area budget constraint is set to 4.8 mm2, a slightly better reward is found with more robustness (less variability) across multiple runs.

Violin plot showing the full distribution of the maximum achievable reward in ten runs across the optimization strategies after 4K trial evaluations under an area budget of 6.8 mm2. The P3BO and Evolutionary algorithm yield larger numbers of high-performing designs (wider sections). The x and y axes indicate the studied optimization algorithms and the geometric mean of speedup (reward) over the baseline accelerator, respectively.
As above, under an area budget of 5.8 mm2.
As above, under an area budget of 4.8 mm2.

Conclusion
While Apollo presents the first step towards better understanding of accelerator design space and building more efficient hardware, inventing accelerators with new capabilities is still an uncharted territory and a new frontier. We believe that this research is an exciting path forward to further explore ML-driven techniques for architecture design and co-optimization (e.g., compiler, mapping, and scheduling) across the computing stack to invent efficient accelerators with new capabilities for the next generation of applications.

Acknowledgments
This work was performed by Amir Yazdanbakhsh, Christof Angermueller, and Berkin Akin . We would like to also thank Milad Hashemi, Kevin Swersky, James Laudon, Herman Schmit, Cliff Young, Yanqi Zhou, Albin Jones, Satrajit Chatterjee, Ravi Narayanaswami, Ray (I-Jui) Sung, Suyog Gupta, Kiran Seshadri, Suvinay Subramanian, Matthew Denton, and the Vizier team for their help and support.


1 In our target accelerator, the total number of design points is around 5 x 108

2 The chip area is approximately the sum of total hardware components on the chip, including on-chip storage, processing engines, controllers, I/O pins, and etc.  

Read More

Using genetic algorithms on AWS for optimization problems

Machine learning (ML)-based solutions are capable of solving complex problems, from voice recognition to finding and identifying faces in video clips or photographs. Usually, these solutions use large amounts of training data, which results in a model that processes input data and produces numeric output that can be interpreted as a word, face, or classification category. For many types of problems, this approach works very well.

But what if you have a problem that doesn’t have training data available, or doesn’t fit within the concept of a classification or regression? For example, what if you need to find an optimal ordering for a given set of worker tasks with a given set of conditions and constraints? How do you solve that, especially if the number of tasks is very large?

This post describes genetic algorithms (GAs) and demonstrates how to use them on AWS. GAs are unsupervised ML algorithms used to solve general types of optimization problems, including:

  • Optimal data orderings – Examples include creating work schedules, determining the best order to perform a set of tasks, or finding an optimal path through an environment
  • Optimal data subsets – Examples include finding the best subset of products to include in a shipment, or determining which financial instruments to include in a portfolio
  • Optimal data combinations – Examples include finding an optimal strategy for a task that is composed of many components, where each component is a choice of one of many options

For many optimization problems, the number of potential solutions (good and bad) is very large, so GAs are often considered a type of search algorithm, where the goal is to efficiently search through a huge solution space. GAs are especially advantageous when the fitness landscape is complex and non-convex, so that classical optimization methods such as gradient descent are an ineffective means to find a global solution. Finally, GAs are often referred to as heuristic search algorithms because they don’t guarantee finding the absolute best solution, but they do have a high probability of finding a sufficiently good solution to the problem in a short amount of time.

GAs use concepts from evolution such as survival of the fittest, genetic crossover, and genetic mutation to solve problems. Rather than trying to create a single solution, those evolutionary concepts are applied to a population of different problem solutions, each of which is initially random. The population goes through a number of generations, literally evolving solutions through mechanisms like reproduction (crossover) and mutation. After a number of generations of evolution, the best solution found across all the generations is chosen as the final problem solution.

As a prerequisite to using a GA, you must be able to do the following:

  • Represent each potential solution in a data structure.
  • Evaluate that data structure and return a numeric fitness score that accurately reflects the solution quality. For example, imagine a fitness score that measures the total time to perform a set of tasks. In that case, the goal would be to minimize that fitness score in order to perform the tasks as quickly as possible.

Each member of the population has a different solution stored in its data structure, so the fitness function must return a score that can be used to compare two candidates against each other. That’s the “survival of the fittest” part of the algorithm—one candidate is evaluated as better than another, and that fitter candidate’s information is passed on to future generations.

One note about terminology: because many of the ideas behind a genetic algorithm come from the field of genetics, the data representation that each member of a population uses is sometimes called a genome. That’s simply another way to refer to the data used to represent a particular solution.

Use case: Finding an optimal route for a delivery van

As an example, let’s say that you work for a company that ships lots of packages all over the world, and your job is focused on the final step, which is delivering a package by truck or van to its final destination.

A given delivery vehicle might have up to 100 packages at the start of a day, so you’d like to calculate the shortest route to deliver all the packages and return the truck to the main warehouse when done. This is a version of a classic optimization problem called The Travelling Salesman Problem, originally formulated in 1930. In the following visualization of the problem, displayed as a top-down map of a section of a city, the warehouse is shown as a yellow dot, and each delivery stop is shown as a red dot.

In the following visualization of the problem, displayed as a top-down map of a section of a city, the warehouse is shown as a yellow dot, and each delivery stop is shown as a red dot.

To keep things simple for this demonstration, we assume that when traveling from one delivery stop to another, there are no one-way roads. Due to this assumption, the total distance traveled from one stop to the next is the difference in X coordinates added to the difference in Y coordinates.

If the problem had a slightly different form (like traveling via airplane rather than driving through city streets), we might calculate the distance using the Pythagorean equation, taking the square root of the total of the difference in X coordinates (squared) added to the difference in Y coordinates (squared). For this use case, however, we stick with the total of the difference in X coordinates added to the total of the difference in Y coordinates, because that matches how a truck travels to deliver the packages, assuming two-way streets.

Next, let’s get a sense of how challenging this problem is. In other words, how many possible routes are there with 100 stops where you visit each stop only once? In this case, the math is simple: there are 100 possible first stops multiplied by 99 possible second stops, multiplied by 98 possible third stops, and so on—100 factorial (100!), in other words. That’s 9.3 x 10157 possibilities, which definitely counts as a large solution space and rules out any thoughts of using a brute force approach. After all, with that volume of potential solutions, there really is no way to iterate through all the possible solutions in any reasonable amount of time.

Given that, it seems that a GA could be a good approach for this problem, because GAs are effective at finding good-quality solutions within very large solution spaces. Let’s develop a GA to see how that works.

Representation and a fitness function

As mentioned earlier, the first step in writing a GA is to determine the data structure for a solution. Suppose that we have a list of all 100 destinations with their associated locations. A useful data representation for this problem is to have each candidate store a list of 100 location indexes that represent the order the delivery van must visit each location. The X and Y coordinates found in the lookup table could be latitude and longitude coordinates or other real-world data.

The X and Y coordinates found in the lookup table could be latitude and longitude coordinates or other real-world data.

To implement our package delivery solution, we use a Python script, although almost any modern computer language like Java or C# works well. Open-source packages like inspyred also create the general structure of a GA, allowing you to focus on just the parts that vary from project to project. However, for the purposes of introducing the ideas behind a GA, we write the code without relying on third-party libraries.

As a first step, we represent a potential solution as the following code:

class CandidateSolution(object):
    def __init__(self):
        self.fitness_score = 0
        num_stops = len(delivery_stop_locations) # a list of (X,Y) tuples
        self.path = list(range(num_stops))
        random.shuffle(self.path)

The class has a fitness_score field and a path field. The path is a list of indexes into delivery_stop_locations, which is a list of (X,Y) coordinates for each delivery stop. That list is loaded from a database elsewhere in the code. We also use random.shuffle(), which ensures that each potential solution is a randomly shuffled list of indexes into the delivery_stop_locations list. GAs always start with a population of completely random solutions, and then rely on evolution to home in on the best solution possible.

With this data structure, the fitness function is straightforward. We start at the warehouse, then travel to the first location in our list, then the second location, and so on until we’ve visited all delivery stop locations, and then we return to the warehouse. In the end, the fitness function simply totals up the distance traveled over that entire trip. The goal of this GA is to minimize that distance, so the smaller the fitness score, the better the solution. We use the following the code to implement the fitness function:

def dist(location_a, location_b):
    xdiff = abs(location_a['X'] - location_b['X'])
    ydiff = abs(location_a['Y'] - location_b['Y'])
    return xdiff + ydiff

 def calc_score_for_candidate(candidate):
    # start with the distance from the warehouse to the first stop
    warehouse_location = {'X': STARTING_WAREHOUSE_X, 'Y': STARTING_WAREHOUSE_Y}
    total_distance = dist(warehouse_location, delivery_stop_locations[candidate.path[0]])

    # then travel to each stop
    for i in range(len(candidate.path) - 1):
        total_distance += dist(
            delivery_stop_locations[candidate.path[i]], 
            delivery_stop_locations[candidate.path[i + 1]])

    # then travel back to the warehouse
    total_distance += dist(warehouse_location, delivery_stop_locations
[candidate.path[-1]])
    return total_distance

Now that we have a representation and a fitness function, let’s look at the overall flow of a genetic algorithm.

Program flow for a genetic algorithm

When you have a data representation and a fitness function, you’re ready to create the rest of the GA. The standard program flow includes the following pseudocode:

  1. Generation 0 – Initialize the entire population with completely random solutions.
  2. Fitness – Calculate the fitness score for each member of the population.
  3. Completion check – Take one of the following actions:
    1. If the best fitness score found in the current generation is better than any seen before, save it as a potential solution.
    2. If you go through a certain number of generations without any improvement (no better solution has been found), then exit this loop, returning the best found to date.
  4. Elitism – Create a new generation, initially empty. Take a small percentage (like 5%) of the best-scoring candidates from the current generation and copy them unchanged into the new generation.
  5. Selection and crossover – To populate the remainder of the new generation, repeatedly select two good candidate solutions from the current generation and combine them to form a new child candidate that gets added to the next generation.
    1. Mutation – On rare occasions (like 2%, for example), mutate a newly created child candidate by randomly perturbing its data.
  6. Replace the current generation with the next generation and return to step 2.

When the algorithm exits the main loop, the best solution found during that run is used as the problem’s final solution. However, it’s important to realize that because there is so much randomness in a GA—from the initially completely random candidates to randomized selection, crossover, and mutation—each time you run a GA, you almost certainly get a different result. Because of that randomness, a best practice when using a GA is to run it multiple times to solve the same problem, keeping the very best solutions found across all the runs.

Using a genetic algorithm on AWS via Amazon SageMaker Processing

Due to the inherent randomness that comes with a GA, it’s usually a good idea to run the code multiple times, using the best result found across those runs. This can be accomplished using Amazon SageMaker Processing, which is an Amazon SageMaker managed service for running data processing workloads. In this case, we use it to launch the GA so that multiple instances of the code run in parallel.

Before we start, we need to set up a couple of AWS resources that our project needs, like database tables to store the delivery stop locations and GA results, and an AWS Identity and Access Management (IAM) role to run the GA. Use the AWS CloudFormation template included in the associated GitHub repo to create these resources, and make a note of the resulting ARN of the IAM role. Detailed instructions are included in the README file found in the GitHub repo.

After you create the required resources, populate the Amazon DynamoDB table DeliveryStops (indicating the coordinates for each delivery stop) using the Python script create_delivery_stops.py, which is included in the code repo. You can run this code from a SageMaker notebook or directly from a desktop computer, assuming you have Python and Boto3 installed. See the README in the repo for detailed instructions on running this code.

We use DynamoDB for storing the delivery stops and the results. DynamoDB is a reasonable choice for this use case because it’s highly scalable and reliable, and doesn’t require any maintenance due to it being a fully managed service. DynamoDB can handle more than 10 trillion requests per day and can support peaks of more than 20 million requests per second, although this use case doesn’t require anywhere near that kind of volume.

After you create the IAM role and DynamoDB tables, we’re ready to set up the GA code and run it using SageMaker.

  1. To start, create a notebook in SageMaker.

Be sure to use a notebook instance rather than SageMaker Studio, because we need a kernel with Docker installed.

To use SageMaker Processing, we first need to create a Docker image that we use to provide a runtime environment for the GA.

  1. Upload Dockerfile and genetic_algorithm.py from the code repo into the root folder for your Jupyter notebook instance.
  2. Open Dockerfile and ensure that the ENV AWS_DEFAULT_REGION line refers to the AWS Region that you’re using.

The default Region in the file from the repo is us-east-2, but you can use any Region you wish.

  1. Create a cell in your notebook and enter the following code:
    import boto3
    
    print("Building container...")
    
    region = boto3.session.Session().region_name
    account_id = boto3.client('sts').get_caller_identity().get('Account')
    ecr_repository = 'sagemaker-processing-container-for-ga'
    tag = ':latest'
    base_uri = '{}.dkr.ecr.{}.amazonaws.com'.format(account_id, region)
    repo_uri = '{}/{}'.format(base_uri, ecr_repository + tag)
    
    # Create ECR repository and push docker image
    !docker build -t $ecr_repository docker
    !aws ecr get-login-password --region $region | docker login --username AWS --password-stdin $base_uri
    !aws ecr create-repository --repository-name $ecr_repository
    !docker tag {ecr_repository + tag} $repo_uri
    !docker push $repo_uri
    
    print("Container Build done")
    
    iam_role = 'ARN_FOR_THE_IAM_ROLE_CREATED_EARLIER'
    

Be sure to fill in the iam_role ARN, which is displayed on the Outputs page of the CloudFormation stack that you created earlier. You can also change the name of the Docker image if you wish, although the default value of sagemaker-processing-container-for-ga is reasonable.

Running that cell creates a Docker image that supports Python with the Boto3 package installed, and then registers it with Amazon Elastic Container Registry (Amazon ECR), which is a fully-managed Docker registry that handles everything required to scale or manage the storage of Docker images.

Add a new cell to your notebook and enter and run the following code:

from sagemaker.processing import ScriptProcessor

processor = ScriptProcessor(image_uri=repo_uri,
     role=iam_role,
     command=['python3']
     instance_count=1,
     instance_type="ml.m5.xlarge")

processor.run(code='./genetic_algorithm.py')

This image shows the job launched, and the results displayed below as the GA does its processing:

This image shows the job launched, and the results displayed below as the GA does its processing:

The ScriptProcessor class is used to create a container that the GA code runs in. We don’t include the code for the GA in the container itself because the ScriptProcessor class is designed to be used as a generic container (preloaded with all required software packages), and the run command chooses a Python file to run within that container. Although the GA Python code is located on your notebook instance, SageMaker Processing copies it to an Amazon Simple Storage Service (Amazon S3) bucket in your account so that it can be referenced by the processing job. Because of that, the IAM role we use must include a read-only permission policy for Amazon S3, along with other required permissions related to services like DynamoDB and Amazon ECR.

Calculating fitness scores is something that can and should be done in parallel, because fitness calculations tend to be fairly slow and each candidate solution is independent of all other candidate solutions. The GA code for this demonstration uses multiprocessing to calculate multiple fitness scores at the same time, which dramatically increases the speed at which the GA runs. We also specify the instance type in the ScriptProcessor constructor. In this case, we chose ml.m5.xlarge in order to use a processor with 4 vCPUs. Choosing an instance type with more vCPUs results in faster runs of each run of the GA, at a higher price per hour. There is no benefit to using an instance type with GPUs for a GA, because all of the work is done via a CPU.

Finally, the ScriptProcessor constructor also specifies the number of instances to run. If you specify a number of instances greater than 1, the same code runs in parallel, which is exactly what we want for a GA. Each instance is a complete run of the GA, run in its own container. Because each instance is completely self-contained, we can run multiple instances at once, and each instance does its calculations and writes its results into the DynamoDB results table.

To review, we’re using two different forms of parallelism for the GA: one is through running multiple instances at once (one per container), and the other is through having each container instance use multiprocessing in order to effectively calculate fitness scores for multiple candidates at the same time.

The following diagram illustrates the overall architecture of this approach.

The following diagram illustrates the overall architecture of this approach.The Docker image defines the runtime environment, which is stored in Amazon ECR. That image is combined with a Python script that runs the GA, and SageMaker Processing uses one or more containers to run the code. Each instance reads configuration data from DynamoDB and writes results into DynamoDB.

Genetic operations

Now that we know how to run a GA using SageMaker, let’s dive a little deeper into how we can apply a GA to our delivery problem.

Selection

When we select two parents for crossover, we want a balance between good quality and randomness, which can be thought of as genetic diversity. If we only pick candidates with the best fitness scores, we miss candidates that have elements that might eventually help find a great solution, even though the candidate’s current fitness score isn’t the best. On the other hand, if we completely ignore quality when selecting parents, the evolutionary process doesn’t work very well—we’re ignoring survival of the fittest.

There are a number of approaches for selection, but the simplest is called tournament selection. With a tournament of size 2, you randomly select two candidates from the population and keep the best one. The same applies to a tournament of size 3 or more—you simply use the one with the best fitness score. The larger the number you use, the better quality candidate you get, but at a cost of reduced genetic diversity.

The following code shows the implementation of tournament selection:

def tourney_select(population):
    selected = random.sample(population, TOURNEY_SIZE)
    best = min(selected, key=lambda c: c.fitness_score)
    return best

def select_parents(population):
    # using Tourney selection, get two candidates and make sure they're distinct
    while True:
        candidate1 = tourney_select(population)
        candidate2 = tourney_select(population)
        if candidate1 != candidate2:
            break
    return candidate1, candidate2

Crossover

After we select two candidates, how can we combine them to form one or two children? If both parents are simply lists of numbers and we can’t duplicate or leave out any numbers from the list, combining the two can be challenging.

One approach is called partially mapped crossover. It works as follows:

  1. Copy each parent, creating two children.
  2. Randomly select a starting and ending point for crossover within the genome. We use the same starting and ending points for both children.
  3. For each child, iterate from the starting crossover point to the ending crossover point and perform the following actions on each gene in the child at the current point:
    1. Find the corresponding gene in the other parent (the one that wasn’t copied into the current child), using the same crossover point. If that gene matches what’s already in the child at that point, continue to the next point, because no crossover is required for the gene.
    2. Otherwise, find the gene from the alternate parent and swap it with the current gene within the child.

The following diagram illustrates the first step, making copies of both parents.

The following diagram illustrates the first step, making copies of both parents.Each child is crossed over with the alternate parent. The following diagram shows the randomly selected start and end points, with the thick arrow indicating which gene is crossed over next.

The following diagram shows the randomly selected start and end points, with the thick arrow indicating which gene is crossed over next.

In the first swap position, the parent contributes the value 8. Because the current gene value in the child is 4, the 4 and 8 are swapped within the child.

Because the current gene value in the child is 4, the 4 and 8 are swapped within the child.

That swap has the effect of taking the gene with value 8 from the parent and placing it within the child at the corresponding position. When the swap is complete, the large arrow moves to the next gene to cross over.

When the swap is complete, the large arrow moves to the next gene to cross over.

At this point, the sequence is repeated. In this case, both gene values in the current position are the same (6), so the crossover position advances to the next position.

At this point, the sequence is repeated.

The gene value from the parent is 7 in this case, so the swap occurs within the child.

The gene value from the parent is 7 in this case, so the swap occurs within the child.

The following diagram shows the final result, with the arrows indicating how the genes were crossed over.

The following diagram shows the final result, with the arrows indicating how the genes were crossed over.

Crossover isn’t a mandatory step, and most GAs use a crossover rate parameter to control how often crossover happens. If two parents are selected but crossover isn’t used, both parents are copied unchanged into the next generation.

We used the following code for the crossover in this solution:

def crossover_parents_to_create_children(parent_one, parent_two):
    child1 = copy.deepcopy(parent_one)
    child2 = copy.deepcopy(parent_two)

    # sometimes we don't cross over, so use copies of the parents
    if random.random() >= CROSSOVER_RATE:
        return child1, child2

    num_genes = len(parent_one.path)
    start_cross_at = random.randint(0, num_genes - 2)  # pick a point between 0 and the end - 2, so we can cross at least 1 stop
    num_remaining = num_genes - start_cross_at
    end_cross_at = random.randint(num_genes - num_remaining + 1, num_genes - 1)

    for index in range(start_cross_at, end_cross_at + 1):
        child1_stop = child1.path[index]
        child2_stop = child2.path[index]

        # if the same, skip it since there is no crossover needed at this gene
        if child1_stop == child2_stop:
            continue

        # find within child1 and swap
        first_found_at = child1.path.index(child1_stop)
        second_found_at = child1.path.index(child2_stop)
        child1.path[first_found_at], child1.path[second_found_at] = child1.path[second_found_at], child1.path[first_found_at]

        # and the same for the second child
        first_found_at = child2.path.index(child1_stop)
        second_found_at = child2.path.index(child2_stop)
        child2.path[first_found_at], child2.path[second_found_at] = child2.path[second_found_at], child2.path[first_found_at]

    return child1, child2

Mutation

Mutation is a way to add genetic diversity to a GA, which is often desirable. However, too much mutation causes the GA to lose its way, so it’s best to use it in moderation if it’s needed at all.

You can approach mutation for this problem in two different ways: swapping and displacement.

A swap mutation is just what it sounds like—two randomly selected locations (genes) are swapped within a genome (see the following diagram).

A swap mutation is just what it sounds like—two randomly selected locations (genes) are swapped within a genome

 

The following code performs the swap:

def swap_mutation(candidate):
    indexes = range(len(candidate.path))
    pos1, pos2 = random.sample(indexes, 2)
    candidate.path[pos1], candidate.path[pos2] = candidate.path[pos2], candidate.path[pos1]

A displacement mutation randomly selects a gene, randomly selects an insertion point, and moves the selected gene into the selected insertion point, shifting other genes as needed to make space (see the following diagram).

A displacement mutation randomly selects a gene, randomly selects an insertion point.

The following code performs the displacement:

def displacement_mutation(candidate):
    num_stops = len(candidate.path)
    stop_to_move = random.randint(0, num_stops - 1)
    insert_at = random.randint(0, num_stops - 1)
    # make sure it's moved to a new index within the path, so it's really different
    while insert_at == stop_to_move:
        insert_at = random.randint(0, num_stops - 1)
    stop_index = candidate.path[stop_to_move]
    del candidate.path[stop_to_move]
    candidate.path.insert(insert_at, stop_index)

Elitism

An optional part of any GA is elitism, which is done when populating a new generation of candidates. When used, elitism copies a certain percentage of the best-scoring candidates from the current generation into the next generation. Elitism is a method for ensuring that the very best candidates always remain in the population. See the following code:

num_elites = int(ELITISM_RATE * POPULATION_SIZE)
current_generation.sort(key=lambda c: c.fitness_score)
next_generation = [current_generation[i] for i in range(num_elites)]

Results

It’s helpful to compare the results from our GA to those from a baseline algorithm. One common non-GA approach to solving this problem is known as the Nearest Neighbor algorithm, which you can apply in this manner:

  1. Set our current location to be the warehouse.
  2. While there are unvisited delivery stops, perform the following:
    1. Find the unvisited delivery stop that is closest to our current location.
    2. Move to that stop, making it the current location.
  3. Return to the warehouse.

The following diagrams illustrate the head-to-head results, using varying numbers of stops.

 

10 Delivery Stops

Nearest Neighbor
Total distance: 142
Genetic Algorithm
Total distance: 124

25 Delivery Stops

Nearest Neighbor
Total distance: 202
Genetic Algorithm
Total distance: 170

50 Delivery Stops

Nearest Neighbor
Total distance: 268
Genetic Algorithm
Total distance: 252

75 Delivery Stops


Nearest Neighbor
Total distance: 370
Genetic Algorithm
Total distance: 318

100 Delivery Stops

Nearest Neighbor
Total distance: 346
Genetic Algorithm
Total distance: 368

The following table summarizes the results.

# delivery stops Nearest Neighbor Distance Genetic Algorithm Distance
10 142 124
25 202 170
50 268 252
75 370 318
100 346 368

The Nearest Neighbor algorithm performs well in situations where many locations are clustered tightly together, but can perform poorly when dealing with locations that are more widely distributed. The path calculated for 75 delivery stops is significantly longer than the path calculated for 100 delivery stops—this is an example of how the results can vary widely depending on the data. We need a deeper statistical analysis using a broader set of sample data to thoroughly compare the results of the two algorithms.

On the other hand, for the majority of test cases, the GA solution finds the shorter path, even though it could admittedly be improved with tuning. Like other ML methodologies, genetic algorithms benefit from hyperparameter tuning. The following table summarizes the hyperparameters used in our runs, and we could further tune them to improve the GA’s performance.

Hyperparameter Value Used
Population size 5,000
Crossover rate 50%
Mutation rate 10%
Mutation method 50/50 split between swap and displacement
Elitism rate 10%
Tournament size 2

Conclusion and resources

Genetic algorithms are a powerful tool to solve optimization problems, and running them using SageMaker Processing allows you to leverage the power of multiple containers at once. Additionally, you can select instance types that have useful characteristics, like multiple virtual CPUs to optimize running jobs.

If you’d like to learn more about GAs, see Genetic algorithm on Wikipedia, which contains a number of useful links. Although several GA frameworks exist, the code for a GA tends to be relatively simple (because there’s very little math) and you may be able to write the code yourself, or use the accompanying code in our GitHub repo, which includes the CloudFormation template that creates the required AWS infrastructure. Be sure to shut down the CloudFormation stack when you’re done, in order to avoid running up charges.

Although optimization problems are relatively rare compared to other ML applications like classification or regression, when you need to solve one, a genetic algorithm is usually a good option, and SageMaker Processing makes it easy.


About the Author

Greg Sommerville is a Prototyping Architect on the AWS Envision Engineering Americas Prototyping team, where he helps AWS customers implement innovative solutions to challenging problems with machine learning, IoT and serverless technologies. He lives in Ann Arbor, Michigan and enjoys practicing yoga, catering to his dogs, and playing poker.

Read More

Q&A with Clemson University’s Bart Knijnenburg, research award recipient for improving ad experiences

In this monthly interview series, we turn the spotlight on members of the academic community and the important research they do — as partners, collaborators, consultants, or independent contributors.

For February, we nominated Bart Knijnenburg, assistant professor at Clemson University. Knijnenburg is a 2019 UX-sponsored research award recipient in improving ad experiences, whose resulting research was nominated for Best Paper at the 54th Hawaii International Conference on System Sciences (HICSS). Knijnenburg has also been involved in the Facebook Fellowship Program as the adviser of two program alumni, Moses Namara and Daricia Wilkinson.

In this Q&A, Knijnenburg describes the work he does at Clemson, including his recently nominated research in improving ad experiences. He also tells us what inspired this research, what the results were, and where people can learn more.

Q: Tell us about your role at Clemson and the type of research you and your department specialize in.

Bart Knijnenburg: I am an assistant professor in the Human-Centered Computing division of the Clemson University School of Computing. Our division studies the human aspects of computing through user-centered design and user experiments, with faculty members who study virtual environments, online communities, adaptive user experiences, etc. My personal interest lies in helping people make better decisions online through adaptive consumer decision support. Within this broad area, I have specialized in usable recommender systems and privacy decision-making.

In the area of recommender systems, I focus on usable mechanisms for users of such systems to input their preferences, and novel means to display and explain the resulting recommendations to users. An important goal I have in this area is to build systems that don’t just show users items that reflect their preferences, but help users better understand what their preferences are to begin with — systems I call “recommender systems for self-actualization.”

In the area of privacy decision-making, I focus on systems that actively assist consumers in their privacy decision-making practices — a concept I have dubbed “user-tailored privacy.” These systems should help users translate their privacy preferences into settings, thereby reducing the users’ burden of control while at the same time respecting their inherent privacy preferences.

Q: What inspired you to pursue your recent research project in improving ad experiences?

BK: Despite recent efforts to improve the user experience around online ads, there is a rise of distrust and skepticism around the collection and use of personal data for advertising purposes. There are a number of reasons for this distrust, including a lack of transparency and control. This lack of transparency and control not only generates mistrust, but also makes it more likely that the user models created by ad personalization algorithms reflect users’ immediate desires rather than their longer-term goals. The presented ads, in turn, tend to reflect these short-term likes, ignoring users’ ambitions and their better selves.

As someone who has worked extensively on transparency and control in both the field of recommender systems and the field of privacy, I am excited to apply this work to the area of ad experiences. In this project, my team therefore aims to design, build, and evaluate intuitive explanations of the ad recommendation process and interaction mechanisms that allow users to control this process. We will build these mechanisms in line with the nascent concepts of recommender systems for self-actualization and user-tailored privacy. The ultimate goal of this effort is to make advertisements more aligned with users’ long-term goals and ambitions.

Q: What were the results of this research?

BK: The work on this project is still very much ongoing. Our first step has been to conduct a systematic literature review on ad explanations, covering existing research on how they are generated, presented, and perceived by users. Based on this review, we developed a classification scheme that categorizes the existing literature on ad explanations offering insights into the reasoning behind the ad recommendation, the objective of the explanation, the content of the explanation, and how this content should be presented. This classification scheme offers a useful tool for researchers and practitioners to synthesize existing research on ad explanations and to identify paths for future research.

Our second step involves the development of a measurement instrument to evaluate ad experiences. The validation of this measurement instrument is still ongoing, but the end result will entail a carefully constructed set of questionnaires that can be used to users’ reactions toward online ads, including aspects of targeting accuracy, accountability, transparency, control, reliability, persuasiveness, and creepiness.

A third step involves a fundamental redesign of the ad experience on social networks, reimagining the very concept of advertising as a means to an end that serves the longer-term goals of the user. We are still in the very early stages of this activity, but we aim to explore the paradigm of recommendations, insights, and/or personal goals as a vehicle for this transformation of the ad experience.

Q: How has this research been received so far?

BK: Our paper on the literature review and the classification scheme of ad explanations was accepted to HICSS and was nominated as the Best Paper in the Social Media and e-Business Transformation minitrack. We are working on an interactive version of the classification scheme that provides a convenient overview of and direct access to the most relevant research in the area of ad explanations.

We are also working with Facebook researchers to make sure that our ad experience measurement instrument optimally serves their goal of creating a user-friendly ad experience.

Q: Where can people learn more about your research?

BK: You can find a project page about this research at www.usabart.nl/FBads. We will keep this page updated when new results become available!

The post Q&A with Clemson University’s Bart Knijnenburg, research award recipient for improving ad experiences appeared first on Facebook Research.

Read More

Achievement Unlocked: Celebrating Year One of GeForce NOW

It’s a celebration, gamers!

One year ago to the day we launched GeForce NOW, our cloud gaming service that transforms ordinary hardware into an extraordinarily powerful GeForce gaming PC. It’s the always-on gaming rig that never needs upgrading or patching and can instantly play your library of games.

We’ve been blown away by the passion and fandom of its members. Over 175 million hours have been streamed and more than 130 million moments have been captured with NVIDIA Highlights. It is the gamers who continue to push GeForce NOW forward.

Over the past year, we onboarded hundreds of games and are now supporting more than 800 titles, including 80 of the most-played free-to-play games, from over 300 publishers. Games like Cyberpunk 2077, Control and The Medium turned RTX ON to deliver real-time ray tracing and cinematic-quality graphics.

A glimpse into year one for GeForce NOW

We added new platforms, including Chromebook, iPhone and iPad, over the past few months. Today, GeForce NOW extends to even more PCs and Macs with Chrome browser support in beta. Additionally, Mac support expands to include new Apple M1-based hardware.

GeForce NOW has grown globally as well, with more than 65 countries now supported by our own service and more being added regularly by our GeForce NOW Alliance.

New Games, New Features — That’s GFN Thursday

Mark your calendars! GFN Thursday is our ongoing commitment to bringing great PC games and service updates to members each week. Check in every Thursday to discover what’s new in the cloud, including games, exclusive features and news on GeForce NOW.

Thirty new games will join the GeForce NOW library this month, including a number of day-and-date releases. Highlights include Apex Legends Season 8, Valheim, Werewolf: The Apocalypse – Earthblood and a demo for Square Enix’s highlighly anticipated Outriders game, coming to GeForce NOW on Feb. 25.

For the full list of games, including today’s GFN Thursday release, check out our latest blog.

Adding Even More Ways to Play

Switching between work and gaming is now just a Ctrl+Tab away.

Starting today, we’re adding beta support for Chrome browser on Windows PC and macOS, so members can access GeForce NOW instantly from more devices. The native applications on each platform still provide the best experience and features, but now gaming is even more convenient.

To get started, launch Chrome and head over to https://play.geforcenow.com. Then, simply log in and play.

Create desktop shortcuts by clicking on a game to open the details, and select +SHORTCUT to help launch your favorite games faster.

Members can also quickly and easily invite friends to play the same game. Click on a game to open the details, copy the URL shown in your browser, then share over social media, text or email.

Our latest client release also adds official support for Macs with the new Apple M1 chip via Rosetta 2. Apple products with the new chip will ask you to install Rosetta, if you haven’t previously, before installing the GeForce NOW app.

The Celebration Begins

All month long we’ll be celebrating GFN members with a series of rewards and giveaways. In the days ahead, Founders members can look forward to a unique offer in their inbox, while all members will have something special waiting as well.

And on social there will be weekly opportunities to win premium prizes, including Steel Series Arctis Pro wireless headphones, Razer Kishi controllers and more.

Now is the perfect time to encourage your friends to join you in the cloud by signing up for a free membership. Or upgrade to a Founders membership for priority access to gaming servers, extended session lengths and RTX ON for supported games.

GeForce NOW is available on nearly any PC, Mac, Chromebook, iPhone or iPad, and Android devices including NVIDIA SHIELD TV.

Follow us on Facebook and Twitter to get in on the fun, and subscribe to the GeForce NOW newsletter for game updates and the latest news.

The post Achievement Unlocked: Celebrating Year One of GeForce NOW appeared first on The Official NVIDIA Blog.

Read More

GFN Thursday — 30 Games Coming in February, 13 Available Today

Scientifically speaking, today is the best day of the week, because today is GFN Thursday. And that means more of the best PC games streaming right from the cloud across all of your devices.

This is a special GFN Thursday, too — not just because it’s the first Thursday of the month, which means learning about some of the games coming throughout February, but also because it’s GeForce NOW’s one-year anniversary. One year ago, we opened the cloud for any PC gamer whose rig needed a boost, and gave our Founders members RTX ON for supported games. Today, we offer over 800 games streaming instantly, with more on the way each week.

For now, let’s get into the best part of GFN Thursday: new games.

Let’s Play Today

The complete list of games joining GeForce NOW this week can be found below, but here are a few highlights:

Apex Legends Season 8 Mayhem on GeForce NOW

Apex Legends Season 8 (Origin and Steam)

Time to bring the boom in Season 8 – Mayhem. Meet a new Legend, Fuse, who doesn’t lack confidence, but often lacks a plan. He’s a blow-up-first ask-questions-later kinda guy.

Valheim on GeForce NOW

Valheim (day-and-date release on Steam)

A brutal exploration and survival game for 1-10 players, set in a procedurally generated purgatory inspired by Viking culture. Battle, build and conquer your way to a saga worthy of Odin’s patronage!

Werewolf: The Apocalypse - Earthblood on GeForce NOW

Werewolf: The Apocalypse – Earthblood (day-and-date release on Epic Games Store)

A unique experience full of savage combat and mystical adventures. You are Cahal, a powerful Garou who can transform into a wolf and a Crinos, a huge ferocious beast. Master your three forms and their powers to punish those who defile Gaia.

In addition to these highlights, members can look for the following:

  • Blue Fire (Steam)
  • Code2040 (Day-and-date release on Steam)
  • Curious Expedition 2 (Steam)
  • Magicka 2 (Steam)
  • Might & Magic Heroes V: Tribes of the East (Steam)
  • Mini Ninjas (Steam)
  • Order of Battle: World War II (Free to play on Steam)
  • Path of Wuxia (English language release on Steam)
  • Secret World Legends (Free to play on Steam)
  • Warhammer 40,000 Gladius Relics of War (Epic Games Store)

Looking Ahead to the Rest of February

Beyond this week, members can start getting excited about the following games:

Outriders Demo (day-and-date release on Steam)

Square Enix and People Can Fly present OUTRIDERS, a 1-3 player, drop-in-drop-out co-op shooter set in an original, dark and desperate sci-fi universe.

And here are even more titles joining GeForce NOW in February:

  • Art of Rally (Steam, Epic Games Store)
  • Darkest Hour: A Hearts of Iron Game (Steam)
  • Day of Infamy (Steam)
  • Everspace (Steam)
  • Farm Manager 2018 (Steam)
  • Farmer’s Dynasty (Steam)
  • Lara Croft and the Temple of Osiris (Steam)
  • The Legend of Heroes: Trails of Cold Steel III (Steam)
  • Lumberjack’s Dynasty (Steam)
  • Observer: System Redux (Steam)
  • Project Highrise (Steam)
  • Rise of Industry (Steam)
  • Sniper: Ghost Warrior 2 (Steam)
  • South Park: The Fractured But Whole (Steam)
  • South Park: The Stick of Truth (Steam)
  • Thea 2: The Shattering (Steam)

In Case You Missed It

In addition to the games we shared on the GeForce Forums, we had some surprise additions in January, including:

That’s a whole lotta gaming. What will you play in February? Let us know on Twitter or in the comments below.

The post GFN Thursday — 30 Games Coming in February, 13 Available Today appeared first on The Official NVIDIA Blog.

Read More

Creating a BankingBot on Amazon Lex V2 Console with support for English and Spanish

Amazon Lex is a service for building conversational interfaces into any application. The new Amazon Lex V2 Console and APIs make it easier to build, deploy, and manage bots. In this post, you will learn about about the 3 main benefits of Amazon Lex V2 Console and API, basic bot building concepts, and how to create a simple BankingBot on the Amazon Lex V2 Console.

The new Amazon Lex V2 Console and API have three main benefits:

  • You can add a new language to a bot at any time and manage all the languages through the lifecycle of design, test, and deployment as a single resource. The new console dashboard allows you to quickly move between different languages to compare and refine your conversations.
  • The Amazon Lex V2 API follows a simplified information architecture (IA) where intent and slot types are scoped to a specific language. Versioning is performed at the bot level so that resources such as intents and slot types don’t have to be versioned individually.
  • Amazon Lex V2 Console and API provides additional builder productivity tools and capabilities that give you more flexibility and control of your bot design process. For example, you can now save partially completed work as you script, test, and tune your configuration. You can also use the Conversation flow section to view the utterances and slot types for each intent.

You can access new Amazon Lex V2 Console from the AWS Management Console, the AWS Command Line Interface (AWS CLI), or via APIs. With the enhanced console and revised APIs, you can expedite building virtual agents, conversational IVR systems, self-service chatbots, or informational bots.

Basic bot concepts

Amazon Lex enables you to add self-service, natural language chatbots to your applications or devices. You can build bots to perform automated tasks such as scheduling an appointment or to find answers to frequent customer queries such as return policies. Depending on your user base, you can also configure your bot to converse in multiple languages.

In this post, you learn the basic concepts needed to create a simple BankingBot that can handle requests such as checking account balances, making bill payments, and transferring funds. When building conversational interfaces, you need to understand five main concepts:
When building conversational interfaces, you need to understand five main concepts:

  • Intents – An intent represents an action that the user wants to perform. This enables the bot to understand and classify what task a user is trying to accomplish. Your bot can support one or more related intents and are scoped to individual languages. For this post, our BankingBot is configured to understand intents in English and Spanish, such as CheckBalance, allowing your users to check the balance in their accounts, or TransferFunds for paying bills.
  • Utterances – Utterances are phrases that are used to trigger your intent. Each intent can be trained by providing a set of sample utterances. Based on these utterances, Amazon Lex can identify and invoke an intent based on natural language user input.
  • Slots and slot types – Slots are input data that a bot needs to complete an action or fulfill an intent. For the CheckBalance intent, the bot needs information regarding which account and date of birth to verify the user’s identity. This data is captured as slots, which is used to fulfill the intents. Amazon Lex has two types of slots:
    • Built-in slots – These slots provide a definition of how the data is recognized and handled. For example, Amazon Lex has the built-in slot type for AMAZON.DATE, which recognizes words or phrases that represent a date and converts them into a standard date format (for example, “tomorrow,” “the fifth of November,” or “22 December”).
    • Custom slots – These slots allow you to define and manage a custom catalog of items. You can define a custom slot by providing a list of values. Amazon Lex uses these values to train the natural language understanding model used for recognizing values for the slot. For example, you can define a slot type as accountType with values such as Checking, Savings, and Credit. You can also add synonyms for each value, such as defining Visa as a synonym for your Credit account.
  • Prompts and responses – These are bot messages that can be used to get information, acknowledge what the user said earlier, or confirm an action with the user before completing a transaction.
  • Fulfilling the user request – As part of fulfilling the user’s request, you can configure the bot to respond with a closing response. Optionally, you can enable code hooks such as AWS Lambda functions to run business logic.

Creating the bot

Now that you know about the basic building blocks of a bot, let’s get started. We configure the BankingBot to interact and understand the five intents in English and four intents in Spanish. We start off with a basic Welcome intent and then increase the complexity of the intent confirmations by adding custom slots, Lambda functions, and context management. The following table provides an overview of our intents.

Intent Built-in
Slots
Custom
Slots
Context Prompts/
Responses
App Integration
Welcome x
CheckBalance x x Lambda
FollowupCheckBalance* x x x Lambda
TransferFunds x x x
FallbackIntent x

*As of this writing, context management is only supported in US English.

To create your bot, complete the following steps:

  1. On the Amazon Lex V2 Console, choose Bots. If you’re in the Lex V1 Console, click on Switch to the new Lex V2 Console located in the left hand menu.
  2. Choose Create bot.

Choose Create bot.

  1. For Creation method, select Create.

3. For Creation method, select Create.

  1. For Bot name, enter BankingBot.
  2. Optionally, enter a description.

Optionally, enter a description.

  1. For Runtime role, select Create a new role with basic Amazon Lex permissions.

For Runtime role, select Create a new role with basic Amazon Lex permissions.

  1. Because this bot is only for demo purposes, it’s not subject to COPPA, so select No.

Because this bot is only for demo purposes, it’s not subject to COPPA, so select No.

  1. Leave the Idle session timeout and Advanced settings at their default.
  2. Choose Next.

Choose Next.

Adding languages

This sample BankingBot is configured for both US English and US Spanish. Let’s first add US English.

  1. For Select language, choose English (US).

If you’re building a voice-based bot, Amazon Lex comes pre-integrated with the neural speech-to-text voices from Amazon Polly. Try them out and see what voice fits your bot.

  1. Choose Add another language.
  2. For Select language, choose Spanish (US).
  3. Choose Done.

Congratulations, you have successfully created your BankingBot! Now, let’s bring it to life.

Congratulations, you have successfully created your BankingBot! Now, let’s bring it to life.

Creating intents and slots

In this section, we walk you through how to create 5 intents and related slots for your bot.

Intent 1: Welcome

At this point, the console automatically takes you into the Intent editor page, where a NewIntent is ready for you to configure. The BankingBot is a friendly bot, so let’s start by creating a simple Welcome intent to greet users.

  1. Scroll to Intent details, for Intent name, replace NewIntent to Welcome.

Scroll to Intent details, for Intent name, replace NewIntent to Welcome.

  1. Under Sample utterances, choose the Plain Text tab and add the following:
    Hi
    Hello
    I need help
    Can you help me?
    

  1. Under Closing responses, for Message, enter:
    Hi! I’m BB, the BankingBot. How can I help you today?

  2. Choose Save intent.

Choose Save intent.

  1. After your intent is saved, choose Build.
  2. Now that you’ve successfully built your first intent, choose Test and give it a try.

Now that you’ve successfully built your first intent, choose Test and give it a try.

Intent 2: CheckBalance

Now let’s get a bit fancier with a CheckBalance intent. This intent allows a user to check an account balance. The bot first validates the user by requesting their date of birth and then asks which account they want to check. This intent requires you to create a custom slot type, set up the intent and finally first set up a Lambda function for fulfillment.

Creating a custom slot

Now that you have added your Lambda function, you need to create a custom slot that can capture a user’s account types with valid values such as Checking, Savings, and Credit before creating the intents. To create a custom slot, follow these steps:

  1. In the navigation pane, drill down to the English (US) version of your bot.
  2. Under English (US), choose Slot types.

Under English (US), choose Slot types.

  1. On the Add slot type menu, choose Add blank slot type.

On the Add slot type menu, choose Add blank slot type.

  1. For Slot type name¸ enter accountType.
  2. Choose Add.

Choose Add.

  1. For Slot value resolution, select Restrict to slot values.

For Slot value resolution, select Restrict to slot values.

  1. Under Slot type values, add values for Checking, Savings, and Credit.

You’ve now created a custom slot type for accountType. You can also add synonyms in the second column to help the bot recognize additional references to the Credit slot, such as credit card, Visa, and Mastercard.

You’ve now created a custom slot type for accountType.

  1. Choose Save slot type.

Congratulations! You now have your first custom slot type.

Creating the intent

Now let’s create the CheckBalance intent. This intent allows a user to check an account balance. The bot first validates the user by requesting their date of birth and then asks which account they want to check. This intent uses the accountType custom slot and a Lambda function for fulfillment.

  1. In the navigation pane, under English (US), choose Intents.
  2. Choose New intent.
  3. For Intent name, enter CheckBalance.
  4. Choose Add.

Choose Add.

  1. Under Intent details, for Description, add a description.

Under Intent details, for Description, add a description.

  1. Under Sample utterances, choose the Plain Text tab and enter the following utterances:
    What’s the balance in my account?
    Check my account balance
    What’s the balance in {accountType}? 
    How much do I have in {accountType}?
    I want to check the balance
    Can you help me with account balance?
    Balance in {accountType}
    

  2. Choose Save intent.

In the chatbot lifecycle, this component can be leveraged to expand the chatbot’s understanding of its users by providing additional utterances. The phrases don’t need to be an exact match for user inputs, but should be representative of real-world natural language queries.

  1. Under Slots, choose Add slot.

Under Slots, choose Add slot.

For the CheckBalance intent, we set up two slots: account type and date of birth.

  1. For Name, enter accountType.
  2. For Slot type, choose accountType.
  3. For Prompts, enter:
    Sure. For which account would you like your balance?

  4. Choose Add.

Choose Add.

  1. Choose Add slot.

Choose Add slot

  1. For Name, enter dateofBirth.
  2. For Slot type, choose AMAZON.Date.
  3. For Prompts, enter For verification purposes, what is your date of birth?
  4. Choose Add.

Choose Add.

  1. Choose Save intent.

Preparing for intent 3: FollowupCheckBalance with context

Understanding the direction and context of an ever-evolving conversation is beneficial to building natural, human-like conversational interfaces. Being able to classify utterances as the conversation develops requires managing context across multiple turns. Consider when a user wants to follow up and check their account balance in a different account. You don’t want the bot to ask the user for their date of birth again. You want the bot to understand the context of the question and carry over the date of birth slot value from this intent into the follow-up intent.

To prepare for the third BankingBot intent, FollowupCheckBalance, you need to preserve this CheckBalance context as an output for future use.

  1. Under Contexts, for Output contexts, choose New Context tag.

Under Contexts, for Output contexts, choose New Context tag.

  1. For Context tag name, enter contextCheckBalance.
  2. Choose Add.

Choose Add.Now your context is stored for future use.

  1. Under Code hooks, select Use a Lambda function for fulfillment. To create the Lambda function, please follow the instructions in Appendix B below.

Under Code hooks, select Use a Lambda function for fulfillment.

  1. Choose Save Intent.
  2. Choose Build.
  3. After the bot building process is complete, you can test the intent by choosing Test.

After the bot building process is complete, you can test the intent by choosing Test.

You can also use the Conversation flow section to view the current state of your conversation flow and links to help you quickly get to that specific utterance, slot, or prompt.

You can also use the Conversation flow section to view the current state of your conversation

You learn how to create an intent with prompts and closing responses in the fourth intent, TransferFunds.

Intent 3: FollowupBalance

Next, we create a FollowupBalance intent, where the user might ask what the balance is for a different account. With this intent, you want to use the context management feature and utilize the context that you set up earlier with the CheckBalance intent.

  1. On the Intent editor page, under Intents, choose Add.
  2. Choose Add empty intent.

Choose Add empty intent.

  1. For Intent name¸ enter FollowupBalance.
  2. Choose Add.

Choose Add.

  1. For Description, enter:
    Intent to provide detail of expenses made for an account over a period of time.

For Description, enter Intent to provide detail of expenses made for an account over a period of time.

  1. In the Contexts section, for Input contexts, choose the context you just created in the CheckBalance intent.

In the Contexts section, for Input contexts, choose the context you just created in the CheckBalance intent.

  1. Under Sample utterances, on the Plain text tab, enter the following sample utterances:
    How about my {accountType} account
    What about {accountType}
    And in {accountType}?

  2. In the Slots section, choose Add slot.
  3. For Name, enter accountType.
  4. For Slot type, choose accountType.
  5. For Prompts, enter:
    You’d like the balance for which account?

  6. Choose Add.

Next, you create a second slot.

  1. Choose Add slot.
  2. For Name, enter dateofBirth.
  3. For Slot type, choose AMAZON.Date.
  4. For Prompts, enter:
    For verification purposes. What is your date of birth?

  5. Choose Add.
  6. In the Slots section, open the dateofBirth slot and choose Advanced options.
  7. Under Default values, enter the context and slot value for the CheckBalance intent
    (contextCheckBalance.dateofBirth).
  8. Choose Add default value.
  9. Choose Save.

Choose Save.

  1. In the Code hooks section, select Use a Lambda function for fulfillment.

In the Code hooks section, select Use a Lambda function for fulfillment.

  1. Choose Save intent.
  2. Choose Build.
  3. When your BankingBot is built, choose Test and try the FollowupBalance intent and can see if the dateofBirth slot from the CheckBalance intent is used.

Choose Test and try the FollowupBalance intent and can see if the dateofBirth slot from the CheckBalance intent is used.

Intent 4: TransferFunds

The TransferFunds intent offers the functionality of moving funds from one account to a target account. In this intent, you learn how to create two different slots using the same slot type and how to configure confirmation prompts and declines.

  1. On the Intent editor page, under Intents, choose Add.
  2. Choose Add empty intent.
  3. For Name, enter TransferFunds.
  4. Choose Add.
  5. For the intent description, enter:
    Help user transfer funds between bank accounts

  6. Under Sample Utterances, on the Plain Text tab, enter the following:
    I want to transfer funds
    Can I make a transfer?
    I want to make a transfer
    I'd like to transfer {transferAmount} from {sourceAccountType} to {targetAccountType}
    Can I transfer {transferAmount} to my {targetAccountType}
    Would you be able to help me with a transfer?
    Need to make a transfer
    

Next, we create the transferAmount slot.

  1. Choose Add slot.
  2. For Name, enter transferAmount.
  3. For Slot type, choose AMAZON.Number.
  4. For Prompts, enter:
    How much would you like to transfer?

  5. Choose Add.

Next, create the sourceAccountType slot.

  1. Choose Add slot.
  2. For Name, enter sourceAccountType.
  3. For Slot type¸ choose accountType.
  4. For Prompts¸ enter:
    Which account would you like to transfer from?

  5. Choose Add.

Next, create the targetAccountType slot.

  1. Choose Add slot.
  2. For Name, enter targetAccountType.
  3. For Slot type¸ choose accountType.
  4. For Prompts¸ enter:
    Which account are we transferring to?

  5. Choose Add.
  6. Under Prompts, for Confirmation prompts, enter:
    Got it. So we are transferring {transferAmount} from {sourceAccountType} to {targetAccountType}. Can I go ahead with the transfer?

  7. For Decline responses, enter:
    The transfer has been cancelled.

For Decline responses, enter The transfer has been cancelled.

  1. Under Closing responses, for Message, enter:
    The transfer is complete. {transferAmount} should now be available in your {targetAccountType} account.

  2. Choose Save intent.
  3. Choose Build.
  4. Choose Test.

Intent 5: FallbackIntent

Your last intent is the fallback intent, which is used when the bot can’t understand or identify a specific intent. It serves as a catchall intent and can also be used to route the conversation to a human agent for more assistance.

  1. On the Intents list, choose FallbackIntent.
  2. Under Closing responses¸ for Message, enter:
    Sorry I am having trouble understanding. Can you describe what you'd like to do in a few words? I can help you find your account balance, transfer funds and make a payment.

Sorry I am having trouble understanding. Can you describe what you'd like to do in a few words? I can help you find your account balance, transfer funds and make a payment.

  1. Choose Save intent.
  2. Choose Build.

Configuring the bot for Spanish

Amazon Lex V2 Console also allows you add multiple languages to a bot. Each language has an independent set of intents and slot types. Each intent follows the same structure as its English counterpart.

Intent 1: Welcome (Spanish)

To create the Welcome intent in Spanish, complete the following steps:

  1. In the navigation pane, under Spanish (US), choose Intents.
  2. Choose NewIntent.

Choose NewIntent

  1. For Intent name, enter Welcome.
  2. Choose Add.

Choose Add.

  1. Under Sample utterances, on the Plain text tab, enter the following:
    Hola
    Necesito ayuda
    Me podría ayudar?
    

Under Sample utterances, on the Plain text tab, enter the following:

  1. Under Closing responses, for Message, enter:
    Bienvenido! Puedo ayudarle con tareas como chequear balance o realizar un pago. Cómo puedo ayudarle hoy?

Bienvenido! Puedo ayudarle con tareas como chequear balance o realizar un pago. Cómo puedo ayudarle hoy?

  1. Choose Save intent.
  2. Choose Build.

Intent 2: CheckBalance (Spanish)

To create the CheckBalance intent, you first need to create the accountType custom slot type as you did for the English bot.

  1. Under Spanish (US), choose Slot types.
  2. On the Add slot type menu, choose Add a blank slot type.

On the Add slot type menu, choose Add a blank slot type.

  1. For Slot type name, enter accountType.
  2. Choose Add.
  3. In the Slot value resolution section, select Restrict to slot values.
  4. Add the Spanish slot values Cheques (Checking), Ahorro (Savings), and Crédito (Credit).

Add the Spanish slot values Cheques (Checking), Ahorro (Savings), and Crédito (Credit).

  1. Choose Save slot type.

You have now created the accountType custom slot.

  1. In the navigation pane, under Spanish (US), choose Intents.
  2. On the Add intent menu, choose Add empty intent.

On the Add intent menu, choose Add empty intent.

  1. For Intent name, enter CheckBalance.
  2. Choose Add.
  3. For Description, enter:
    Intent to check balance in the specified account

  4. Under Sample utterances, on the Plain text tab, enter the following:
    Cuál es el balance en mi cuenta?
    Verificar balance en mi cuenta
    Cuál es el balance en la cuenta {accountType}
    Cuál es el balance en {accountType}
    Cuánto hay en {accountType}
    Quiero verificar el balance
    Me podría ayudar con el balance de mi cuenta?
    Balance en {accountType}
    

Under Sample utterances, on the Plain text tab, enter the following:

  1. Choose Add slot.
  2. For Name, enter accountType.
  3. For Slot type, choose accountType.
  4. For Prompts, enter:
    Por supuesto. De qué cuenta le gustaría conocer el balance?

  5. Choose Add slot.
  6. For Name, enter dateofBirth.
  7. For Slot type, choose AMAZON.Date.
  8. For Prompts, enter:
    Por supuesto. Por motivos de verificación. Podría por favor compartir su fecha de nacimiento?

  9. Choose Add.
  10. Under Code hooks, select Use a Lambda function for fulfillment. For the Spanish version of your bot, you will need a new Lambda function for Spanish. To create a Lambda function, please follow the instructions in Appendix B below.
  11. Choose Save intent.
  12. Choose Build.

Intent 3: TransferFunds (Spanish)

Like the English version, the TransferFunds intent offers the functionality of moving funds from one account to another. This intent allows you to work with two slots of the same type and configure confirmations and prompts.

  1. Create a new intent and name it TransferFunds.
  2. For the intent description, enter:
    Intent to transfer funds between checking and savings accounts.

  3. Under Sample utterances, on the Plain Text tab, enter the following:
    Quisiera transferir fondos
    Puedo realizar una transferencia?
    Necesito hacer una transferencia.
    Quisiera transferir {transferAmount} desde {sourceAccountType} hacia {targetAccountType}
    Puedo transferir {transferAmount} hacia {targetAccountType} ?
    Can I transfer {transferAmount} to my {targetAccountType}
    Necesito ayuda con una transferencia.
    Me ayudaría a realizar una transferencia?
    Necesito realizar una transferencia.
    

Next, create the transferAmount slot.

  1. Choose Add slot.
  2. For Name, enter transferAmount.
  3. For Slot Type, choose AMAZON.Number.
  4. For Prompts,
    Qué monto desea transferir?

  5. Choose Add.

Next, create the sourceAccountType slot.

  1. Choose Add slot.
  2. For Name, enter sourceAccountType.
  3. For Slot type, choose accountType.
  4. For Prompts, enter
    Desde qué cuenta desea iniciar la transferencia?

  5. Choose Add.

Next, create the targetAccountType slot.

  1. Choose Add slot.
  2. For Name, enter targetAccountType.
  3. For Slot type, choose accountType.
  4. For Prompts, enter
    Hacia qué cuenta desea realizar la transferencia?

  5. Choose Add.
  6. Under Prompts, for Confirmation prompts, enter:
    Usted desea transferir {transferAmount} dólares desde la cuenta {sourceAccountType} hacia la cuenta {targetAccountType}. Puedo realizar la transferencia?

  7. For Decline responses, enter:
    No hay problema. La transferencia ha sido cancelada.

  8. Under Closing responses, for Message, enter:
    La transferencia ha sido realizada. {transferAmount} deberían estar disponibles en su cuenta {targetAccountType}.

  9. Choose Save intent.
  10. Choose Build.

Conclusion

Congratulations! You have successfully built a BankingBot that can check balances, transfer funds, and properly greet a customer. You also seen how easy it is to add and manage new languages. Additionally, the Conversation flow section lets you view and jump to different parameters of the conversation as you build and refine the dialogue for each of your intents.

To learn more about Amazon Lex V2 Console and APIs, check out the following resources:

Also, you could give your bot the ability to reply to natural language questions by integrating it with Amazon Kendra. For more information, see Integrate Amazon Kendra and Amazon Lex using a search intent.

Appendix A: Bot configuration

This example bot contains five intents that allow a user to interact with the financial institution and perform the following tasks:

  • Welcome – Intent to greet users
  • CheckBalance – Intent to check balance in the specified account
  • FollowupBalance – Intent to provide detail of expenses made for an account over a period of time
  • TransferFunds – Intent to transfer funds between checking and savings accounts
  • FallbackIntent – Default intent to respond when no other intent matches user input

Intents details: English

  1. Welcome configuration
    1. Description: Intent to greet users
    2. Sample utterances:
      • Hi
      • Hello
      • I need help
      • Can you help me?
    3. Closing response: Hi! I’m BB, the BankingBot. How can I help you today?
  1. CheckBalance configuration
    1. Description: Intent to check balance in the specified account
    2. Sample utterances:
      • What’s the balance in my account?
      • Check my account balance
      • What’s the balance in {accountType}?
      • How much do I have in {accountType}?
      • I want to check the balance
      • Can you help me with account balance?
      • Balance in {accountType}
    3. Slots:
      • accountType:
        • Custom slot type: accountType
        • Prompt: For which account would you like to check the balance?
      • dateofBirth:
        • Built-in slot type: AMAZON.Date
        • Prompt: For verification purposes, what is your date of birth?
    4. Context tag:
      • contextCheckBalance
        • Output contexts
    5. Closing response: Response comes from the fulfillment by the Lambda function.
  1. FollowupBalance configuration
    1. Description: Intent to provide detail of expenses made for an account over a period of time.
    2. Sample utterances:
      • How about my {accountType} account
      • What about {accountType}
      • And in {accountType}?
      • how about {accountType}
    3. Slots:
      • dateofBirth:
        • Built-in slot type: AMAZON.Date (default: #CheckBalance.dateOfBirth)
        • Prompt: For verification purposes. What is your date of birth?
      • accountType:
        • Custom slot type: accountType
        • Prompt: Which account do you need the balance details for? (default: #contextCheckBalance.dateofBirth)
    4. Closing response: Response comes from the fulfillment Lambda function.
  1. TransferFunds configuration
    1. Description: Intent to transfer funds between checking and savings accounts.
    2. Sample utterances:
      • I want to transfer funds
      • Can I make a transfer?
      • I want to make a transfer
      • I’d like to transfer {transferAmount} from {sourceAccountType} to {targetAccountType}
      • Can I transfer {transferAmount} to my {targetAccountType}
      • Would you be able to help me with a transfer?
      • Need to make a transfer
    3. Slots:
      • sourceAccountType:
        • Custom slot type: accountType
        • Prompt: Which account would you like to transfer from?
      • targetAccountType:
        • Custom slot type: accountType
        • Prompt: Which account are we transferring to?
      • transferAmount:
        • Built-in slot type: AMAZON.Number
        • Prompt: What amount are we transferring today?
    4. Confirmation prompt: Got it. So we are transferring {transferAmount} dollars from {sourceAccountType} to {targetAccountType}. Can I go ahead with the transfer?
    5. Decline response: Sure. The transfer has been cancelled.
    6. Closing response: The transfer is complete. {transferAmount} should now be available in your {targetAccountType} account.
  1. FallbackIntent configuration
    1. Description: Default intent to respond when no other intent matches user input.
    2. Closing response: Sorry I am having trouble understanding. Can you describe what you’d like to do in a few words? I can help you with account balance, transfer funds and payments.

Intent details: Spanish

  1. AccountBalance configuration
    1. Description: Intent to check balance in the specified account.
    2. Sample utterances:
      • Cuál es el balance en mi cuenta?
      • Verificar balance en mi cuenta
      • Cuál es el balance en la cuenta {accountType}
      • Cuál es el balance en {accountType}
      • Cuánto hay en {accountType}
      • Quiero verificar el balance
      • Me podría ayudar con el balance de mi cuenta?
      • Balance en {accountType}
    3. Slots:
      • accountType:
        • Custom slot type: Restrict values to Cheques, Ahorros, and Crédito
        • Prompt: Por supuesto. De qué cuenta le gustaría conocer el balance?
      • dateofBirth:
        • Built-in slot type: AMAZON.Date (default: #CheckBalance.dateOfBirth)
        • Prompt: Por supuesto. Por motivos de verificación. Podría por favor compartir su fecha de nacimiento?
    4. Closing response: Response comes from the fulfillment Lambda function.
  1. TransferFunds configuration
    1. Description: Intent to transfer funds between checking and savings accounts.
    2. Sample utterances:
      • Quisiera transferir fondos
      • Puedo realizar una transferencia?
      • Necesito hacer una transferencia.
      • Quisiera transferir {transferAmount} desde {sourceAccountType} hacia {targetAccountType}
      • Puedo transferir {transferAmount} hacia {targetAccountType} ?
      • Necesito ayuda con una transferencia.
      • Me ayudaría a realizar una transferencia?
      • Necesito realizar una transferencia.
    3. Slots:
      • sourceAccountType:
        • Custom slot type: Restrict Values to Checking, Savings, and Credit
        • Prompt: Desde qué cuenta desea iniciar la transferencia?
      • targetAccountType:
        • Custom slot type: Restrict Values to Checking, Savings, and Credit
        • Prompt: Hacia qué cuenta desea realizar la transferencia?
      • transferAmount:
        • Built-in slot type: AMAZON.Number
        • Prompt: Qué monto desea transferir?
    4. Confirmation prompt: Entendido. Usted desea transferir {transferAmount} dólares desde la cuenta {sourceAccountType} hacia la cuenta {targetAccountType}. Puedo realizar la transferencia?
    5. Decline response: No hay problema. La transferencia ha sido cancelada.
    6. Closing response: La transferencia ha sido realizada. {transferAmount} deberían estar disponibles en su cuenta {targetAccountType}
  1. FallbackIntent configuration
    1. Description: Default intent to respond when no other intent matches user input
    2. Closing response: Lo siento, no he entendido. En pocas palabras, podría describer que necesita hacer? Puedo ayudarlo con balance de cuenta, transferir fondos y pagos.

Appendix B: Creating a Lambda function

In Amazon Lex V2 Console, you use a single Lambda function as a fulfillment mechanism for all of your intents. This function is defined in the Alias language support page. You will need to create a separate Lambda function for each language you have specified for your bot. For this example bot, you will need to create a unique Lambda function for the English US and Spanish US language in your bot.

  1. In the top left corner, click on the Services drop down menu and select Lambda in the Compute section.In the top left corner, click on the Services drop down menu and select Lambda in the Compute section.
  2. On the Lambda console, choose Functions.
  3. Choose Create function.

Choose Create function.

  1. Select Author from scratch.

Select Author from scratch.

  1. For Function name, enter a name. For the English version, enter BankingBotEnglish for the Function name. For the Spanish version, enter BankingBotSpanish.
  2. For Runtime, choose Python 3.8.

For Runtime, choose Python 3.8.

  1. Choose Create function.
  2. In the Function code section, choose lambda_function.py.
  3. Download the Lambda BankingBotEnglish or BankingBotSpanish code for the specific language and open it in a text editor.
  4. Copy the code and replace the current function code with the Lambda BankingBotEnglish code or BankingBotSpanish code for the respective language.
  5. Choose Deploy.

Adding the Lambda function to your language

Now you have set up your Lambda function. In Amazon Lex V2 Console, Lambda functions are defined at the bot alias level. Follow these steps to set up your bot to use a Lambda function:

  1. On the Amazon Lex V2 Console, in the navigation pane, under your bot, choose Aliases.

On the Amazon Lex V2 console, in the navigation pane, under your bot, choose Aliases.

  1. Choose TestBotAlias.

Choose TestBotAlias.

  1. For Languages, select English (US) or Spanish (US) depending on which fulfillment Lambda function you are creating.

For Languages, select English (US).

  1. For Source, choose BankingBotEnglish or BankingBotSpanish as your source depending on which language you are configuring.
  2. For Lambda function version or alias, choose your function.
  3. Choose Save.

Choose Save.

Now your Lambda function is ready to work with your BankingBot intents.


About the Author

Juan Pablo Bustos is an AI Services Specialist Solutions Architect at Amazon Web Services, based in Dallas, TX. Outside of work, he loves spending time writing and playing music as well as trying random restaurants with his family.

 

 

 

As a Product Manager on the Amazon Lex team, Harshal Pimpalkhute spends his time trying to get machines to engage (nicely) with humans.

 

 

 

Esther Lee is a Product Manager for AWS Language AI Services. She is passionate about the intersection of technology and education. Out of the office, Esther enjoys long walks along the beach, dinners with friends and friendly rounds of Mahjong.

Read More

Using Amazon Translate to provide language support to Amazon Kendra

Amazon Kendra is a highly accurate and easy-to-use intelligent search service powered by machine learning (ML). Amazon Kendra supports English. This post provides a set of techniques to provide non-English language support when using Amazon Kendra.

We demonstrate these techniques within the context of a question-answer chatbot use case (Q&A bot) where a user can submit a question in any language that Amazon Translate supports through the chatbot. Amazon Kendra searches across a number of documents and returns a result in the language of that query. Amazon Comprehend and Amazon Translate are essential to providing non-English language support.

Our Q&A bot implementation relies on Amazon Simple Storage Service (Amazon S3) to store the documents prior to their ingestion into Amazon Kendra, Amazon Comprehend to detect the query’s dominant language to enable proper query and response translation, Amazon Translate to translate the query and response to and from English, and Amazon Lex to build the conversational user interface and provide the conversational interactions.

All queries, except for English, are translated from their native language into English before being submitted to Amazon Kendra. The Amazon Kendra responses a user sees are also translated. We have stored predefined Spanish response translations while performing real-time translation on all other languages. We use metadata attributes associated with each ingested document to point to the predefined Spanish translations.

We use three use cases to illustrate these techniques and assume that all the languages needing to be translated are supported by Amazon Translate. First, for Spanish language users, each document (we use small documents for the Q&A bot scenario) is translated by Amazon Translate into Spanish and has human vetting. This pre-translation is relevant as a description for Amazon Kendra document ranking model results.

Second, on-the-fly translation of the reading comprehension model responses occurs for all language responses except for English. On-the-fly translation occurs for the document ranking model results except for English and Spanish. We go into more detail on how to implement on-the-fly translation for Amazon Kendra’s different models later in this post.

Third, for English speaking users, translation doesn’t occur, allowing both the query and Amazon Kendra’s responses to be passed to and from Amazon Kendra without change.

The following exchange illustrates the three use cases. We start with English followed by Spanish, French, and Italian.

The following exchange illustrates the three use cases. We start with English followed by Spanish, French, and Italian.

Translation considerations and prerequisites

We perform the following steps on the document:

  1. Run the document through Amazon Translate to get a Spanish language version of the document as well as the title.
  2. Manually review the translation and make any changes desired.
  3. Create a metadata file where one of the attributes is the Spanish translation of the document.
  4. Ingest the English language document and the associated metadata file into Kendra.

The following code is the metadata file for the document:

{
    "Attributes": {
        "_created_at": "2020-10-28T16:48:26.059730Z",
        "_source_uri": "https://aws.amazon.com/kendra/faqs/",
        "spanish_text": "R: Amazon Kendra es un servicio de búsqueda empresarial muy preciso y fácil de usar que funciona con Machine Learning. 
"spanish_title": "P: ¿Qué es Amazon Kendra?"
    },
    "Title": "Q: What is Amazon Kendra?",
    "ContentType": "PLAIN_TEXT"
}

In this case, we have some predefined attributes, such as _created_at and _source_uri, as well as custom attributes such as spanish_text and spanish_title.

In the case of queries in Spanish, you use these attributes to build the response to send back to the user. The fact that the title of the document is in itself a possible user query allows you to have control over the translations.

If your documents are in another language, you need to run Amazon Translate to translate the documents into English before ingestion into Amazon Kendra.

We have not tried translation in other scenarios where the document types and answers can vary widely. However, we believe that the techniques shown in this post allow you to try translation in other scenarios and evaluate the accuracy.

Amazon Kendra processing overview

Now that we have the documents squared away, we build a chatbot using Amazon Lex. The chatbot identifies the language using Amazon Comprehend, translates the query from the user’s language to English, submits a query to the Amazon Kendra index, and translates the result back to the language the query was in. You can apply this approach to any language that Amazon Translate supports.

We use the Amazon Kendra built-in Amazon S3 connector to ingest documents and the Amazon Kendra FAQ ingestion process for getting question-answer pairs into Amazon Kendra. The ingested documents are in English. We manually created a description of each document in Spanish and attached that Spanish description as a metadata attribute. Ideally, all the documents that you use are in English.

If these documents have an overview section, you can use Amazon Translate as the method of generating this metadata description attribute. If your documents are in another language, you need to run Amazon Translate to translate the documents into English before ingestion into Amazon Kendra. The following diagram illustrates our architecture.

The following diagram illustrates our architecture.

We use the Amazon Kendra built-in Amazon S3 connector to ingest documents. If you also have FAQ documents, you also use the Amazon Kendra FAQ ingestion process.

Setting up your resources

In this section, we discuss the steps needed to implement this solution. See the appendix for details on the specifics of these steps. The AWS Lambda function is critical in order to understand where and how to implement the translation. We go into further details on the translation specifics in the next section.

  1. Download the documents and metadata files, decompress the archive, and store them in an S3 bucket. You use this bucket as the source for your Amazon Kendra S3 connector.
  2. Set up Amazon Kendra:
    1. Create an Amazon Kendra index. For instructions, see Getting started with the Amazon Kendra SharePoint connector.
    2. Create an Amazon Kendra S3 data source.
    3. Add attributes.
    4. Ingest the example data source from Amazon S3 into Amazon Kendra.
  3. Set up the fulfillment Lambda function.
  4. Set up the chatbot.

Understanding translation in the fulfillment Lambda function

The Lambda function has been structured into three main sections to process and respond to the user’s query: language detection, submitting a query, and returning the translated result.

Language detection

In the first section, you use Amazon Comprehend to detect the dominant language. For this post, we obtain the user input from the key inputTranscript part of the event submitted by Amazon Lex. Also, if Amazon Comprehend doesn’t have enough confidence in the language detected, it defaults to English. See the following code:

query = event['inputTranscript']
        response =  comprehend.detect_dominant_language(Text = query)
        confidence = response["Languages"][0]['Score']
        if confidence > 0.50:
            language = response["Languages"][0]['LanguageCode']
        else:
            #Default to english if there isn't enough confidence
            language = "en"

Submitting a query

Amazon Kendra currently supports documents and queries in English, so in order to submit your query, you have to translate it.

In the provided example code, after identifying the dominant language, and depending on the language, you translate the query to English. It’s worth noting that we can do a simple check if the language is English or not. For illustration purposes, I include the option of matching Spanish or a different language.

if language == "en":
        pass
    elif language == "es":
        translated_query = translate.translate_text(Text=query, SourceLanguageCode="es", TargetLanguageCode="en")
        query = translated_query['TranslatedText']
    else:
        try:
            translated_query = translate.translate_text(Text=query, SourceLanguageCode=language, TargetLanguageCode="en")
            query = translated_query['TranslatedText']
        except Exception as e:
            return(str(e))

Now that your query is in English, you can submit the query to Amazon Kendra:

response=kendra.query(
QueryText = query,
IndexId = index_id)

There are several options on how to work with the result from Amazon Kendra. For more information, see Analyzing the results in the Amazon Kendra Essentials Workshop. As a chatbot use case, we only work with the first result.

If the first result is from the reading comprehension model (result type Answer) and the language code is different than en (English), you translate the DocumentExcerpt, which is the value to be returned. See the following code:

answer_text = query_result['DocumentExcerpt']['Text']
                if language == "en":
                    pass
                else:
                    result = translate.translate_text(Text=answer_text, SourceLanguageCode="en", TargetLanguageCode=language)
                    answer_text = result['TranslatedText']

If the first result is from the document ranking model (result type Document), you might recall that in the introduction, we have pre-translated the Spanish language results and stored that in the document metadata for Spanish language documents.

The following code shows that:

  • If the language code is es (Spanish), the pre-translated content stored in the metadata field synopsis is returned.
  • If the language code is en (English), the DocumentExcerpt value returned by Amazon Kendra is returned as is.
  • If the language code is neither es or en, the content of DocumentExcerpt is translated to the language detected and returned.
    if language == "es":
        if key['Key'] == 'spanish_text':
            synopsis = key['Value']['StringValue']
            answer_text = synopsis
            if key['Key'] == 'spanish_title':
                document_title = key['Value']['StringValue']
                print('Title: ' + document_title)
    elif language == "en":
        document_title = query_result['DocumentTitle']['Text']
        answer_text = query_result['DocumentExcerpt']['Text']
    else:
        #Placeholder to translate the title if needed
        #document_title = query_result['DocumentTitle']['Text']
        #result = translate.translate_text(Text=document_title, SourceLanguageCode="en", TargetLanguageCode=language)
        #document_title = result['TranslatedText']
        answer_text = query_result['DocumentExcerpt']['Text']
        result = translate.translate_text(Text=answer_text, SourceLanguageCode="en", TargetLanguageCode=language)
        answer_text = result['TranslatedText']
    response = answer_text
    return response

Returning the result

At this point, if you obtained a result, you should have it the language that the question was asked. The last portion of the Lambda function is to return the result to Amazon Lex for it to be passed on to the user’s conversational user interface:

if result == "":
             no_matches = "I'm sorry, I couldn't find matches for your query"
             result = translate.translate_text(Text=no_matches, SourceLanguageCode="en", TargetLanguageCode=language)
             result = result['TranslatedText']
        else:
            #Truncate Text
            if len(result) > 340:
                result = result[:340]
                result = result.rsplit(' ', 1)
                result = result[0]+"..."
    response = {
        "dialogAction": {
            "type": "Close",
            "fulfillmentState": "Fulfilled",
            "message": {
              "contentType": "PlainText",
              "content": result
            },
        }
    }

Conclusion

We have demonstrated a few techniques that you can use to enable Amazon Kendra to provide support for languages other than English. We recommend doing a small pilot and accuracy POC on ground truth questions and answers to determine if these techniques can enable your non-English language use cases.

To follow an interactive tutorial that can help you get started with Amazon Kendra visit our Amazon Kendra Essentials+ Workshop. You can also visit the Amazon Kendra website to dive deep on features, connectors, videos and more.

Appendix

In the above sections of this post we covered translation in Amazon Kendra for the reading comprehension and document ranking models. Below, we will cover translation in Amazon Kendra for FAQ matching.

Translations for the FAQ model

For Amazon Kendra FAQ matching, you can use either real-time or pre-translated responses. Pre-translated responses with human vetting likely provide better results. For pre-translated responses, complete the following steps:

  1. Create one row per language desired for each question.
  2. Create a language attribute that specifies what language the answer is in.
  3. Place the pre-translated response into the FAQ answer column.
  4. Use the language attribute as a query filter.

Pre-translation considerations

This chatbot use case has documents with a small amount of text. This allows us to place the pre-translated document into an attribute. For larger files, we place pre-translated document summaries into the attribute instead. This allows us to return vetted summaries in the native language for each document ranking result. We can continue to use real-time translation for the reading comprehension model passages and suggested answers.

Pre-translation is only effective for the document ranking model and the FAQ model. The reading comprehension model doesn’t return associated attributes. The lack of attributes prevents the use of pre-translated content with the reading comprehension model and requires instead that you use on-the-fly translation for the reading comprehension model results.

Creating an Amazon Kendra data source and adding attributes

For this use case, we use two custom attributes that contain the revised translations to Spanish. These attributes are called spanish_title and spanish_text.

To add them into your index, follow these steps:

  1. On the Amazon Kendra console, on your new index, under Data management, choose Facet definition.
  2. Choose Add field.

  1. For Field name, enter your name (spanish_text).
  2. For Data type, choose String.
  3. For Usage types, select Displayable.
  4. Choose Add.

Choose Add.

  1. Repeat the process for the field spanish_title.

Ingesting the example dataset

Now that you have an Amazon Kendra index, the custom index fields, and the sample documents into your S3 bucket, you create an S3 data source.

  1. On the Amazon Kendra console, on your new index, under Data management, choose Data sources.
  2. Choose Add connector.
  3. For My data source name, enter a name (for example, MyS3Connector).
  4. Choose Next.

Choose Next.

  1. For Enter the data source location, enter the location of your S3 bucket.

For Enter the data source location, enter the location of your S3 bucket.

  1. For IAM role, choose Create a new role.
  2. For Role name, enter a name for your role.

For Role name, enter a name for your role.

  1. For Frequency, choose Run on demand.
  2. Choose Next.

Choose Next.

  1. Validate your settings and choose Add data source.
  2. When the process is complete, you can sync your data source by choosing Sync now.

When the process is complete, you can sync your data source by choosing Sync now.

At this point you can test a sample query by on the search console. For example, the following screenshot shows the results for the question “what is Amazon Kendra?”

For example, the following screenshot shows the results for the question “what is Amazon Kendra?”

Setting up the fulfillment Lambda function

For this use case, the multilingual chatbot requires a Lambda function to query the index as well as perform the translations if needed.

  1. On the Lambda console, choose Create function.

On the Lambda console, choose Create function.

  1. Select Author from scratch.

Select Author from scratch.

  1. For Function name, enter a name.
  2. For Runtime, choose the latest Python version available.

For Runtime, choose the latest Python version available.

  1. For Execution role, select Create a new role with basic Lambda permissions.

For Execution role, select Create a new role with basic Lambda permissions.

  1. Choose Create function.
  2. After creating the function, on the Permissions tab, choose your role to edit it.
  3. On the IAM console, choose Add inline policy.
  4. On the JSON tab, update the following policy to include your Amazon Kendra index ID (you can obtain it on the Amazon Kendra console in the Index section):
    {
        "Version": "2012-10-17",
        "Statement": [
            {
                "Sid": "KendraQueries",
                "Effect": "Allow",
                "Action": "kendra:Query",
                "Resource": "arn:aws:kendra:<YOUR_REGION>:<YOUR_AWS_ACCOUNT_IT>:index/<YOUR_AMAZON_KENDRA_INDEX_ID>"
            },
            {
                "Sid": "ComprehendTranslate",
                "Effect": "Allow",
                "Action": [
                    "comprehend:DetectDominantLanguage",
                    "translate:TranslateText"
                ],
                "Resource": "*"
            }
        ]
    }

  1. Choose Review policy.

Choose Review policy.

  1. For Name, enter a name.
  2. Choose Create policy.

Choose Create policy.

  1. In the Lambda configuration, enter the following code into the function code (update your index_id). The code is also available to download.
    """
    Lexbot Lambda handler.
    """
    from urllib.request import Request, urlopen
    import json
    import boto3
    
    
    kendra = boto3.client('kendra')
    #Define your Index ID
    index_id=<YOUR_AMAZON_KENDRA_INDEX_ID>
    region = 'us-east-1'
    
    translate = boto3.client(service_name='translate', region_name=region, use_ssl=True)
    comprehend = boto3.client(service_name='comprehend', region_name=region, use_ssl=True)
    
    
    def query_index(query, language):
        print("Query: "+query)
        if language == "en":
            pass
        elif language == "es":
            translated_query = translate.translate_text(Text=query, SourceLanguageCode="es", TargetLanguageCode="en")
            query = translated_query['TranslatedText']
        else:
            try:
                translated_query = translate.translate_text(Text=query, SourceLanguageCode=language, TargetLanguageCode="en")
                query = translated_query['TranslatedText']
            except Exception as e:
                return(str(e))     
        response=kendra.query(
            QueryText = query,
            IndexId = index_id)
        print(response)
        #Return just the first result
        for query_result in response['ResultItems']:
            #Reading comprehension result
            if query_result['Type']=='ANSWER':
                    url = query_result['DocumentURI']
                    answer_text = query_result['DocumentExcerpt']['Text']
                    if language == "en":
                        pass
                    else:
                        result = translate.translate_text(Text=answer_text, SourceLanguageCode="en", TargetLanguageCode=language)
                        answer_text = result['TranslatedText']
                    response = answer_text
                    return response
            #Document Ranking result    
            if query_result['Type']=='DOCUMENT':
                if query_result['ScoreAttributes']['ScoreConfidence'] == "LOW":
                    response = ""
                    return(response)
                else:    
                    synopsis = ""
                    document_title = ""
                    answer_text= ""
                    url = ""
                    for key in query_result['DocumentAttributes']:
                        if language == "es":
                            if key['Key'] == 'spanish_text':
                                synopsis = key['Value']['StringValue']
                                answer_text = synopsis
                                if key['Key'] == 'spanish_title':
                                    document_title = key['Value']['StringValue']
                                    print('Title: ' + document_title)
                        elif language == "en":
                            document_title = query_result['DocumentTitle']['Text']
                            answer_text = query_result['DocumentExcerpt']['Text']
                        else:
                            #Placeholder to translate the title if needed
                            #document_title = query_result['DocumentTitle']['Text']
                            #result = translate.translate_text(Text=document_title, SourceLanguageCode="en", TargetLanguageCode=language)
                            #document_title = result['TranslatedText']
                            answer_text = query_result['DocumentExcerpt']['Text']
                            result = translate.translate_text(Text=answer_text, SourceLanguageCode="en", TargetLanguageCode=language)
                            answer_text = result['TranslatedText']
                        response = answer_text
                        return response
        
    def lambda_handler(event, context):
        if(len(event['inputTranscript']) < 3):
            result = "Please try again"
        else:
            query = event['inputTranscript']
            response =  comprehend.detect_dominant_language(Text = query)
            confidence = response["Languages"][0]['Score']
            if confidence > 0.50:
                language = response["Languages"][0]['LanguageCode']
            else:
                #Default to english if there isn't enough confidence
                language = "en"
            result = query_index(query, language)
            if result == "":
                 no_matches = "I'm sorry, I couldn't find matches for your query"
                 result = translate.translate_text(Text=no_matches, SourceLanguageCode="en", TargetLanguageCode=language)
                 result = result['TranslatedText']
            else:
                #Truncate Text
                if len(result) > 340:
                    result = result[:340]
                    result = result.rsplit(' ', 1)
                    result = result[0]+"..."
        response = {
            "dialogAction": {
                "type": "Close",
                "fulfillmentState": "Fulfilled",
                "message": {
                  "contentType": "PlainText",
                  "content": result
                },
            }
        }
        print('result = ' + str(response))
    

    
    

  2. Choose Deploy.

Choose Deploy.

Setting up the chatbot

The chatbot that you create for this use case uses Lambda to fulfill the requests. Essentially, you create a fallback intent and pass the user input to the Lambda function.

To set up a chatbot on the console, complete the following steps:

  1. On the Amazon Lex console, under Bots, choose Create.
  2. Choose Custom bot.

Choose Custom bot.

  1. For Bot name, enter a name.
  2. For Language, choose English (US).
  3. Leave the other options at their defaults.

  1. Choose Create.

For this post, we use the fallback intent to process the queries sent to Amazon Kendra. First we need to create an intent.

  1. Choose Create intent.

Choose Create intent.

  1. Enter a name for your intent and choose Add.

  1. Under Sample utterances, enter some sample utterances.

Under Sample utterances, enter some sample utterances.

  1. Under Response, enter an example answer.

Under Response, enter an example answer.

  1. Choose Save Intent.

Now you can build and test your bot (see the following screenshot).

  1. To import the fallback intent, next to Intents, choose the icon.

  1. Choose Search existing intents.

  1. Search for and choose the built-in intent AMAZON.FallbackIntent.

Search for and choose the built-in intent AMAZON.FallbackIntent.

  1. Enter a name.
  2. Choose Add.

  1. For Fulfillment, select AWS Lambda function.

For Fulfillment, select AWS Lambda function.

  1. For Lambda function, choose the function you created.

For Lambda function, choose the function you created.

  1. Choose Save Intent.

Now you disable the clarification questions so you can use the fallback intent on the first attempt.

  1. Under Error handling, deselect Clarification prompts.
  2. Choose Save.

Choose Save.

  1. Choose Build.

Testing

After the bot building process is complete, you can test your bot directly on the Amazon Kendra console.

Now we issue the same query in French (“Qu’est-ce qu’Amazon Kendra?”) and we get the response back in French.

If you want to test your chatbot as a standalone web application, see Sample Amazon Lex Web Interface on GitHub.

You can also test the Amazon Lex integration with Slack or Facebook Messenger.


About the Author

Juan Bustos is an AI Services Specialist Solutions Architect at Amazon Web Services, based in Dallas, TX. Outside of work, he loves spending time writing and playing music as well as trying random restaurants with his family.

 

 

 

David Shute is a Senior ML GTM Specialist at Amazon Web Services focused on Amazon Kendra. When not working, he enjoys hiking and walking on a beach.

Read More