Amazon AWS – Page 145

How Light & Wonder built a predictive maintenance solution for gaming machines on AWS

June 22, 2023

by Aruna Abeyakoon Amazon AWS

This post is co-written with Aruna Abeyakoon and Denisse Colin from Light and Wonder (L&W).

Headquartered in Las Vegas, Light & Wonder, Inc. is the leading cross-platform global game company that provides gambling products and services. Working with AWS, Light & Wonder recently developed an industry-first secure solution, Light & Wonder Connect (LnW Connect), to stream telemetry and machine health data from roughly half a million electronic gaming machines distributed across its casino customer base globally when LnW Connect reaches its full potential. Over 500 machine events are monitored in near-real time to give a full picture of machine conditions and their operating environments. Utilizing data streamed through LnW Connect, L&W aims to create better gaming experience for their end-users as well as bring more value to their casino customers.

Light & Wonder teamed up with the Amazon ML Solutions Lab to use events data streamed from LnW Connect to enable machine learning (ML)-powered predictive maintenance for slot machines. Predictive maintenance is a common ML use case for businesses with physical equipment or machinery assets. With predictive maintenance, L&W can get advanced warning of machine breakdowns and proactively dispatch a service team to inspect the issue. This will reduce machine downtime and avoid significant revenue loss for casinos. With no remote diagnostic system in place, issue resolution by the Light & Wonder service team on the casino floor can be costly and inefficient, while severely degrading the customer gaming experience.

The nature of the project is highly exploratory—this is the first attempt at predictive maintenance in the gaming industry. The Amazon ML Solutions Lab and L&W team embarked on an end-to-end journey from formulating the ML problem and defining the evaluation metrics, to delivering a high-quality solution. The final ML model combines CNN and Transformer, which are the state-of-the-art neural network architectures for modeling sequential machine log data. The post presents a detailed description of this journey, and we hope you will enjoy it as much as we do!

In this post, we discuss the following:

How we formulated the predictive maintenance problem as an ML problem with a set of appropriate metrics for evaluation
How we prepared data for training and testing
Data preprocessing and feature engineering techniques we employed to obtain performant models
Performing a hyperparameter tuning step with Amazon SageMaker Automatic Model Tuning
Comparisons between the baseline model and the final CNN+Transformer model
Additional techniques we used to improve model performance, such as ensembling

Background

In this section, we discuss the issues that necessitated this solution.

Dataset

Slot machine environments are highly regulated and are deployed in an air-gapped environment. In LnW Connect, an encryption process was designed to provide a secure and reliable mechanism for the data to be brought into an AWS data lake for predictive modeling. The aggregated files are encrypted and the decryption key is only available in AWS Key Management Service (AWS KMS). A cellular-based private network into AWS is set up through which the files were uploaded into Amazon Simple Storage Service (Amazon S3).

LnW Connect streams a wide range of machine events, such as start of game, end of game, and more. The system collects over 500 different types of events. As shown in the following
, each event is recorded along with a timestamp of when it happened and the ID of the machine recording the event. LnW Connect also records when a machine enters a non-playable state, and it will be marked as a machine failure or breakdown if it doesn’t recover to a playable state within a sufficiently short time span.

Machine ID	Event Type ID	Timestamp
0	E1	2022-01-01 00:17:24
0	E3	2022-01-01 00:17:29
1000	E4	2022-01-01 00:17:33
114	E234	2022-01-01 00:17:34
222	E100	2022-01-01 00:17:37

In addition to dynamic machine events, static metadata about each machine is also available. This includes information such as machine unique identifier, cabinet type, location, operating system, software version, game theme, and more, as shown in the following table. (All the names in the table are anonymized to protect customer information.)

Machine ID	Cabinet Type	OS	Location	Game Theme
276	A	OS_Ver0	AA Resort & Casino	StormMaiden
167	B	OS_Ver1	BB Casino, Resort & Spa	UHMLIndia
13	C	OS_Ver0	CC Casino & Hotel	TerrificTiger
307	D	OS_Ver0	DD Casino Resort	NeptunesRealm
70	E	OS_Ver0	EE Resort & Casino	RLPMealTicket

Problem definition

We treat the predictive maintenance problem for slot machines as a binary classification problem. The ML model takes in the historical sequence of machine events and other metadata and predicts whether a machine will encounter a failure in a 6-hour future time window. If a machine will break down within 6 hours, it is deemed a high-priority machine for maintenance. Otherwise, it is low priority. The following figure gives examples of low-priority (top) and high-priority (bottom) samples. We use a fixed-length look-back time window to collect historical machine event data for prediction. Experiments show that longer look-back time windows improve model performance significantly (more details later in this post).

Modeling challenges

We faced a couple of challenges solving this problem:

We have a huge amount event logs that contain around 50 million events a month (from approximately 1,000 game samples). Careful optimization is needed in the data extraction and preprocessing stage.
Event sequence modeling was challenging due to the extremely uneven distribution of events over time. A 3-hour window can contain anywhere from tens to thousands of events.
Machines are in a good state most of the time and the high-priority maintenance is a rare class, which introduced a class imbalance issue.
New machines are added continuously to the system, so we had to make sure our model can handle prediction on new machines that have never been seen in training.

Data preprocessing and feature engineering

In this section, we discuss our methods for data preparation and feature engineering.

Feature engineering

Slot machine feeds are streams of unequally spaced time series events; for example, the number of events in a 3-hour window can range from tens to thousands. To handle this imbalance, we used event frequencies instead of the raw sequence data. A straightforward approach is aggregating the event frequency for the entire look-back window and feeding it into the model. However, when using this representation, the temporal information is lost, and the order of events is not preserved. We instead used temporal binning by dividing the time window into N equal sub-windows and calculating the event frequencies in each. The final features of a time window are the concatenation of all its sub-window features. Increasing the number of bins preserves more temporal information. The following figure illustrates temporal binning on a sample window.

First, the sample time window is split into two equal sub-windows (bins); we used only two bins here for simplicity for illustration. Then, the counts of the events E1, E2, E3, and E4 are calculated in each bin. Lastly, they are concatenated and used as features.

Along with the event frequency-based features, we used machine-specific features like software version, cabinet type, game theme, and game version. Additionally, we added features related to the timestamps to capture the seasonality, such as hour of the day and day of the week.

Data preparation

To extract data efficiently for training and testing, we utilize Amazon Athena and the AWS Glue Data Catalog. The events data is stored in Amazon S3 in Parquet format and partitioned according to day/month/hour. This facilitates efficient extraction of data samples within a specified time window. We use data from all machines in the latest month for testing and the rest of the data for training, which helps avoid potential data leakage.

ML methodology and model training

In this section, we discuss our baseline model with AutoGluon and how we built a customized neural network with SageMaker automatic model tuning.

Building a baseline model with AutoGluon

With any ML use case, it’s important to establish a baseline model to be used for comparison and iteration. We used AutoGluon to explore several classic ML algorithms. AutoGluon is easy-to-use AutoML tool that uses automatic data processing, hyperparameter tuning, and model ensemble. The best baseline was achieved with a weighted ensemble of gradient boosted decision tree models. The ease of use of AutoGluon helped us in the discovery stage to navigate quickly and efficiently through a wide range of possible data and ML modeling directions.

Building and tuning a customized neural network model with SageMaker automatic model tuning

After experimenting with different neural networks architectures, we built a customized deep learning model for predictive maintenance. Our model surpassed the AutoGluon baseline model by 121% in recall at 80% precision. The final model ingests historical machine event sequence data, time features such as hour of the day, and static machine metadata. We utilize SageMaker automatic model tuning jobs to search for the best hyperparameters and model architectures.

The following figure shows the model architecture. We first normalize the binned event sequence data by average frequencies of each event in the training set to remove the overwhelming effect of high-frequency events (start of game, end of game, and so on). The embeddings for individual events are learnable, while the temporal feature embeddings (day of the week, hour of the day) are extracted using the package GluonTS. Then we concatenate the event sequence data with the temporal feature embeddings as the input to the model. The model consists of the following layers:

Convolutional layers (CNN) – Each CNN layer consists of two 1-dimensional convolutional operations with residual connections. The output of each CNN layer has the same sequence length as the input to allow for easy stacking with other modules. The total number of CNN layers is a tunable hyperparameter.
Transformer encoder layers (TRANS) – The output of the CNN layers is fed together with the positional encoding to a multi-head self-attention structure. We use TRANS to directly capture temporal dependencies instead of using recurrent neural networks. Here, binning of the raw sequence data (reducing length from thousands to hundreds) helps alleviate the GPU memory bottlenecks, while keeping the chronological information to a tunable extent (the number of the bins is a tunable hyperparameter).
Aggregation layers (AGG) – The final layer combines the metadata information (game theme type, cabinet type, locations) to produce the priority level probability prediction. It consists of several pooling layers and fully connected layers for incremental dimension reduction. The multi-hot embeddings of metadata are also learnable, and don’t go through CNN and TRANS layers because they don’t contain sequential information.

We use the cross-entropy loss with class weights as tunable hyperparameters to adjust for the class imbalance issue. In addition, the numbers of CNN and TRANS layers are crucial hyperparameters with the possible values of 0, which means specific layers may not always exist in the model architecture. This way, we have a unified framework where the model architectures are searched along with other usual hyperparameters.

We utilize SageMaker automatic model tuning, also known as hyperparameter optimization (HPO), to efficiently explore model variations and the large search space of all hyperparameters. Automatic model tuning receives the customized algorithm, training data, and hyperparameter search space configurations, and searches for best hyperparameters using different strategies such as Bayesian, Hyperband, and more with multiple GPU instances in parallel. After evaluating on a hold-out validation set, we obtained the best model architecture with two layers of CNN, one layer of TRANS with four heads, and an AGG layer.

We used the following hyperparameter ranges to search for the best model architecture:

hyperparameter_ranges = {
# Learning Rate
"learning_rate": ContinuousParameter(5e-4, 1e-3, scaling_type="Logarithmic"),
# Class weights
"loss_weight": ContinuousParameter(0.1, 0.9),
# Number of input bins
"num_bins": CategoricalParameter([10, 40, 60, 120, 240]),
# Dropout rate
"dropout_rate": CategoricalParameter([0.1, 0.2, 0.3, 0.4, 0.5]),
# Model embedding dimension
"dim_model": CategoricalParameter([160,320,480,640]),
# Number of CNN layers
"num_cnn_layers": IntegerParameter(0,10),
# CNN kernel size
"cnn_kernel": CategoricalParameter([3,5,7,9]),
# Number of tranformer layers
"num_transformer_layers": IntegerParameter(0,4),
# Number of transformer attention heads
"num_heads": CategoricalParameter([4,8]),
#Number of RNN layers
"num_rnn_layers": IntegerParameter(0,10), # optional
# RNN input dimension size
"dim_rnn":CategoricalParameter([128,256])
}

To further improve model accuracy and reduce model variance, we trained the model with multiple independent random weight initializations, and aggregated the result with mean values as the final probability prediction. There is a trade-off between more computing resources and better model performance, and we observed that 5–10 should be a reasonable number in the current use case (results shown later in this post).

Model performance results

In this section, we present the model performance evaluation metrics and results.

Evaluation metrics

Precision is very important for this predictive maintenance use case. Low precision means reporting more false maintenance calls, which drives costs up through unnecessary maintenance. Because average precision (AP) doesn’t fully align with the high precision objective, we introduced a new metric named average recall at high precisions (ARHP). ARHP is equal to the average of recalls at 60%, 70%, and 80% precision points. We also used precision at top K% (K=1, 10), AUPR, and AUROC as additional metrics.

Results

The following table summarizes the results using the baseline and the customized neural network models, with 7/1/2022 as the train/test split point. Experiments show that increasing the window length and sample data size both improve the model performance, because they contain more historical information to help with the prediction. Regardless of the data settings, the neural network model outperforms AutoGluon in all metrics. For example, recall at the fixed 80% precision is increased by 121%, which enables you to quickly identify more malfunctioned machines if using the neural network model.

Model	Window length/Data size	AUROC	AUPR	ARHP	Recall@Prec0.6	Recall@Prec0.7	Recall@Prec0.8	Prec@top1%	Prec@top10%
AutoGluon baseline	12H/500k	66.5	36.1	9.5	12.7	9.3	6.5	85	42
Neural Network	12H/500k	74.7	46.5	18.5	25	18.1	12.3	89	55
AutoGluon baseline	48H/1mm	70.2	44.9	18.8	26.5	18.4	11.5	92	55
Neural Network	48H/1mm	75.2	53.1	32.4	39.3	32.6	25.4	94	65

The following figures illustrate the effect of using ensembles to boost the neural network model performance. All the evaluation metrics shown on the x-axis are improved, with higher mean (more accurate) and lower variance (more stable). Each box-plot is from 12 repeated experiments, from no ensembles to 10 models in ensembles (x-axis). Similar trends persist in all metrics besides the Prec@top1% and Recall@Prec80% shown.

After factoring in the computational cost, we observe that using 5–10 models in ensembles is suitable for Light & Wonder datasets.

Conclusion

Our collaboration has resulted in the creation of a groundbreaking predictive maintenance solution for the gaming industry, as well as a reusable framework that could be utilized in a variety of predictive maintenance scenarios. The adoption of AWS technologies such as SageMaker automatic model tuning facilitates Light & Wonder to navigate new opportunities using near-real-time data streams. Light & Wonder is starting the deployment on AWS.

If you would like help accelerating the use of ML in your products and services, please contact the Amazon ML Solutions Lab program.

About the authors

Aruna Abeyakoon is the Senior Director of Data Science & Analytics at Light & Wonder Land-based Gaming Division. Aruna leads the industry-first Light & Wonder Connect initiative and supports both casino partners and internal stakeholders with consumer behavior and product insights to make better games, optimize product offerings, manage assets, and health monitoring & predictive maintenance.

Denisse Colin is a Senior Data Science Manager at Light & Wonder, a leading cross-platform global game company. She is a member of the Gaming Data & Analytics team helping develop innovative solutions to improve product performance and customers’ experiences through Light & Wonder Connect.

Tesfagabir Meharizghi is a Data Scientist at the Amazon ML Solutions Lab where he helps AWS customers across various industries such as gaming, healthcare and life sciences, manufacturing, automotive, and sports and media, accelerate their use of machine learning and AWS cloud services to solve their business challenges.

Mohamad Aljazaery is an applied scientist at Amazon ML Solutions Lab. He helps AWS customers identify and build ML solutions to address their business challenges in areas such as logistics, personalization and recommendations, computer vision, fraud prevention, forecasting and supply chain optimization.

Yawei Wang is an Applied Scientist at the Amazon ML Solution Lab. He helps AWS business partners identify and build ML solutions to address their organization’s business challenges in a real-world scenario.

Yun Zhou is an Applied Scientist at the Amazon ML Solutions Lab, where he helps with research and development to ensure the success of AWS customers. He works on pioneering solutions for various industries using statistical modeling and machine learning techniques. His interest includes generative models and sequential data modeling.

Panpan Xu is a Applied Science Manager with the Amazon ML Solutions Lab at AWS. She is working on research and development of Machine Learning algorithms for high-impact customer applications in a variety of industrial verticals to accelerate their AI and cloud adoption. Her research interest includes model interpretability, causal analysis, human-in-the-loop AI and interactive data visualization.

Raj Salvaji leads Solutions Architecture in the Hospitality segment at AWS. He works with hospitality customers by providing strategic guidance, technical expertise to create solutions to complex business challenges. He draws on 25 years of experience in multiple engineering roles across Hospitality, Finance and Automotive industries.

Shane Rai is a Principal ML Strategist with the Amazon ML Solutions Lab at AWS. He works with customers across a diverse spectrum of industries to solve their most pressing and innovative business needs using AWS’s breadth of cloud-based AI/ML services.

Yash Kanoria wins SIGecom Test of Time Award

June 22, 2023

by Amazon AWS

Kanoria and coauthors honored for their paper narrowing the gap between theoretical understanding and practical experience in matching markets.Read More

How Ali Dashti helps advance the science behind marketing collections

June 21, 2023

by Amazon AWS

The senior applied science manager envisions machine learning as the path to a better experience for Amazon customers.Read More

Use the AWS CDK to deploy Amazon SageMaker Studio lifecycle configurations

June 21, 2023

by Cory Hairston Amazon AWS

Amazon SageMaker Studio is the first fully integrated development environment (IDE) for machine learning (ML). Studio provides a single web-based visual interface where you can perform all ML development steps required to prepare data, as well as build, train, and deploy models. Lifecycle configurations are shell scripts triggered by Studio lifecycle events, such as starting a new Studio notebook. You can use lifecycle configurations to automate customization for your Studio environment. This customization includes installing custom packages, configuring notebook extensions, preloading datasets, and setting up source code repositories. For example, as an administrator for a Studio domain, you may want to save costs by having notebook apps shut down automatically after long periods of inactivity.

The AWS Cloud Development Kit (AWS CDK) is a framework for defining cloud infrastructure through code and provisioning it through AWS CloudFormation stacks. A stack is a collection of AWS resources that can be programmatically updated, moved, or deleted. AWS CDK constructs are the building blocks of AWS CDK applications, representing the blueprint to define cloud architectures.

In this post, we show how to use the AWS CDK to set up Studio, use Studio lifecycle configurations, and enable its access for data scientists and developers in your organization.

Solution overview

The modularity of lifecycle configurations allows you to apply them to all users in a domain or to specific users. This way, you can set up lifecycle configurations and reference them in the Studio kernel gateway or Jupyter server quickly and consistently. The kernel gateway is the entry point to interact with a notebook instance, whereas the Jupyter server represents the Studio instance. This enables you to apply DevOps best practices and meet safety, compliance, and configuration standards across all AWS accounts and Regions. For this post, we use Python as the main language, but the code can be easily changed to other AWS CDK supported languages. For more information, refer to Working with the AWS CDK.

Prerequisites

To get started, make sure you have the following prerequisites:

The AWS Command Line Interface (AWS CLI) installed.
The AWS CDK installed. For more information, refer to Getting started with the AWS CDK and Working with the AWS CDK in Python.
An AWS profile with permissions to create AWS Identity and Access Management (IAM) roles, Studio domains, and Studio user profiles.
Python 3+.

Clone the GitHub repository

First, clone the GitHub repository.

As you clone the repository, you can observe that we have a classic AWS CDK project with the directory studio-lifecycle-config-construct, which contains the construct and resources required to create lifecycle configurations.

AWS CDK constructs

The file we want to inspect is aws_sagemaker_lifecycle.py. This file contains the SageMakerStudioLifeCycleConfig construct we use to set up and create lifecycle configurations.

The SageMakerStudioLifeCycleConfig construct provides the framework for building lifecycle configurations using a custom AWS Lambda function and shell code read in from a file. The construct contains the following parameters:

ID – The name of the current project.
studio_lifecycle_content – The base64 encoded content.
studio_lifecycle_tags – Labels you assign to organize Amazon resources. They are inputted as key-value pairs and are optional for this configuration.
studio_lifecycle_config_app_type – JupyterServer is for the unique server itself, and the KernelGateway app corresponds to a running SageMaker image container.

For more information on the Studio notebook architecture, refer to Dive deep into Amazon SageMaker Studio Notebooks architecture.

The following is a code snippet of the Studio lifecycle config construct (aws_sagemaker_lifecycle.py):

class SageMakerStudioLifeCycleConfig(Construct):
 def __init__(
 self,
 scope: Construct,
 id: str,
 studio_lifecycle_config_content: str,
 studio_lifecycle_config_app_type: str,
 studio_lifecycle_config_name: str,
 studio_lifecycle_config_arn: str,
 **kwargs,
 ):
 super().__init__(scope, id)
 self.studio_lifecycle_content = studio_lifecycle_content
 self.studio_lifecycle_config_name = studio_lifecycle_config_name
 self.studio_lifecycle_config_app_type = studio_lifecycle_config_app_type

 lifecycle_config_role = iam.Role(
 self,
 "SmStudioLifeCycleConfigRole",
 assumed_by=iam.ServicePrincipal("lambda.amazonaws.com"),
 )

 lifecycle_config_role.add_to_policy(
 iam.PolicyStatement(
 resources=[f"arn:aws:sagemaker:{scope.region}:{scope.account}:*"],
 actions=[
 "sagemaker:CreateStudioLifecycleConfig",
 "sagemaker:ListUserProfiles",
 "sagemaker:UpdateUserProfile",
 "sagemaker:DeleteStudioLifecycleConfig",
 "sagemaker:AddTags",
 ],
 )
 )

 create_lifecycle_script_lambda = lambda_.Function(
 self,
 "CreateLifeCycleConfigLambda",
 runtime=lambda_.Runtime.PYTHON_3_8,
 timeout=Duration.minutes(3),
 code=lambda_.Code.from_asset(
 "../mlsl-cdk-constructs-lib/src/studiolifecycleconfigconstruct"
 ),
 handler="onEvent.handler",
 role=lifecycle_config_role,
 environment={
 "studio_lifecycle_content": self.studio_lifecycle_content,
 "studio_lifecycle_config_name": self.studio_lifecycle_config_name,
 "studio_lifecycle_config_app_type": self.studio_lifecycle_config_app_type,
 },
 )

 config_custom_resource_provider = custom_resources.Provider(
 self,
 "ConfigCustomResourceProvider",
 on_event_handler=create_lifecycle_script_lambda,
 )

 studio_lifecyle_config_custom_resource = CustomResource(
 self,
 "LifeCycleCustomResource",
 service_token=config_custom_resource_provider.service_token,
 )
 self. studio_lifecycle_config_arn = studio_lifecycle_config_custom_resource.get_att("StudioLifecycleConfigArn")

After you import and install the construct, you can use it. The following code snippet shows how to create a lifecycle config using the construct in a stack either in app.py or another construct:

my_studio_lifecycle_config = SageMakerStudioLifeCycleConfig(
 self,
 "MLSLBlogPost",
 studio_lifecycle_config_content="base64content",
 studio_lifecycle_config_name="BlogPostTest",
 studio_lifecycle_config_app_type="JupyterServer",
 
 )

Deploy AWS CDK constructs

To deploy your AWS CDK stack, run the following commands in the location where you cloned the repository.

The command may be python instead of python3 depending on your path configurations.

Create a virtual environment:
1. For macOS/Linux, use python3 -m venv .cdk-venv.
2. For Windows, use python3 -m venv .cdk-venv.
Activate the virtual environment:
1. For macOS/Linux, use source .cdk-venvbinactivate.
2. For Windows, use .cdk-venv/Scripts/activate.bat.
3. For PowerShell, use .cdk-venv/Scripts/activate.ps1.
Install the required dependencies:
1. pip install -r requirements.txt
2. pip install -r requirements-dev.txt
At this point, you can optionally synthesize the CloudFormation template for this code:
```
cdk synth
```
Deploy the solution with the following commands:
1. aws configure
2. cdk bootstrap
3. cdk deploy

When the stack is successfully deployed, you should be able to view the stack on the CloudFormation console.

You will also be able to view the lifecycle configuration on the SageMaker console.

Choose the lifecycle configuration to view the shell code that runs as well as any tags you assigned.

Attach the Studio lifecycle configuration

There are multiple ways to attach a lifecycle configuration. In this section, we present two methods: using the AWS Management Console, and programmatically using the infrastructure provided.

Attach the lifecycle configuration using the console

To use the console, complete the following steps:

On the SageMaker console, choose Domains in the navigation pane.
Choose the domain name you’re using and the current user profile, then choose Edit.
Select the lifecycle configuration you want to use and choose Attach.

From here, you can also set it as default.

Attach the lifecycle configuration programmatically

You can also retrieve the ARN of the Studio lifecycle configuration created by the construct’s and attach it to the Studio construct programmatically. The following code shows the lifecycle configuration ARN being passed to a Studio construct:

default_user_settings=sagemaker.CfnDomain.UserSettingsProperty(
                execution_role=self.sagemaker_role.role_arn,
                jupyter_server_app_settings=sagemaker.CfnDomain.JupyterServerAppSettingsProperty(
                    default_resource_spec=sagemaker.CfnDomain.ResourceSpecProperty(
                        instance_type="system",
                        lifecycle_config_arn = my_studio_lifecycle_config.studio_lifeycycle_config_arn

                    )
                )

Clean up

Complete the steps in this section to clean up your resources.

Delete the Studio lifecycle configuration

To delete your lifecycle configuration, complete the following steps:

On the SageMaker console, choose Studio lifecycle configurations in the navigation pane.
Select the lifecycle configuration, then choose Delete.

Delete the AWS CDK stack

When you’re done with the resources you created, you can destroy your AWS CDK stack by running the following command in the location where you cloned the repository:

cdk destroy

When asked to confirm the deletion of the stack, enter yes.

You can also delete the stack on the AWS CloudFormation console with the following steps:

On the AWS CloudFormation console, choose Stacks in the navigation pane.
Choose the stack that you want to delete.
In the stack details pane, choose Delete.
Choose Delete stack when prompted.

If you run into any errors, you may have to manually delete some resources depending on your account configuration.

Conclusion

In this post, we discussed how Studio serves as an IDE for ML workloads. Studio offers lifecycle configuration support, which allows you to set up custom shell scripts to perform automated tasks, or set up development environments at launch. We used AWS CDK constructs to build the infrastructure for the custom resource and lifecycle configuration. Constructs are synthesized into CloudFormation stacks that are then deployed to create the custom resource and lifecycle script that is used in Studio and the notebook kernel.

For more information, visit Amazon SageMaker Studio.

About the Authors

Cory Hairston is a Software Engineer with the Amazon ML Solutions Lab. He currently works on providing reusable software solutions.

Alex Chirayath is a Senior Machine Learning Engineer at the Amazon ML Solutions Lab. He leads teams of data scientists and engineers to build AI applications to address business needs.

Gouri Pandeshwar is an Engineer Manager at the Amazon ML Solutions Lab. He and his team of engineers are working to build reusable solutions and frameworks that help accelerate adoption of AWS AI/ML services for customers’ business use cases.

Boost agent productivity with Salesforce integration for Live Call Analytics

June 21, 2023

by Kishore Dhamodaran Amazon AWS

As a contact center agent, would you rather focus on having productive customer conversations or get distracted by having to look up customer information and knowledge articles that could exist in various systems? We’ve all been there. Having a productive conversation while multitasking is challenging. A single negative experience may put a dent on a customer’s perception of your brand.

The Live Call Analytics with Agent Assist (LCA) open-source solution addresses these challenges by providing features such as AI-powered agent assistance, call transcription, call summarization, and much more. As part of our effort to meet the needs of your agents, we strive to add features based on your feedback and our own experience helping contact center operators.

One of the features we added is the ability to write your own AWS Lambda hooks for the start of call and post-call to custom process calls as they occur. This makes it easier to custom integrate with LCA architecture without complex modification to the original source code. It also lets you update LCA stack deployments more easily and quickly than if you were modifying the code directly.

Today, we are excited to announce a feature that lets you integrate LCA with your Customer Relationship Management (CRM) system, built on top of the pre- and post-call Lambda hooks.

In this post, we walk you through setting up the LCA/CRM integration with Salesforce.

Solution overview

LCA now has two additional Lambda hooks:

Start of call Lambda hook – The LCA Call Event/Transcript Processor invokes this hook at the beginning of each call. This function can implement custom logic that applies to the beginning of call processing, such as retrieving call summary details logged into a case in a CRM.
Post-call summary Lambda hook – The LCA Call Event/Transcript Processor invokes this hook after the call summary is processed. This function can implement custom logic that’s relevant to postprocessing, for example, updating the call summary to a CRM system.

The following diagram illustrates the start of call and post-call (summary) Lambda hooks that integrate with Salesforce to look up and update case records, respectively.

Here are the steps we walk you through:

Set up Salesforce to allow the custom Lambda hooks to look up or update the case records.
Deploy the LCA and Salesforce integration stacks.
Update the LCA stack with the Salesforce integration Lambda hooks and perform validations.

Prerequisites

You need the following prerequisites:

An existing Salesforce organization. Sign up for a free Salesforce Developer Edition organization, if you don’t have one.
An AWS account. If you don’t have one, sign up at https://aws.amazon.com.
The AWS Command Line Interface (AWS CLI) version 2 installed.
The AWS Serverless Application Model Command Line Interface (AWS SAM CLI), to build and deploy your SAM application.

Create a Salesforce connected app

To set up your Salesforce app, complete the following steps:

Log in to your Salesforce org and go to Setup.
Search for App Manager and choose App Manager.
Choose New Connected App.
For Connected App Name, enter a name.
For Contact Email, enter a valid email.
Select Enable OAuth Settings and enter a value for Callback URL.
Under Available OAuth Scopes, choose Manage user data via APIs (api).
Select Require Secret for Webserver Flow and Require Secret for Refresh Token Flow.
Choose Save.
Under API (Enable OAuth Settings), choose Manage Consumer Details.
Verify your identity if prompted.
Copy the consumer key and consumer secret.

You need these when deploying the AWS Serverless Application Model (AWS SAM) application.

Get your Salesforce access token

If you don’t already have an access token, you need to obtain one. Before doing this, make sure that you’re prepared to update any applications that are using an access token because this step creates a new one and may invalidate the prior tokens.

Find your personal information by choosing Settings from View profile on the top right.
Choose Reset My Security Token followed by Reset Security Token.
Make note of the new access token that you receive via email.

Create a Salesforce customer contact record for each caller

The Lambda function that performs case look-up and update matches the caller’s phone number with a contact record in Salesforce. To create a new contact, complete the following steps:

Log in to your Salesforce org.
Under App Launcher, search for and choose Service Console.
On the Service Console page, choose Contacts from the drop-down list, then choose New.
Enter a valid phone number under the Phone field of the New Contact page.
Enter other contact details and choose Save.
Repeat Steps 1–5 for any caller that makes a phone call and test the integration.

Deploy the LCA stack

Complete the following steps to deploy the LCA stack:

Follow the instructions under the Deploy the CloudFormation stack section of Live call analytics and agent assist for your contact center with Amazon language AI services.
Make sure that you choose ANTHROPIC, SAGEMAKER, or LAMBDA for the End of Call Transcript Summary parameter. See Transcript Summarization for more details.

The stacks take about 45 minutes to deploy.

After the main stack shows CREATE_COMPLETE, on the Outputs tab, make a note of the Kinesis data stream ARN (CallDataStreamArn).

Deploy the Salesforce integration stack

To deploy the Salesforce integration stack, complete the following steps:

Open a command-line terminal and run the following commands:

https://github.com/aws-samples/amazon-transcribe-live-call-analytics.git
cd amazon-transcribe-live-call-analytics/plugins/salesforce-integration
sam build
sam deploy —guided

Use the following table as a reference for parameter choices.

Parameter Name	Description
AWS Region	The Region where you have deployed the LCA solution
SalesforceUsername	The user name of your Salesforce organization that has permissions to read and create cases
SalesforcePassword	The password associated to your Salesforce user name
SalesforceAccessToken	The access token you obtained earlier
SalesforceConsumerKey	The consumer key you copied earlier
SalesforceConsumerSecret	The consumer secret you obtained earlier
SalesforceHostUrl	The login URL of your Salesforce organization
SalesforceAPIVersion	The Salesforce API version (choose default or v56.0)
LCACallDataStreamArn	The Kinesis data stream ARN (CallDataStreamArn) obtained earlier

After the stack successfully deploys, make a note of StartOfCallLambdaHookFunctionArn and PostCallSummaryLambdaHookFunctionArn from the outputs displayed on your terminal.

Update LCA Stack

Complete the following steps to update the LCA stack:

On the AWS CloudFormation console, update the main LCA stack.
Choose Use current template.
For Lambda Hook Function ARN for Custom Start of Call Processing (existing), provide the StartOfCallLambdaHookFunctionArn that you obtained earlier.
For Lambda Hook Function ARN for Custom Post Processing, after the Call Transcript Summary is processed (existing), provide the PostCallSummaryLambdaHookFunctionArn that you obtained earlier.
Make sure that End of Call Transcript Summary is not DISABLED.

Validate the integration

Make a test call and make sure you can see the beginning of call AGENT ASSIST and post-call AGENT ASSIST transcripts. Refer to the Explore live call analysis and agent assist features section of the Live call analytics and agent assist for your contact center with Amazon language AI services post for guidance.

Clean up

To avoid incurring charges, clean up your resources by following these instructions when you are finished experimenting with this solution:

On the AWS CloudFormation console, and delete the LCA stacks that you deployed. This deletes resources that were created by deploying the solution. The recording S3 buckets, DynamoDB table, and CloudWatch log groups are retained after the stack is deleted to avoid deleting your data.
On your terminal, run sam delete to delete the Salesforce integration Lambda functions.
Follow the instructions in Deactivate a Developer Edition Org to deactivate your Salesforce Developer org.

Conclusion

In this post, we demonstrated how the Live-Call Analytics sample project can accelerate your adoption of real-time contact center analytics and integration. Rather than building from scratch, we show how to use the existing code base with the pre-built integration points with the start of call and post-call Lambda hooks. This enhances agent productivity by integrating with Salesforce to look up and update case records. Explore our open-source project and enhance the CRM pre- and post-call Lambda hooks to accommodate your use case.

About the Authors

Kishore Dhamodaran is a Senior Solutions Architect at AWS.

Bob Strahan is a Principal Solutions Architect in the AWS Language AI Services team.

Christopher Lott is a Senior Solutions Architect in the AWS AI Language Services team. He has 20 years of enterprise software development experience. Chris lives in Sacramento, California and enjoys gardening, aerospace, and traveling the world.

Babu Srinivasan is a Sr. Specialist SA – Language AI services in the World Wide Specialist organization at AWS, with over 24 years of experience in IT and the last 6 years focused on the AWS Cloud. He is passionate about AI/ML. Outside of work, he enjoys woodworking and entertains friends and family (sometimes strangers) with sleight of hand card magic.

A quick guide to Amazon’s 20-plus papers at CVPR

June 21, 2023

by Amazon AWS

Image segmentation, multimodal models, and innovative machine learning methods are among the Amazon researchers’ areas of focus.Read More

Reduce energy consumption of your machine learning workloads by up to 90% with AWS purpose-built accelerators

June 20, 2023

by Karsten Schroer Amazon AWS

Machine learning (ML) engineers have traditionally focused on striking a balance between model training and deployment cost vs. performance. Increasingly, sustainability (energy efficiency) is becoming an additional objective for customers. This is important because training ML models and then using the trained models to make predictions (inference) can be highly energy-intensive tasks. In addition, more and more applications around us have become infused with ML, and new ML-powered applications are conceived every day. A popular example is OpenAI’s ChatGPT, which is powered by a state-of-the-art large language model (LMM). For reference, GPT-3, an earlier generation LLM has 175 billion parameters and requires months of non-stop training on a cluster of thousands of accelerated processors. The Carbontracker study estimates that training GPT-3 from scratch may emit up to 85 metric tons of CO2 equivalent, using clusters of specialized hardware accelerators.

There are several ways AWS is enabling ML practitioners to lower the environmental impact of their workloads. One way is through providing prescriptive guidance around architecting your AI/ML workloads for sustainability. Another way is by offering managed ML training and orchestration services such as Amazon SageMaker Studio, which automatically tears down and scales up ML resources when not in use, and provides a host of out-of-the-box tooling that saves cost and resources. Another major enabler is the development of energy efficient, high-performance, purpose-built accelerators for training and deploying ML models.

The focus of this post is on hardware as a lever for sustainable ML. We present the results of recent performance and power draw experiments conducted by AWS that quantify the energy efficiency benefits you can expect when migrating your deep learning workloads from other inference- and training-optimized accelerated Amazon Elastic Compute Cloud (Amazon EC2) instances to AWS Inferentia and AWS Trainium. Inferentia and Trainium are AWS’s recent addition to its portfolio of purpose-built accelerators specifically designed by Amazon’s Annapurna Labs for ML inference and training workloads.

AWS Inferentia and AWS Trainium for sustainable ML

To provide you with realistic numbers of the energy savings potential of AWS Inferentia and AWS Trainium in a real-world application, we have conducted several power draw benchmark experiments. We have designed these benchmarks with the following key criteria in mind:

First, we wanted to make sure that we captured direct energy consumption attributable to the test workload, including not just the ML accelerator but also the compute, memory, and network. Therefore, in our test setup, we measured power draw at that level.
Second, when running the training and inference workloads, we ensured that all instances were operating at their respective physical hardware limits and took measurements only after that limit was reached to ensure comparability.
Finally, we wanted to be certain that the energy savings reported in this post could be achieved in a practical real-world application. Therefore, we used common customer-inspired ML use cases for benchmarking and testing.

The results are reported in the following sections.

Inference experiment: Real-time document understanding with LayoutLM

Inference, as opposed to training, is a continuous, unbounded workload that doesn’t have a defined completion point. It therefore makes up a large portion of the lifetime resource consumption of an ML workload. Getting inference right is key to achieving high performance, low cost, and sustainability (better energy efficiency) along the full ML lifecycle. With inference tasks, customers are usually interested in achieving a certain inference rate to keep up with the ingest demand.

The experiment presented in this post is inspired by a real-time document understanding use case, which is a common application in industries like banking or insurance (for example, for claims or application form processing). Specifically, we select LayoutLM, a pre-trained transformer model used for document image processing and information extraction. We set a target SLA of 1,000,000 inferences per hour, a value often considered as real time, and then specify two hardware configurations capable of meeting this requirement: one using Amazon EC2 Inf1 instances, featuring AWS Inferentia, and one using comparable accelerated EC2 instances optimized for inference tasks. Throughout the experiment, we track several indicators to measure inference performance, cost, and energy efficiency of both hardware configurations. The results are presented in the following figure.

Performance, Cost and Energy Efficiency Results of Inference Benchmarks

AWS Inferentia delivers 6.3 times higher inference throughput. As a result, with Inferentia, you can run the same real-time LayoutLM-based document understanding workload on fewer instances (6 AWS Inferentia instances vs. 33 other inference-optimized accelerated EC2 instances, equivalent to an 82% reduction), use less than a tenth (-92%) of the energy in the process, all while achieving significantly lower cost per inference (USD 2 vs. USD 25 per million inferences, equivalent to a 91% cost reduction).

Training experiment: Training BERT Large from scratch

Training, as opposed to inference, is a finite process that is repeated much less frequently. ML engineers are typically interested in high cluster performance to reduce training time while keeping cost under control. Energy efficiency is a secondary (yet growing) concern. With AWS Trainium, there is no trade-off decision: ML engineers can benefit from high training performance while also optimizing for cost and reducing environmental impact.

To illustrate this, we select BERT Large, a popular language model used for natural language understanding use cases such as chatbot-based question answering and conversational response prediction. Training a well-performing BERT Large model from scratch typically requires 450 million sequences to be processed. We compare two cluster configurations, each with a fixed size of 16 instances and capable of training BERT Large from scratch (450 million sequences processed) in less than a day. The first uses traditional accelerated EC2 instances. The second setup uses Amazon EC2 Trn1 instances featuring AWS Trainium. Again, we benchmark both configurations in terms of training performance, cost, and environmental impact (energy efficiency). The results are shown in the following figure.

Performance, Cost and Energy Efficiency Results of Training Benchmarks

In the experiments, AWS Trainium-based instances outperformed the comparable training-optimized accelerated EC2 instances by a factor of 1.7 in terms of sequences processed per hour, cutting the total training time by 43% (2.3h versus 4h on comparable accelerated EC2 instances). As a result, when using a Trainium-based instance cluster, the total energy consumption for training BERT Large from scratch is approximately 29% lower compared to a same-sized cluster of comparable accelerated EC2 instances. Again, these performance and energy efficiency benefits also come with significant cost improvements: cost to train for the BERT ML workload is approximately 62% lower on Trainium instances (USD 787 versus USD 2091 per full training run).

Getting started with AWS purpose-built accelerators for ML

Although the experiments conducted here all use standard models from the natural language processing (NLP) domain, AWS Inferentia and AWS Trainium excel with many other complex model architectures including LLMs and the most challenging generative AI architectures that users are building (such as GPT-3). These accelerators do particularly well with models with over 10 billion parameters, or computer vision models like stable diffusion (see Model Architecture Fit Guidelines for more details). Indeed, many of our customers are already using Inferentia and Trainium for a wide variety of ML use cases.

To run your end-to-end deep learning workloads on AWS Inferentia- and AWS Trainium-based instances, you can use AWS Neuron. Neuron is an end-to-end software development kit (SDK) that includes a deep learning compiler, runtime, and tools that are natively integrated into the most popular ML frameworks like TensorFlow and PyTorch. You can use the Neuron SDK to easily port your existing TensorFlow or PyTorch deep learning ML workloads to Inferentia and Trainium and start building new models using the same well-known ML frameworks. For easier setup, use one of our Amazon Machine Images (AMIs) for deep learning, which come with many of the required packages and dependencies. Even simpler: you can use Amazon SageMaker Studio, which natively supports TensorFlow and PyTorch on Inferentia and Trainium (see the aws-samples GitHub repo for an example).

One final note: while Inferentia and Trainium are purpose built for deep learning workloads, many less complex ML algorithms can perform well on CPU-based instances (for example, XGBoost and LightGBM and even some CNNs). In these cases, a migration to AWS Graviton3 may significantly reduce the environmental impact of your ML workloads. AWS Graviton-based instances use up to 60% less energy for the same performance than comparable accelerated EC2 instances.

Conclusion

There is a common misconception that running ML workloads in a sustainable and energy-efficient fashion means sacrificing on performance or cost. With AWS purpose-built accelerators for machine learning, ML engineers don’t have to make that trade-off. Instead, they can run their deep learning workloads on highly specialized purpose-built deep learning hardware, such as AWS Inferentia and AWS Trainium, that significantly outperforms comparable accelerated EC2 instance types, delivering lower cost, higher performance, and better energy efficiency—up to 90%—all at the same time. To start running your ML workloads on Inferentia and Trainium, check out the AWS Neuron documentation or spin up one of the sample notebooks. You can also watch the AWS re:Invent 2022 talk on Sustainability and AWS silicon (SUS206), which covers many of the topics discussed in this post.

About the Authors

Karsten Schroer is a Solutions Architect at AWS. He supports customers in leveraging data and technology to drive sustainability of their IT infrastructure and build data-driven solutions that enable sustainable operations in their respective verticals. Karsten joined AWS following his PhD studies in applied machine learning & operations management. He is truly passionate about technology-enabled solutions to societal challenges and loves to dive deep into the methods and application architectures that underlie these solutions.

Kamran Khan is a Sr. Technical Product Manager at AWS Annapurna Labs. He works closely with AI/ML customers to shape the roadmap for AWS purpose-built silicon innovations coming out of Amazon’s Annapurna Labs. His specific focus is on accelerated deep-learning chips including AWS Trainium and AWS Inferentia. Kamran has 18 years of experience in the semiconductor industry. Kamran has over a decade of experience helping developers achieve their ML goals.

Onboard users to Amazon SageMaker Studio with Active Directory group-specific IAM roles

June 19, 2023

by Ram Vittal Amazon AWS

Amazon SageMaker Studio is a web-based integrated development environment (IDE) for machine learning (ML) that lets you build, train, debug, deploy, and monitor your ML models. For provisioning Studio in your AWS account and Region, you first need to create an Amazon SageMaker domain—a construct that encapsulates your ML environment. More concretely, a SageMaker domain consists of an associated Amazon Elastic File System (Amazon EFS) volume, a list of authorized users, and a variety of security, application, policy, and Amazon Virtual Private Cloud (Amazon VPC) configurations.

When creating your SageMaker domain, you can choose to use either AWS IAM Identity Center (successor to AWS Single Sign-On) or AWS Identity and Access Management (IAM) for user authentication methods. Both authentication methods have their own set of use cases; in this post, we focus on SageMaker domains with IAM Identity Center, or single sign-on (SSO) mode, as the authentication method.

With SSO mode, you set up an SSO user and group in IAM Identity Center and then grant access to either the SSO group or user from the Studio console. Currently, all SSO users in a domain inherit the domain’s execution role. This may not work for all organizations. For instance, administrators may want to set up IAM permissions for a Studio SSO user based on their Active Directory (AD) group membership. Furthermore, because administrators are required to manually grant SSO users access to Studio, the process may not scale when onboarding hundreds of users.

In this post, we provide prescriptive guidance for the solution to provision SSO users to Studio with least privilege permissions based on AD group membership. This guidance enables you to quickly scale for onboarding hundreds of users to Studio and achieve your security and compliance posture.

Solution overview

The following diagram illustrates the solution architecture.

The workflow to provision AD users in Studio includes the following steps:

Set up a Studio domain in SSO mode.
For each AD group:
1. Set up your Studio execution role with appropriate fine-grained IAM policies
2. Record an entry in the AD group-role mapping Amazon DynamoDB table.
Alternatively, you can adopt a naming standard for IAM role ARNs based on the AD group name and derive the IAM role ARN without needing to store the mapping in an external database.
Sync your AD users and groups and memberships to AWS Identity Center:
1. If you’re using an identity provider (IdP) that supports SCIM, use the SCIM API integration with IAM Identity Center.
2. If you are using self-managed AD, you may use AD Connector.
When the AD group is created in your corporate AD, complete the following steps:
1. Create a corresponding SSO group in IAM Identity Center.
2. Associate the SSO group to the Studio domain using the SageMaker console.
When an AD user is created in your corporate AD, a corresponding SSO user is created in IAM Identity Center.
When the AD user is assigned to an AD group, an IAM Identity Center API (CreateGroupMembership) is invoked, and SSO group membership is created.
The preceding event is logged in AWS CloudTrail with the name AddMemberToGroup.
An Amazon EventBridge rule listens to CloudTrail events and matches the AddMemberToGroup rule pattern.
The EventBridge rule triggers the target AWS Lambda function.
This Lambda function will call back IAM Identity Center APIs, get the SSO user and group information, and perform the following steps to create the Studio user profile (CreateUserProfile) for the SSO user:
1. Look up the DynamoDB table to fetch the IAM role corresponding to the AD group.
2. Create a user profile with the SSO user and the IAM role obtained from the lookup table.
3. The SSO user is granted access to Studio.
The SSO user is redirected to the Studio IDE via the Studio domain URL.

Note that, as of writing, Step 4b (associate the SSO group to the Studio domain) needs to be performed manually by an admin using the SageMaker console at the SageMaker domain level.

Set up a Lambda function to create the user profiles

The solution uses a Lambda function to create the Studio user profiles. We provide the following sample Lambda function that you can copy and modify to meet your needs for automating the creation of the Studio user profile. This function performs the following actions:

Receive the CloudTrail AddMemberToGroup event from EventBridge.
Retrieve the Studio DOMAIN_ID from the environment variable (you can alternatively hard-code the domain ID or use a DynamoDB table as well if you have multiple domains).
Read from a dummy markup table to match AD users to execution roles. You can change this to fetch from the DynamoDB table if you’re using a table-driven approach. If you use DynamoDB, your Lambda function’s execution role needs permissions to read from the table as well.
Retrieve the SSO user and AD group membership information from IAM Identity Center, based on the CloudTrail event data.
Create a Studio user profile for the SSO user, with the SSO details and the matching execution role.

import os
import json
import boto3
DOMAIN_ID = os.environ.get('DOMAIN_ID', 'd-xxxx')


def lambda_handler(event, context):
    
    print({"Event": event})

    client = boto3.client('identitystore')
    sm_client = boto3.client('sagemaker')
    
    event_detail = event['detail']
    group_response = client.describe_group(
        IdentityStoreId=event_detail['requestParameters']['identityStoreId'],
        GroupId=event_detail['requestParameters']['groupId'],
    )
    group_name = group_response['DisplayName']
    
    user_response = client.describe_user(
        IdentityStoreId=event_detail['requestParameters']['identityStoreId'],
        UserId=event_detail['requestParameters']['member']['memberId']
    )
    user_name = user_response['UserName']
    print(f"Event details: {user_name} has been added to {group_name}")
    
    mapping_dict = {
        "ad-group-1": "<execution-role-arn>",
        "ad-group-2": "<execution-role-arn>”
    }
    
    user_role = mapping_dict.get(group_name)
    
    if user_role:
        response = sm_client.create_user_profile(
            DomainId=DOMAIN_ID,
            SingleSignOnUserIdentifier="UserName",
            SingleSignOnUserValue=user_name,
            # if the SSO user_name value is an email, 
	  #  add logic to handle it since Studio user profiles don’t accept @ character
            UserProfileName=user_name, 
            UserSettings={
                "ExecutionRole": user_role
            }
        )
        print(response)
    else:
        response = "Group is not authorized to use SageMaker. Doing nothing."
        print(response)
    return {
        'statusCode': 200,
        'body': json.dumps(response)
    }

Note that by default, the Lambda execution role doesn’t have access to create user profiles or list SSO users. After you create the Lambda function, access the function’s execution role on IAM and attach the following policy as an inline policy after scoping down as needed based on your organization requirements.

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Action": [
                "identitystore:DescribeGroup",
                "identitystore:DescribeUser"
            ],
            "Effect": "Allow",
            "Resource": "*"
        },
        {
            "Action": "sagemaker:CreateUserProfile",
            "Effect": "Allow",
            "Resource": "*"
        },
        {
            "Action": "iam:PassRole",
            "Effect": "Allow",
            "Resource": [
                "<list-of-studio-execution-roles>"
            ]
        }
    ]
}

Set up the EventBridge rule for the CloudTrail event

EventBridge is a serverless event bus service that you can use to connect your applications with data from a variety of sources. In this solution, we create a rule-based trigger: EventBridge listens to events and matches against the provided pattern and triggers a Lambda function if the pattern match is successful. As explained in the solution overview, we listen to the AddMemberToGroup event. To set it up, complete the following steps:

On the EventBridge console, choose Rules in the navigation pane.
Choose Create rule.
Provide a rule name, for example, AddUserToADGroup.
Optionally, enter a description.
Select default for the event bus.
Under Rule type, choose Rule with an event pattern, then choose Next.
On the Build event pattern page, choose Event source as AWS events or EventBridge partner events.

Under Event pattern, choose the Custom patterns (JSON editor) tab and enter the following pattern:

{
  "source": ["aws.sso-directory"],
  "detail-type": ["AWS API Call via CloudTrail"],
  "detail": {
    "eventSource": ["sso-directory.amazonaws.com"],
    "eventName": ["AddMemberToGroup"]
  }
}

Choose Next.
On the Select target(s) page, choose the AWS service for the target type, the Lambda function as the target, and the function you created earlier, then choose Next.
Choose Next on the Configure tags page, then choose Create rule on the Review and create page.

After you’ve set the Lambda function and the EventBridge rule, you can test out this solution. To do so, open your IdP and add a user to one of the AD groups with the Studio execution role mapped. Once you add the user, you can verify the Lambda function logs to inspect the event and also see the Studio user provisioned automatically. Additionally, you can use the DescribeUserProfile API call to verify that the user is created with appropriate permissions.

Supporting multiple Studio accounts

To support multiple Studio accounts with the preceding architecture, we recommend the following changes:

Set up an AD group mapped to each Studio account level.
Set up a group-level IAM role in each Studio account.
Set up or derive the group to IAM role mapping.
Set up a Lambda function to perform cross-account role assumption, based on the IAM role mapping ARN and created user profile.

Deprovisioning users

When a user is removed from their AD group, you should remove their access from the Studio domain as well. With SSO, when a user is removed, the user is disabled in IAM Identity Center automatically if the AD to IAM Identity Center sync is in place, and their Studio application access is immediately revoked.

However, the user profile on Studio still persists. You can add a similar workflow with CloudTrail and a Lambda function to remove the user profile from Studio. The EventBridge trigger should now listen for the DeleteGroupMembership event. In the Lambda function, complete the following steps:

Obtain the user profile name from the user and group ID.
List all running apps for the user profile using the ListApps API call, filtering by the UserProfileNameEquals parameter. Make sure to check for the paginated response, to list all apps for the user.
Delete all running apps for the user and wait until all apps are deleted. You can use the DescribeApp API to view the app’s status.
When all apps are in a Deleted state (or Failed), delete the user profile.

With this solution in place, ML platform administrators can maintain group memberships in one central location and automate the Studio user profile management through EventBridge and Lambda functions.

The following code shows a sample CloudTrail event:

"AddMemberToGroup": 
{
    "eventVersion": "1.08",
    "userIdentity": {
        "type": "Unknown",
        "accountId": "<account-id>",
        "accessKeyId": "30997fec-b566-4b8b-810b-60934abddaa2"
    },
    "eventTime": "2022-09-26T22:24:18Z",
    "eventSource": "sso-directory.amazonaws.com",
    "eventName": "AddMemberToGroup",
    "awsRegion": "us-east-1",
    "sourceIPAddress": "54.189.184.116",
    "userAgent": "Okta SCIM Client 1.0.0",
    "requestParameters": {
        "identityStoreId": "d-906716eb24",
        "groupId": "14f83478-a061-708f-8de4-a3a2b99e9d89",
        "member": {
            "memberId": "04c8e458-a021-702e-f9d1-7f430ff2c752"
        }
    },
    "responseElements": null,
    "requestID": "b24a123b-afb3-4fb6-8650-b0dc1f35ea3a",
    "eventID": "c2c0873b-5c49-404c-add7-f10d4a6bd40c",
    "readOnly": false,
    "eventType": "AwsApiCall",
    "managementEvent": true,
    "recipientAccountId": "<account-id>",
    "eventCategory": "Management",
    "tlsDetails": {
        "tlsVersion": "TLSv1.2",
        "cipherSuite": "ECDHE-RSA-AES128-GCM-SHA256",
        "clientProvidedHostHeader": "up.sso.us-east-1.amazonaws.com"
    }
}

The following code shows a sample Studio user profile API request:

create-user-profile \
--domain-id d-xxxxxx \
--user-profile-name ssouserid
--single-sign-on-user-identifier 'userName' \
--single-sign-on-user-value 'ssouserid‘ \
--user-settings ExecutionRole=arn:aws:iam::<account id>:role/name

Conclusion

In this post, we discussed how administrators can scale Studio onboarding for hundreds of users based on their AD group membership. We demonstrated an end-to-end solution architecture that organizations can adopt to automate and scale their onboarding process to meet their agility, security, and compliance needs. If you’re looking for a scalable solution to automate your user onboarding, try this solution, and leave you feedback below! For more information about onboarding to Studio, see Onboard to Amazon SageMaker Domain.

About the authors

Ram Vittal is an ML Specialist Solutions Architect at AWS. He has over 20 years of experience architecting and building distributed, hybrid, and cloud applications. He is passionate about building secure and scalable AI/ML and big data solutions to help enterprise customers with their cloud adoption and optimization journey to improve their business outcomes. In his spare time, he rides his motorcycle and walks with his 2-year-old sheep-a-doodle!

Durga Sury is an ML Solutions Architect in the Amazon SageMaker Service SA team. She is passionate about making machine learning accessible to everyone. In her 4 years at AWS, she has helped set up AI/ML platforms for enterprise customers. When she isn’t working, she loves motorcycle rides, mystery novels, and hiking with her 5-year-old husky.

Paper on graph database schemata wins best-industry-paper award

June 16, 2023

by Amazon AWS

SIGMOD paper by Amazon researchers and collaborators presents flexible data definition language that enables rapid development of complex graph databases.Read More

SambaSafety automates custom R workload, improving driver safety with Amazon SageMaker and AWS Step Functions

June 16, 2023

by Dan Ferguson Amazon AWS

At SambaSafety, their mission is to promote safer communities by reducing risk through data insights. Since 1998, SambaSafety has been the leading North American provider of cloud–based mobility risk management software for organizations with commercial and non–commercial drivers. SambaSafety serves more than 15,000 global employers and insurance carriers with driver risk and compliance monitoring, online training and deep risk analytics, as well as risk pricing solutions. Through the collection, correlation and analysis of driver record, telematics, corporate and other sensor data, SambaSafety not only helps employers better enforce safety policies and reduce claims, but also helps insurers make informed underwriting decisions and background screeners perform accurate, efficient pre–hire checks.

Not all drivers present the same risk profile. The more time spent behind the wheel, the higher your risk profile. SambaSafety’s team of data scientists has developed complex and propriety modeling solutions designed to accurately quantify this risk profile. However, they sought support to deploy this solution for batch and real-time inference in a consistent and reliable manner.

In this post, we discuss how SambaSafety used AWS machine learning (ML) and continuous integration and continuous delivery (CI/CD) tools to deploy their existing data science application for batch inference. SambaSafety worked with AWS Advanced Consulting Partner Firemind to deliver a solution that used AWS CodeStar, AWS Step Functions, and Amazon SageMaker for this workload. With AWS CI/CD and AI/ML products, SambaSafety’s data science team didn’t have to change their existing development workflow to take advantage of continuous model training and inference.

Customer use case

SambaSafety’s data science team had long been using the power of data to inform their business. They had several skilled engineers and scientists building insightful models that improved the quality of risk analysis on their platform. The challenges faced by this team were not related to data science. SambaSafety’s data science team needed help connecting their existing data science workflow to a continuous delivery solution.

SambaSafety’s data science team maintained several script-like artifacts as part of their development workflow. These scripts performed several tasks, including data preprocessing, feature engineering, model creation, model tuning, and model comparison and validation. These scripts were all run manually when new data arrived into their environment for training. Additionally, these scripts didn’t perform any model versioning or hosting for inference. SambaSafety’s data science team had developed manual workarounds to promote new models to production, but this process became time-consuming and labor-intensive.

To free up SambaSafety’s highly skilled data science team to innovate on new ML workloads, SambaSafety needed to automate the manual tasks associated with maintaining existing models. Furthermore, the solution needed to replicate the manual workflow used by SambaSafety’s data science team, and make decisions about proceeding based on the outcomes of these scripts. Finally, the solution had to integrate with their existing code base. The SambaSafety data science team used a code repository solution external to AWS; the final pipeline had to be intelligent enough to trigger based on updates to their code base, which was written primarily in R.

Solution overview

The following diagram illustrates the solution architecture, which was informed by one of the many open-source architectures maintained by SambaSafety’s delivery partner Firemind.

The solution delivered by Firemind for SambaSafety’s data science team was built around two ML pipelines. The first ML pipeline trains a model using SambaSafety’s custom data preprocessing, training, and testing scripts. The resulting model artifact is deployed for batch and real-time inference to model endpoints managed by SageMaker. The second ML pipeline facilitates the inference request to the hosted model. In this way, the pipeline for training is decoupled from the pipeline for inference.

One of the complexities in this project is replicating the manual steps taken by the SambaSafety data scientists. The team at Firemind used Step Functions and SageMaker Processing to complete this task. Step Functions allows you to run discrete tasks in AWS using AWS Lambda functions, Amazon Elastic Kubernetes Service (Amazon EKS) workers, or in this case SageMaker. SageMaker Processing allows you to define jobs that run on managed ML instances within the SageMaker ecosystem. Each run of a Step Function job maintains its own logs, run history, and details on the success or failure of the job.

The team used Step Functions and SageMaker, together with Lambda, to handle the automation of training, tuning, deployment, and inference workloads. The only remaining piece was the continuous integration of code changes to this deployment pipeline. Firemind implemented a CodeStar project that maintained a connection to SambaSafety’s existing code repository. When the industrious data science team at SambaSafety posts an update to a specific branch of their code base, CodeStar picks up the changes and triggers the automation.

Conclusion

SambaSafety’s new serverless MLOps pipeline had a significant impact on their capability to deliver. The integration of data science and software development enables their teams to work together seamlessly. Their automated model deployment solution reduced time to delivery by up to 70%.

SambaSafety also had the following to say:

“By automating our data science models and integrating them into their software development lifecycle, we have been able to achieve a new level of efficiency and accuracy in our services. This has enabled us to stay ahead of the competition and deliver innovative solutions to clients. Our clients will greatly benefit from this with the faster turnaround times and improved accuracy of our solutions.”

SambaSafety connected with AWS account teams with their problem. AWS account and solutions architecture teams worked to identify this solution by sourcing from our robust partner network. Connect with your AWS account team to identify similar transformative opportunities for your business.

About the Authors

Dan Ferguson is an AI/ML Specialist Solutions Architect (SA) on the Private Equity Solutions Architecture at Amazon Web Services. Dan helps Private Equity backed portfolio companies leverage AI/ML technologies to achieve their business objectives.

Khalil Adib is a Data Scientist at Firemind, driving the innovation Firemind can provide to their customers around the magical worlds of AI and ML. Khalil tinkers with the latest and greatest tech and models, ensuring that Firemind are always at the bleeding edge.

Jason Mathew is a Cloud Engineer at Firemind, leading the delivery of projects for customers end-to-end from writing pipelines with IaC, building out data engineering with Python, and pushing the boundaries of ML. Jason is also the key contributor to Firemind’s open source projects.

Background

Dataset

Problem definition

Modeling challenges

Data preprocessing and feature engineering

Feature engineering

Data preparation

ML methodology and model training

Building a baseline model with AutoGluon

Building and tuning a customized neural network model with SageMaker automatic model tuning

Model performance results

Evaluation metrics

Results

Conclusion

About the authors

Solution overview

Prerequisites

Clone the GitHub repository

AWS CDK constructs

Deploy AWS CDK constructs

Attach the Studio lifecycle configuration

Attach the lifecycle configuration using the console

Attach the lifecycle configuration programmatically

Clean up

Conclusion

About the Authors

Solution overview

Prerequisites

Create a Salesforce connected app

Get your Salesforce access token

Create a Salesforce customer contact record for each caller

Deploy the LCA stack

Deploy the Salesforce integration stack

Update LCA Stack

Validate the integration

Clean up

Conclusion

About the Authors

AWS Inferentia and AWS Trainium for sustainable ML

Inference experiment: Real-time document understanding with LayoutLM

Training experiment: Training BERT Large from scratch

Getting started with AWS purpose-built accelerators for ML

Conclusion

About the Authors

Solution overview

Set up a Lambda function to create the user profiles

Set up the EventBridge rule for the CloudTrail event

Supporting multiple Studio accounts

Deprovisioning users

Conclusion

About the authors

Customer use case

Solution overview

Conclusion

About the Authors

Navigation

GenAI Vision Endless Possibilities

"I'm interested in things that change the world or that affect the future and wondrous, new technology where you see it, and you're like, 'Wow, how did that even happen? How is that possible?'" -- Elon Musk

Copyright © 2019-2025 Vedere AI. All Rights Reserved.