Amazon AWS – Page 227

“An accidental project born out of our need to innovate”

May 2, 2022

by Amazon AWS

Former Amazon intern George Boateng is using machine learning and mobile tech to bridge Africa’s digital divide.Read More

Improving unsupervised sentence-pair comparison

April 29, 2022

by Amazon AWS

Method that captures advantages of cross-encoding and bi-encoding improves on predecessors by as much as 5%.Read More

Amazon Rekognition introduces Streaming Video Events to provide real-time alerts on live video streams

April 28, 2022

by Prathyusha Cheruku Amazon AWS

Today, AWS announced the general availability of Amazon Rekognition Streaming Video Events, a fully managed service for camera manufacturers and service providers that uses machine learning (ML) to detect objects such as people, pets, and packages in live video streams from connected cameras. Amazon Rekognition Streaming Video Events sends them a notification as soon as the desired object is detected in the live video stream.

With these event notifications, service providers can send timely and actionable smart alerts to their users such as “Pet detected in the backyard,” enable home automation experiences such as turning on garage lights when a person is detected, build custom in-app experiences such as a smart search to find specific video events of packages without scrolling through hours of footage, or integrate these alerts with Echo devices for Alexa announcements such as “A package was detected at the front door” when the doorbell detects a delivery person dropping off a package – all while keeping cost and latency low.

This post describes how camera manufacturers and security service providers can use Amazon Rekognition Streaming Video Events on live video streams to deliver actionable smart alerts to their users in real time.

Amazon Rekognition Streaming Video Events

Many camera manufacturers and security service providers offer home security solutions that include camera doorbells, indoor cameras, outdoor cameras, and value-added notification services to help their users understand what is happening on their property. Cameras with built-in motion detectors are placed at entry or exit points of the home to notify users of any activity in real time, such as “Motion detected in the backyard.” However, motion detectors are noisy, can be set off by innocuous events like wind and rain, creating notification fatigue, and resulting in clunky home automation setup. Building the right user experience for smart alerts, search, or even browsing video clips requires ML and automation that is hard to get right and can be expensive.

Amazon Rekognition Streaming Video Events lowers the costs of value-added video analytics by providing a low-cost, low-latency, fully managed ML service that can detect objects (such as people, pets, and packages) in real time on video streams from connected cameras. The service starts analyzing the video clip only when a motion event is triggered by the camera. When the desired object is detected, it sends a notification that includes the objects detected, bounding box coordinates, zoomed-in image of the objects detected, and the timestamp. The Amazon Rekognition pre-trained APIs provide high accuracy even in varying lighting conditions, camera angles, and resolutions.

Customer success stories

Customers like Abode Systems and 3xLOGIC are using Amazon Rekognition Streaming Video Events to send relevant alerts to their users and minimize false alarms.

Abode Systems (Abode) offers homeowners a comprehensive suite of do-it-yourself home security solutions that can be set up in minutes and enables homeowners to keep their family and property safe. Since the company’s launch in 2015, in-camera motion detection sensors have played an essential part in Abode’s solution, enabling customers to receive notifications and monitor their homes from anywhere. Abode recognized that to offer its customers the best video stream smart notification experience, they needed highly accurate yet inexpensive and scalable streaming computer vision solutions that can detect objects and events of interest in real time. After weighing alternatives, Abode chose to pilot Amazon Rekognition Streaming Video Events. Within a matter of weeks, Abode was able to deploy a serverless, well-architected solution integrating tens of thousands of cameras. To learn more about Abode’s case study, see Abode uses Amazon Rekognition Streaming Video Events to provide real-time notifications to their smart home customers.

“We are always focused on making technology choices that provide value to our customers and enable rapid growth while keeping costs low. With Amazon Rekognition Streaming Video Events, we could launch person, pet, and package detection at a fraction of the cost of developing everything ourselves. Our smart home customers are notified in real time when Amazon Rekognition detects an object or activity of interest. This helps us filter out the noise and focus on what’s important to our customers – quality notifications.

For us it was a no-brainer, we didn’t want to create and maintain a custom computer vision service. We turned to the experts on the Amazon Rekognition team. Amazon Rekognition Streaming Video Events APIs are accurate, scalable, and easy to incorporate into our systems. The integration powers our smart notification features, so instead of a customer receiving 100 notifications a day, every time the motion sensor is triggered, they receive just two or three smart notifications when there is an event of interest present in the video stream.”

– Scott Beck, Chief Technology Officer at Abode Systems.

3xLOGIC is a leader in commercial electronic security systems. They provide commercial security systems and managed video monitoring for businesses, hospitals, schools, and government agencies. Managed video monitoring is a critical component of a comprehensive security strategy for 3xLOGIC’s customers. With more than 50,000 active cameras in the field, video monitoring teams face a daily challenge of dealing with false alarms coming from in-camera motion detection sensors. These false notifications pose a challenge for operators because they must treat every notification as if it were an event of interest. 3xLOGIC wanted to improve their managed video monitoring product VIGIL CLOUD with intelligent video analytics and provide monitoring center operators with real-time smart notifications. To do this, 3xLOGIC used Amazon Rekognition Video Streaming Events. The service enables 3xLOGIC to analyze live video streams from connected cameras to detect the presence of individuals and filter out the noise from false notifications. To learn more about 3xLOGIC’s case study, see 3xLOGIC uses Amazon Rekognition Streaming Video Events to provide intelligent video analytics on live video streams to monitoring agents.

“Simply relying on motion detection sensors triggers several alarms that are not a security or safety risk when there is a lot of activity in a scene. By utilizing machine learning to filter out the vast majority of events, such as animals, shadows, moving vegetation, and more, we can dramatically reduce the workload of the security operators and improve their efficiency.”

– Ola Edman, Senior Director Global Video Development at 3xLOGIC.

“With over 50,000 active cameras in the field, many without the advanced analytics of newer and more expensive camera models, 3xLOGIC takes on the challenge of false alarms every day. Building, training, testing, and maintaining computer vision models is resource-intensive and has a huge learning curve. With Amazon Rekognition Streaming Video Events, we simply call the API and surface the results to our users. It has been very easy to use and the accuracy is impressive.”

– Charlie Erickson, CTO at 3xLOGIC.

How it works

Amazon Rekognition Streaming Video Events works with Amazon Kinesis Video Streams to detect objects from live video streams. This enables camera manufacturers and service providers to minimize false alerts from camera motion events by sending real-time notifications only when a desired object (such as a person, pet, or package) is detected in the video frame. The Amazon Rekognition streaming video APIs enable service providers to accurately alert on objects that are relevant for their customer, successfully adjust the duration of the video to process per motion event, and even define specific areas within the frame that needs to be analyzed.

Amazon Rekognition helps service providers protect their user data by automatically encrypting the data at rest using AWS Key Management Service (KMS) and in transit using the industry-standard Transport Layer Security (TLS) protocol.

Here’s how camera manufacturers and service providers can incorporate video analysis on live video streams:

Integrate Kinesis Video Streams with Amazon Rekognition – Kinesis Video Streams allows camera manufacturers and service providers to easily and securely stream live video from devices such as video doorbells and indoor and outdoor cameras to AWS. It integrates seamlessly with new or existing Kinesis video streams to facilitate live video stream analysis.
Specify video duration –Amazon Rekognition Streaming Video Events allows service providers to control how much video they need to process per motion event. They can specify the length of the video clips to be between 1–120 seconds (the default is 10 seconds). When motion is detected, Amazon Rekognition starts analyzing video from the relevant Kinesis video stream for the specific duration. This provides camera manufacturers and service providers with the flexibility to better manage their ML inference costs.
Choose relevant objects –Amazon Rekognition Streaming Video Events provides the capability to choose one or more objects for detection in live video streams. This minimizes false alerts from camera motion events by sending notifications only when desired objects are detected in the video frame.
Let Amazon Rekognition know where to send the notifications – Service providers can specify their Amazon Simple Notification Service (Amazon SNS) destination to send event notifications. When Amazon Rekognition starts processing the video stream, it sends a notification as soon a desired object is detected. This notification includes the object detected, the bounding box, the time stamp, and a link to the specified Amazon Simple Storage Service (Amazon S3) bucket with the zoomed-in image of the object detected. They can then use this notification to send smart alerts to their users.
Send motion detection trigger notifications – Whenever a connected camera detects motion, the service provider sends a trigger to Amazon Rekognition to start processing the video streams. Amazon Rekognition processes the applicable Kinesis video stream for the specific objects for the defined duration. When the desired object is detected, Amazon Rekognition sends a notification to their private SNS topic.
Integrate with Alexa or other voice assistants (optional) – Service providers can integrate these notifications with Alexa Smart Home skills to enable Alexa announcements for their users. Whenever Amazon Rekognition Streaming Video Events sends them a notification, they can send these notifications to Alexa to provide audio announcements from Echo devices, such as “Package detected at the front door.”

To learn more, see Amazon Rekognition Streaming Video Events developer guide.

The following diagram illustrates Abode’s architecture with Amazon Rekognition Streaming Video Events.

The following diagram illustrates 3xLOGIC’s architecture with Amazon Rekognition Streaming Video Events.

Amazon Rekognition Video Streaming Events is generally available to AWS customers in US East (N. Virginia), US East (Ohio), US West (Oregon), Europe (Ireland), and Asia Pacific (Mumbai) Regions, with availability in additional Regions in the coming months.

Conclusion

AWS customers such as Abode and 3xLOGIC are using Amazon Rekognition Streaming Video Events to innovate and add intelligent video analytics to their security solutions and modernize their offerings without having to invest in new hardware or develop and maintain custom computer vision analytics.

To get started with Amazon Rekognition Streaming Video Events, visit Amazon Rekognition Streaming Video Events.

About the Author

Prathyusha Cheruku is an AI/ML Computer Vision Principal Product Manager at AWS. She focuses on building powerful, easy-to-use, no-code/low-code deep learning-based image and video analysis services for AWS customers. Outside of work, she has a passion for music, karaoke, painting, and traveling.

3xLOGIC uses Amazon Rekognition Streaming Video Events to provide intelligent video analytics on live video streams to monitoring agents

April 28, 2022

by Mike Ames Amazon AWS

3xLOGIC wanted to improve their managed video monitoring product VIGIL CLOUD with intelligent video analytics and provide monitoring center operators with real-time smart notifications. To do this, 3xLOGIC used Amazon Rekognition Video Streaming Events, a low-latency, low-cost, scalable, managed computer vision service from AWS. The service enables 3xLOGIC to analyze live video streams from connected cameras to detect the presence of people and filter out the noise from false notifications. When a person is detected the service sends a notification that includes the object detected, zoomed in image of the object, bounding boxes, and timestamps to monitoring center operators for further review.

– Ola Edman, Senior Director Global Video Development at 3xLOGIC.

Video analytics with Amazon Rekognition Streaming Video Events

The challenge for managed video monitoring operators is that the more false notifications they receive, the more they get desensitized to the noise and the more likely they are to miss a critical notification. Providers like 3xLOGIC want agents to respond to notifications with the same urgency on the last alarm of their shift as they did on the first. The best way for that to happen is to simply filter out the noise from in-camera motion detection events.

3xLOGIC worked with AWS to develop and launch a multi-location pilot program that showed a significant decrease in false alarms. The following diagram illustrates 3xLOGIC’s integration with Amazon Rekognition Streaming Video Events.

When a 3xLOGIC camera detects motion, it starts streaming video to Amazon Kinesis Video Streams and calls an API to trigger Amazon Rekognition to start analyzing the video stream. When Amazon Rekognition detects a person in the video stream, it sends an event to Amazon Simple Notification Service (Amazon SNS), which notifies a video monitoring agent of the event. Amazon Rekognition provides out-of-the-box notifications, which include zoomed-in images of the people, bounding boxes, labels, and timestamps of the event. Monitoring agents use these notifications in concert with live camera views to evaluate the event and take appropriate action. To learn more about Amazon Rekognition Streaming Video Events, refer to the Amazon Rekognition Developer guide.

– Charlie Erickson, CTO at 3xLOGIC Products and Solutions.

Conclusion

The managed video monitoring market requires an in-depth understanding of the variety of security risks that firms face. It also requires that you keep up with the latest technology, regulations, and best practices. By partnering with AWS, providers like 3xLOGIC are innovating and adding intelligent video analytics to their security solutions and modernizing their offerings without having to invest in new hardware or develop and maintain custom computer vision analytics.

To get started with Amazon Rekognition Streaming Video Events, visit Amazon Rekognition Streaming Video Events.

About the Authors

Mike Ames is a Principal Applied AI/ML Solutions Architect with AWS. He helps companies use machine learning and AI services to combat fraud, waste, and abuse. In his spare time, you can find him mountain biking, kickboxing, or playing Frisbee with his dog Max.

Prathyusha Cheruku is a Principal Product Manager for AI/ML Computer Vision at AWS. She focuses on building powerful, easy-to-use, no-code/low-code deep learning-based image and video analysis services for AWS customers. Outside of work, she has a passion for music, karaoke, painting, and traveling.

David Robo is a Principal WW GTM Specialist for AI/ML Computer Vision at Amazon Web Services. In this role, David works with customers and partners throughout the world who are building innovative video-based devices, products, and services. Outside of work, David has a passion for the outdoors and carving lines on waves and snow.

Abode uses Amazon Rekognition Streaming Video Events to provide real-time notifications to their smart home customers

April 28, 2022

by Mike Ames Amazon AWS

Abode has been an AWS user since 2015, taking advantage of multiple AWS services for storage, compute, database, IoT, and video streaming for its solutions. Abode reached out to AWS to understand how they could use AWS computer vision services to build smart notifications into their home security solution for their customers. After evaluating their options, Abode chose to use Amazon Rekognition Streaming Video Events, a low-cost, low-latency, fully managed AI service that can detect objects such as people, pets, and packages in real time on video streams from connected cameras.

– Scott Beck, Chief Technology Officer at Abode Systems.

Smart notifications for the connected home market segment

Abode recognized that to offer its customers the best video stream smart notification experience, they needed highly accurate yet inexpensive and scalable streaming computer vision solutions that can detect objects and events of interest in real time. After weighing alternatives, Abode leaned on their relationship with AWS to pilot Amazon Rekognition Streaming Video Events. Within a matter of weeks, Abode was able to deploy a serverless, well-architected solution integrating tens of thousands of cameras.

“Every time a camera detects motion, we stream video to Amazon Kinesis Video Streams and trigger Amazon Rekognition Streaming Video Events APIs to detect if there truly was a person, pet, or package in the stream,” Beck says. “Our smart home customers are notified in real time when Amazon Rekognition detects an object or activity of interest. This helps us filter out the noise and focus on what’s important to our customers – quality notifications.”

Amazon Rekognition Streaming Video Events

Amazon Rekognition Streaming Video Events detects objects and events in video streams and returns the labels detected, bounding box coordinates, zoomed-in images of the object detected, and timestamps. With this service, companies like Abode can deliver timely and actionable smart notifications only when a desired label such as a person, pet, or package is detected in the video frame. For more information, refer to the Amazon Rekognition Streaming Video Events Developer Guide.

“For us it was a no-brainer, we didn’t want to create and maintain a custom computer vision service,” Beck says. “We turned to the experts on the Amazon Rekognition team. Amazon Rekognition Streaming Video Events APIs are accurate, scalable, and easy to incorporate into our systems. The integration powers our smart notification features, so instead of a customer receiving 100 notifications a day, every time the motion sensor is triggered, they receive just two or three smart notifications when there is an event of interest present in the video stream.”

Solution overview

Abode’s goal was to improve accuracy and usefulness of camera-based motion detection notifications to their customers by providing highly accurate label detection using their existing camera technology. This meant that Abode’s customers wouldn’t have to buy additional hardware to take advantage of new features, and Abode wouldn’t have to develop and maintain a bespoke solution. The following diagram illustrates Abode’s integration with Amazon Rekognition Streaming Video Events.

The solution consists of the following steps:

Integrate Amazon Kinesis Video Streams with Amazon Rekognition – Abode was already using Amazon Kinesis Video Streams to easily stream live video from devices such as video doorbells and indoor and outdoor cameras to AWS. They simply integrated Kinesis Video Streams with Amazon Rekognition to facilitate live video stream analysis.
Specify video duration – With Amazon Rekognition, Abode can control how much video needs to be processed per motion event. Amazon Rekognition allows you to specify the length of the video clips to be between 0–120 seconds (the default is 10 seconds) per motion event. When motion is detected, Amazon Rekognition starts analyzing video from the relevant Kinesis video stream for the specific duration. This allows Abode the flexibility to better manage their machine learning (ML) inference costs.
Choose relevant labels – With Amazon Rekognition, customers like Abode can choose one or more labels for detection in live video streams. This minimizes false alerts from camera motion events by sending notifications only when desired objects are detected in the video frame. Abode opted for person, pet, and package detection.
Let Amazon Rekognition know where to send the notifications – When Amazon Rekognition starts processing the video stream, it sends a notification as soon a desired object is detected to the Amazon Simple Notification Service (Amazon SNS) destination configured by Abode. This notification includes the object detected, the bounding box, the timestamp, and a link to Abode’s specified Amazon Simple Storage Service (Amazon S3) bucket with the zoomed-in image of the object detected. Abode then uses this information to send relevant smart alerts to the homeowner, such as “A package has been detected at 12:53pm” or “A pet detected in the backyard.”
Send motion detection trigger notifications – Whenever the smart camera detects motion, Abode sends a trigger to Amazon Rekognition to start processing the video streams. Amazon Rekognition processes the applicable Kinesis video stream for the specific objects and the duration defined. When the desired object is detected, Amazon Rekognition sends a notification to Abode’s private SNS topic.
Integrate with Alexa or other voice assistants (optional) – Abode also integrated these notifications with Alexa Smart Home skills to enable Alexa announcements for their users. Whenever they receive a notification from Amazon Rekognition Streaming Video Events, Abode sends these notifications to Alexa to provide audio announcements from Echo devices, such as “Package detected at the front door.”

Conclusion

The connected home security market segment is dynamic and evolving, driven by consumers’ increased need for security, convenience, and entertainment. AWS customers like Abode are innovating and adding new ML capabilities to their smart home security solutions for their consumers. The proliferation of camera and streaming video technology is just beginning, and managed computer vision services like Amazon Rekognition Streaming Video Events is paving the way for new smart video streaming capabilities in the home automation market.

To learn more, check out Amazon Rekognition Streaming Video Events and developer guide.

About the Authors

Pandas user-defined functions are now available in Amazon SageMaker Data Wrangler

April 28, 2022

by Ben Harris Amazon AWS

Amazon SageMaker Data Wrangler reduces the time to aggregate and prepare data for machine learning (ML) from weeks to minutes. With Data Wrangler, you can select and query data with just a few clicks, quickly transform data with over 300 built-in data transformations, and understand your data with built-in visualizations without writing any code.

Additionally, you can create custom transforms unique to your requirements. Custom transforms allow you to write custom transformations using either PySpark, Pandas, or SQL.

Data Wrangler now supports a custom Pandas user-defined function (UDF) transform that can process large datasets efficiently. You can choose from two custom Pandas UDF modes: Pandas and Python. Both modes provide an efficient solution to process datasets, and the mode you choose depends on your preference.

In this post, we demonstrate how to use the new Pandas UDF transform in either mode.

Solution overview

At the time of this writing, you can import datasets into Data Wrangler from Amazon Simple Storage Service (Amazon S3), Amazon Athena, Amazon Redshift, Databricks, and Snowflake. For this post, we use Amazon S3 to store the 2014 Amazon reviews dataset.

The data has a column called reviewText containing user-generated text. The text also contains several stop words, which are common words that don’t provide much information, such as “a,” “an,” and “the.” Removal of stop words is a common preprocessing step in natural language processing (NLP) pipelines. We can create a custom function to remove the stop words from the reviews.

Create a custom Pandas UDF transform

Let’s walk through the process of creating two Data Wrangler custom Pandas UDF transforms using Pandas and Python modes.

Download the Digital Music reviews dataset and upload it to Amazon S3.
Open Amazon SageMaker Studio and create a new Data Wrangler flow.
Under Import data, choose Amazon S3 and navigate to the dataset location.
For File type, choose jsonl.

A preview of the data should be displayed in the table.

Choose Import to proceed.
After your data is imported, choose the plus sign next to Data types and choose Add transform.
Choose Custom transform.
On the drop-down menu, Python (User-Defined Function).

Now we create our custom transform to remove stop words.

Specify your input column, output column, return type, and mode.

The following example uses Pandas mode. This means the function should accept and return a Pandas series of the same length. You can think of a Pandas series as a column in a table or a chunk of the column. This is the most performant Pandas UDF mode because Pandas can vectorize operations across batches of values as opposed to one at a time. The pd.Series type hints are required in Pandas mode.

import pandas as pd
from sklearn.feature_extraction import text

# Input: the quick brown fox jumped over the lazy dog
# Output: quick brown fox jumped lazy dog
def remove_stopwords(series: pd.Series) -> pd.Series:
  """Removes stop words from the given string."""
  
  # Replace nulls with empty strings and lowercase to match stop words case
  series = series.fillna("").str.lower()
  tokens = series.str.split()
  
  # Remove stop words from each entry of series
  tokens = tokens.apply(lambda t: [token for token in t 
                                   if token not in text.ENGLISH_STOP_WORDS])
  
  # Joins the filtered tokens by spaces
  return tokens.str.join(" ")

If you prefer to use pure Python as opposed to the Pandas API, Python mode allows you to specify a pure Python function that accepts a single argument and returns a single value. The following example is equivalent to the preceding Pandas code in terms of output. Type hints are not required in Python mode.

from sklearn.feature_extraction import text

def remove_stopwords(value: str) -> str:
  if not value:
    return ""
  
  tokens = value.lower().split()
  tokens = [token for token in tokens 
            if token not in text.ENGLISH_STOP_WORDS]
  return " ".join(tokens)

Choose Add to add your custom transform.

Conclusion

Data Wrangler has over 300 built-in transforms, and you can also add custom transformations unique to your requirements. In this post, we demonstrated how to process datasets with Data Wrangler’s new custom Pandas UDF transform, using both Pandas and Python modes. You can use either mode based on your preference. To learn more about Data Wrangler, refer to Create and Use a Data Wrangler Flow.

About the Authors

Ben Harris is a software engineer with experience designing, deploying, and maintaining scalable data pipelines and machine learning solutions across a variety of domains. Ben has built systems for data collection and labeling, image and text classification, sequence-to-sequence modeling, embedding, and clustering, among others.

Haider Naqvi is a Solutions Architect at AWS. He has extensive Software Development and Enterprise Architecture experience. He focuses on enabling customers to achieve business outcomes with AWS. He is based out of New York.

Vishal Srivastava is a Technical Account Manager at AWS. With a background in Software Development and Analytics, he primarily works with financial services sector and digital native business customers and supports their cloud journey. In his free time, he loves to travel with his family.

How Searchmetrics uses Amazon SageMaker to automatically find relevant keywords and make their human analysts 20% faster

April 28, 2022

by Daniel Burke Amazon AWS

Searchmetrics is a global provider of search data, software, and consulting solutions, helping customers turn search data into unique business insights. To date, Searchmetrics has helped more than 1,000 companies such as McKinsey & Company, Lowe’s, and AXA find an advantage in the hyper-competitive search landscape.

In 2021, Searchmetrics turned to AWS to help with artificial intelligence (AI) usage to further improve their search insights capabilities.

In this post, we share how Searchmetrics built an AI solution that increased the efficiency of its human workforce by 20% by automatically finding relevant search keywords for any given topic, using Amazon SageMaker and its native integration with Hugging Face.

“Amazon SageMaker made it a breeze to evaluate and integrate Hugging Face’s state-of-the-art NLP models into our systems.
The solution we built makes us more efficient and greatly improves our user experience.”– Ioannis Foukarakis, Head of Data, Searchmetrics

Using AI to identify relevance from a list of keywords

A key part of Searchmetrics’ insights offering is its ability to identify the most relevant search keywords for a given topic or search intent.

To do this, Searchmetrics has a team of analysts assessing the potential relevance of certain keywords given a specific seed word. Analysts use an internal tool to review a keyword within a given topic and a generated list of potentially related keywords, and they must then select one or more related keywords that are relevant to that topic.

This manual filtering and selection process was time consuming and slowed down Searchmetrics’s ability to deliver insights to its customers.

To improve this process, Searchmetrics sought to build an AI solution that could use natural language processing (NLP) to understand the intent of a given search topic and automatically rank an unseen list of potential keywords by relevance.

Using SageMaker and Hugging Face to quickly build advanced NLP capabilities

To solve this, Searchmetrics’ engineering team turned to SageMaker, an end-to-end machine learning (ML) platform that helps developers and data scientists quickly and easily build, train, and deploy ML models.

SageMaker accelerates the deployment of ML workloads by simplifying the ML build process. It provides a broad set of ML capabilities on top of a fully managed infrastructure. This removes the undifferentiated heavy lifting that too-often hinders ML development.

Searchmetrics chose SageMaker because of the full range of capabilities it provided at every step of the ML development process:

SageMaker notebooks enabled the Searchmetrics team to quickly spin up fully managed ML development environments, perform data preprocessing, and experiment with different approaches
The batch transform capabilities in SageMaker enabled Searchmetrics to efficiently process its inference payloads in bulk, as well as easily integrate into its existing web service in production

Searchmetrics was also particularly interested in the native integration of SageMaker with Hugging Face, an exciting NLP startup that provides easy access to more than 7,000 pre-trained language models through its popular Tranformers library.

SageMaker provides a direct integration with Hugging Face through a dedicated Hugging Face estimator in the SageMaker SDK. This makes it easy to run Hugging Face models on the fully managed SageMaker infrastructure.

With this integration, Searchmetrics was able to test and experiment with a range of different models and approaches to find the best-performing approach to their use case.

The end solution uses a zero-shot classification pipeline to identify the most relevant keywords. Different pre-trained models and query strategies were evaluated, with facebook/bart-large-mnli providing the most promising results.

Using AWS to improve operational efficiency and find new innovation opportunities

With SageMaker and its native integration with Hugging Face, Searchmetrics was able to build, train, and deploy an NLP solution that could understand a given topic and accurately rank an unseen list of keywords based on their relevance. The toolset offered by SageMaker made it easier to experiment and deploy.

When integrated with Searchmetrics’s existing internal tool, this AI capability delivered an average reduction of 20% in the time taken for human analysts to complete their job. This resulted in higher throughput, improved user experience, and faster onboarding of new users.

This initial success has not only improved the operational performance of Searchmetrics’s search analysts, but has also helped Searchmetrics chart a clearer path to deploying more comprehensive automation solutions using AI in its business.

These exciting new innovation opportunities help Searchmetrics continue to improve their insights capabilities, and also help them ensure that customers continue to stay ahead in the hyper-competitive search landscape.

In addition, Hugging Face and AWS announced a partnership earlier in 2022 that makes it even easier to train Hugging Face models on SageMaker. This functionality is available through the development of Hugging Face AWS Deep Learning Containers (DLCs). These containers include Hugging Face Transformers, Tokenizers, and the Datasets library, which allows us to use these resources for training and inference jobs.

For a list of the available DLC images, see available Deep Learning Containers Images, which are maintained and regularly updated with security patches. You can find many examples of how to train Hugging Face models with these DLCs and the Hugging Face Python SDK in the following GitHub repo.

Learn more about how you can accelerate your ability to innovate with AI/ML by visiting Getting Started with Amazon SageMaker, getting hands-on learning content by reviewing the Amazon SageMaker developer resources, or visiting Hugging Face on Amazon SageMaker.

About the Author

Daniel Burke is the European lead for AI and ML in the Private Equity group at AWS. Daniel works directly with Private Equity funds and their portfolio companies, helping them accelerate their AI and ML adoption to improve innovation and increase enterprise value.

Identify paraphrased text with Hugging Face on Amazon SageMaker

April 28, 2022

by Bala Krishnamoorthy Amazon AWS

Identifying paraphrased text has business value in many use cases. For example, by identifying sentence paraphrases, a text summarization system could remove redundant information. Another application is to identify plagiarized documents. In this post, we fine-tune a Hugging Face transformer on Amazon SageMaker to identify paraphrased sentence pairs in a few steps.

A truly robust model can identify paraphrased text when the language used may be completely different, and also identify differences when the language used has high lexical overlap. In this post, we focus on the latter aspect. Specifically, we look at whether we can train a model that can identify the difference between two sentences that have high lexical overlap and very different or opposite meanings. For example, the following sentences have the exact same words but opposite meanings:

I took a flight from New York to Paris
I took a flight from Paris to New York

Solution overview

We walk you through the following high-level steps:

Set up the environment.
Prepare the data.
Tokenize the dataset.
Fine-tune the model.
Deploy the model and perform inference.
Evaluate model performance.

If you want to skip setting up the environment, you can use the following notebook on GitHub and run the code in SageMaker.

Hugging Face and AWS announced a partnership earlier in 2022 that makes it even easier to train Hugging Face models on SageMaker. This functionality is available through the development of Hugging Face AWS Deep Learning Containers (DLCs). These containers include Hugging Face Transformers, Tokenizers, and the Datasets library, which allows us to use these resources for training and inference jobs. For a list of the available DLC images, see Available Deep Learning Containers Images. They are maintained and regularly updated with security patches. You can find many examples of how to train Hugging Face models with these DLCs and the Hugging Face Python SDK in the following GitHub repo.

The PAWS dataset

Realizing the lack of efficient sentence pairs datasets that exhibit high lexical overlap without being paraphrases, the original PAWS dataset released in 2019 aimed to provide the natural language processing (NLP) community a new resource for training and evaluating paraphrase detection models. PAWS sentence pairs are generated in two steps using Wikipedia and the Quora Question Pairs (QQP) dataset. A language model first swaps words in a sentence pair with the same Bag of Words (BOW) to generate a sentence pair. A back translation step then generates paraphrases with high BOW overlap but using a different word order. The final PAWS dataset contains a total of 108,000 human-labeled and 656,000 noisily labeled pairs.

In this post, we use the PAWS-Wiki Labeled (Final) dataset from Hugging Face. Hugging Face has already performed the data split for us, which results in 49,000 sentence pairs in the training dataset, and 8,000 sentence pairs each for the validation and test datasets. Two sentence pair examples from the training dataset are shown in the following example. A label of 1 indicates that the two sentences are paraphrases of each other.

Sentence 1	Sentence 2	Label
Although interchangeable, the body pieces on the 2 cars are not similar.	Although similar, the body parts are not interchangeable on the 2 cars.	0
Katz was born in Sweden in 1947 and moved to New York City at the age of 1.	Katz was born in 1947 in Sweden and moved to New York at the age of one.	1

Prerequisites

You need to complete the following prerequisites:

Sign up for an AWS account if you don’t have one. For more information, see Set Up Amazon SageMaker Prerequisites.
Get started using SageMaker notebook instances.
Set up the right AWS Identity and Access Management (IAM) permissions. For more information, see SageMaker Roles.

Set up the environment

Before we begin examining and preparing our data for model fine-tuning, we need to set up our environment. Let’s start by spinning up a SageMaker notebook instance. Choose an AWS Region in your AWS account and follow the instructions to create a SageMaker notebook instance. The notebook instance may take a few minutes to spin up.

When the notebook instance is running, choose conda_pytorch_p38 as your kernel type. To use the Hugging Face dataset, we first need to install and import the Hugging Face library:

!pip --quiet install "sagemaker" "transformers==4.17.0" "datasets==1.18.4" --upgrade
!pip --quiet install sentence-transformers

import sagemaker.huggingface
import sagemaker
from datasets import load_dataset

Next, let’s establish a SageMaker session. We use the default Amazon Simple Storage Service (Amazon S3) bucket associated with the SageMaker session to store the PAWS dataset and model artifacts:

sess = sagemaker.Session()
role = sagemaker.get_execution_role()
bucket = sess.default_bucket()

Prepare the data

We can load the Hugging Face version of the PAWS dataset with its load_dataset() command. This call downloads and imports the PAWS Python processing script from the Hugging Face GitHub repository, which then downloads the PAWS dataset from the original URL stored in the script and caches the data as an Arrow table on the drive. See the following code:

dataset_train, dataset_val, dataset_test = load_dataset("paws", "labeled_final", split=['train', 'validation', 'test'])

Before we begin fine-tuning our pre-trained BERT model, let’s look at our target class distribution. For our use case, the PAWS dataset has binary labels (0 indicates the sentence pair is not a paraphrase, and 1 indicates it is). Let’s create a column chart to view the class distribution, as shown in the following code. We see that there is a slight class imbalance issue in our training set (56% negative samples vs. 44% positive samples). However, the imbalance is small enough to avoid employing class imbalance mitigation techniques.

import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

df = dataset_train.to_pandas()

ax = sns.countplot(x="label", data=df)
ax.set_title('Label Count for PAWS Dataset', fontsize=15)
for p in ax.patches:
    ax.annotate(f'n{p.get_height()}', (p.get_x()+0.4, p.get_height()), ha='center', va='top', color='white', size=13)

Tokenize the dataset

Before we can begin fine-tuning, we need to tokenize our dataset. As a starting point, let’s say we want to fine-tune and evaluate the roberta-base transformer. We selected roberta-base because it’s a general-purpose transformer that was pre-trained on a large corpus of English data and has frequently shown high performance on a variety of NLP tasks. The model was originally introduced in the paper RoBERTa: A Robustly Optimized BERT Pretraining Approach.

We perform tokenization on the sentences with a roberta-base tokenizer from Hugging Face, which uses byte-level Byte Pair Encoding to split the document into tokens. For more details about the RoBERTa tokenizer, refer to RobertaTokenizer. Because our inputs are sentence pairs, we need to tokenize both sentences simultaneously. Because most BERT models require the input to have a fixed tokenized input length, we set the following parameters: max_len=128 and truncation=True. See the following code:

from transformers import AutoTokenizer
tokenizer_and_model_name = 'roberta-base'

# Download tokenizer
tokenizer = AutoTokenizer.from_pretrained(tokenizer_and_model_name)

# Tokenizer helper function
def tokenize(batch, max_len=128):
    return tokenizer(batch['sentence1'], batch['sentence2'], max_length=max_len, truncation=True)

dataset_train_tokenized = dataset_train.map(tokenize, batched=True, batch_size=len(dataset_train))
dataset_val_tokenized = dataset_val.map(tokenize, batched=True, batch_size=len(dataset_val))

The last preprocessing step for fine-tuning our BERT model is to convert the tokenized train and validation datasets into PyTorch tensors and upload them to our S3 bucket:

import botocore
from datasets.filesystems import S3FileSystem

s3 = S3FileSystem()
s3_prefix = 'sts-sbert-paws/sts-paws-datasets'

# convert and save train_dataset to s3
training_input_path = f's3://{sess.default_bucket()}/{s3_prefix}/train'
dataset_train_tokenized = dataset_train_tokenized.rename_column("label", "labels")
dataset_train_tokenized.set_format('torch', columns=['input_ids', 'attention_mask', 'labels'])
dataset_train_tokenized.save_to_disk(training_input_path,fs=s3)

# convert and save val_dataset to s3
val_input_path = f's3://{sess.default_bucket()}/{s3_prefix}/val'
dataset_val_tokenized = dataset_val_tokenized.rename_column("label", "labels")
dataset_val_tokenized.set_format('torch', columns=['input_ids', 'attention_mask', 'labels'])
dataset_val_tokenized.save_to_disk(val_input_path,fs=s3)

Fine-tune the model

Now that we’re done with data preparation, we’re ready to fine-tune our pre-trained roberta-base model on the paraphrase identification task. We can use the SageMaker Hugging Face Estimator class to initiate the fine-tuning process in two steps. The first step is to specify the training hyperparameters and metric definitions. The metric definitions variable tells the Hugging Face Estimator what types of metrics to extract from the model’s training logs. Here, we’re primarily interested in extracting validation set metrics at each training epoch.

# Step 1: specify training hyperparameters and metric definitions
hyperparameters = {'epochs': 4,
                   'train_batch_size': 16,
                   'model_name': tokenizer_and_model_name}
                   
metric_definitions=[
    {'Name': 'loss', 'Regex': "'loss': ([0-9]+(.|e-)[0-9]+),?"},
    {'Name': 'eval_loss', 'Regex': "'eval_loss': ([0-9]+(.|e-)[0-9]+),?"},
    {'Name': 'eval_accuracy', 'Regex': "'eval_accuracy': ([0-9]+(.|e-)[0-9]+),?"},
    {'Name': 'eval_f1', 'Regex': "'eval_f1': ([0-9]+(.|e-)[0-9]+),?"},
    {'Name': 'eval_precision', 'Regex': "'eval_precision': ([0-9]+(.|e-)[0-9]+),?"},
    {'Name': 'eval_recall', 'Regex': "'eval_recall': ([0-9]+(.|e-)[0-9]+),?"},
    {'Name': 'epoch', 'Regex': "'epoch': ([0-9]+(.|e-)[0-9]+),?"}
]

The second step is to instantiate the Hugging Face Estimator and start the fine-tuning process with the .fit() method:

# Step 2: instantiate estimator and begin fine-tuning
from sagemaker.huggingface import HuggingFace

huggingface_estimator = HuggingFace(
                            entry_point='train.py',
                            source_dir='./scripts',
                            output_path=f's3://{sess.default_bucket()}',
                            base_job_name='huggingface-sdk-extension',
                            instance_type='ml.p3.8xlarge',
                            instance_count=1,
                            volume_size=100,
                            transformers_version='4.17.0',
                            pytorch_version='1.10.2',
                            py_version='py38',
                            role=role,
                            hyperparameters=hyperparameters,
                            metric_definitions=metric_definitions
                        )
                        
huggingface_estimator.fit({'train': training_input_path, 'test': val_input_path}, 
                          wait=True, 
                          job_name='sm-sts-blog-{}'.format(int(time.time())))

The fine-tuning process takes approximately 30 minutes using the specified hyperparameters.

Deploy the model and perform inference

SageMaker offers multiple deployment options depending on your use case. For persistent, real-time endpoints that make one prediction at a time, we recommend using SageMaker real-time hosting services. If you have workloads that have idle periods between traffic spurts and can tolerate cold starts, we recommend using Serverless Inference. Serverless endpoints automatically launch compute resources and scale them in and out depending on traffic, eliminating the need to choose instance types or manage scaling policies. We demonstrate how to deploy our fine-tuned Hugging Face model to both a real-time inference endpoint and a Serverless Inference endpoint.

Deploy to a real-time inference endpoint

You can deploy a training object onto real-time inference hosting within SageMaker using the .deploy() method. For a full list of the accepted parameters, refer to Hugging Face Model. To start, let’s deploy the model to one instance, by passing in the following parameters: initial_instance_count, instance_type, and endpoint_name. See the following code:

rt_predictor = huggingface_estimator.deploy(initial_instance_count=1,
instance_type="ml.g4dn.xlarge",
endpoint_name="sts-sbert-paws")

The model takes a few minutes to deploy. After the model is deployed, we can submit sample records from the unseen test dataset to the endpoint for inference.

Deploy to a Serverless Inference endpoint

To deploy our training object onto a serverless endpoint, we need to first specify a serverless config file with memory_size_in_mb and max_concurrency arguments:

from sagemaker.serverless.serverless_inference_config import ServerlessInferenceConfig

serverless_config = ServerlessInferenceConfig(
    memory_size_in_mb=6144,
    max_concurrency=1,
)

memory_size_in_mb defines the total RAM size of your serverless endpoint; the minimal RAM size is 1024 MB (1 GB) and it can scale up to 6144 MB (6 GB). Generally, you should aim to choose a memory size that is at least as large as your model size. max_concurrency defines the quota for how many concurrent invocations can be processed at the same time (up to 50 concurrent invocations) for a single endpoint.

We also need to supply the Hugging Face inference image URI, which you can retrieve using the following code:

image_uri = sagemaker.image_uris.retrieve(
    framework="huggingface",
    base_framework_version="pytorch1.10",
    region=sess.boto_region_name,
    version="4.17",
    py_version="py38",
    instance_type="ml.m5.large",
    image_scope="inference",
)

Now that we have the serverless config file, we can create a serverless endpoint in the same way as our real-time inference endpoint, using the .deploy() method:

sl_predictor = huggingface_estimator.deploy(
    serverless_inference_config=serverless_config, image_uri=image_uri
)

The endpoint should be created in a few minutes.

Perform model inference

To make predictions, we need to create the sentence pair by adding the [CLS] and [SEP] special tokens and subsequently submit the input to the model endpoints. The syntax for real-time inference and serverless inference is the same:

import random 

rand = random.randrange(0, 8000)

true_label = dataset_test[rand]['label']
sent_1 = dataset_test[rand]['sentence1']
sent_2 = dataset_test[rand]['sentence2']

sentence_pair = {"inputs": ['[CLS] ' + sent_1 + ' [SEP] ' + sent_2 + ' [SEP]']}


# real-time inference 
print('Sentence 1:', sent_1) 
print('Sentence 2:', sent_2)
print()
print('Inference Endpoint:', rt_predictor.endpoint_name)
print('True Label:', true_label)
print('Predicted Label:', rt_predictor.predict({"inputs": sentence_pair})[0]['label'])
print('Prediction Confidence:', rt_predictor.predict({"inputs": sentence_pair})[0]['score'])

# serverless inference
print('Sentence 1:', sent_1) 
print('Sentence 2:', sent_2)
print()
print('Inference Endpoint:', sl_predictor.endpoint_name)
print('True Label:', true_label)
print('Predicted Label:', sl_predictor.predict({"inputs": sentence_pair})[0]['label'])
print('Prediction Confidence:', sl_predictor.predict({"inputs": sentence_pair})[0]['score'])

In the following examples, we can see the model is capable of correctly classifying whether the input sentence pair contains paraphrased sentences.

The following is a real-time inference example.

The following is a Serverless Inference example.

Evaluate model performance

To evaluate the model, let’s expand the preceding code and submit all 8,000 unseen test records to the real-time endpoint:

from tqdm import tqdm

preds = []
labels = []

# Inference takes ~5 minutes for all test records using a fine-tuned roberta-base and ml.g4dn.xlarge instance

for i in tqdm(range(len(dataset_test))):
    true_label = dataset_test[i]['label']
    sent_1 = dataset_test[i]['sentence1']
    sent_2 = dataset_test[i]['sentence2']
    
    sentence_pair = {"inputs": ['[CLS] ' + sent_1 + ' [SEP] ' + sent_2 + ' [SEP]']}
    pred = rt_predictor.predict(sentence_pair)
    
    labels.append(true_label)
    preds.append(int(pred[0]['label'].split('_')[1]))

Next, we can create a classification report using the extracted predictions:

from sklearn.metrics import classification_report

print('Endpoint Name:', rt_predictor.endpoint_name)
class_names = ['paraphase', 'not paraphrase']
print(classification_report(labels, preds, target_names=class_names))

We get the following test scores.

We can observe that roberta-base has a combined macro-average F1 score of 92% and performs slightly better at detecting sentences that are paraphrases. The roberta-base model performs well, but it’s good practice to calculate model performance using at least one other model.

The following table compares roberta-base performance results on the same test set against another fine-tuned transformer called paraphrase-mpnet-base-v2, a sentence transformer pre-trained specifically for the paraphrase identification task. Both models were trained on an ml.p3.8xlarge instance.

The results show that roberta-base has a 1% higher F1 score with very similar training and inference times using real-time inference hosting on SageMaker. The performance difference between the models is relatively minor, however, roberta-base is ultimately the winner since it has marginally better performance metrics and almost identical training and inference times.

Precision

Recall

F1-score

Training time (billable)

Inference time (full test set)

roberta-base

0.92

0.93

0.92

18 minutes

2 minutes

paraphrase-mpnet-

base-v2

0.92

0.91

17 minutes

2 minutes

Clean up

When you’re done using the model endpoints, you can delete them to avoid incurring future charges:

rt_predictor.delete_endpoint()
sl_predictor.delete_endpoint()

Conclusion

In this post, we discussed how to rapidly build a paraphrase identification model using Hugging Face transformers on SageMaker. We fine-tuned two pre-trained transformers, roberta-base and paraphrase-mpnet-base-v2, using the PAWS dataset (which contains sentence pairs with high lexical overlap). We demonstrated and discussed the benefits of real-time inference vs. Serverless Inference deployment, the latter being a new feature that targets spiky workloads and eliminates the need to manage scaling policies. On an unseen test set with 8,000 records, we demonstrated that both models achieved an F1 score greater than 90%.

To expand on this solution, consider the following:

Try fine-tuning with your own custom dataset. If you don’t have sufficient training labels, you could evaluate the performance of a fine-tuned model like the one demonstrated in this post on a custom test dataset.
Integrate this fine-tuned model into a downstream application that requires information on whether two sentences (or blocks of text) are paraphrases of each other.

Happy building!

About the Authors

Bala Krishnamoorthy is a Data Scientist with AWS Professional Services, where he enjoys applying machine learning to solve customer business problems. He specializes in natural language processing use cases and has worked with customers in industries such as software, finance and healthcare. In his free time, he enjoys trying new food, watching comedies and documentaries, working out at Orange Theory, and being out on the water (paddle-boarding, snorkeling and hopefully diving soon).

Ivan Cui is a Data Scientist with AWS Professional Services, where he helps customers build and deploy solutions using machine learning on AWS. He has worked with customers across diverse industries, including software, finance, pharmaceutical, and healthcare. In his free time, he enjoys reading, spending time with his family, and maximizing his stock portfolio.

How Moovit turns data into insights to help passengers avoid delays using Apache Airflow and Amazon SageMaker

April 28, 2022

by Sharon Dahan Amazon AWS

This is a guest post by Moovit’s Software and Cloud Architect, Sharon Dahan.

Moovit, an Intel company, is a leading Mobility as a Service (MaaS) solutions provider and creator of the top urban mobility app. Moovit serves over 1.3 billion riders in 3,500 cities around the world.

We help people everywhere get to their destination in the smoothest way possible, by combining all options for real-time trip planning and payment in one app. We provide governments, cities, transit agencies, operators, and all organizations with mobility challenges with AI-powered mobility solutions that cover planning, operations, and analytics.

In this post, we describe how Moovit built an automated pipeline to train and deploy BERT models which classify public transportation service alerts in multiple metropolitan areas using Apache Airflow and Amazon SageMaker.

The service alert challenge

One of the key features in Moovit’s urban mobility app is offering access to transit service alerts (sourced from local operators and agencies) to app users around the world.

A service alert is a text message that describes a change (which can be positive or negative) in public transit service. These alerts are typically communicated by the operator in a long textual format and need to be analyzed in order to classify their potential impact on the user’s trip plan. The service alert classification affects the way transit recommendations are shown in the app. An incorrect classification may cause users to ignore important service interruptions that may impact their trip plan.

Existing solution and classification challenges

Historically, Moovit applied both automated rule-based classification (which works well for simple logic) as well as manual human classification for more complex cases.

For example, the following alert “Line 46 will arrive 10 min later as a result of an accident with a deer.” Can be classified into one of the following categories:

1: "NO_SERVICE",
2: "REDUCED_SERVICE",
3: "SIGNIFICANT_DELAYS",
4: "DETOUR",
5: "ADDITIONAL_SERVICE",
6: "MODIFIED_SERVICE",
7: "OTHER_EFFECT",
9: "STOP_MOVED",

The above example should be classified as 3, which is SIGNIFICANT_DELAYS.

The existing rule-based classification solution searches the text for key phrases (for example delay or late) as illustrated in the following diagram.

While the rule-based classification engine offered accurate classifications, it was able to classify only 20% of the service alerts requiring the other 80% to be manually classified. This was not scalable and resulted in gaps in our service alerts coverage.

NLP based classification with a BERT framework

We decided to leverage a neural network that can learn to classify service alerts and selected the BERT model for this challenge.

BERT (Bidirectional Encoder Representations from Transformers) is an open-source machine learning (ML) framework for natural language processing (NLP). BERT is designed to help computers understand the meaning of ambiguous language in the text by using surrounding text to establish context. The BERT framework was pre-trained using text from the BooksCorpus with 800M words and English Wikipedia with 2,500M words, and can be fine-tuned with question-answer datasets.

We leveraged classified data from our rule-based classification engine as ground truth for the training job and explored two possible approaches:

Approach 1: The first approach was to train using the BERT pre-trained model which meant adding our layers in the beginning and at the end of the pre-trained model.
Approach 2: The second approach was to use the BERT tokenizer with a standard five-layer model.

Comparison tests showed that, due to the limited amount of available ground truth data, the BERT tokenizer approach yielded better results, was less time-consuming, and required minimal compute resources for training. The model was able to successfully classify service alerts that could not be classified with the existing rule-based classification engine.

The following diagram illustrates the model’s high-level architecture.

After we have the trained model, we deploy it to a SageMaker endpoint and expose it to the Moovit backend server (with request payload being the service alert’s raw text). See the following example code:

{
   "instances": [
       "Expect longer waits for </br> B4, B8, B11, B12, B14, B17, B24, B35, B38, B47, B48, B57, B60, B61, B65, B68, B82, and B83 buses.rnrnWe're working to provide as much service as possible."
   ]
}

The response is the classification and the level of confidence:

{
   "response": [
       {
           "id": 1,
           "prediction": "SIGNIFICANT_DELAYS",
           "confidance": 0.921
       }
   ]
}

From research to production – overcoming operational challenges

Once we trained an NLP model, we had to overcome several challenges in order to enable our app users to access service alerts at scale and in a timely manner:

How do we deploy a model to our production environment?
How do we serve the model at scale with low latency?
How do we re-train the model in order to future proof our solution?
How do we expand to other metropolitan areas (aka “metros”) in an efficient way?

Prior to using SageMaker, we used to take the trained ML models and manually integrate them into our backend environment. This created a dependency between the model deployment and a backend upgrade. As a result, our ability to deploy new models was very limited and resulted in extremely rare model updates.

In addition, serving an ML model can require substantial compute resources which are difficult to predict and need to be provisioned for in advance in order to ensure adherence to our strict latency requirements. When the model is served within the backend this can cause unnecessary scaling of compute resources and erratic behavior.

The solution to both these challenges was to use SageMaker endpoints for our real time inference requirements. This enabled us to (1) de-couple the model serving and deployment cycle from the backend release schedule and (2) de-couple the resource provisioning required for model serving (also in peak periods) from the backend provisioning.

Because our group already had deep experience with Airflow, we decided to automate the entire pipeline using Airflow operators in conjunction with SageMaker. As you can see below, we built a full CI/CD pipeline to automate data collection, model re-training and to manage the deployment process. This pipeline can also be leveraged to make the entire process scalable to new metropolitan areas, as we continue to increase our coverage in additional cities worldwide.

AI Lake architecture

The architecture shown in the following diagram is based on SageMaker and Airflow; all endpoints exposed to developers use Amazon API Gateway. This implementation was dubbed “AI lake”.

SageMaker helps data scientists and developers to prepare, build, train, and deploy high-quality machine learning models quickly by bringing together a broad set of capabilities purpose-built for machine learning.

Moovit uses SageMaker to automate the training and deployment process. The trained models are saved to Amazon Simple Storage Service (Amazon S3) and cataloged.

SageMaker helps us significantly reduce the need for engineering time and lets us focus more on developing features for the business and less on the infrastructure required to support the model’s lifecycle. Below you can see Moovit’s SageMaker Training Jobs.

After we train the Metro’s model, we expose it using the SageMaker endpoint. SageMaker enables us to deploy a new version seamlessly to the app, without any downtime.

Moovit uses API Gateway to expose all models under the same domain, as shown in the following screenshot.

Moovit decided to use Airflow to schedule and create a holistic workflow. Each model has its own workflow, which includes the following steps:

Dataset generation – The owner of this step is the BI team. This step automatically creates a fully balanced dataset with which to train the model. The final dataset is saved to an S3 bucket.
Train – The owner of this step is the server team. This step fetches the dataset from the previous step and trains the model using SageMaker. SageMaker takes care of the whole training process, such as provisioning the instance, running the training code, saving the model, and saving the training job results and logs.
Verify – This step is owned by the data science team. During the verification step, Moovit runs a confusion matrix and checks some of the parameters to make sure that the model is healthy and stands within proper thresholds. If the new model misses the criteria, the flow is canceled and the deploy step doesn’t run.
Deploy – The owner of this step is the DevOps teams. This step triggers the deploy function for SageMaker (using Boto3) to update the existing endpoint or create a new one.

Results

With the AI lake solution and service alert classification model, Moovit accomplished two major achievements:

Functional – In Metros where the service alert classification model was deployed, Moovit has achieved x3 growth in percentage of classified service alerts! (from 20% to over 60%)
Operational – Moovit now has the ability to maintain and develop more ML models with less engineering effort, and with very clear and outlined best practices and responsibilities. This opens new opportunities for integrating AI and ML models into Moovit’s products and technologies.

The following charts illustrate the service alert classifications before (left) and after (right) implementing this solution – the turquoise area is the unclassified alerts (aka “modified service”).

Conclusion

In this post, we shared how Moovit used SageMaker with AirFlow to improve the number of classified service alerts by 200% (x3). Moovit is now able to maintain and develop more ML models with less engineering efforts and with very clear practices and responsibilities.

For further reading, refer to the following:

About the Authors

Sharon Dahan is a Software & Cloud Architect at Moovit. He is responsible for bringing innovative and creative solutions which can stand within Moovit’s tremendous scale. In his spare time, Sharon makes tasty hoppy beer.

Miron Perel is a Senior Machine Learning Business Development Manager with Amazon Web Services. Miron helps enterprise organizations harness the power of data and Machine Learning to innovate and grow their business.

Eitan Sela is a Machine Learning Specialist Solutions Architect with Amazon Web Services. He works with AWS customers to provide guidance and technical assistance, helping them build and operate machine learning solutions on AWS. In his spare time, Eitan enjoys jogging and reading the latest machine learning articles.

Advances in trustworthy machine learning at Alexa AI

April 28, 2022

by Amazon AWS

The team’s latest research on privacy-preserving machine learning, federated learning, and bias mitigation.Read More

Amazon Rekognition Streaming Video Events

Customer success stories

How it works

Conclusion

About the Author

Video analytics with Amazon Rekognition Streaming Video Events

Conclusion

About the Authors

Smart notifications for the connected home market segment

Amazon Rekognition Streaming Video Events

Solution overview

Conclusion

About the Authors

Solution overview

Create a custom Pandas UDF transform

Conclusion

About the Authors

Using AI to identify relevance from a list of keywords

Using SageMaker and Hugging Face to quickly build advanced NLP capabilities

Using AWS to improve operational efficiency and find new innovation opportunities

About the Author

Solution overview

The PAWS dataset

Prerequisites

Set up the environment

Prepare the data

Tokenize the dataset

Fine-tune the model

Deploy the model and perform inference

Deploy to a real-time inference endpoint

Deploy to a Serverless Inference endpoint

Perform model inference

Evaluate model performance

Clean up

Conclusion

About the Authors

The service alert challenge

Existing solution and classification challenges

NLP based classification with a BERT framework

From research to production – overcoming operational challenges

AI Lake architecture

Results

Conclusion

About the Authors

Navigation

GenAI Vision Endless Possibilities

"I'm interested in things that change the world or that affect the future and wondrous, new technology where you see it, and you're like, 'Wow, how did that even happen? How is that possible?'" -- Elon Musk

Copyright © 2019-2025 Vedere AI. All Rights Reserved.