Automate Amazon Rekognition Custom Labels model training and deployment using AWS Step Functions

With Amazon Rekognition Custom Labels, you can have Amazon Rekognition train a custom model for object detection or image classification specific to your business needs. For example, Rekognition Custom Labels can find your logo in social media posts, identify your products on store shelves, classify machine parts in an assembly line, distinguish healthy and infected plants, or detect animated characters in videos.

Developing a Rekognition Custom Labels model to analyze images is a significant undertaking that requires time, expertise, and resources, often taking months to complete. Additionally, it often requires thousands or tens of thousands of hand-labeled images to provide the model with enough data to accurately make decisions. Generating this data can take months to gather and require large teams of labelers to prepare it for use in machine learning (ML).

With Rekognition Custom Labels, we take care of the heavy lifting for you. Rekognition Custom Labels builds off of the existing capabilities of Amazon Rekognition, which is already trained on tens of millions of images across many categories. Instead of thousands of images, you simply need to upload a small set of training images (typically a few hundred images or less) that are specific to your use case via our easy-to-use console. If your images are already labeled, Amazon Rekognition can begin training in just a few clicks. If not, you can label them directly within the Amazon Rekognition labeling interface, or use Amazon SageMaker Ground Truth to label them for you. After Amazon Rekognition begins training from your image set, it produces a custom image analysis model for you in just a few hours. Behind the scenes, Rekognition Custom Labels automatically loads and inspects the training data, selects the right ML algorithms, trains a model, and provides model performance metrics. You can then use your custom model via the Rekognition Custom Labels API and integrate it into your applications.

However, building a Rekognition Custom Labels model and hosting it for real-time predictions involves several steps: creating a project, creating the training and validation datasets, training the model, evaluating the model, and then creating an endpoint. After the model is deployed for inference, you might have to retrain the model when new data becomes available or if feedback is received from real-world inference. Automating the whole workflow can help reduce manual work.

In this post, we show how you can use AWS Step Functions to build and automate the workflow. Step Functions is a visual workflow service that helps developers use AWS services to build distributed applications, automate processes, orchestrate microservices, and create data and ML pipelines.

Solution overview

The Step Functions workflow is as follows:

  1. We first create an Amazon Rekognition project.
  2. In parallel, we create the training and the validation datasets using existing datasets. We can use the following methods:
    1. Import a folder structure from Amazon Simple Storage Service (Amazon S3) with the folders representing the labels.
    2. Use a local computer.
    3. Use Ground Truth.
    4. Create a dataset using an existing dataset with the AWS SDK.
    5. Create a dataset with a manifest file with the AWS SDK.
  3. After the datasets are created, we train a Custom Labels model using the CreateProjectVersion API. This could take from minutes to hours to complete.
  4. After the model is trained, we evaluate the model using the F1 score output from the previous step. We use the F1 score as our evaluation metric because it provides a balance between precision and recall. You can also use precision or recall as your model evaluation metrics. For more information on custom label evaluation metrics, refer to Metrics for evaluating your model.
  5. We then start to use the model for predictions if we are satisfied with the F1 score.

The following diagram illustrates the Step Functions workflow.


Before deploying the workflow, we need to create the existing training and validation datasets. Complete the following steps:

  1. First, create an Amazon Rekognition project.
  2. Then, create the training and validation datasets.
  3. Finally, install the AWS SAM CLI.

Deploy the workflow

To deploy the workflow, clone the GitHub repository:

git clone
cd rekognition-customlabels-automation-with-stepfunctions
sam build
sam deploy --guided

These commands build, package and deploy your application to AWS, with a series of prompts as explained in the repository.

Run the workflow

To test the workflow, navigate to the deployed workflow on the Step Functions console, then choose Start execution.

The workflow could take a few minutes to a few hours to complete. If the model passes the evaluation criteria, an endpoint for the model is created in Amazon Rekognition. If the model doesn’t pass the evaluation criteria or the training failed, the workflow fails. You can check the status of the workflow on the Step Functions console. For more information, refer to Viewing and debugging executions on the Step Functions console.

Perform model predictions

To perform predictions against the model, you can call the Amazon Rekognition DetectCustomLabels API. To invoke this API, the caller needs to have the necessary AWS Identity and Access Management (IAM) permissions. For more details on performing predictions using this API, refer to Analyzing an image with a trained model.

However, if you need to expose the DetectCustomLabels API publicly, you can front the DetectCustomLabels API with Amazon API Gateway. API Gateway is a fully managed service that makes it easy for developers to create, publish, maintain, monitor, and secure APIs at any scale. API Gateway acts as the front door for your DetectCustomLabels API, as shown in the following architecture diagram.

API Gateway forwards the user’s inference request to AWS Lambda. Lambda is a serverless, event-driven compute service that lets you run code for virtually any type of application or backend service without provisioning or managing servers. Lambda receives the API request and calls the Amazon Rekognition DetectCustomLabels API with the necessary IAM permissions. For more information on how to set up API Gateway with Lambda integration, refer to Set up Lambda proxy integrations in API Gateway.

The following is an example Lambda function code to call the DetectCustomLabels API:

client = boto3.client('rekognition', region_name="us-east-1")

def lambda_handler(event, context):
    image = json.dumps(event['body'])

    # Base64 decode the base64 encoded image body since API GW base64 encodes the image sent in and
    # Amazon Rekognition's detect_custom_labels API base64 encodes automatically ( since we are using the SDK)
    base64_decoded_image = base64.b64decode(image)

    min_confidence = 85

    # Call DetectCustomLabels
    response = client.detect_custom_labels(Image={'Bytes': base64_decoded_image},

    response_body = json.loads(json.dumps(response))

    statusCode = response_body['ResponseMetadata']['HTTPStatusCode']
    predictions = {}
    predictions['Predictions'] = response_body['CustomLabels']

    return {
        "statusCode": statusCode,
        "body": json.dumps(predictions)

Clean up

To delete the workflow, use the AWS SAM CLI:

sam delete —stack-name <your sam project name>

To delete the Rekognition Custom Labels model, you can either use the Amazon Rekognition console or the AWS SDK. For more information, refer to Deleting an Amazon Rekognition Custom Labels model.


In this post, we walked through a Step Functions workflow to create a dataset and then train, evaluate, and use a Rekognition Custom Labels model. The workflow allows application developers and ML engineers to automate the custom label classification steps for any computer vision use case. The code for the workflow is open-sourced.

For more serverless learning resources, visit Serverless Land. To learn more about Rekognition custom labels, visit Amazon Rekognition Custom Labels.

About the Author

Veda Raman is a Senior Specialist Solutions Architect for machine learning based in Maryland. Veda works with customers to help them architect efficient, secure and scalable machine learning applications. Veda is interested in helping customers leverage serverless technologies for Machine learning.

Read More