Training a custom single class object detection model with Amazon Rekognition Custom Labels

Customers often need to identify single objects in images; for example, to identify their company’s logo, find a specific industrial or agricultural defect, or locate a specific event, like hurricanes, in satellite scans. In this post, we showcase how to train a custom model to detect a single object using Amazon Rekognition Custom Labels.

Amazon Rekognition is a fully managed service that provides computer vision (CV) capabilities for analyzing images and video at scale, using deep learning technology without requiring machine learning (ML) expertise. Amazon Rekognition Custom Labels lets you extend the detection and classification capabilities of the Amazon Rekognition pre-trained APIs by using data to train a custom CV model specific to your business needs. With the latest update to support single object training, Amazon Rekognition Custom Labels now lets you create a custom object detection model with single object classes.

Solution overview

To show you how the single class object detection feature works, we create a custom model to detect pizza in images. Because we only care about finding pizza in our images, we don’t want to create labels for other food types or create a “not pizza” label.

To create our custom model, we follow these steps:

  1. Create a project in Amazon Rekognition Custom Labels.
  2. Create a dataset with images containing one or more pizzas.
  3. Label the images by applying bounding boxes on all pizzas in the images using the user interface provided by Amazon Rekognition Custom Labels.
  4. Train the model and evaluate the performance.
  5. Test the new custom model using the automatically generated API endpoint.

Amazon Rekognition Custom Labels lets you manage the ML model training process on the Amazon Rekognition console, which simplifies the end-to-end process.

Creating your project

To create your pizza-detection project, complete the following steps:

  1. On the Amazon Rekognition console, choose Custom Labels.
  2. Choose Get Started.
  3. For Project name, enter PizzaDetection.
  4. Choose Create project

You can also create a project on the Projects page. You can access the Projects page via the left navigation pane.

Creating your dataset

To create your pizza model, you first need to create a dataset to train the model with. For this post, our dataset is composed of 39 images that contain pizza. We sourced our images from pexels.com.

To create your dataset:

  1. Choose Create dataset.
  2. Select Upload images from your computer.

  1. Choose Add Images.
  2. Upload your images. You can always add more images later.

Labeling the images with bounding boxes

You’re now ready to label the images by applying bounding boxes on all images with pizza.

  1. Add Pizza as a label to your dataset via the labels list on the left side of the gallery.
  2. Apply the label to the pizzas in the images by selecting all the images with pizza and choosing Draw Bounding Box.

You can use the Shift key to automatically select multiple images between the first and last selected images.

Make sure to draw a bounding box that covers the pizza as tightly as possible.

Training your model

After you label your images, you’re ready to train your model.

  1. Choose Train Model.
  2. For Choose project, choose your PizzaDetection project.
  3. For Choose training dataset, choose your PizzaImages dataset.

As part of the training, Amazon Rekognition Custom Labels requires a labeled test dataset. You use the text dataset to verify how well the trained model predicts the correct labels and generate evaluation metrics. You don’t use the images in the test dataset to train your model; they should represent the types of images you want your model to analyze.

  1. For Create test set, choose how you want to provide your test dataset.

Amazon Rekognition Custom Labels provides three options:

  • Choose an existing test dataset
  • Create a new test dataset
  • Split training dataset

For this post, we select Split training dataset and let Amazon Rekognition hold back 20% of the images for testing and use the remaining 80% of the images to train the model.

Our model took approximately 1 hour to train. The training time required for your model depends on many factors, including the number of images provided in the dataset and the complexity of the model.

When training is complete, Amazon Rekognition Custom Labels outputs key metrics with every training, including F1 score, precision, recall, and the assumed threshold for each label. For more information about metrics, see Metrics for Evaluating Your Model.

Looking at our evaluation results, our model has a precision of 1.0, which means that no objects were mistakenly identified as pizza (false positives) in our test set. Our model did miss some pizzas in our test set (false negatives), which is reflected in our recall score of 0.81. You can often use the F1 score as an overall quality score because it takes both precision and recall into account. Finally, we see that our assumed threshold to generate the F1 score, precision, and recall metrics for Pizza is 0.61. By default, our model returns predictions above this assumed threshold. We can increase the recall for this model if we lower the confidence threshold. However, this would most likely cause a drop in precision.

We can also choose View Test Results to see each test image and how our model performed. The following screenshot shows an example of a correctly identified image of pizza during the model testing (true positive).

Testing your model

Your custom pizza detection model is now ready for use. Amazon Rekogntion Custom Labels provides the API calls for starting and using the model; you don’t need to deploy, provision, or manage any infrastructure. The following screenshot shows the API calls for using the model.

By using the API, we tried our model on a new test set of images from pexels.com.

For example, the following image shows a pizza on a table with other objects.

The model detects the pizza with a confidence of 91.72% and a correct bounding box. The following code is the JSON response received by the API call:

{
    "CustomLabels": [
        {
            "Name": "Pizza",
            "Confidence": 91.7249984741211,
            "Geometry": {
                "BoundingBox": {
                    "Width": 0.7824199795722961,
                    "Height": 0.3644999861717224,
                    "Left": 0.11868999898433685,
                    "Top": 0.37672001123428345
                }
            }
        }
    ]
}

The following image has a confidence score of 98.40.

The following image has a confidence score of 96.51.

The following image has an empty JSON result, as expected, because the image doesn’t contain pizza.

The following image also has an empty JSON result.

In addition to using the API, you can also use the Custom Labels Demonstration. This AWS CloudFormation template enables you to set up a custom, password-protected UI where you can start and stop your models and run demonstration inferences.

Conclusion

In this post, we showed you how to create a single class object detection model with Amazon Rekognition Custom Labels. This feature makes it easy to train a custom model that can detect an object class without needing to specify other objects or losing accuracy in its results.

For more information about using custom labels, see What Is Amazon Rekognition Custom Labels?


About the Author

Woody Borraccino is a Senior AI Solutions Architect at AWS.

 

 

 

 

Read More