Currently, many diseases affect farming and lead to significant economic losses due to reduction of yield and loss of quality produce. In many cases, the health condition of a crop or a plant is often assessed by the condition of its leaves. For farmers, it is crucial to identify these symptoms early. Early identification is key to controlling diseases before they spread too far. However, manually identifying if a leaf is infected, the type of the infection, and the required disease control solution is a hard problem to solve. Current methods can be error prone and very costly. This is where an automated machine learning (ML) solution for computer vision (CV) can help. Typically, building complex machine learning models require hundreds of thousands of labeled images, along with expertise in data science. In this post, we showcase how you can build an end-to-end disease detection, identification, and resolution recommendation solution using Amazon Rekognition Custom Labels.
Amazon Rekognition is a fully managed service that provides CV capabilities for analyzing images and video at scale, using deep learning technology without requiring ML expertise. Amazon Rekognition Custom Labels, an automated ML feature of Amazon Rekognition, lets you quickly train custom CV models specific to your business needs, simply by bringing labeled images.
We create a custom model to detect the plant leaf disease. To create our custom model, we follow these steps:
- Create a project in Amazon Rekognition Custom Labels.
- Create a dataset with images containing multiple types of plant leaf diseases.
- Train the model and evaluate the performance.
- Test the new custom model using the automatically generated API endpoint.
Amazon Rekognition Custom Labels lets you manage the ML model training process on the Amazon Rekognition console, which simplifies the end-to-end model development and inference process.
Creating your project
To create your plant leaf disease detection project, complete the following steps:
- On the Amazon Rekognition console, choose Custom Labels.
- Choose Get Started.
- For Project name, enter plant-leaf-disease-detection.
- Choose Create project.
You can also create a project on the Projects page. You can access the Projects page via the navigation pane.
Creating your dataset
To create your leaf disease detection model, you first need to create a dataset to train the model with. For this post, our dataset is composed of three categories of plant leaf disease images: bacterial leaf blight, brown spots, and leaf smut.
The following images show examples of bacterial leaf blight.
The following images show examples of brown spots.
The following images show examples of leaf smut.
We sourced our images from UCI, Citation (Prajapati HB, Shah JP, Dabhi VK. Detection and classification of rice plant diseases. Intelligent Decision Technologies. 2017 Jan 1;11(3):357-73, doi: 10.3233/IDT-170301) (Dua, D. and Graff, C. (2019). UCI Machine Learning Repository [http://archive.ics.uci.edu/ml]. Irvine, CA: University of California, School of Information and Computer Science.)
To create your dataset, complete the following steps:
- Create an Amazon Simple Storage Service (Amazon S3) bucket.
For this post, I create an S3 bucket called plan-leaf-disease-data.
- Create three folders inside this bucket called Bacterial-Leaf-Blight, Brown-Spot, and Leaf-Smut to store images of each disease category.
- Upload each category of image files in their respective bucket.
- On the Amazon Rekognition console, under Datasets, choose Create dataset.
- Select Import images from Amazon S3 bucket.
- For S3 folder location, enter the S3 bucket path.
- For automatic labeling, select Automatically attach a label to my images based on the folder they’re stored in.
This creates data labeling of the images as folder names.
You can now see the generated S3 bucket permissions policy.
- Copy the JSON policy.
- Navigate to the S3 bucket.
- On the Permission tab, under Bucket policy, choose Edit.
- Enter the JSON policy you copied.
- Chose Save changes.
- Choose Submit.
You can see that image labeling is organized based on the folder name.
Training your model
After you label your images, you’re ready to train your model.
- Choose Train Model.
- For Choose project, choose your project plant-leaf-disease-detection.
- For Choose training dataset, choose your dataset plant-leaf-disease-dataset.
As part of model training, Amazon Rekognition Custom Labels requires a labeled test dataset. Amazon Rekognition Custom Labels uses the test dataset to verify how well your trained model predicts the correct labels and generates evaluation metrics. Images in the test dataset are not used to train your model and should represent the same types of images you use with your model to analyze.
- For Create test set, select how you want to create your test dataset.
Amazon Rekognition Custom Labels provides three options:
- Choose an existing test dataset
- Create a new test dataset
- Split training dataset
For this post, we select Split training dataset and let Amazon Rekognition hold back 20% of the images for testing and use the remaining 80% of the images to train the model.
Our model took approximately 1 hour to train. The training time required for your model depends on many factors, including the number of images provided in the dataset and the complexity of the model.
When training is complete, Amazon Rekognition Custom Labels outputs key quality metrics, including F1 score, precision, recall, and the assumed threshold for each label. For more information about metrics, see Metrics for Evaluating Your Model.
Our evaluation results show that our model has a precision of 1.0 for Bacterial-Leaf-Blight and Brown-Spot, which means that no objects were mistakenly identified (false positives) in our test set. Our model also didn’t miss any objects in our test set (false negatives), which is reflected in our recall score of 1. You can often use the F1 score as an overall quality score because it takes both precision and recall into account. Finally, we see that our assumed threshold to generate the F1 score, precision, and recall metrics each category is 0.62, 0.69, and 0.54 for Bacterial-Leaf-Blight, Brown-Spot, and Leaf-Smut, respectively. By default, our model returns predictions above this assumed threshold.
We can also choose View test results to see how our model performed on each test image. The following screenshot shows an example of a correctly identified image of bacterial leaf blight during the model testing (true positive).
Testing your model
Your plant disease detection model is now ready for use. Amazon Rekognition Custom Labels provides the API calls for starting, using, and stopping your model; you don’t need to manage any infrastructure. For more information, see Starting or Stopping an Amazon Rekognition Custom Labels Model (Console).
In addition to using the API, you can also use the Custom Labels Demonstration. This CloudFormation template enables you to set up a custom, password-protected UI where you can start and stop your models and run demonstration inferences.
Once deployed, the application can be accessed using a web browser using the address specified in url output from the CloudFormation stack created during deployment of the solution.
- Choose Start the model.
- Provide the inference unit required. For this example, let’s give a value of 1.
You’re charged for the amount of time, in minutes, that the model is running. For more information, see Inference hours.
It might take a while to start.
- Choose the model name.
- Choose Upload.
A window opens for you to choose the plant leaf image from your local drive.
The model detects the disease in the uploaded leaf image along with confidence score. It also gives the pest control recommendation based on the type of disease.
To avoid incurring unnecessary charges, delete the resources used in this walkthrough when not in use. For instructions, see the following:
In this post, we showed you how to create an object detection model with Amazon Rekognition Custom Labels. This feature makes it easy to train a custom model that can detect an object class without needing to specify other objects or losing accuracy in its results.
For more information about using custom labels, see What Is Amazon Rekognition Custom Labels?
About the Authors
Dhiraj Thakur is a Solutions Architect with Amazon Web Services. He works with AWS customers and partners to provide guidance on enterprise cloud adoption, migration, and strategy. He is passionate about technology and enjoys building and experimenting in the analytics and AI/ML space.
Sameer Goel is a Solutions Architect in Seattle, who drives customer success by building prototypes on cutting-edge initiatives. Prior to joining AWS, Sameer graduated with a master’s degree from NEU Boston, with a concentration in data science. He enjoys building and experimenting with AI/ML projects on Raspberry Pi.