Computer vision, the automatic recognition and description of documents, images, and videos, has far-reaching applications, from identifying defects in high-speed assembly lines, to intelligently automating document processing workflows, and identifying products and people in social media. AWS computer vision services, including Amazon Lookout for Vision, AWS Panorama, Amazon Rekognition, and Amazon Textract, help developers automate image, video, and text analysis without requiring machine learning (ML) experience. As a result, you can implement solutions faster and decrease your time to value.
As customers continue to expand their use of computer vision, we have been investing in all of our services to make them easier to apply to use cases, easier to implement with fewer data requirements, and more cost-effective. Recently, AWS was named a Leader in the IDC MarketScape: Asia/Pacific (Excluding Japan) Vision AI Software Platform 2021 Vendor Assessment (Doc # AP47490521, October 2021). The IDC MarketScape evaluated our product functionality, service delivery, research and innovation strategy, and more for three vision AI use cases: productivity, end-user experience, and decision recommendation. They found that our offerings have a product-market fit for all three use cases. The IDC MarketScape recommends that computer vision decision-makers consider AWS for Vision AI services when you need to centrally plan vision AI capabilities in a large-scope initiative, such as digital transformation (DX), or want flexible ways to control costs.
“Vision AI is one of the emerging technology markets,” says Christopher Lee Marshall, Associate Vice President, Artificial Intelligent and Analytics Strategies at IDC Asia Pacific. “AWS is placed in the Leader’s Category in IDC MarketScape: Asia/Pacific (Excluding Japan) Vision AI Software Platform 2021 Vendor Assessment. It’s critical to watch the major vendors and more mature market solutions, as the early movers tend to consolidate their strengths with greater access to training data, more iterations of algorithm variations, deeper understanding of the operational contexts, and more systematic approaches to work with solution partners in the ecosystem.”
A key service of focus in the report was Amazon Rekognition. We’re excited to announce several enhancements to make Amazon Rekognition more cost-effective, more accurate, and easier to implement. First, we’re lowering prices for image APIs. Next, we’re enriching Amazon Rekognition with new features for content moderation, text-in-image analysis, and automated machine learning (AutoML). The new capabilities enable more accurate content moderation workflows, optical character recognition for a broader range of scenarios, and simplified training and deployment of custom computer vision models.
These latest announcements add to the Amazon Textract innovations we introduced recently, where we added TIFF file support, lowered the latency of asynchronous operations by 50%, and reduced prices by up to 32% in eight AWS Regions. The Amazon Textract innovations make it easier, faster, and less expensive to process documents at scale using computer vision on AWS.
Let’s dive deeper into the Amazon Rekognition announcements and product improvements.
Up to 38% price reduction for Amazon Rekognition Image APIs
We want to help you get a better return on investment for computer vision workflows. Therefore, we’re lowering the price for all Amazon Rekognition Image APIs by up to 38%. This price reduction applies to all 14 Regions where the Amazon Rekognition service endpoints are available.
We offer four pricing tiers based on usage volume for Amazon Rekognition Image APIs today: up to 1 million, 1 – 10M, 10 – 100M, and above 100M images processed per month. The price points for these tiers are $0.001, $0.0008, $0.0006, and $0.0004 per image. With this price reduction, we lowered the API volumes that unlock lower prices:
- We lowered the threshold from 10 million images per month to 5 million images per month for Tier 2. As a result, you can now benefit from a lower Tier 3 price of $0.0006 per image after 5 million images.
- We lowered the Tier 4 threshold from 100 million images per month to 35 million images per month.
We summarize the volume threshold changes in the following table.
|Old volume (images processed per month)||New volume (images processed per month)|
|Tier 1||Unchanged at first 1 million images|
|Tier 2||Next 9 million images||Next 4 million images|
|Tier 3||Next 90 million images||Next 30 million images|
|Tier 4||Over 100 million images||Over 35 million images|
Finally, we’re lowering the price per image for the highest-volume tier from $0.0004 to $0.00025 per image for select APIs. The prices in the following table are for the US East (N. Virginia) Region. In summary, the new prices are as follows.
|Pricing tier||Volume (images per month)||Price per image|
|Images processed by Group 1 APIs: CompareFaces, IndexFaces, SearchFacebyImage, and SearchFaces||Images processed by Group 2 APIs: DetectFaces, DetectModerationLabels, DetectLabels, DetectText, and RecognizeCelebrities|
|Tier 1||First 1 million images||$0.00100||$0.00100|
|Tier 2||Next 4 million images||$0.00080||$0.00080|
|Tier 3||Next 30 million images||$0.00060||$0.00060|
|Tier 4||Over 35 million images||$0.00040||$0.00025|
Your savings will vary based on your usage. The following table provides example savings for a few scenarios in the US East (N. Virginia) Region.
|API Volumes||Group 1 & 2 Image APIs: Old Price||Group 1 Image APIs||Group 2 Image APIs|
|New Price||% Reduction||New Price||% Reduction|
|12 Million in a month||$9,400||$8,400||-10.6%||$8,400||-10.6%|
|12M Annual (1M in a month)||$12,000||$12,000||0.0%||$12,000||0.0%|
|60M in a month||$38,200||$32,200||-15.7%||$28,450||-25.5%|
|60M Annual (5M in a month)||$50,400||$50,400||0.0%||$50,400||0.0%|
|120M in a month||$70,200||$56,200||-19.9%||$43,450||-38.1%|
|120M Annual (10M in a month)||$98,400||$86,400||-12.2%||$86,400||-12.2%|
|420M in a month||$190,200||$176,200||-7.4%||$118,450||-37.7%|
|420M Annual (35M in a month)||$278,400||$266,400||-4.3%||$266,400||-4.3%|
|1.2 Billion in a month||$502,200||$488,200||-2.8%||$313,450||-37.6%|
|1.2B Annual (100M in a month)||$746,400||$578,400||-22.5%||$461,400||-38.2%|
Learn more about the price reduction by visiting the pricing page.
Accuracy improvements for content moderation
Organizations need a scalable solution to make sure users aren’t exposed to inappropriate content from user-generated and third-party content in social media, ecommerce, and photo-sharing applications.
The Amazon Rekognition Content Moderation API helps you automatically detect inappropriate or unwanted content to streamline moderation workflows.
With the Amazon Rekognition Content Moderation API, you now get improved accuracy across all ten top-level categories (such as explicit nudity, violence, and tobacco) and all 35 subcategories.
The improvements in image model moderation reduce false positive rates across all moderation categories. Lower false positive rates lead to lower volumes of images flagged for further review by human moderators, reducing their workload and improving efficiency. When combined with a price reduction for image APIs, you get more value for your content moderation solution at lower prices. Learn more about the improved Content Moderation API by visiting Moderating content.
11 Street is an online shopping company. They’re using Amazon Rekognition to automate the review of images and videos. “As part of 11st’s interactive experience, and to empower our community to express themselves, we have a feature where users can submit a photo or video review of the product they have just purchased. For example, a user could submit a photo of themselves wearing the new makeup they just bought. To make sure that no images or videos contain content that is prohibited by our platform guidelines, we originally resorted to manual content moderation. We quickly found that this was costly, error-prone, and not scalable. We then turned to Amazon Rekognition for Content Moderation, and found that it was easy to test, deploy, and scale. We are now able to automate the review of more than 7,000 uploaded images and videos every day with Amazon Rekognition, saving us time and money. We look forward to the new model update that the Amazon Rekognition team is releasing soon.” – 11 Street Digital Transformation team
Flipboard is a content recommendation platform that enables publishers, creators, and curators to share stories with readers to help them stay up to date on their passions and interests. Says Anuj Ahooja, Senior Engineering Manager at Flipboard: “On average, Flipboard processes approximately 90 million images per day. To maintain a safe and inclusive environment and to confirm that all images comply with platform guidelines at scale, it is crucial to implement a content moderation workflow using ML. However, building models for this system internally was labor-intensive and lacked the accuracy necessary to meet the high-quality standards Flipboard users expect. This is where Amazon Rekognition became the right solution for our product. Amazon Rekognition is a highly accurate, easily deployed, and performant content moderation platform that provides a robust moderation taxonomy. Since putting Amazon Rekognition into our workflows, we’ve been catching approximately 63,000 images that violate our standards per day. Moreover, with frequent improvements like the latest content moderation model update, we can be confident that Amazon Rekognition will continue to help make Flipboard an even more inclusive and safe environment for our users over time.”
Yelp connects people with great local businesses. With unmatched local business information, photos, and review content, Yelp provides a one-stop local platform for consumers to discover, connect, and transact with local businesses of all sizes by making it easy to request a quote, join a waitlist, and make a reservation, appointment, or purchase. Says Alkis Zoupas, Head of Trust and Safety Engineering at Yelp: “Yelp’s mission is to connect people with great local businesses, and we take significant measures to give people access to reliable and useful information. As part of our multi-stage, multi-model approach to photo classification, we use Amazon Rekognition to tune our systems for various outcomes and levels of filtering. Amazon Rekognition has helped reduce development time, allowing us to be more effective with our resource utilization and better prioritize what our teams should focus on.”
Support for seven more languages and accuracy improvements for text analysis
Customers use the Amazon Rekognition text service for a variety of applications, such as ensuring compliance of images with corporate policies, analysis of marketing assets, and reading street signs. With the Amazon Rekognition DetectText API, you can detect text in images and check it against your list of inappropriate words and phrases. In addition, you can further enable content redaction by using the detected text bounding box area to blur sensitive information.
The newest version of the DetectText API now supports Arabic, French, German, Italian, Portuguese, Russian, and Spanish languages in addition to English. The DetectText API also provides improved accuracy for detecting curved and vertical text in images. With the expanded language support and higher accuracy for curved and vertical text, you can scale and improve your content moderation, text moderation, and other text detection workflows.
OLX Group is one of the world’s fastest-growing networks of trading platforms, with operations in over 30 countries and over 20 brands worldwide. Says Jaroslaw Szymczak, Data Science Manager at OLX Group: “As a leader in the classifieds marketplace sector, and to foster a safe, inclusive, and vibrant buying and selling community, it is paramount that we make sure that all products listed on our platforms comply with our rules for product display and authenticity. To do that, among other aspects of the ads, we have placed focus on analyzing the non-organic text featured on images uploaded by our users. We tested Amazon Rekognition’s text detection functionality for this purpose and found that it was highly accurate and augmented our in-house violations detection systems, helping us improve our moderation workflows. Using Amazon Rekognition for text detection, we were able to flag 350,000 policy violations last year. It has also helped us save significant amounts in development costs and has allowed us to refocus data science time on other projects. We are very excited about the upcoming text model update as it will even further expand our capabilities for text analysis.”
VidMob is a leading creative analytics platform that uses data to understand the audience, improve ads, and increase marketing performance. Says James Kupernick, Chief Technology Officer at VidMob: “At VidMob, our goal is to maximize ROI for our customers by leveraging real-time insights into creative content. We have been working with the Amazon Rekognition team for years to extract meaningful visual metadata from creative content, helping us drive data-driven outcomes for our customers. It is of the utmost importance that our customers get actionable data signals. In turn, we have used Amazon Rekognition’s text detection feature to determine when there is overlaid text in a creative and classify that text in a way that creates unique insights. We can scale this process using the Amazon Rekognition Text API, allowing our data science and engineers teams to create differentiated value. In turn, we are very excited about the new text model update and the addition of new languages so that we can better support our international clients.”
Simplicity and scalability for AutoML
Amazon Rekognition Custom Labels is an AutoML service that allows you to build custom computer vision models to detect objects and scenes in images specific to your business needs. For example, with Rekognition Custom Labels, you can develop solutions for detecting brand logos, proprietary machine parts, and items on store shelves without the need for in-depth ML expertise. Instead, your critical ML experts can continue working on higher-value projects.
With the new capabilities in Rekognition Custom Labels, you can simplify and scale your workflows for custom computer vision models.
First, you can train your computer vision model in four simple steps with a few clicks. You get a guided step-by-step console experience with directions for creating projects, creating image datasets, annotating and labeling images, and training models.
Next, we improved our underlying ML algorithms. As a result, you can now build high-quality models with less training data to detect vehicles, their make, or possible damages to vehicles.
Finally, we have introduced seven new APIs to make it even easier for you to build and train computer vision models programmatically. With the new APIs, you can do the following:
- Create, copy, or delete datasets
- List the contents and get details of the datasets
- Modify datasets and auto-split them to create a test dataset
For more information, visit the Rekognition Custom Labels Guide.
Prodege, LLC is a cutting-edge marketing and consumer insights platform that leverages its global audience of reward program members to power its business solutions. Prodege uses Rekognition Custom Labels to detect anomalies in store receipts. Says Arun Gupta, Director, Business Intelligence at Prodege: “By using Rekognition Custom Labels, Prodege was able to detect anomalies with high precision across store receipt images being uploaded by our valued members as part of our rewards program offerings. The best part of Rekognition Custom Labels is that it’s easy to set up and requires only a small set of pre-classified images (a couple of hundred in our case) to train the ML model for high confidence image detection. The model’s endpoints can be easily accessed using the API. Rekognition Custom Labels has been an extremely effective solution to enable the smooth functioning of our validated receipt scanning product and helped us save a lot of time and resources performing manual detection. The new console experience of Rekognition Custom Labels has made it even easier to build and train a model, especially with the added capability of updating and deleting an existing dataset. This will significantly improve our constant iteration of training models as we grow and add more data in the pursuit of enhancing our model performance. I can’t even thank the AWS Support Team enough, who has been diligently helping us with all aspects of the product through this journey.”
Says Arnav Gupta, Global AWS Practice Lead at Quantiphi: “As an advanced consulting partner for AWS, Quantiphi has been leveraging Amazon’s computer vision services such as Amazon Rekognition and Amazon Textract to solve some of our customer’s most pressing business challenges. The simplified and guided experience offered by the updated Rekognition Custom Labels console and the new APIs has made it easier for us to build and train computer vision models, significantly reducing the time to deliver solutions from months to weeks for our customers. We have also built our document processing solution Qdox on top of Amazon Textract, which has enabled us to provide our own industry-specific document processing solutions to customers.”
Get started with Amazon Rekognition
With the new features we’re announcing today, you can increase the accuracy of your content moderation workflows, deploy text moderation solutions across a broader range of scenarios and languages, and simplify your AutoML implementation. In addition, you can use the price reduction on the image APIs to analyze more images with your existing budget. Use one or more of the following options to get started today:
- Learn more through tutorials, workshops, and solution templates on Amazon Rekognition resources
- Sign up for the AWS Free Tier to explore Amazon Rekognition for free
- Let us know your feedback through your account teams or on the Amazon Rekognition Forum
- Start building on the Amazon Rekognition console
- Start building with the AWS Command Line Interface (AWS CLI) and the AWS Software Development Kit (AWS SDK) by checking out Set up the AWS CLI and AWS SDKs
About the Author
Roger Barga is the GM of Computer Vision at AWS.