Make your audio and video files searchable using Amazon Transcribe and Amazon Kendra

The demand for audio and video media content is growing at an unprecedented rate. Organizations are using media to engage with their audiences like never before. Product documentation is increasingly published in video form, and podcasts are increasingly produced in place of blog posts. The recent explosion in the use of virtual workplaces has resulted in content that is encapsulated in the form of recorded meetings, calls, and voicemails. Contact centers also generate media content such as support calls, screen share recordings, or post-call surveys.

Amazon Machine Learning services help you find answers and extract valuable insights from the content of your audio and video files as well as your text files.

In this post, we introduce a new open-source solution, MediaSearch, built to make your media files searchable, and consumable in search results. It uses Amazon Transcribe to convert media audio tracks to text, and Amazon Kendra to provide intelligent search. Your users can find the content they’re looking for, even when it’s embedded in the sound track of your audio or video files. The solution also provides an enhanced Amazon Kendra query application that lets users play the relevant section of original media files, directly from the search results page.

Solution overview

MediaSearch is easy to install and try out! Use it to enable your customers to find answers to their questions from your podcast recordings and presentations, or for your students to find answers from your educational videos or lecture recordings, in addition to text documents.

The MediaSearch solution has two components, as illustrated in the following diagram.

The first component, the MediaSearch indexer, finds and transcribes audio and video files stored in an Amazon Simple Storage Service (Amazon S3) bucket. It prepares the transcriptions by embedding time markers at the start of each sentence, and it indexes each prepared transcription in a new or existing Amazon Kendra index. It runs the first time when you install it, and subsequently runs on an interval that you specify, maintaining the index to reflect any new, modified, or deleted files.

The second component, the MediaSearch finder, is a sample web search client that you use to search for content in your Amazon Kendra index. It has all the features of a standard Amazon Kendra search page, but it also includes in-line embedded media players in the search result, so you can not only see the relevant section of the transcript, but also play the corresponding section from the original media without navigating away from the search page (see the following screenshot).

In the sections that follow, we discuss several topics:

  • How to deploy the solution to your AWS account
  • How to use it to index and search sample media files
  • How to use the solution with your own media files
  • How the solution works under the hood
  • The costs involved
  • How to monitor usage and troubleshoot problems
  • Options to customize and tune the solution
  • How to uninstall and clean up when you’re done experimenting

Deploy the MediaSearch solution

In this section, we walk through deploying the two solution components: the indexer and the finder. We use an AWS CloudFormation stack to deploy the necessary resources in the us-east-1 (N. Virginia) AWS Region.

The source code is available in our GitHub repository. Follow the directions in the README to deploy MediaSearch to additional Regions supported by Amazon Kendra.

Deploy the indexer component

To deploy the indexer component, complete the following steps:

  1. Choose Launch Stack:
  2. Change the stack name if required to ensure that it’s unique.
  3. For ExistingIndexId, leave blank to create a new Amazon Kendra index (Developer Edition), otherwise provide the IndexId (not the index name) for an existing index in your account and Region (Amazon Kendra Enterprise Edition should be used for production workloads).
  4. For MediaBucket and MediaFolderPrefix, use the defaults initially to transcribe and index sample audio and video files.
  5. For now, use the default values for the other parameters.
  6. Select the acknowledgement check boxes, and choose Create stack.
  7. When the stack is created (after approximately 15 minutes), choose the Outputs tab, and copy the value of IndexId—you need it to deploy the finder component in the next step.

The newly installed indexer runs automatically to find, transcribe, and index the sample audio and video files. Later you can provide a different bucket name and prefix to index your own media files. If you have media files in multiple buckets, you can deploy multiple instances of the indexer, each with a unique stack name.

Deploy the finder component

To deploy the finder web application component, complete the following steps:

  1. Choose Launch Stack:
  2. For IndexId, use the Amazon Kendra index copied from the MediaSearch indexer stack outputs.
  3. For MediaBucketNames, use the default initially to allow the search page to access media files from the sample file bucket.
  4. When the stack is created (after approximately 5 minutes), choose the Outputs tab and use the link for MediaSearchFinderURL to open the new media search application page in your browser.

If the application isn’t ready when you first open the page, don’t worry! The initial application build and deployment (using AWS Amplify) takes about 10 minutes, so it will work when you try again a little later. If for any reason the application still doesn’t open, refer to the README in the GitHub repo for troubleshooting steps.

And that’s all there is to the deployment! Next, let’s run some search queries to see it in action.

Test with the sample media files

By the time the MediaSearch finder application is deployed and ready to use, the indexer should have completed processing the sample media files (selected AWS Podcast episodes and AWS Knowledge center videos). You can now run your first MediaSearch query.

  1. Open the MediaSearch finder application in your browser as described in the previous section.
  2. In the query box, enter What’s an interface VPC Endpoint?

The query returns multiple results, sourced from the transcripts of the sample media files.

  1. Observe the time markers at the beginning of each sentence in the answer text. This indicates where the answer is to be found in the original media file.
  2. Use the embedded video player to play the original video inline. Observe that the media playback starts at the relevant section of the video based on the time marker.
  3. To play the video full screen in a new browser tab, use the Fullscreen menu option in the player, or choose the media file hyperlink shown above the answer text.
  4. Choose the video file hyperlink (right-click), copy the URL, and paste it into a text editor. It looks something like the following:
https://mediasearchtest.s3.amazonaws.com/mediasamples/What_is_an_Interface_VPC_Endpoint_and_how_can_I_create_Interface_Endpoint_for_my_VPC_.mp4?AWSAccessKeyId=ASIAXMBGHMGZLSYWJHGD&Expires=1625526197&Signature=BYeOXOzT585ntoXLDoftkfS4dBU%3D&x-amz-security-token=.... #t=253.52

This is a presigned S3 URL that provides your browser with temporary read access to the media file referenced in the search result. Using presigned URLs means you don’t need to provide permanent public access to all of your indexed media files.

  1. Scroll down the page, and observe that some search results are from audio (MP3) files, and some are from video (MP4) files.

You can mix and match media types in the same index. You could include other data source types as well, such as documents, webpages, and other file types supported by available Amazon Kendra data sources, and search across them all, allowing Amazon Kendra to find the best content to answer your query.

  1. Experiment with additional queries, such as What does a solutions architect do? or What is Kendra?, or try your own questions.

Index and search your own media files

To index media files stored in your own S3 bucket, replace the MediaBucket and MediaFolderPrefix parameters with your own bucket name and prefix when you install or update the indexer component stack, and modify the MediaBucketNames parameter with your own bucket name when you install or update the finder component stack.

  1. Create a new MediaSearch indexer stack using an existing Amazon Kendra IndexId to add files stored in the new location. To deploy a new indexer, follow the directions in the Deploy the indexer component section in this post, but this time replace the defaults to specify the media bucket name and prefix for your own media files.
  2. Alternatively, update an existing MediaSearch indexer stack to replace the previously indexed files with files from the new location:
    1. Select the stack on the CloudFormation console, choose Update, then Use current template, then Next.
    2. Modify the media bucket name and prefix parameter values as needed.
    3. Choose Next twice, select the acknowledgement check box, and choose Update stack.
  3. Update an existing MediaSearch finder stack to change bucket names or add additional bucket names to the MediaBucketNames parameter.

When the MediaSearch indexer stack is successfully created or updated, the indexer automatically finds, transcribes, and indexes the media files stored in your S3 bucket. When it’s complete, you can submit queries and find answers from the audio tracks of your own audio and video files.

You have the option to provide metadata for any or all of your media files. Use metadata to assign values to index attributes for sorting, filtering, and faceting your search results, or to specify access control lists to govern access to the files. Metadata files can be in the same S3 folder as your media files (default), or in a parallel folder structure specified by the optional indexer parameter MetadataFolderPrefix. For more information about how to create metadata files, see S3 document metadata.

You can also provide customized transcription options for any or all of your media files. This allows you to take full advantage of Amazon Transcribe features such as custom vocabularies, automatic content redaction, and custom language models. For more information, refer to the README in the GitHub repo.

How the MediaSearch solution works

Let’s take a quick look under the hood to see how the solution works, as illustrated in the following diagram.

The MediaSearch solution has an event-driven serverless computing architecture with the following steps:

  1. You provide an S3 bucket containing the audio and video files you want to index and search.
  2. Amazon EventBridge generates events on a repeating interval (such as every 2 hours, every 6 hours, and so on)
  3. These events invoke an AWS Lambda function. The function is invoked initially when the CloudFormation stack is first deployed, and then subsequently by the scheduled events from EventBridge. An Amazon Kendra data source sync job is started. The Lambda function lists all the supported media files (FLAC, MP3, MP4, Ogg, WebM, AMR, or WAV) and associated metadata and Transcribe options stored in the user-provided S3 bucket.
  4. Each new file is added to the Amazon DynamoDB tracking table and submitted to be transcribed by a Transcribe job. Any file that has been previously transcribed is submitted for transcription again only if it has been modified since it was previously transcribed, or if associated Transcribe options have been updated. The DynamoDB table is updated to reflect the transcription status and last modified timestamp of each file. Any tracked files that no longer exist in the S3 bucket are removed from the DynamoDB table and from the Amazon Kendra index. If no new or updated files are discovered, the Amazon Kendra data source sync job is immediately stopped. The DynamoDB table holds a record for each media file with attributes to track transcription job names and status, and last modified timestamps.
  5. As each Transcribe job completes, EventBridge generates a Job Complete event, which invokes an instance of another Lambda function.
  6. The Lambda function processes the transcription job output, generating a modified transcription that has a time marker inserted at the start of each sentence. This modified transcription is indexed in Amazon Kendra, and the job status for the file is updated in the DynamoDB table. When the last file has been transcribed and indexed, the Amazon Kendra data source sync job is stopped.
  7. The index is populated and kept in sync with the transcriptions of all the media files in the S3 bucket monitored by the MediaSearch indexer component, integrated with any additional content from any other provisioned data sources. The media transcriptions are used by Amazon Kendra’s intelligent query processing, which allows users to find content and answers to their questions.
  8. The sample finder client application enhances users’ search experience by embedding an inline media player with each Amazon Kendra answer that is based on a transcribed media file. The client uses the time markers embedded in the transcript to start media playback at the relevant section of the original media file.

Estimate costs

In addition to Amazon S3 costs associated with storing your media, the MediaSearch solution incurs usage costs from Amazon Kendra and Transcribe. Additional minor (usually not significant) costs are incurred by the other services mentioned after free tier allowances have been used. For more information, see the pricing documentation for Amazon Kendra, Transcribe, Lambda, DynamoDB, and EventBridge.

Pricing example: Index the sample media files

The sample dataset has 25 media files—13 audio podcast and 12 video files—containing a total of around 480 minutes or 29,000 seconds of audio.

If you don’t provide an existing Amazon Kendra IndexId when you install MediaSearch, a new Amazon Kendra Developer Edition index is automatically created for you so you can test the solution. After you use your free tier allowance (up to 750 hours in the first 30 days), the index costs $1.125 per hour.

Transcribe pricing is based on the number of seconds of audio transcribed, with a free tier allowance of 60 minutes of audio per month for the first 12 months. After the free tier is used, the cost is $0.00040 for each second of audio transcribed. If you’re no longer free tier eligible, the cost to transcribe the sample files is as follows:

  • Total seconds of audio = 29,000
  • Transcription price per second = $0.00040
  • Total cost for Transcribe = [number of seconds] x [cost per second] = 29,000 x $0.00040 = $11.60

Monitor and troubleshoot

To see the details of each media file transcript job, navigate to the Transcription jobs page on the Transcribe console.

Each media file is transcribed only one time, unless the file is modified. Modified files are re-transcribed and re-indexed to reflect the changes.

Choose any transcription job to review the transcription and examine additional job details.

On the Indexes page of the Amazon Kendra console, choose the index used by MediaSearch to examine the index details.

Choose Data sources in the navigation pane to examine the MediaSearch indexer data source, and observe the data source sync run history. The data source syncs when the indexer runs every interval specified in the CloudFormation stack parameters when you deployed or last updated the solution.

On the DynamoDB console, choose Tables in the navigation pane. Use your MediaSearch stack name as a filter to display the MediaSearch DynamoDB table, and examine the items showing each indexed media file and corresponding status. The table has one record for each media file, and contains attributes with information about the file and its processing status.

On the Functions page of the Lambda console, use your MediaSearch stack name as a filter to list the two MediaSearch indexer functions described earlier.

Choose either of the functions to examine the function details, including environment variables, source code, and more. Choose Monitor & View logs in CloudWatch to examine the output of each function invocation and troubleshoot any issues.

Customize and enhance the solution

You can fork the MediaSearch GitHub repository, enhance the code, and send us pull requests so we can incorporate and share your improvements!

The following are a few suggestions for features you might want to implement:

Clean up

When you’re finished experimenting with this solution, clean up your resources by using the AWS CloudFormation console to delete the indexer and finder stacks that you deployed. This deletes all the resources, including any Amazon Kendra indexes that were created by deploying the solution. Pre-existing indexes aren’t deleted. However, media files that were indexed by the solution are removed from the pre-existing index when you delete the indexer stack.

Conclusion

The combination of Amazon Transcribe and Amazon Kendra enable a scalable, cost-effective solution to make your media files discoverable. You can use the content of your media files to find accurate answers to your users’ questions, whether they’re from text documents or media files, and consume them in their native format. In other words, this solution is a leap in bringing media files on par with text documents as containers of information.

The sample MediaSearch application is provided as open source—use it as a starting point for your own solution, and help us make it better by contributing back fixes and features via GitHub pull requests. For expert assistance, AWS Professional Services and other Amazon partners are here to help.

We’d love to hear from you. Let us know what you think in the comments section, or using the issues forum in the MediaSearch GitHub repository.


About the Authors

Bob StrahanBob Strahan is a Principal Solutions Architect in the AWS Language AI Services team.

 

 

 

 

Abhinav JawadekarAbhinav Jawadekar is a Senior Partner Solutions Architect at Amazon Web Services. Abhinav works with AWS Partners to help them in their cloud journey.

 

Read More

Detect anomalies in operational metrics using Dynatrace and Amazon Lookout for Metrics

Organizations of all sizes and across all industries gather and analyze metrics or key performance indicators (KPIs) to help their businesses run effectively and efficiently. Operational metrics are used to evaluate performance, compare results, and track relevant data to improve business outcomes. For example, you can use operational metrics to determine application performance (the average time it takes to render a page for an end user) or application availability (the duration of time the application was operational). One challenge that most organizations face today is detecting anomalies in operational metrics, which are key in ensuring continuity of IT system operations.

Traditional rule-based methods are manual and look for data that falls outside of numerical ranges that have been arbitrarily defined. An example of this is an alert when transactions per hour fall below a certain number. This results in false alarms if the range is too narrow, or missed anomalies if the range is too broad. These ranges are also static. They don’t change based on evolving conditions like the time of the day, day of the week, seasons, or business cycles. When anomalies are detected, developers, analysts, and business owners can spend weeks trying to identify the root cause of the change before they can take action.

Amazon Lookout for Metrics uses machine learning (ML) to automatically detect and diagnose anomalies without any prior ML experience. In a couple of clicks, you can connect Lookout for Metrics to popular data stores like Amazon Simple Storage Service (Amazon S3), Amazon Redshift, and Amazon Relational Database Service (Amazon RDS), as well as third-party software as a service (SaaS) applications (such as Salesforce, Dynatrace, Marketo, Zendesk, and ServiceNow) via Amazon AppFlow and start monitoring metrics that are important to your business.

This post demonstrates how you can connect to your IT operational infrastructure monitored by Dynatrace using Amazon AppFlow and set up an accurate anomaly detector across metrics and dimensions using Lookout for Metrics. The solution allows you to set up a continuous anomaly detector and optionally set up alerts to receive notifications when anomalies occur.

Lookout for Metrics integrates seamlessly with Dynatrace to detect anomalies within your operational metrics. Once connected, Lookout for Metrics uses ML to start monitoring data and metrics for anomalies and deviations from the norm. Dynatrace enables monitoring of your entire infrastructure, including your hosts, processes, and network. You can perform log monitoring and view information such as the total traffic of your network, the CPU usage of your hosts, the response time of your processes, and more.

Amazon AppFlow is a fully managed service that provides integration capabilities by enabling you to transfer data between SaaS applications like Datadog, Salesforce, Marketo, and Slack and AWS services like Amazon S3 and Amazon Redshift. It provides capabilities to transform, filter, and validate data to generate enriched and usable data in a few easy steps.

Solution overview

In this post, we demonstrate how to integrate with an environment monitored by Dynatrace and detect anomalies in the operation metrics. We also determine how application availability and performance (resource contention) were impacted.

The source data is a cluster of Amazon Elastic Compute Cloud (Amazon EC2) instances that is monitored by Dynatrace. Each EC2 instance is installed with Dynatrace OneAgent to collect all monitored telemetry data (CPU utilization, memory, network utilization, and disk I/O). Amazon AppFlow enables you to securely integrate SaaS applications like Dynatrace and automate data flows, while providing options to configure and connect to such services natively from the AWS Management Console or via API. In this post, we focus on connecting to Dynatrace as our source and Lookout for Metrics as the target, both of which are natively supported applications in Amazon AppFlow.

The solution enables you to create an Amazon AppFlow data flow from Dynatrace to Lookout for Metrics. You can then use Lookout for Metrics to detect any anomalies in the telemetry data, as shown in the following diagram. Optionally, you can send automated anomaly alerts to AWS Lambda functions, webhooks, or Amazon Simple Notification Service (Amazon SNS) topics.

The following are the high-level steps to implement the solution:

  1. Set up Amazon AppFlow integration with Dynatrace.
  2. Create an anomaly detector with Lookout for Metrics.
  3. Add a dataset to the detector and integrate Dynatrace metrics.
  4. Activate the detector.
  5. Create an alert.
  6. Review the detector and data flow status.
  7. Review and analyze any anomalies.

Set up Amazon AppFlow integration with Dynatrace

To set up the data flow, complete the following steps:

  1. On the Amazon AppFlow console, choose Create flow.
  2. For Flow name, enter a name.
  3. For Flow description, enter an optional description.
  4. In the Data encryption section, you can choose or create an AWS Key Management Service (AWS KMS) key.
  5. Choose Next.
  6. For Source name, choose Dynatrace.
  7. For Choose Dynatrace Connection, choose the connection you created.
  8. For Choose Dynatrace object, choose Problems (this is the only object supported as of this writing).

For more information about Dynatrace problems, see Problem overview page.

  1. For Destination name, choose Amazon Lookout for Metrics.
  2. For API token, generate an API token from the Dynatrace console.
  3. For Subdomain, enter your Dynatrace portal URL address.
  4. For Data encryption, choose the AWS KMS key.
  5. For Connection Name, enter a name.
  6. Choose Connect.
  7. For Flow trigger, select Run flow on schedule.
  8. For Repeats, choose Minutes (alternatively, you can choose hourly or daily).
  9. Set the trigger to repeat every 5 minutes.
  10. Enter a starting time.
  11. Enter a start date.

Dynatrace requires a between date range filter to be set.

  1. For Field name, choose Date range.
  2. For Condition, choose is between.
  3. For Criteria 1, choose your start date.
  4. For Criteria 2, choose your end date.
  5. Review your settings and choose Create flow.

Create an anomaly detector with Lookout for Metrics

To create your anomaly detector, complete the following steps:

  1. On the Lookout for Metrics console, choose Create detector.
  2. For Detector name, enter a name.
  3. For Description, enter an optional description.
  4. For Interval, choose the time between each analysis. This should match the interval set on the flow.
  5. For Encryption, create or choose an existing AWS KMS key.
  6. Choose Create.

Add a dataset to the detector and integrate Dynatrace metrics

The next step in activating your anomaly detector is to add a dataset and integrate the Dynatrace metrics.

  1. On the detector details, choose Add a dataset.
  2. For Name, enter the data source name.
  3. For Description, enter an optional description.
  4. For Timezone, choose the time zone relevant to your dataset. This should match the time zone used in Amazon AppFlow (which picks up from the browser).
  5. For Datasource, choose Dynatrace.
  6. For Amazon AppFlow flow, choose the flow that you created.
  7. For Permissions, choose a service role.
  8. Choose Next.
  9. For Map fields, the detector tracks 5 measures; in this example I choose impactLevel and hasRootCause.

The map fields are the primary fields that the detector monitors. The fields that are relevant to monitor from an operational KPI should be considered.

  1. For Dimensions, the detector creates segments in measure values. For this post, I choose severityLevel.
  2. Review the settings and choose Save dataset.

Activate the detector

You’re now ready to activate the newly created detector.

Create an alert

You can create an alert to send automated anomaly alerts to Lambda functions; webhooks; cloud applications like Slack, PagerDuty, and DataDog; or to SNS topics with subscribers that use SMS, email, or push notifications.

  1. On the detector details, choose Add alerts.
  2. For Alert Name, enter the name.
  3. For Sensitivity threshold, enter a threshold at which the detector sends anomaly alerts.
  4. For Channel, choose either Amazon SNS or Lambda as the notification method. For this post, I use Amazon SNS.
  5. For SNS topic, create or choose an existing SNS topic.
  6. For Service role, choose an execution role.
  7. Choose Add alert.

Review the detector and flow status

On the Run history tab, you can confirm that the flows are running successfully for the interval chosen.

On the Detector log tab, you can confirm that the detector records the results after each interval.

Review and analyze any anomalies

On the main detector page, choose View anomalies to review and analyze any anomalies.

On the Anomalies page, you can adjust the severity score on the threshold dial to filter anomalies above a given score.

The following analysis represents the severity level and impacted metrics. The graph suggests anomalies detected by the detector with the availability and resource contention being impacted. The anomaly was detected on June 28 at 14:30 PDT and has a severity score of 98, indicating a high severity anomaly that needs immediate attention.

Lookout for Metrics also allows you to provide real-time feedback on the relevance of the detected anomalies, which enables a powerful human-in-the-loop mechanism. This information is fed back to the anomaly detection model to improve its accuracy continuously, in near-real time.

Conclusion

Anomaly detection can be very useful in identifying anomalies that could signal potential issues within your operational environment. Timely detection of anomalies can aid in troubleshooting, help avoid loss in revenue, and help maintain your company’s reputation. Lookout for Metrics automatically inspects and prepares the data, selects the best-suited ML algorithm, begins detecting anomalies, groups related anomalies together, and summarizes potential root causes.

To get started with this capability, see Amazon Lookout for Metrics. You can use this capability in all Regions where Lookout for Metrics is publicly available. For more information about Region availability, see AWS Regional Services.


About the Author

Sumeeth Siriyur is a Solutions Architect based out of AWS, Sydney. He is passionate about infrastructure services and uses AI services to influence IT infrastructure observability and management. In his spare time, he likes binge-watching and works to continually improve his outdoor sports.

Read More

Accenture promotes machine learning growth with world’s largest private AWS DeepComposer Battle of the Bands League

Accenture is known for pioneering innovative solutions to achieve customer success by using artificial intelligence (AI) and machine learning (ML) powered solutions with AWS services. To keep teams updated with latest ML services, Accenture seeks to gamify hands-on learning. One such event, AWS DeepComposer Battle of the Bands, hosted by Accenture, is the world’s first and largest global league.

Accenture’s league spanned 16 global regions and 55 countries, with each location competing for global superstardom and a real-life gold record! With around 500 bands in the competition, Accenture employees from different skills and domain knowledge proved themselves as aspiring ML musicians, generating a playlist of 150 original songs using AWS DeepComposer. There was no shortage of fans either, with thousands of votes being cast by supportive colleagues and teammates.

Why an AWS DeepComposer Battle of Bands and why now?

According to a recent Gartner report, “Despite the global impact of COVID-19, 47% of AI investments were unchanged since the start of the pandemic and 30% of organizations actually planned to increase such investments”. Additionally, there have been few opportunities in this pandemic to share a fun and enjoyable experience with our teammates, let alone colleagues around the globe.

Accenture and their Amazon Business Group are always looking for unique and exciting ways to help employees up-skill in the latest and greatest tech. Being inspired by their massively successful annual AWS DeepRacer Grand Prix, Accenture switched out the racetrack for the big stage and created their own Battle of the Bands using AWS DeepComposer.

This Battle of the Bands brought together fans and bands from around the globe, generating thousands of views, shares, votes, and opportunities to connect, laugh, and smile together.

Education was Accenture’s number one priority when crafting the competition. The goal was to expose those unfamiliar with AWS or ML to a fun and approachable experience that would increase their confidence with this technology and start them down a path of greater learning. According to registration metrics, around half of all participants were working with AWS and ML hands-on for the first time. Participants have shared that this competition inspired them to learn more about both AWS and ML. Some feedback received included:

“I enjoyed doing something creative and tackling music, which I had no experience with previously.”

“It was fun trying to make a song with the tool and to learn about other ML techniques.”

“I was able to feel like a musician even though I don’t know much about music composition.”

A hall of fame alliance

Accenture and AWS have always demonstrated a great alliance. In 2019, Accenture hosted one of the world’s largest private AWS DeepRacer Leagues. In 2021, multiple individuals and groups participated in the AWS DeepComposer Battle of the Bands League. These bands were able to create a video to go along with their song submission, allowing for more creative freedom and a chance to stand out from the crowd. Some bands made artistic music videos, others saw an opportunity to make something funny and share laughs around the world. Going above and beyond, one contestant turned their AWS DeepComposer competition into a sing-along training video for Accenture’s core values, while another dedicated their video to honoring “sheroes” and famous women in tech.

The dedication of Accenture’s bands to the spirit of the competition really showed in the array of pun-filled band names such as “Doggo-as-a-service,” “The Oracles,” “Anna and the AlgoRhythms,” and “#000000 Sabbath.”

AWS offers a portfolio of educational devices—AWS DeepLens, AWS DeepRacer, and AWS DeepComposer—designed for developers of all skill levels to learn the fundamentals of ML in fun and practical ways. The hands-on nature of AWS AI devices makes them great tools to engage and educate employees.

Accelerating innovation with the Accenture AWS Business Group

By working with the Accenture AWS Business Group (AABG), you can learn from the resources, technical expertise, and industry knowledge of two leading innovators, helping you accelerate the pace of innovation to deliver disruptive products and services. The AABG helps you ideate and innovate cloud solutions through rapid prototype development.

Connect with our team at accentureaws@amazon.com to learn how to use and accelerate ML in your products and services.

You can also organize your own event. To learn more about AWS DeepComposer events, see AWS DeepRacer Community Blog and also check out blog on How to run an AI powered musical challenge: “AWS DeepComposer Got Talent” to learn more about how to host your first event with AWS DeepComposer.

About Accenture

Accenture is a global professional services company with leading capabilities in digital, cloud and security. Combining unmatched experience and specialized skills across more than 40 industries, we offer Strategy and Consulting, Interactive, Technology and Operations services — all powered by the world’s largest network of Advanced Technology and Intelligent Operations centers. Our 569,000 people deliver on the promise of technology and human ingenuity every day, serving clients in more than 120 countries. We embrace the power of change to create value and shared success for our clients, people, shareholders, partners and communities. Visit us at www.accenture.com.

Copyright © 2021 Accenture. All rights reserved. Accenture and its logo are trademarks of Accenture.

This document is produced by consultants at Accenture as general guidance. It is not intended to provide specific advice on your circumstances. If you require advice or further details on any matters referred to, please contact your Accenture representative.

This document makes descriptive reference to trademarks that may be owned by others. The use of such trademarks herein is not an assertion of ownership of such trademarks by Accenture and is not intended to represent or imply the existence of an association between Accenture and the lawful owners of such trademarks. No sponsorship, endorsement, or approval of this content by the owners of such trademarks is intended, expressed, or implied.

Accenture provides the information on an “as-is” basis without representation or warranty and accepts no liability for any action or failure to act taken in response to the information contained or referenced in this publication.


About the Authors

Marc DeMory is a senior emerging tech consultant with Accenture’s Chicago Liquid Studio, focusing on rapid-prototyping and cloud-native development in the fields of Machine Learning, Computer Vision, Automation, and Extended Reality.

 

 

 

Sameer Goel is a Sr. Solutions Architect in Netherlands, who drives customer success by building prototypes on cutting-edge initiatives. Prior to joining AWS, Sameer graduated with a master’s degree from NEU Boston, with a concentration in data science. He enjoys building and experimenting with AI/ML projects on Raspberry Pi.

 

 

Maryam rezapoor is a Senior Product Manager with AWS DeepLabs team based in Santa Clara, CA. She works on developing products to put Machine Learning in the hands of everyone. She loves hiking through the US national parks and is currently training for 1-day Grand Canyon Rim to Rim hike. She is a fan of Metallica and Evanescence. The drummer, Lars Ulrich, has inspired her to pick up those sticks and play drum while singing “nothing else matters.”

Read More

Scale your Amazon Kendra index

Amazon Kendra is a fully managed, intelligent search service powered by machine learning. Amazon Kendra reimagines enterprise search for your websites and applications so your employees and customers can easily find the content they’re looking for. Using keyword or natural language queries, employees and customers can find the right content even when it’s scattered across multiple locations and content repositories within your organization.

Although Amazon Kendra is designed for large-scale search applications with millions of documents and thousands of queries per second, you can run smaller experiments to evaluate Amazon Kendra. You can run a proof of concept, or simply have a smaller workload and still use features that Amazon Kendra Enterprise Edition has to offer. On July 1, 2021, Amazon Kendra introduced new, smaller capacity units for smaller workloads. In addition, to promote experimentation, the price for Amazon Kendra Developer Edition was reduced by 55%.

Amazon Kendra Enterprise Edition capacity units

The base capacity for Amazon Kendra supports up to 100,000 documents and 8,000 searches per day, with adaptive bursting capability to better handle unpredictable query spikes. You can increase the query and the document capacity of your Amazon Kendra index through storage capacity units and query capacity units, and these can be updated independently from each other.

Storage capacity units offer scaling in increments of 100,000 documents (up to 30 GB storage), each. For example, if you need to index 1 million documents, you need nine storage capacity units (100,000 documents with base Amazon Kendra Enterprise Edition, and, 900,00 additional documents from the storage capacity units).

Query capacity units (QCUs) offer scaling increments of 8,000 searches for day, with built-in adaptive bursting. For example, if you need 16K queries per day (average QPS of 0.2) you can provision two units.

For more information about the maximum number of storage capacity units and query capacity units available for a single index, see Quotas for Amazon Kendra.

About capacity bursting

Amazon Kendra has a provisioned base capacity of one query capacity unit. You can use up to 8,000 queries per day with a minimum throughput of 0.1 queries per second (per query capacity unit).

An adaptive approach to handling unexpected traffic beyond the provisioned throughput is to use the built-in adaptive query bursting feature in Amazon Kendra. This allows you to apply unused query capacity to handle unexpected traffic. Amazon Kendra accumulates your unused queries at your provisioned queries per second rate, every second, up to the maximum number of queries you’ve provisioned for your Amazon Kendra index. These accumulated queries are automatically used to help handle unexpected traffic spikes above the currently allocated QPS capacity.

Optimal performance of adaptive query bursting can vary, depending on several factors such as your total index size, query complexity, accumulated unused queries, and overall load on your index. We recommend performing your own load tests to accurately measure bursting capacity.

Best practices

When dimensioning your Amazon Kendra index, you need to consider how many documents you’re indexing, how many queries you expect per day, how many queries per second you need to accommodate, and if you have usage patterns that require additional capacity due to sustained usage. You could also experience short peak times where you can accommodate brief periods of time for additional QPS requirements.

It’s therefore good practice to observe your query usage patterns for a few weeks, especially when the patterns are not easily predictable. This will allow you to define an optimal balance between using the built-in adaptive bursting capability for short, unsustained QPS peaks, and adding/removing capacity units to better handle longer, more sustained peaks and lows.

For information about visualizing and building a rough estimate of your usage patterns in Amazon Kendra, see Automatically scale Amazon Kendra query capacity units with Amazon EventBridge and AWS Lambda.

Amazon Kendra Enterprise Edition allows you to add document storage capacity in units of 100,000 documents with maximum storage of 30 GB. You can add and remove storage capacity at any time, but you can’t remove storage capacity beyond your used capacity (number of documents ingested or storage space used). We recommend estimating how often documents are added to your data sources in order to determine when to increase storage capacity in your Amazon Kendra through storage capacity units. You can monitor the document count with Amazon CloudWatch or on the Amazon Kendra console.

Queries per second represent the number of concurrent queries your Amazon Kendra index receives at a given time. If you’re replacing a search solution with Amazon Kendra, you should be able to retrieve this information from query logs. If you exceed your provisioned and bursting capacity, your request may receive a 400 HTTP status code (client error) with the message ThrottlingException. For example, using the AWS SDK for Python (Boto3), you may receive an exception like the following:

ThrottlingException: An error occurred (ThrottlingException) when calling the Query operation (reached max retries: 4)

For cases like this, Boto3 includes the retries feature, which retries the query call (in this case to Amazon Kendra) after obtaining an exception. If you aren’t using an AWS SDK, you may need to implement an error handling mechanism that, for example, could use exponential backoff to handle this error.

You can monitor your Amazon Kendra index queries with CloudWatch metrics. For example, you could follow the metric IndexQueryCount, which represents the number of index queries per minute. If you want to use the IndexQueryCount metric, you should divide that number by 60 to obtain the average queries per second. Additionally, you can get a report of the queries per second on the Amazon Kendra console, as shown in the following screenshot.

The preceding graph shows three patterns:

  • Peaks of ˜2.5 QPS during business hours, between 8 AM and 8 PM.
  • Sustained QPS usage over ˜0.5 QPS and below 1 QPS between 8 PM and 8 AM.
  • Less than 0.3 QPS usage on the weekend (Feb 7, 2021, was a Sunday and Feb 13,2021 was a Saturday)

Taking into account these capacity requirements, you could start defining your Amazon Kendra index additional capacity units as follows:

  • For the high usage times (between 8 AM and 8 PM Monday through Friday), your Amazon Kendra index adds 24 VQUs (each query capacity unit provides capacity for at least 0.1 QPS) which when added to the initial Amazon Kendra Enterprise Edition query capacity (0.1 QPS), can support 2.5 queries per second
  • For the second usage pattern (Monday through Friday from 8 PM until 8 AM), you add four VQUs, which when combined with your initial Amazon Kendra Enterprise Edition (0.1 QPS), provides capacity for 0.5 QPS.
  • For the weekends, you add two VQUs, provisioning capacity for 0.3 QPS.

The following table summarizes this configuration.

Period Additional VQUS Capacity (Without Bursting)
Mon – Fri 8 AM – 8 PM 24 2.5 QPS
Mon – Fri 8 PM – 8 AM 4 0.5 QPS
Sat – Sun 2 0.3 QPS

You can use this initial approach to define a baseline that needs to be reevaluated to ensure the right sizing of your Amazon Kendra resources.

It’s also important to keep in mind that query autocomplete capacity is defined by your query capacity. Query autocomplete capacity is calculated as five times the provisioned query capacity for an index with a base capacity of 2.5 calls per second. This means that if your Amazon Kendra index query capacity is below 0.6 QPS, you have 2.5 QPS for query autocomplete. If your Amazon Kendra index query capacity is above 0.6 QPS, your query autocomplete capacity is calculated as 2.5 times your current index query capacity.

Conclusion

In this blog post you learned how to estimate capacity and scale for your Amazon Kendra index.

Now it’s easier than ever to experience Amazon Kendra, with 750 hours of Free Tier and the new reduced price for Amazon Kendra Developer Edition. Get started, visit our workshop, or check out the AWS Machine Learning Blog.


About the Author

Dr. Andrew Kane is an AWS Principal Specialist Solutions Architect based out of London. He focuses on the AWS Language and Vision AI services, helping our customers architect multiple AI services into a single use-case driven solution. Before joining AWS at the beginning of 2015, Andrew spent two decades working in the fields of signal processing, financial payments systems, weapons tracking, and editorial and publishing systems. He is a keen karate enthusiast (just one belt away from Black Belt) and is also an avid home-brewer, using automated brewing hardware and other IoT sensors.

 

Tapodipta Ghosh is a Senior Architect. He leads the Content And Knowledge Engineering Machine Learning team that focuses on building models related to AWS Technical Content. He also helps our customers with AI/ML strategy and implementation using our AI Language services like Amazon Kendra.

 

 

Jean-Pierre Dodel leads product management for Amazon Kendra, a new ML-powered enterprise search service from AWS. He brings 15 years of Enterprise Search and ML solutions experience to the team, having worked at Autonomy, HP, and search startups for many years prior to joining Amazon four years ago. JP has led the Amazon Kendra team from its inception, defining vision, roadmaps, and delivering transformative semantic search capabilities to customers like Dow Jones, Liberty Mutual, 3M, and PwC.

 

Juan Bustos is an AI Services Specialist Solutions Architect at Amazon Web Services, based in Dallas, TX. Outside of work, he loves spending time writing and playing music as well as trying random restaurants with his family.

Read More

Reimagine knowledge discovery using Amazon Kendra’s Web Crawler

When you deploy intelligent search in your organization, two important factors to consider are access to the latest and most comprehensive information, and a contextual discovery mechanism. Many companies are still struggling to make their internal documents searchable in a way that allows employees to get relevant information knowledge in a scalable, cost-effective manner. A 2018 International Data Corporation (IDC) study found that data professionals are losing 50% of their time every week—30% searching for, governing, and preparing data, plus 20% duplicating work. Amazon Kendra is purpose-built for addressing these challenges. Amazon Kendra is an intelligent search service that uses deep learning and reading comprehension to deliver more accurate search results.

The intelligent search capabilities of Amazon Kendra improve the search and discovery experience, but enterprises are still faced with the challenge of connecting troves of unstructured data and making that data accessible to search. Content is often unstructured and scattered across intranets and Wikis, making critical information hard to find and costing employees time and effort to track down the right answer.

Enterprises spend a lot of time and effort building complex extract, transform, and load (ETL) jobs that aggregate data sources. Amazon Kendra connectors allow you to quickly aggregate content as part of a single unified searchable index, without needing to copy or move data from an existing location to a new one. This reduces the time and effort typically associated with creating a new search solution.

With the recently launched Amazon Kendra web crawler, it’s now easier than ever to discover information stored within the vast amount of content spread across different websites and internal web portals. You can use the Amazon Kendra web crawler to quickly ingest and search content from your websites.

Sample use case

A common need is to reduce the complexity of searching across multiple data sources present in an organization. Most organizations have multiple departments, each having their own knowledge management and search systems. For example, the HR department may maintain a WordPress-based blog containing news and employee benefits-related articles, a Confluence site could contain internal knowledge bases maintained by engineering, sales may have sales plays stored on a custom content management system (CMS), and corporate office information could be stored in a Microsoft SharePoint Online site.

You can index all these types of webpages for search by using the Amazon web crawler. Specific connectors are also available to index documents directly from individual content data sources.

In this post, you learn how to ingest documents from a WordPress site using its sitemap with the Amazon Kendra web crawler.

Ingest documents with Amazon Kendra web crawler

For this post, we set up a WordPress site with information about AWS AI language services. In order to be able to search the contents of my website, we create a web crawler data source.

  1. On the Amazon Kendra console, choose Data sources in the navigation pane.

  1. Under WebCrawler, choose Add connector.

  1. For Data source name, enter a name for the data source.
  2. Add an optional description.

  1. Choose Next.

The web crawler allows you to define a series of source URLs or source sitemaps. WordPress generates a sitemap, which I use for this post.

  1. For Source, select Source sitemaps.
  2. For Source sitemaps, enter the sitemap URL.

  1. Add a web proxy or authentication if your host requires that.
  2. Create a new AWS Identity and Access Management (IAM) role.
  3. Choose Next.

  1. For this post, I set up the web crawler to crawl one page per second, so I modify the Maximum throttling value to 60.

The maximum value that’s allowed is 300.

For this post, I remove a blog entry that contains 2021/06/28/this-post-is-to-be-skipped/ in the URL, and also all the contents that have the term /feed/ in the URL. Keep in mind that the excluded content won’t be ingested into your Amazon Kendra index, so your users won’t be able to search across these documents.

  1. In the Additional configuration section, add these patterns on the Exclude patterns

  1. For Sync run schedule, choose Run on demand.
  2. Choose Next.

  1. Review the settings and choose Create.
  2. When the data source creation process is complete, choose Sync now.

When the sync job is complete, I can search on my website.

Conclusion

In this post, you saw how to set up the Amazon Kendra web crawler and how easy is to ingest your websites into your Amazon Kendra index. If you’re just getting started with Amazon Kendra, you can build an index, ingest your website, and take advantage of intelligent search to provide better results to your users. To learn more about Amazon Kendra, refer to the Amazon Kendra Essentials workshop and deep dive into the Amazon Kendra blog.


About the Authors

Tapodipta Ghosh is a Senior Architect. He leads the Content And Knowledge Engineering Machine Learning team that focuses on building models related to AWS Technical Content. He also helps our customers with AI/ML strategy and implementation using our AI Language services like Amazon Kendra.

 

 

Vijai Gandikota is a Senior Product Manager at Amazon Web Services for Amazon Kendra.

 

 

 

 

Juan Bustos is an AI Services Specialist Solutions Architect at Amazon Web Services, based in Dallas, TX. Outside of work, he loves spending time writing and playing music as well as trying random restaurants with his family.

Read More

Enghouse EspialTV enables TV accessibility with Amazon Polly

This is a guest post by Mick McCluskey, the VP of Product Management at Enghouse EspialTV. Enghouse provides software solutions that power digital transformation for communications service operators. EspialTV is an Enghouse SaaS solution that transforms the delivery of TV services for these operators across Set Top Boxes (STBs), media players, and mobile devices.

A large audience of consumers use TV services, and several of these groups may have disabilities that make it more difficult for them to access these services. To ensure that TV services are accessible to the broadest possible audience, we need to consider accessibility as a key element of the user experience (UX) for the service. Additionally, because TV is viewed as a key service by governments, it’s often subject to regulatory requirements for accessibility, including talking interfaces for the visually impaired. In the US, the Twenty-First Century Communications and Video Accessibility Act (CVAA) mandates improved accessibility for visual interfaces for users with limited hearing and vision in the US. The CVAA ensures accessibility laws from the 1980s and 1990s are brought up to date with modern technologies, including new digital, broadband, and mobile innovations.

This post describes how Enghouse uses Amazon Polly to significantly improve accessibility for EspialTV through talking interactive menu guides for visually impaired users while meeting regulatory requirements.

Challenges

A key challenge for visually impaired users is navigating TV menus to find the content they want to view. Most TV menus are designed for a 10-foot viewing experience, meaning that a consumer sitting 10 feet from the screen can easily see the menu items. For the visually impaired, these menu items aren’t easy to see and are therefore hard to navigate. To improve our UX for subscribers with limited vision, we sought to develop a mechanism to provide audible descriptions of the menu, allowing easier navigation of key functions such as the following:

  • Channel and program selection
  • Channel and program information
  • Setup configuration, closed-caption control and options, and video description control
  • Configuration information
  • Playback

Overview of the AWS talking menu solution

Hosted on AWS, EspialTV is offered to communications service providers in a software as a service (SaaS) model. It was important for Enghouse to have a solution that not only supported the navigation currently offered at the time of launch, but was highly flexible to support changes and enhancements over time. This way, the voice assistance continuously evolved and improved to accommodate new capabilities as new services and features were added to the menu. For this reason, the solution had to be driven by real-time APIs calls as opposed to hardcoded text-to-speech menu configurations.

To ensure CVAA compliance and accelerate deployment, Enghouse chose to use Amazon Polly to implement this talking menu solution for the following reasons:

  • We wanted a reliable and robust solution within minimal operational and management overhead
  • It permitted faster time to market by using ready-made text-to-speech APIs
  • The real-time API approach offered greater flexibility as we evolved the service over time

The following diagram illustrates the architecture of the talking menu solution.

Using the Amazon Polly text-to-speech API allowed us to build a simple solution that integrated with our current infrastructure and followed this flow:

  • Steps 1 and 2 – When TV users open the menu guide service, the client software running on the Set Top Box (STB) makes a call via the internet or Data Over Cable Service Interface Specification (DOCSIS) cable modem, which is routed through the cable operators headend server to the Espial Guide service running on the AWS Cloud.
  • Step 3 – As TV users interact with the menu guide on the STBs, the client software running on the STBs sends the string containing the specific menu description highlighted by the customer.
  • Step 4 – The cable operators headend server routes the request to a local cache to verify whether the requested string’s text-to-speech is cached locally. If it is, the corresponding text-to-speech is sent back to the STB to be read out loud to the TV user.
  • Step 5 – Each unique cable operator has a local cache. If the requested string isn’t cached locally in the cable operator’s environment, the requested string is sent to the EspialTV service in AWS, where it’s met by a secondary caching server to respond to the request. This secondary layer of caching hosted in the Espial environment ensures high availability and increases cache hit rates. For example, if the caching servers on the cable operator environment is unavailable, the cache request can be resolved by the secondary caching system hosted in the Espial environment.
  • Steps 6 and 7 – If the requested string isn’t found in the caching server in the EspialTV service, it’s routed to the Amazon Polly API to be converted to text-to-speech, which is routed back to the cable operator headend server and then to the TV user’s STB to be read out loud to the user.

This architecture has several key considerations. Firstly, there are several layers of caching implemented to minimize latency for the end user. This also supports the spikey nature of this workload to ensure that only requests not found in the respective caches are made to Amazon Polly.

The ready-made text-to-speech APIs provided by Amazon Polly enables us able to implement the service with just one engineer. We also reduced the expected delivery time by 75% compared to our estimates for building an in-house custom solution. The Amazon Polly documentation was very clear, and the ramp-up time was limited. Since implementation, this solution is reliably supporting 40 cable operators, which each have between 1,000–100,000 STBs.

Conclusion

EspialTV offers operators a TV solution that provides fast time to revenue, low startup costs, and scalability from small to very large operators. EspialTV offers providers and consumers a compelling and always relevant experience for their TV services. With Amazon Polly, we have ensured operators can offer a TV service to the broadest possible range of consumers and align with regulatory requirements for accessibility. To learn more about Amazon Polly, visit the product page.

The content and opinions in this post are those of the third-party author and AWS is not responsible for the content or accuracy of this post.


About the Author

Mick McCluskey is VP of Product Management at Enghouse, a leading provider of software solutions helping operators use digital transformation to drive profitability in fast-changing and emerging markets. In the area of video solutions, Mick has been pivotal in creating the EspialTV solution—a truly disruptive TVaaS solution run on the AWS Cloud that permits pay TV operators to manage transition while maintaining profitability in a rapidly changing market. He is currently working on solutions that help operators take advantage of key technology and industry trends like OTT video, IoT, and 5G. In addition to delivering cloud-based solutions, he continues his journey of learning how to play golf.

Read More

Upgrade your Amazon Polly voices to neural with one line of code

In 2019, Amazon Polly launched neural text-to-speech (NTTS) voices in US English and UK English. Neural voices use machine learning and provide a richer, more lifelike speech quality. Since the initial launch of NTTS, Amazon Polly has extended its neural offering by adding new voices in US Spanish, Brazilian Portuguese, Australian English, Canadian French, German and Korean. Some of them also are available in a Newscaster speaking style tailored to the specific needs of publishers.

If you’ve been using the standard voices in Amazon Polly, upgrading to neural voices is easy. No matter which programming language you use, the upgrade process only requires a simple addition or modification of the Engine parameter wherever you use the SynthesizeSpeech and StartSynthesizeSpeechTask method in your code. In this post, you’ll learn about the benefits of neural voices and how to migrate your voices to NTTS.

Benefits of neural vs. standard

Because neural voices provide a more expressive, natural-sounding quality than standard, migrating to neural improves the user experience and boosts engagement.

“We rely on speech synthesis to drive dynamic narrations for our educational content,” says Paul S. Ziegler, Chief Executive Officer at Reflare. “The switch from Amazon Polly’s standard to neural voices has allowed us to create narrations that are so good as to consistently be indistinguishable from human speech to non-native speakers and to occasionally even fool native speakers.”

The following is an example of Joanna’s standard voice.

The following is an example of the same words, but using Joanna’s neural voice.

“Switching to neural voices is as easy as switching to other non-neural voices,” Ziegler says. “Since our systems were already set up to automatically generate voiceovers on the fly, implementing the changes took less than 5 minutes.”

Quick migration checklist

Not all SSML tags, Regions, and languages support neural voices. Before making the switch, use this checklist to verify that NTTS is available for your specific business needs:

  • Regional support – Verify that you’re making requests in Regions that support NTTS
  • Language and voice support – Verify that you’re making requests to voices and languages that support NTTS by checking the current list of voices and languages
  • SSML tag support – Verify that the SSML tags in your requests are supported by NTTS by checking SSML tag compatibility

Additional considerations

The following table summarizes additional considerations before you switch to NTTS.

Standard Neural
Cost $4 per million characters $16 per million characters
Free Tier 5 million characters per month 1 million characters per month
Default Sample Rate 22 kHz 24 kHz
Usage Quota Quotas in Amazon Polly

Code samples

If you’re already using Amazon Polly standard, the following samples demonstrate how to switch to neural for all SDKs. The required change is highlighted in bold.

Go:

input := &polly.SynthesizeSpeechInput{
    OutputFormat: aws.String("mp3"),
    Text: aws.String(“Hello World!”),
    VoiceId: aws.String("Joanna"),
    Engine: “neural”}

Java:

SynthesizeSpeechRequest synthReq = SynthesizeSpeechRequest.builder()
    .text('Hello World!')
    .voiceId('Joanna')
    .outputFormat('mp3')
    .engine('neural')
    .build();
ResponseInputStream<SynthesizeSpeechResponse> synthRes = polly.synthesizeSpeech(synthReq);

Javascript:

polly.synthesizeSpeech({
    Text: “Hello World!”,
    OutputFormat: "mp3",
    VoiceId: "Joanna",
    TextType: "text",
    Engine: “neural”});

.NET:

var response = client.SynthesizeSpeech(new SynthesizeSpeechRequest 
{
    Text = "Hello World!",
    OutputFormat = "mp3",
    VoiceId = "Joanna"
    Engine = “neural”
});

PHP:

$result = $client->synthesizeSpeech([
    'Text' => ‘Hello world!’,
    'OutputFormat' => ‘mp3,
    'VoiceId' => ‘Joanna’,
    'Engine' => ‘neural’]);

Python:

polly.synthesize_speech(
    Text="Hello world!",
    OutputFormat="mp3",
    VoiceId="Joanna",
    Engine=”neural”)

Ruby:

resp = polly.synthesize_speech({
    text: “Hello World!”,
    output_format: "mp3",
    voice_id: "Joanna",
    engine: “neural”
  })

Conclusion

You can start playing with neural voices immediately on the Amazon Polly console. If you have any questions or concerns, please post it to the AWS Forum for Amazon Polly, or contact your AWS Support team.


About the Author

Marta Smolarek is a Senior Program Manager in the Amazon Text-to-Speech team. Outside of work, she loves to go camping with her family

Read More