Amazon scientists have shown that our latest text-to-speech (TTS) system, which uses a generative neural network, can learn to employ a newscaster style from just a few hours of training data.Read More
Reducing Customer Friction through Skill Selection
This year, we’ve started to explore ways to make it easier for customers to find and engage with Alexa skills.Read More
Amazon helps launch workshop on automatic fact verification
At the 2018 Conference on Empirical Methods in Natural Language Processing (EMNLP), Amazon researchers and their colleagues at the University of Sheffield and Imperial College London will host the first Workshop on Fact Extraction and Verification, which will explore how computer systems can learn to recognize false assertions online.Read More
How an Echo Device Could Locate Snaps, Claps, and Taps
Microphone arrays like the ones in Echo devices are being widely investigated as a way to enable sound source localization. Sound reaches the separate microphones of the array at different times, and the differences can be used to calculate the direction of the sound’s source.Read More
Identifying sounds in audio streams
On September 20, Amazon unveiled a host of new products and features, including Alexa Guard, a smart-home feature available on select Echo devices later this year. When activated, Alexa Guard can send a customer alerts if it detects the sound of glass breaking or of smoke or carbon monoxide alarms in the home.Read More
How Voice and Graphics Working Together Enhance the Alexa Experience
Last week, Amazon announced the release of both a redesigned Echo Show with a bigger screen and the Alexa Presentation Language, which enables third-party developers to build “multimodal” skills that coordinate Alexa’s natural-language-understanding systems with on-screen graphics.Read More
Whisper to Alexa, and She’ll Whisper Back
If you’re in a room where a child has just fallen asleep, and someone else walks in, you might start speaking in a whisper, to indicate that you’re trying to keep the room quiet. The other person will probably start whispering, too.Read More
Learning to Recognize the Irrelevant
A central task of natural-language-understanding systems, like the ones that power Alexa, is domain classification, or determining the general subject of a user’s utterances. Voice services must make finer-grained determinations, too, such as the particular actions that a customer wants executed. But domain classification makes those determinations much more efficient, by narrowing the range of possible interpretations.Read More
How Alexa Is Learning to Ignore TV, Radio, and Other Media Players
Echo devices have already attracted tens of millions of customers, but in the Alexa AI group, we’re constantly working to make Alexa’s speech recognition systems even more accurate.Read More
Alexa at Interspeech 2018: How interaction histories can improve speech understanding
Alexa’s ability to act on spoken requests depends on statistical models that translate speech to text and text to actions. Historically, the models’ decisions were one-size-fits-all: the same utterance would produce the same action, regardless of context.Read More