July 2018 – Vedere AI

Contextual Clues Can Help Improve Alexa’s Speech Recognizers

Automatic speech recognition systems, which convert spoken words into text, are an important component of conversational agents such as Alexa. These systems generally comprise an acoustic model, a pronunciation model, and a statistical language model. The role of the statistical language model is to assign a probability to the next word in a sentence, given the previous ones. For instance, the phrases “Pulitzer Prize” and “pullet surprise” may have very similar acoustic profiles, but statistically, one is far more likely to conclude a question that begins “Alexa, what playwright just won a … ?”Read More

How Alexa can use song-playback duration to learn customers’ preferences

To be as useful as possible to customers, Alexa should be able to make educated guesses about the meanings of ambiguous utterances. If, for instance, a customer says, “Alexa, play the song ‘Hello’”, Alexa should be able to infer from the customer’s listening history whether the song requested is the one by Adele or the one by Lionel Richie.Read More

Measuring abstract reasoning in neural networks

Neural network-based models continue to achieve impressive results on longstanding machine learning problems, but establishing their capacity to reason about abstract concepts has proven difficult. Building on previous efforts to solve this important feature of general-purpose learning systems, our latest paper sets out an approach for measuring abstract reasoning in learning machines, and reveals some important insights about the nature of generalisation itself.Read More

DeepMind papers at ICML 2018

The 2018 International Conference on Machine Learning will take place in Stockholm, Sweden from 10-15 July.For those attending and planning the week ahead, we are sharing a schedule of DeepMind presentations at ICML (you can download a pdf version here). We look forward to the many engaging discussions, ideas, and collaborations that are sure to arise from the conference!Efficient Neural Audio SynthesisAuthors: Nal Kalchbrenner, Erich Elsen, Karen Simonyan, Seb Nouri, Norman Casagrande, Edward Lockhart, Sander Dieleman, Aaron van den Oord, Koray KavukcuogluSequential models achieve state-of-the-art results in audio, visual and textual domains with respect to both estimating the data distribution and generating desired samples. Efficient sampling for this class of models at the cost of little to no loss in quality has however remained an elusive task. With a focus on text-to-speech synthesis, we show that compact recurrent architectures, a remarkably high degree of weight sparsification and a novel reordering of the variables greatly reduce sampling latency while maintaining high audio fidelity. We first describe a compact single-layer recurrent neural network, the WaveRNN, with a novel dual softmax layer that matches the quality of the state-of-the-art WaveNet model.Read More

2018 Amazon Research Awards CFP launch announcement

This month, Amazon announced the 11 focus areas of the 2018 Amazon Research Awards.Read More

Vedere AI

Monthly Archives: July 2018

Contextual Clues Can Help Improve Alexa’s Speech Recognizers

How Alexa can use song-playback duration to learn customers’ preferences

Measuring abstract reasoning in neural networks

DeepMind papers at ICML 2018

2018 Amazon Research Awards CFP launch announcement

Navigation

GenAI Vision Endless Possibilities

"I'm interested in things that change the world or that affect the future and wondrous, new technology where you see it, and you're like, 'Wow, how did that even happen? How is that possible?'" -- Elon Musk

Copyright © 2019-2025 Vedere AI. All Rights Reserved.