We show that imitation learning of human-human interactions in a simulated world, in conjunction with self-supervised learning, is sufficient to produce a multimodal interactive agent, which we call MIA, that successfully interacts with non-adversarial humans 75% of the time. We further identify architectural and algorithmic techniques that improve performance, such as hierarchical action selection.Read More
Improving language models by retrieving from trillions of tokens
We explore an alternate path for improving language models: we augment transformers with retrieval over a database of text passages including web pages, books, news and code. We call our method RETRO, for “Retrieval Enhanced TRansfOrmers”.Read More
Creating Interactive Agents with Imitation Learning
We show that imitation learning of human-human interactions in a simulated world, in conjunction with self-supervised learning, is sufficient to produce a multimodal interactive agent, which we call MIA, that successfully interacts with non-adversarial humans 75% of the time. We further identify architectural and algorithmic techniques that improve performance, such as hierarchical action selection.Read More
Improving language models by retrieving from trillions of tokens
We explore an alternate path for improving language models: we augment transformers with retrieval over a database of text passages including web pages, books, news and code. We call our method RETRO, for “Retrieval Enhanced TRansfOrmers”.Read More
Language modelling at scale: Gopher, ethical considerations, and retrieval
Language, and its role in demonstrating and facilitating comprehension – or intelligence – is a fundamental part of being human. It gives people the ability to communicate thoughts and concepts, express ideas, create memories, and build mutual understanding. These are foundational parts of social intelligence. It’s why our teams at DeepMind study aspects of language processing and communication, both in artificial agents and in humans.Read More
On the Expressivity of Markov Reward
Our main results prove that while reward can express many tasks, there exist instances of each task type that no Markov reward function can capture. We then provide a set of polynomial-time algorithms that construct a reward function which allows an agent to optimize tasks of each of these three types, and correctly determine when no such reward function exists.Read More
Exploring the beauty of pure mathematics in novel ways
Discovering new patterns in the fields of topology and representation theory with machine learningRead More
Exploring the beauty of pure mathematics in novel ways
More than a century ago, Srinivasa Ramanujan shocked the mathematical world with his extraordinary ability to see remarkable patterns in numbers that no one else could see. The self-taught mathematician from India described his insights as deeply intuitive and spiritual, and patterns often came to him in vivid dreams.Read More
Unsupervised deep learning identifies semantic disentanglement in single inferotemporal face patch neurons
Our brain has an amazing ability to process visual information. We can take one glance at a complex scene, and within milliseconds be able to parse it into objects and their attributes, like colour or size, and use this information to describe the scene in simple language. Underlying this seemingly effortless ability is a complex computation performed by our visual cortex, which involves taking millions of neural impulses transmitted from the retina and transforming them into a more meaningful form that can be mapped to the simple language description. In order to fully understand how this process works in the brain, we need to figure out both how the semantically meaningful information is represented in the firing of neurons at the end of the visual processing hierarchy, and how such a representation may be learnt from largely untaught experience.Read More
Unsupervised deep learning identifies semantic disentanglement in single inferotemporal face patch neurons
Our brain has an amazing ability to process visual information. We can take one glance at a complex scene, and within milliseconds be able to parse it into objects and their attributes, like colour or size, and use this information to describe the scene in simple language. Underlying this seemingly effortless ability is a complex computation performed by our visual cortex, which involves taking millions of neural impulses transmitted from the retina and transforming them into a more meaningful form that can be mapped to the simple language description. In order to fully understand how this process works in the brain, we need to figure out both how the semantically meaningful information is represented in the firing of neurons at the end of the visual processing hierarchy, and how such a representation may be learnt from largely untaught experience.Read More