We argue that merely using curiosity for fast environment exploration or as a bonus reward for a specific task does not harness the full potential of this technique and misses useful skills. Instead, we propose to shift the focus towards retaining the behaviours which emerge during curiosity-based learning. We posit that these self-discovered behaviours serve as valuable skills in an agent’s repertoire to solve related tasks.Read More
Is Curiosity All You Need? On the Utility of Emergent Behaviours from Curious Exploration
We argue that merely using curiosity for fast environment exploration or as a bonus reward for a specific task does not harness the full potential of this technique and misses useful skills. Instead, we propose to shift the focus towards retaining the behaviours which emerge during curiosity-based learning. We posit that these self-discovered behaviours serve as valuable skills in an agent’s repertoire to solve related tasks.Read More
Challenges in Detoxifying Language Models
In our paper, we focus on LMs and their propensity to generate toxic language. We study the effectiveness of different methods to mitigate LM toxicity, and their side-effects, and we investigate the reliability and limits of classifier-based automatic toxicity evaluation.Read More
Challenges in Detoxifying Language Models
In our paper, we focus on LMs and their propensity to generate toxic language. We study the effectiveness of different methods to mitigate LM toxicity, and their side-effects, and we investigate the reliability and limits of classifier-based automatic toxicity evaluation.Read More
Building architectures that can handle the world’s data
Perceiver IO, a more general version of the Perceiver architecture, can produce a wide variety of outputs from many different inputs.Read More
Building architectures that can handle the world’s data
Most architectures used by AI systems today are specialists. A 2D residual network may be a good choice for processing images, but at best it’s a loose fit for other kinds of data — such as the Lidar signals used in self-driving cars or the torques used in robotics. What’s more, standard architectures are often designed with only one task in mind, often leading engineers to bend over backwards to reshape, distort, or otherwise modify their inputs and outputs in hopes that a standard architecture can learn to handle their problem correctly. Dealing with more than one kind of data, like the sounds and images that make up videos, is even more complicated and usually involves complex, hand-tuned systems built from many different parts, even for simple tasks. As part of DeepMind’s mission of solving intelligence to advance science and humanity, we want to build systems that can solve problems that use many types of inputs and outputs, so we began to explore a more general and versatile architecture that can handle all types of data.Read More
Generally capable agents emerge from open-ended play
In new work, algorithmic advances and new training environments lead to agents which exhibit general heuristic behaviours.Read More
Generally capable agents emerge from open-ended play
In recent years, artificial intelligence agents have succeeded in a range of complex game environments. For instance, AlphaZero beat world-champion programs in chess, shogi, and Go after starting out with knowing no more than the basic rules of how to play. Through reinforcement learning (RL), this single system learnt by playing round after round of games through a repetitive process of trial and error. But AlphaZero still trained separately on each game — unable to simply learn another game or task without repeating the RL process from scratch. The same is true for other successes of RL, such as Atari, Capture the Flag, StarCraft II, Dota 2, and Hide-and-Seek. DeepMind’s mission of solving intelligence to advance science and humanity led us to explore how we could overcome this limitation to create AI agents with more general and adaptive behaviour. Instead of learning one game at a time, these agents would be able to react to completely new conditions and play a whole universe of games and tasks, including ones never seen before.Read More
Enabling high-accuracy protein structure prediction at the proteome scale
Many novel machine learning innovations contribute to AlphaFold’s current level of accuracy. We give a high-level overview of the system below; for a technical description of the network architecture see our AlphaFold methods paper and especially its extensive Supplementary Information.Read More
Putting the power of AlphaFold into the world’s hands
In partnership with EMBL-EBI, were incredibly proud to be launching the AlphaFold Protein Structure Database.Read More