Safety-first AI for autonomous data centre cooling and industrial control

Many of societys most pressing problems have grown increasingly complex, so the search for solutions can feel overwhelming. At DeepMind and Google, we believe that if we can use AI as a tool to discover new knowledge, solutions will be easier to reach.In 2016, we jointly developed an AI-powered recommendation system to improve the energy efficiency of Googles already highly-optimised data centres. Our thinking was simple: even minor improvements would provide significant energy savings and reduce CO2 emissions to help combat climate change.Now were taking this system to the next level: instead of human-implemented recommendations, our AI system is directly controlling data centre cooling, while remaining under the expert supervision of our data centre operators. This first-of-its-kind cloud-based control system is now safely delivering energy savings in multiple Google data centres.Read More

A major milestone for the treatment of eye disease

We are delighted to announce the results of the first phase of our joint research partnership with Moorfields Eye Hospital, which could potentially transform the management of sight-threatening eye disease.The results, published online inNature Medicine(open access full text, see end of blog), show that our AI system can quickly interpret eye scans from routine clinical practice with unprecedented accuracy. It can correctly recommend how patients should be referred for treatment for over 50 sight-threatening eye diseases as accurately as world-leading expert doctors.These are early results, but they show that our system could handle the wide variety of patients found in routine clinical practice. In the long term, we hope this will help doctors quickly prioritise patients who need urgent treatment which could ultimately save sight.A more streamlined processCurrently, eyecare professionals use optical coherence tomography (OCT) scans to help diagnose eye conditions. These 3D images provide a detailed map of the back of the eye, but they are hard to read and need expert analysis to interpret.The time it takes to analyse these scans, combined with the sheer number of scans that healthcare professionals have to go through (over 1,000 a day at Moorfields alone), can lead to lengthy delays between scan and treatment even when someone needs urgent care.Read More

Objects that Sound

Visual and audio events tend to occur together: a musician plucking guitar strings and the resulting melody; a wine glass shattering and the accompanying crash; the roar of a motorcycle as it accelerates. These visual and audio stimuli are concurrent because they share a common cause. Understanding the relationship between visual events and their associated sounds is a fundamental way that we make sense of the world around us.In Look, Listen, and Learn and Objects that Sound (to appear at ECCV 2018), we explore this observation by asking: what can be learnt by looking at and listening to a large number of unlabelled videos? By constructing an audio-visual correspondence learning task that enables visual and audio networks to be jointly trained from scratch, we demonstrate that:the networks are able to learn useful semantic concepts;the two modalities can be used to search one another (e.g. to answer the question, Which sound fits well with this image?); andthe object making the sound can be localised.Limitations of previous cross-modal learning approachesLearning from multiple modalities is not new; historically, researchers have largely focused on image-text or audio-vision pairings.Read More