Learning about artificial intelligence: A hub of MIT resources for K-12 students

In light of the recent events surrounding Covid-19, learning for grades K-12 looks very different than it did a month ago. Parents and educators may be feeling overwhelmed about turning their homes into classrooms. 

With that in mind, a team led by Media Lab Associate Professor Cynthia Breazeal has launched aieducation.mit.edu to share a variety of online activities for K-12 students to learn about artificial intelligence, with a focus on how to design and use it responsibly. Learning resources provided on this website can help to address the needs of the millions of children, parents, and educators worldwide who are staying at home due to school closures caused by Covid-19, and are looking for free educational activities that support project-based STEM learning in an exciting and innovative area. 

The website is a collaboration between the Media Lab, MIT Stephen A. Schwarzman College of Computing, and MIT Open Learning, serving as a hub to highlight diverse work by faculty, staff, and students across the MIT community at the intersection of AI, learning, and education. 

“MIT is the birthplace of Constructionism under Seymour Papert. MIT has revolutionized how children learn computational thinking with hugely successful platforms such as Scratch and App Inventor. Now, we are bringing this rich tradition and deep expertise to how children learn about AI through project-based learning that dovetails technical concepts with ethical design and responsible use,” says Breazeal. 

The website will serve as a hub for MIT’s latest work in innovating learning and education in the era of AI. In addition to highlighting research, it also features up-to-date project-based activities, learning units, child-friendly software tools, digital interactives, and other supporting materials, highlighting a variety of MIT-developed educational research and collaborative outreach efforts across and beyond MIT. The site is intended for use by students, parents, teachers, and lifelong learners alike, with resources for children and adults at all learning levels, and with varying levels of comfort with technology, for a range of artificial intelligence topics. The team has also gathered a variety of external resources to explore, such as Teachable Machines by Google, a browser-based platform that lets users train classifiers for their own image-recognition algorithms in a user-friendly way.

In the spirit of “mens et manus” — the MIT motto, meaning “mind and hand” — the vision of technology for learning at MIT is about empowering and inspiring learners of all ages in the pursuit of creative endeavors. The activities highlighted on the new website are designed in the tradition of constructionism: learning through project-based experiences in which learners build and share their work. The approach is also inspired by the idea of computational action, where children can design AI-enabled technologies to help others in their community.

“MIT has been a world leader in AI since the 1960s,” says MIT professor of computer science and engineering Hal Abelson, who has long been involved in MIT’s AI research and educational technology. “MIT’s approach to making machines intelligent has always been strongly linked with our work in K-12 education. That work is aimed at empowering young people through computational ideas that help them understand the world and computational actions that empower them to improve life for themselves and their communities.”

Research in computer science education and AI education highlights the importance of having a mix of plugged and unplugged learning approaches. Unplugged activities include kinesthetic or discussion-based activities developed to introduce children to concepts in AI and its societal impact without using a computer. Unplugged approaches to learning AI are found to be especially helpful for young children. Moreover, these approaches can also be accessible to learning environments (classrooms and homes) that have limited access to technology. 

As computers continue to automate more and more routine tasks, inequity of education remains a key barrier to future opportunities, where success depends increasingly on intellect, creativity, social skills, and having specific skills and knowledge. This accelerating change raises the critical question of how to best prepare students, from children to lifelong learners, to be successful and to flourish in the era of AI.

It is important to help prepare a diverse and inclusive citizenry to be responsible designers and conscientious users of AI. In that spirit, the activities on aieducation.mit.edu range from hands-on programming to paper prototyping, to Socratic seminars, and even creative writing about speculative fiction. The learning units and project-based activities are designed to be accessible to a wide audience with different backgrounds and comfort levels with technology. A number of these activities leverage learning about AI as a way to connect to the arts, humanities, and social sciences, too, offering a holistic view of how AI intersects with different interests and endeavors. 

The rising ubiquity of AI affects us all, but today a disproportionately small slice of the population has the skills or power to decide how AI is designed or implemented; worrying consequences have been seen in algorithmic bias and perpetuation of unjust systems. Democratizing AI through education, starting in K-12, will help to make it more accessible and diverse at all levels, ultimately helping to create a more inclusive, fair, and equitable future.

Read More

Computational thinking class enables students to engage in Covid-19 response

When an introductory computational science class, which is open to the general public, was repurposed to study the Covid-19 pandemic this spring, the instructors saw student registration rise from 20 students to nearly 300.

Introduction to Computational Thinking (6.S083/18.S190), which applies data science, artificial intelligence, and mathematical models using the Julia programming language developed at MIT, was introduced in the fall as a pilot half-semester class. It was launched as part of the MIT Stephen A. Schwarzman College of Computing’s computational thinking program and spearheaded by Department of Mathematics Professor Alan Edelman and Visiting Professor David P. Sanders. They very quickly were able to fast-track the curriculum to focus on applications to Covid-19 responses; students were equally fast in jumping on board.

“Everyone at MIT wants to contribute,” says Edelman. “While we at the Julia Lab are doing research in building tools for scientists, Dave and I thought it would be valuable to teach the students about some of the fundamentals related to computation for drug development, disease models, and such.” 

The course is offered through MIT’s Department of Electronic Engineering and Computer Science and the Department of Mathematics. “This course opens a trove of opportunities to use computation to better understand and contain the Covid-19 pandemic,” says MIT Computer Science and Artificial Intelligence Laboratory Director Daniela Rus.

The fall version of the class had a maximum enrollment of 20 students, but the spring class has ballooned to nearly 300 students in one weekend, almost all from MIT. “We’ve had a tremendous response,” Edelman says. “This definitely stressed the MIT sign-up systems in ways that I could not have imagined.”

Sophomore Shinjini Ghosh, majoring in computer science and linguistics, says she was initially drawn to the class to learn Julia, “but also to develop the skills to do further computational modeling and conduct research on the spread and possible control of Covid-19.”

“There’s been a lot of misinformation about the epidemiology and statistical modeling of the coronavirus,” adds sophomore Raj Movva, a computer science and biology major. “I think this class will help clarify some details, and give us a taste of how one might actually make predictions about the course of a pandemic.” 

Edelman says that he has always dreamed of an interdisciplinary modern class that would combine the machine learning and AI of a “data-driven” world, the modern software and systems possibilities that Julia allows, and the physical models, differential equations, and  scientific machine learning of the “physical world.” 

He calls this class “a natural outgrowth of Julia Lab’s research, and that of the general cooperative open-source Julia community.” For years, this online community collaborates to create tools to speed up the drug approval process, aid in scientific machine learning and differential equations, and predict infectious disease transmission. “The lectures are open to the world, following the great MIT tradition of MIT open courses,” says Edelman.

So when MIT turned to virtual learning to de-densify campus, the transition to an online, remotely taught version of the class was not too difficult for Edelman and Sanders.

“Even though we have run open remote learning courses before, it’s never the same as being able to see the live audience in front of you,” says Edelman. “However, MIT students ask such great questions in the Zoom chat, so that it remains as intellectually invigorating as ever.”

Sanders, a Marcos Moshinsky research fellow currently on leave as a professor at the National University of Mexico, is working on techniques for accelerating global optimization. Involved with the Julia Lab since 2014, Sanders has worked with Edelman on various teaching, research, and outreach projects related to Julia, and his YouTube tutorials have reached over 100,000 views. “His videos have often been referred to as the best way to learn the Julia language,” says Edelman.

Edelman will also be enlisting some help from Philip, his family’s Corgi who until recently had been a frequent wanderer of MIT’s halls and classrooms. “Philip is a well-known Julia expert whose image has been classified many times by Julia’s AI Systems,” says Edelman. “Students are always happy when Philip participates in the online classes.”

Read More

Researching from home: Science stays social, even at a distance

With all but a skeleton crew staying home from each lab to minimize the spread of Covid-19, scores of Picower Institute researchers are immersing themselves in the considerable amount of scientific work that can done away from the bench. With piles of data to analyze; plenty of manuscripts to write; new skills to acquire; and fresh ideas to conceive, share, and refine for the future, neuroscientists have full plates, even when they are away from their, well, plates. They are proving that science can remain social, even if socially distant.

Ever since the mandatory ramp down of on-campus research took hold March 20, for example, teams of researchers in the lab of Troy Littleton, the Menicon Professor of Neuroscience, have sharpened their focus on two data-analysis projects that are every bit as essential to their science as acquiring the data in the lab in the first place. Research scientist Yulia Akbergenova and graduate student Karen Cunningham, for example, are poring over a huge amount of imaging data showing how the strength of connections between neurons, or synapses, mature and how that depends on the molecular components at the site. Another team, comprised of Picower postdoc Suresh Jetti and graduate students Andres Crane and Nicole Aponte-Santiago, is analyzing another large dataset, this time of gene transcription, to learn what distinguishes two subclasses of motor neurons that form synapses of characteristically different strength.

Work is similarly continuing among researchers in the lab of Elly Nedivi, the William R. (1964) and Linda R. Young Professor of Neuroscience. Since heading home, Senior Research Support Associate Kendyll Burnell has been looking at microscope images tracking how inhibitory interneurons innervate the visual cortex of mice throughout their development. By studying the maturation of inhibition, the lab hopes to improve understanding of the role of inhibitory circuitry in the experience-dependent changes, or plasticity, and development of the visual cortex, she says. As she’s worked, her poodle Soma (named for the central body structure of a neuron) has been by her side.

Despite extra time with comforts of home, though, it’s clear that nobody wanted this current mode of socially distant science. For every lab, it’s tremendously disruptive and costly. But labs are finding many ways to make progress nonetheless.

“Although we are certainly hurting because our lab work is at a standstill, the Miller lab is fortunate to have a large library of multiple-electrode neurophysiological data,” says Picower Professor Earl Miller. “The datasets are very rich. As our hypotheses and analytical tools develop, we can keep going back to old data to ask new questions. We are taking advantage of the wet lab downtime to analyze data and write papers. We have three under review and are writing at least three more right now.”

Miller is inviting new collaborations regardless of the physical impediment of social distancing. A recent lab meeting held via the videoconferencing app Zoom included MIT Department of Brain and Cognitive Sciences Associate Professor Ila Fiete and her graduate student, Mikail Khona. The Miller lab has begun studying how neural rhythms move around the cortex and what that means for brain function. Khona presented models of how timing relationships affect those waves. While this kind of an interaction between labs of the Picower Institute and the McGovern Institute for Brain Research would normally have taken place in person in MIT’s Building 46, neither lab let the pandemic get in the way.

Similarly, the lab of Li-Huei Tsai, Picower Professor and director of the Picower Institute, has teamed up with that of Manolis Kellis, professor in the MIT Computer Science and Artificial Intelligence Laboratory. They’re forming several small squads of experimenters and computational experts to launch analyses of gene expression and other data to illuminate the fate of individual cell types like interneurons or microglia in the context of the Alzheimer’s disease-afflicted brain. Other teams are focusing on analyses of questions such as how pathology varies in brain samples carrying different degrees of genetic risk factors. These analyses will prove useful for stages all along the scientific process, Tsai says, from forming new hypotheses to wrapping up papers that are well underway.

Remote collaboration and communication are proving crucial to researchers in other ways, too, proving that online interactions, though distant, can be quite personally fulfilling.

Nicholas DiNapoli, a research engineer in the lab of Associate Professor Kwanghun Chung, is making the best of time away from the bench by learning about the lab’s computational pipeline for processing the enormous amounts of imaging data it generates. He’s also taking advantage of a new program within the lab in which Senior Computer Scientist Lee Kamentsky is teaching Python computer programming principles to anyone in the lab who wants to learn. The training occurs via Zoom two days a week.

As part of a crowded calendar of Zoom meetings, or “Zeetings” as the lab has begun to call them, Newton Professor Mriganka Sur says he makes sure to have one-to-one meetings with everyone in the lab. The team also has organized into small subgroups around different themes of the lab’s research.

But also, the lab has continued to maintain its cohesion by banding together informally creating novel work and social experiences.

Graduate student Ning Leow, for example, used Zoom to create a co-working session in which participants kept a video connection open for hours at a time, just to be in each other’s virtual presence while they worked. Among a group of Sur lab friends, she read a paper related to her thesis and did a substantial amount of data analysis. She also advised a colleague on an analysis technique via the connection.

“I’ve got to say that it worked out really well for me personally because I managed to get whatever I wanted to complete on my list done,” she says, “and there was also a sense of healthy accountability along with the sense of community.”

Whether in person or via an officially imposed distance, science is social. In that spirit, graduate student K. Guadalupe “Lupe” Cruz organized a collaborative art event via Zoom for female scientists in brain and cognitive sciences at MIT. She took a photo of Rosalind Franklin, the scientist whose work was essential for resolving the structure of DNA, and divided it into nine squares to distribute to the event attendees. Without knowing the full picture, everyone drew just their section, talking all the while about how the strange circumstances of Covid-19 have changed their lives. At the end, they stitched their squares together to reconstruct the image.

Examples abound of how Picower scientists, though mostly separate and apart, are still coming together to advance their research and to maintain the fabric of their shared experiences.

Read More

Making robots better co-workers

Nima Keivan wants to help Amazon’s robots and fulfillment center workers to collaborate more effectively, so that robots can perform the more mundane tasks, while the humans can focus on higher-value jobs.Read More

Towards understanding glasses with graph neural networks

Under a microscope, a pane of window glass doesnt look like a collection of orderly molecules, as a crystal would, but rather a jumble with no discernable structure. Glass is made by starting with a glowing mixture of high-temperature melted sand and minerals. Once cooled, its viscosity (a measure of the friction in the fluid) increases a trillion-fold, and it becomes a solid, resisting tension from stretching or pulling. Yet the molecules in the glass remain in a seemingly disordered state, much like the original molten liquid almost as though the disordered liquid state had been flash-frozen in place. The glass transition, then, first appears to be a dramatic arrest in the movement of the glass molecules. Whether this process corresponds to a structural phase transition (as in water freezing, or the superconducting transition) is a major open question in the field. Understanding the nature of the dynamics of glass is fundamental to understanding how the atomic-scale properties define the visible features of many solid materials.Read More

Accelerating data-driven discoveries

As technologies like single-cell genomic sequencing, enhanced biomedical imaging, and medical “internet of things” devices proliferate, key discoveries about human health are increasingly found within vast troves of complex life science and health data.

But drawing meaningful conclusions from that data is a difficult problem that can involve piecing together different data types and manipulating huge data sets in response to varying scientific inquiries. The problem is as much about computer science as it is about other areas of science. That’s where Paradigm4 comes in.

The company, founded by Marilyn Matz SM ’80 and Turing Award winner and MIT Professor Michael Stonebraker, helps pharmaceutical companies, research institutes, and biotech companies turn data into insights.

It accomplishes this with a computational database management system that’s built from the ground up to host the diverse, multifaceted data at the frontiers of life science research. That includes data from sources like national biobanks, clinical trials, the medical internet of things, human cell atlases, medical images, environmental factors, and multi-omics, a field that includes the study of genomes, microbiomes, metabolomes, and more.

On top of the system’s unique architecture, the company has also built data preparation, metadata management, and analytics tools to help users find the important patterns and correlations lurking within all those numbers.

In many instances, customers are exploring data sets the founders say are too large and complex to be represented effectively by traditional database management systems.

“We’re keen to enable scientists and data scientists to do things they couldn’t do before by making it easier for them to deal with large-scale computation and machine-learning on diverse data,” Matz says. “We’re helping scientists and bioinformaticists with collaborative, reproducible research to ask and answer hard questions faster.”

A new paradigm

Stonebraker has been a pioneer in the field of database management systems for decades. He has started nine companies, and his innovations have set standards for the way modern systems allow people to organize and access large data sets.

Much of Stonebraker’s career has focused on relational databases, which organize data into columns and rows. But in the mid 2000s, Stonebraker realized that a lot of data being generated would be better stored not in rows or columns but in multidimensional arrays.

For example, satellites break the Earth’s surface into large squares, and GPS systems track a person’s movement through those squares over time. That operation involves vertical, horizontal, and time measurements that aren’t easily grouped or otherwise manipulated for analysis in relational database systems.

Stonebraker recalls his scientific colleagues complaining that available database management systems were too slow to work with complex scientific datasets in fields like genomics, where researchers study the relationships between population-scale multi-omics data, phenotypic data, and medical records.

“[Relational database systems] scan either horizontally or vertically, but not both,” Stonebraker explains. “So you need a system that does both, and that requires a storage manager down at the bottom of the system which is capable of moving both horizontally and vertically through a very big array. That’s what Paradigm4 does.”

In 2008, Stonebraker began developing a database management system at MIT that stored data in multidimensional arrays. He confirmed the approach offered major efficiency advantages, allowing analytical tools based on linear algebra, including many forms of machine learning and statistical data processing, to be applied to huge datasets in new ways.

Stonebraker decided to spin the project into a company in 2010, when he partnered with Matz, a successful entrepreneur who co-founded Cognex Corporation, a large industrial machine-vision company that went public in 1989. The founders and their team, including Alex Poliakov BS ’07, went to work building out key features of the system, including its distributed architecture that allows the system to run on low-cost servers, and its ability to automatically clean and organize data in useful ways for users.

The founders describe their database management system as a computational engine for scientific data, and they’ve named it SciDB. On top of SciDB, they developed an analytics platform, called the REVEAL discovery engine, based on users’ daily research activities and aspirations.

“If you’re a scientist or data scientist, Paradigm’s REVEAL and SciDB products take care of all the data wrangling and computational ‘plumbing and wiring,’ so you don’t have to worry about accessing data, moving data, or setting up parallel distributed computing,” Matz says. “Your data is science-ready. Just ask your scientific question and the platform orchestrates all of the data management and computation for you.”

SciDB is designed to be used by both scientists and developers, so users can interact with the system through graphical user interfaces or by leveraging statistical and programming languages like R and Python.

“It’s been very important to sell solutions, not building blocks,” Matz says. “A big part of our success in the life sciences with top pharmas and biotechs and research institutes is bringing them our REVEAL suite of application-specific solutions to problems. We’re not handing them an analytical platform that’s a set of Spark LEGO blocks; we’re giving them solutions that handle the data they deal with daily, and solutions that use their vocabulary and answer the questions they want to work on.”

Accelerating discovery

Today Paradigm4’s customers include some of the biggest pharmaceutical and biotech companies in the world as well as research labs at the National Institutes of Health, Stanford University, and elsewhere.

Customers can integrate genomic sequencing data, biometric measurements, data on environmental factors, and more into their inquiries to enable new discoveries across a range of life science fields.

Matz says SciDB did 1 billion linear regressions in less than an hour in a recent benchmark, and that it can scale well beyond that, which could speed up discoveries and lower costs for researchers who have traditionally had to extract their data from files and then rely on less efficient cloud-computing-based methods to apply algorithms at scale.

“If researchers can run complex analytics in minutes and that used to take days, that dramatically changes the number of hard questions you can ask and answer,” Matz says. “That is a force-multiplier that will transform research daily.”

Beyond life sciences, Paradigm4’s system holds promise for any industry dealing with multifaceted data, including earth sciences, where Matz says a NASA climatologist is already using the system, and industrial IoT, where data scientists consider large amounts of diverse data to understand complex manufacturing systems. Matz says the company will focus more on those industries next year.

In the life sciences, however, the founders believe they already have a revolutionary product that’s enabling a new world of discoveries. Down the line, they see SciDB and REVEAL contributing to national and worldwide health research that will allow doctors to provide the most informed, personalized care imaginable.

“The query that every doctor wants to run is, when you come into his or her office and display a set of symptoms, the doctor asks, ‘Who in this national database has genetics that look like mine, symptoms that look like mine, lifestyle exposures that look like mine? And what was their diagnosis? What was their treatment? And what was their morbidity?” Stonebraker explains. “This is cross correlating you with everybody else to do very personalized medicine, and I think this is within our grasp.”

Read More

Robots Learning to Move like Animals

Robots Learning to Move like Animals



Quadruped robot learning locomotion skills by imitating a dog.

Whether it’s a dog chasing after a ball, or a monkey swinging through the
trees, animals can effortlessly perform an incredibly rich repertoire of agile
locomotion skills. But designing controllers that enable legged robots to
replicate these agile behaviors can be a very challenging task. The superior
agility seen in animals, as compared to robots, might lead one to wonder: can
we create more agile robotic controllers with less effort by directly imitating
animals?

In this work, we present a framework for learning robotic locomotion skills by
imitating animals. Given a reference motion clip recorded from an animal (e.g.
a dog), our framework uses reinforcement learning to train a control policy
that enables a robot to imitate the motion in the real world. Then, by simply
providing the system with different reference motions, we are able to train a
quadruped robot to perform a diverse set of agile behaviors, ranging from fast
walking gaits to dynamic hops and turns. The policies are trained primarily in
simulation, and then transferred to the real world using a latent space
adaptation technique, which is able to efficiently adapt a policy using only a
few minutes of data from the real robot.

Q&A: Markus Buehler on setting coronavirus and AI-inspired proteins to music

The proteins that make up all living things are alive with music. Just ask Markus Buehler: The musician and MIT professor develops artificial intelligence models to design new proteins, sometimes by translating them into sound. His goal is to create new biological materials for sustainable, non-toxic applications. In a project with the MIT-IBM Watson AI Lab, Buehler is searching for a protein to extend the shelf-life of perishable food. In a new study in Extreme Mechanics Letters, he and his colleagues offer a promising candidate: a silk protein made by honeybees for use in hive building. 

In another recent study, in APL Bioengineering, he went a step further and used AI discover an entirely new protein. As both studies went to print, the Covid-19 outbreak was surging in the United States, and Buehler turned his attention to the spike protein of SARS-CoV-2, the appendage that makes the novel coronavirus so contagious. He and his colleagues are trying to unpack its vibrational properties through molecular-based sound spectra, which could hold one key to stopping the virus. Buehler recently sat down to discuss the art and science of his work.

Q: Your work focuses on the alpha helix proteins found in skin and hair. Why makes this protein so intriguing? 

A: Proteins are the bricks and mortar that make up our cells, organs, and body. Alpha helix proteins are especially important. Their spring-like structure gives them elasticity and resilience, which is why skin, hair, feathers, hooves, and even cell membranes are so durable. But they’re not just tough mechanically, they have built-in antimicrobial properties. With IBM, we’re trying to harness this biochemical trait to create a protein coating that can slow the spoilage of quick-to-rot foods like strawberries.

Q: How did you enlist AI to produce this silk protein?

A: We trained a deep learning model on the Protein Data Bank, which contains the amino acid sequences and three-dimensional shapes of about 120,000 proteins. We then fed the model a snippet of an amino acid chain for honeybee silk and asked it to predict the protein’s shape, atom-by-atom. We validated our work by synthesizing the protein for the first time in a lab — a first step toward developing a thin antimicrobial, structurally-durable coating that can be applied to food. My colleague, Benedetto Marelli, specializes in this part of the process. We also used the platform to predict the structure of proteins that don’t yet exist in nature. That’s how we designed our entirely new protein in the APL Bioengineering study. 

Q: How does your model improve on other protein prediction methods? 

A: We use end-to-end prediction. The model builds the protein’s structure directly from its sequence, translating amino acid patterns into three-dimensional geometries. It’s like translating a set of IKEA instructions into a built bookshelf, minus the frustration. Through this approach, the model effectively learns how to build a protein from the protein itself, via the language of its amino acids. Remarkably, our method can accurately predict protein structure without a template. It outperforms other folding methods and is significantly faster than physics-based modeling. Because the Protein Data Bank is limited to proteins found in nature, we needed a way to visualize new structures to make new proteins from scratch.

Q: How could the model be used to design an actual protein?

A: We can build atom-by-atom models for sequences found in nature that haven’t yet been studied, as we did in the APL Bioengineering study using a different method. We can visualize the protein’s structure and use other computational methods to assess its function by analyzing its stablity and the other proteins it binds to in cells. Our model could be used in drug design or to interfere with protein-mediated biochemical pathways in infectious disease.

Q: What’s the benefit of translating proteins into sound?

A: Our brains are great at processing sound! In one sweep, our ears pick up all of its hierarchical features: pitch, timbre, volume, melody, rhythm, and chords. We would need a high-powered microscope to see the equivalent detail in an image, and we could never see it all at once. Sound is such an elegant way to access the information stored in a protein. 

Typically, sound is made from vibrating a material, like a guitar string, and music is made by arranging sounds in hierarchical patterns. With AI we can combine these concepts, and use molecular vibrations and neural networks to construct new musical forms. We’ve been working on methods to turn protein structures into audible representations, and translate these representations into new materials. 

Q: What can the sonification of SARS-CoV-2’s “spike” protein tell us?

A: Its protein spike contains three protein chains folded into an intriguing pattern. These structures are too small for the eye to see, but they can be heard. We represented the physical protein structure, with its entangled chains, as interwoven melodies that form a multi-layered composition. The spike protein’s amino acid sequence, its secondary structure patterns, and its intricate three-dimensional folds are all featured. The resulting piece is a form of counterpoint music, in which notes are played against notes. Like a symphony, the musical patterns reflect the protein’s intersecting geometry realized by materializing its DNA code.

Q: What did you learn?

A: The virus has an uncanny ability to deceive and exploit the host for its own multiplication. Its genome hijacks the host cell’s protein manufacturing machinery, and forces it to replicate the viral genome and produce viral proteins to make new viruses. As you listen, you may be surprised by the pleasant, even relaxing, tone of the music. But it tricks our ear in the same way the virus tricks our cells. It’s an invader disguised as a friendly visitor. Through music, we can see the SARS-CoV-2 spike from a new angle, and appreciate the urgent need to learn the language of proteins.  

Q: Can any of this address Covid-19, and the virus that causes it?

A: In the longer term, yes. Translating proteins into sound gives scientists another tool to understand and design proteins. Even a small mutation can limit or enhance the pathogenic power of SARS-CoV-2. Through sonification, we can also compare the biochemical processes of its spike protein with previous coronaviruses, like SARS or MERS. 

In the music we created, we analyzed the vibrational structure of the spike protein that infects the host. Understanding these vibrational patterns is critical for drug design and much more. Vibrations may change as temperatures warm, for example, and they may also tell us why the SARS-CoV-2 spike gravitates toward human cells more than other viruses. We’re exploring these questions in current, ongoing research with my graduate students. 

We might also use a compositional approach to design drugs to attack the virus. We could search for a new protein that matches the melody and rhythm of an antibody capable of binding to the spike protein, interfering with its ability to infect.

Q: How can music aid protein design?

A: You can think of music as an algorithmic reflection of structure. Bach’s Goldberg Variations, for example, are a brilliant realization of counterpoint, a principle we’ve also found in proteins. We can now hear this concept as nature composed it, and compare it to ideas in our imagination, or use AI to speak the language of protein design and let it imagine new structures. We believe that the analysis of sound and music can help us understand the material world better. Artistic expression is, after all, just a model of the world within us and around us.  

Co-authors of the study in Extreme Mechanics Letters are: Zhao Qin, Hui Sun, Eugene Lim and Benedetto Marelli at MIT; and Lingfei Wu, Siyu Huo, Tengfei Ma and Pin-Yu Chen at IBM Research. Co-author of the study in APL Bioengineering is Chi-Hua Yu. Buehler’s sonification work is supported by MIT’s Center for Art, Science and Technology (CAST) and the Mellon Foundation. 

Read More

Understanding a changing climate

AWS recently sponsored the Causality for Climate (C4C) competition at the 2019 NeurIPS (Neural Information Processing Systems) conference. The competition focused on the causal discovery and development of new ways to understand climate data.Read More

Agent57: Outperforming the human Atari benchmark

The Atari57 suite of games is a long-standing benchmark to gauge agent performance across a wide range of tasks. Weve developed Agent57, the first deep reinforcement learning agent to obtain a score that is above the human baseline on all 57 Atari 2600 games. Agent57 combines an algorithm for efficient exploration with a meta-controller that adapts the exploration and long vs. short-term behaviour of the agent.Read More