Q&A: More-sustainable concrete with machine learning

As a building material, concrete withstands the test of time. Its use dates back to early civilizations, and today it is the most popular composite choice in the world. However, it’s not without its faults. Production of its key ingredient, cement, contributes 8-9 percent of the global anthropogenic CO2 emissions and 2-3 percent of energy consumption, which is only projected to increase in the coming years. With aging United States infrastructure, the federal government recently passed a milestone bill to revitalize and upgrade it, along with a push to reduce greenhouse gas emissions where possible, putting concrete in the crosshairs for modernization, too.

Elsa Olivetti, the Esther and Harold E. Edgerton Associate Professor in the MIT Department of Materials Science and Engineering, and Jie Chen, MIT-IBM Watson AI Lab research scientist and manager, think artificial intelligence can help meet this need by designing and formulating new, more sustainable concrete mixtures, with lower costs and carbon dioxide emissions, while improving material performance and reusing manufacturing byproducts in the material itself. Olivetti’s research improves environmental and economic sustainability of materials, and Chen develops and optimizes machine learning and computational techniques, which he can apply to materials reformulation. Olivetti and Chen, along with their collaborators, have recently teamed up for an MIT-IBM Watson AI Lab project to make concrete more sustainable for the benefit of society, the climate, and the economy.

Q: What applications does concrete have, and what properties make it a preferred building material?

Olivetti: Concrete is the dominant building material globally with an annual consumption of 30 billion metric tons. That is over 20 times the next most produced material, steel, and the scale of its use leads to considerable environmental impact, approximately 5-8 percent of global greenhouse gas (GHG) emissions. It can be made locally, has a broad range of structural applications, and is cost-effective. Concrete is a mixture of fine and coarse aggregate, water, cement binder (the glue), and other additives.

Q: Why isn’t it sustainable, and what research problems are you trying to tackle with this project?

Olivetti: The community is working on several ways to reduce the impact of this material, including alternative fuels use for heating the cement mixture, increasing energy and materials efficiency and carbon sequestration at production facilities, but one important opportunity is to develop an alternative to the cement binder.

While cement is 10 percent of the concrete mass, it accounts for 80 percent of the GHG footprint. This impact is derived from the fuel burned to heat and run the chemical reaction required in manufacturing, but also the chemical reaction itself releases CO2 from the calcination of limestone. Therefore, partially replacing the input ingredients to cement (traditionally ordinary Portland cement or OPC) with alternative materials from waste and byproducts can reduce the GHG footprint. But use of these alternatives is not inherently more sustainable because wastes might have to travel long distances, which adds to fuel emissions and cost, or might require pretreatment processes. The optimal way to make use of these alternate materials will be situation-dependent. But because of the vast scale, we also need solutions that account for the huge volumes of concrete needed. This project is trying to develop novel concrete mixtures that will decrease the GHG impact of the cement and concrete, moving away from the trial-and-error processes towards those that are more predictive.

Chen: If we want to fight climate change and make our environment better, are there alternative ingredients or a reformulation we could use so that less greenhouse gas is emitted? We hope that through this project using machine learning we’ll be able to find a good answer.

Q: Why is this problem important to address now, at this point in history?

Olivetti: There is urgent need to address greenhouse gas emissions as aggressively as possible, and the road to doing so isn’t necessarily straightforward for all areas of industry. For transportation and electricity generation, there are paths that have been identified to decarbonize those sectors. We need to move much more aggressively to achieve those in the time needed; further, the technological approaches to achieve that are more clear. However, for tough-to-decarbonize sectors, such as industrial materials production, the pathways to decarbonization are not as mapped out.

Q: How are you planning to address this problem to produce better concrete?

Olivetti: The goal is to predict mixtures that will both meet performance criteria, such as strength and durability, with those that also balance economic and environmental impact. A key to this is to use industrial wastes in blended cements and concretes. To do this, we need to understand the glass and mineral reactivity of constituent materials. This reactivity not only determines the limit of the possible use in cement systems but also controls concrete processing, and the development of strength and pore structure, which ultimately control concrete durability and life-cycle CO2 emissions.

Chen: We investigate using waste materials to replace part of the cement component. This is something that we’ve hypothesized would be more sustainable and economic — actually waste materials are common, and they cost less. Because of the reduction in the use of cement, the final concrete product would be responsible for much less carbon dioxide production. Figuring out the right concrete mixture proportion that makes endurable concretes while achieving other goals is a very challenging problem. Machine learning is giving us an opportunity to explore the advancement of predictive modeling, uncertainty quantification, and optimization to solve the issue. What we are doing is exploring options using deep learning as well as multi-objective optimization techniques to find an answer. These efforts are now more feasible to carry out, and they will produce results with reliability estimates that we need to understand what makes a good concrete.

Q: What kinds of AI and computational techniques are you employing for this?

Olivetti: We use AI techniques to collect data on individual concrete ingredients, mix proportions, and concrete performance from the literature through natural language processing. We also add data obtained from industry and/or high throughput atomistic modeling and experiments to optimize the design of concrete mixtures. Then we use this information to develop insight into the reactivity of possible waste and byproduct materials as alternatives to cement materials for low-CO2 concrete. By incorporating generic information on concrete ingredients, the resulting concrete performance predictors are expected to be more reliable and transformative than existing AI models.

Chen: The final objective is to figure out what constituents, and how much of each, to put into the recipe for producing the concrete that optimizes the various factors: strength, cost, environmental impact, performance, etc. For each of the objectives, we need certain models: We need a model to predict the performance of the concrete (like, how long does it last and how much weight does it sustain?), a model to estimate the cost, and a model to estimate how much carbon dioxide is generated. We will need to build these models by using data from literature, from industry, and from lab experiments.

We are exploring Gaussian process models to predict the concrete strength, going forward into days and weeks. This model can give us an uncertainty estimate of the prediction as well. Such a model needs specification of parameters, for which we will use another model to calculate. At the same time, we also explore neural network models because we can inject domain knowledge from human experience into them. Some models are as simple as multi-layer perceptions, while some are more complex, like graph neural networks. The goal here is that we want to have a model that is not only accurate but also robust — the input data is noisy, and the model must embrace the noise, so that its prediction is still accurate and reliable for the multi-objective optimization.

Once we have built models that we are confident with, we will inject their predictions and uncertainty estimates into the optimization of multiple objectives, under constraints and under uncertainties.

Q: How do you balance cost-benefit trade-offs?

Chen: The multiple objectives we consider are not necessarily consistent, and sometimes they are at odds with each other. The goal is to identify scenarios where the values for our objectives cannot be further pushed simultaneously without compromising one or a few. For example, if you want to further reduce the cost, you probably have to suffer the performance or suffer the environmental impact. Eventually, we will give the results to policymakers and they will look into the results and weigh the options. For example, they may be able to tolerate a slightly higher cost under a significant reduction in greenhouse gas. Alternatively, if the cost varies little but the concrete performance changes drastically, say, doubles or triples, then this is definitely a favorable outcome.

Q: What kinds of challenges do you face in this work?

Chen: The data we get either from industry or from literature are very noisy; the concrete measurements can vary a lot, depending on where and when they are taken. There are also substantial missing data when we integrate them from different sources, so, we need to spend a lot of effort to organize and make the data usable for building and training machine learning models. We also explore imputation techniques that substitute missing features, as well as models that tolerate missing features, in our predictive modeling and uncertainty estimate.

Q: What do you hope to achieve through this work?

Chen: In the end, we are suggesting either one or a few concrete recipes, or a continuum of recipes, to manufacturers and policymakers. We hope that this will provide invaluable information for both the construction industry and for the effort of protecting our beloved Earth.

Olivetti: We’d like to develop a robust way to design cements that make use of waste materials to lower their CO2 footprint. Nobody is trying to make waste, so we can’t rely on one stream as a feedstock if we want this to be massively scalable. We have to be flexible and robust to shift with feedstocks changes, and for that we need improved understanding. Our approach to develop local, dynamic, and flexible alternatives is to learn what makes these wastes reactive, so we know how to optimize their use and do so as broadly as possible. We do that through predictive model development through software we have developed in my group to automatically extract data from literature on over 5 million texts and patents on various topics. We link this to the creative capabilities of our IBM collaborators to design methods that predict the final impact of new cements. If we are successful, we can lower the emissions of this ubiquitous material and play our part in achieving carbon emissions mitigation goals.

Other researchers involved with this project include Stefanie Jegelka, the X-Window Consortium Career Development Associate Professor in the MIT Department of Electrical Engineering and Computer Science; Richard Goodwin, IBM principal researcher; Soumya Ghosh, MIT-IBM Watson AI Lab research staff member; and Kristen Severson, former research staff member. Collaborators included Nghia Hoang, former research staff member with MIT-IBM Watson AI Lab and IBM Research, and Executive Director of MIT Climate & Sustainability Consortium Jeremy Gregory.​

This research is supported by the MIT-IBM Watson AI Lab.

Read More

Technique enables real-time rendering of scenes in 3D

Humans are pretty good at looking at a single two-dimensional image and understanding the full three-dimensional scene that it captures. Artificial intelligence agents are not.

Yet a machine that needs to interact with objects in the world — like a robot designed to harvest crops or assist with surgery — must be able to infer properties about a 3D scene from observations of the 2D images it’s trained on.     

While scientists have had success using neural networks to infer representations of 3D scenes from images, these machine learning methods aren’t fast enough to make them feasible for many real-world applications.

A new technique demonstrated by researchers at MIT and elsewhere is able to represent 3D scenes from images about 15,000 times faster than some existing models.

The method represents a scene as a 360-degree light field, which is a function that describes all the light rays in a 3D space, flowing through every point and in every direction. The light field is encoded into a neural network, which enables faster rendering of the underlying 3D scene from an image.

The light-field networks (LFNs) the researchers developed can reconstruct a light field after only a single observation of an image, and they are able to render 3D scenes at real-time frame rates.

“The big promise of these neural scene representations, at the end of the day, is to use them in vision tasks. I give you an image and from that image you create a representation of the scene, and then everything you want to reason about you do in the space of that 3D scene,” says Vincent Sitzmann, a postdoc in the Computer Science and Artificial Intelligence Laboratory (CSAIL) and co-lead author of the paper.

Sitzmann wrote the paper with co-lead author Semon Rezchikov, a postdoc at Harvard University; William T. Freeman, the Thomas and Gerd Perkins Professor of Electrical Engineering and Computer Science and a member of CSAIL; Joshua B. Tenenbaum, a professor of computational cognitive science in the Department of Brain and Cognitive Sciences and a member of CSAIL; and senior author Frédo Durand, a professor of electrical engineering and computer science and a member of CSAIL. The research will be presented at the Conference on Neural Information Processing Systems this month.

Mapping rays

In computer vision and computer graphics, rendering a 3D scene from an image involves mapping thousands or possibly millions of camera rays. Think of camera rays like laser beams shooting out from a camera lens and striking each pixel in an image, one ray per pixel. These computer models must determine the color of the pixel struck by each camera ray.

Many current methods accomplish this by taking hundreds of samples along the length of each camera ray as it moves through space, which is a computationally expensive process that can lead to slow rendering.

Instead, an LFN learns to represent the light field of a 3D scene and then directly maps each camera ray in the light field to the color that is observed by that ray. An LFN leverages the unique properties of light fields, which enable the rendering of a ray after only a single evaluation, so the LFN doesn’t need to stop along the length of a ray to run calculations.

“With other methods, when you do this rendering, you have to follow the ray until you find the surface. You have to do thousands of samples, because that is what it means to find a surface. And you’re not even done yet because there may be complex things like transparency or reflections. With a light field, once you have reconstructed the light field, which is a complicated problem, rendering a single ray just takes a single sample of the representation, because the representation directly maps a ray to its color,” Sitzmann says.      

The LFN classifies each camera ray using its “Plücker coordinates,” which represent a line in 3D space based on its direction and how far it is from its point of origin. The system computes the Plücker coordinates of each camera ray at the point where it hits a pixel to render an image.

By mapping each ray using Plücker coordinates, the LFN is also able to compute the geometry of the scene due to the parallax effect. Parallax is the difference in apparent position of an object when viewed from two different lines of sight. For instance, if you move your head, objects that are farther away seem to move less than objects that are closer. The LFN can tell the depth of objects in a scene due to parallax, and uses this information to encode a scene’s geometry as well as its appearance.

But to reconstruct light fields, the neural network must first learn about the structures of light fields, so the researchers trained their model with many images of simple scenes of cars and chairs.

“There is an intrinsic geometry of light fields, which is what our model is trying to learn. You might worry that light fields of cars and chairs are so different that you can’t learn some commonality between them. But it turns out, if you add more kinds of objects, as long as there is some homogeneity, you get a better and better sense of how light fields of general objects look, so you can generalize about classes,” Rezchikov says.

Once the model learns the structure of a light field, it can render a 3D scene from only one image as an input.

Rapid rendering

The researchers tested their model by reconstructing 360-degree light fields of several simple scenes. They found that LFNs were able to render scenes at more than 500 frames per second, about three orders of magnitude faster than other methods. In addition, the 3D objects rendered by LFNs were often crisper than those generated by other models.

An LFN is also less memory-intensive, requiring only about 1.6 megabytes of storage, as opposed to 146 megabytes for a popular baseline method.

“Light fields were proposed before, but back then they were intractable. Now, with these techniques that we used in this paper, for the first time you can both represent these light fields and work with these light fields. It is an interesting convergence of the mathematical models and the neural network models that we have developed coming together in this application of representing scenes so machines can reason about them,” Sitzmann says.

In the future, the researchers would like to make their model more robust so it could be used effectively for complex, real-world scenes. One way to drive LFNs forward is to focus only on reconstructing certain patches of the light field, which could enable the model to run faster and perform better in real-world environments, Sitzmann says.

“Neural rendering has recently enabled photorealistic rendering and editing of images from only a sparse set of input views. Unfortunately, all existing techniques are computationally very expensive, preventing applications that require real-time processing, like video conferencing. This project takes a big step toward a new generation of computationally efficient and mathematically elegant neural rendering algorithms,” says Gordon Wetzstein, an associate professor of electrical engineering at Stanford University, who was not involved in this research. “I anticipate that it will have widespread applications, in computer graphics, computer vision, and beyond.”

This work is supported by the National Science Foundation, the Office of Naval Research, Mitsubishi, the Defense Advanced Research Projects Agency, and the Singapore Defense Science and Technology Agency.

Read More

Generating a realistic 3D world

While standing in a kitchen, you push some metal bowls across the counter into the sink with a clang, and drape a towel over the back of a chair. In another room, it sounds like some precariously stacked wooden blocks fell over, and there’s an epic toy car crash. These interactions with our environment are just some of what humans experience on a daily basis at home, but while this world may seem real, it isn’t.

A new study from researchers at MIT, the MIT-IBM Watson AI Lab, Harvard University, and Stanford University is enabling a rich virtual world, very much like stepping into “The Matrix.” Their platform, called ThreeDWorld (TDW), simulates high-fidelity audio and visual environments, both indoor and outdoor, and allows users, objects, and mobile agents to interact like they would in real life and according to the laws of physics. Object orientations, physical characteristics, and velocities are calculated and executed for fluids, soft bodies, and rigid objects as interactions occur, producing accurate collisions and impact sounds.

TDW is unique in that it is designed to be flexible and generalizable, generating synthetic photo-realistic scenes and audio rendering in real time, which can be compiled into audio-visual datasets, modified through interactions within the scene, and adapted for human and neural network learning and prediction tests. Different types of robotic agents and avatars can also be spawned within the controlled simulation to perform, say, task planning and execution. And using virtual reality (VR), human attention and play behavior within the space can provide real-world data, for example.

“We are trying to build a general-purpose simulation platform that mimics the interactive richness of the real world for a variety of AI applications,” says study lead author Chuang Gan, MIT-IBM Watson AI Lab research scientist.

Creating realistic virtual worlds with which to investigate human behaviors and train robots has been a dream of AI and cognitive science researchers. “Most of AI right now is based on supervised learning, which relies on huge datasets of human-annotated images or sounds,” says Josh McDermott, associate professor in the Department of Brain and Cognitive Sciences (BCS) and an MIT-IBM Watson AI Lab project lead. These descriptions are expensive to compile, creating a bottleneck for research. And for physical properties of objects, like mass, which isn’t always readily apparent to human observers, labels may not be available at all. A simulator like TDW skirts this problem by generating scenes where all the parameters and annotations are known. Many competing simulations were motivated by this concern but were designed for specific applications; through its flexibility, TDW is intended to enable many applications that are poorly suited to other platforms.

Another advantage of TDW, McDermott notes, is that it provides a controlled setting for understanding the learning process and facilitating the improvement of AI robots. Robotic systems, which rely on trial and error, can be taught in an environment where they cannot cause physical harm. In addition, “many of us are excited about the doors that these sorts of virtual worlds open for doing experiments on humans to understand human perception and cognition. There’s the possibility of creating these very rich sensory scenarios, where you still have total control and complete knowledge of what is happening in the environment.”

McDermott, Gan, and their colleagues are presenting this research at the conference on Neural Information Processing Systems (NeurIPS) in December.

Behind the framework

The work began as a collaboration between a group of MIT professors along with Stanford and IBM researchers, tethered by individual research interests into hearing, vision, cognition, and perceptual intelligence. TDW brought these together in one platform. “We were all interested in the idea of building a virtual world for the purpose of training AI systems that we could actually use as models of the brain,” says McDermott, who studies human and machine hearing. “So, we thought that this sort of environment, where you can have objects that will interact with each other and then render realistic sensory data from them, would be a valuable way to start to study that.”

To achieve this, the researchers built TDW on a video game platform called Unity3D Engine and committed to incorporating both visual and auditory data rendering without any animation. The simulation consists of two components: the build, which renders images, synthesizes audio, and runs physics simulations; and the controller, which is a Python-based interface where the user sends commands to the build. Researchers construct and populate a scene by pulling from an extensive 3D model library of objects, like furniture pieces, animals, and vehicles. These models respond accurately to lighting changes, and their material composition and orientation in the scene dictate their physical behaviors in the space. Dynamic lighting models accurately simulate scene illumination, causing shadows and dimming that correspond to the appropriate time of day and sun angle. The team has also created furnished virtual floor plans that researchers can fill with agents and avatars. To synthesize true-to-life audio, TDW uses generative models of impact sounds that are triggered by collisions or other object interactions within the simulation. TDW also simulates noise attenuation and reverberation in accordance with the geometry of the space and the objects in it.

Two physics engines in TDW power deformations and reactions between interacting objects — one for rigid bodies, and another for soft objects and fluids. TDW performs instantaneous calculations regarding mass, volume, and density, as well as any friction or other forces acting upon the materials. This allows machine learning models to learn about how objects with different physical properties would behave together.

Users, agents, and avatars can bring the scenes to life in several ways. A researcher could directly apply a force to an object through controller commands, which could literally set a virtual ball in motion. Avatars can be empowered to act or behave in a certain way within the space — e.g., with articulated limbs capable of performing task experiments. Lastly, VR head and handsets can allow users to interact with the virtual environment, potentially to generate human behavioral data that machine learning models could learn from.

Richer AI experiences

To trial and demonstrate TDW’s unique features, capabilities, and applications, the team ran a battery of tests comparing datasets generated by TDW and other virtual simulations. The team found that neural networks trained on scene image snapshots with randomly placed camera angles from TDW outperformed other simulations’ snapshots in image classification tests and neared that of systems trained on real-world images. The researchers also generated and trained a material classification model on audio clips of small objects dropping onto surfaces in TDW and asked it to identify the types of materials that were interacting. They found that TDW produced significant gains over its competitor. Additional object-drop testing with neural networks trained on TDW revealed that the combination of audio and vision together is the best way to identify the physical properties of objects, motivating further study of audio-visual integration.

TDW is proving particularly useful for designing and testing systems that understand how the physical events in a scene will evolve over time. This includes facilitating benchmarks of how well a model or algorithm makes physical predictions of, for instance, the stability of stacks of objects, or the motion of objects following a collision — humans learn many of these concepts as children, but many machines need to demonstrate this capacity to be useful in the real world. TDW has also enabled comparisons of human curiosity and prediction against those of machine agents designed to evaluate social interactions within different scenarios.

Gan points out that these applications are only the tip of the iceberg. By expanding the physical simulation capabilities of TDW to depict the real world more accurately, “we are trying to create new benchmarks to advance AI technologies, and to use these benchmarks to open up many new problems that until now have been difficult to study.”

The research team on the paper also includes MIT engineers Jeremy Schwartz and Seth Alter, who are instrumental to the operation of TDW; BCS professors James DiCarlo and Joshua Tenenbaum; graduate students Aidan Curtis and Martin Schrimpf; and former postdocs James Traer (now an assistant professor at the University of Iowa) and Jonas Kubilius PhD ‘08. Their colleagues are IBM director of the MIT-IBM Watson AI Lab David Cox; research software engineer Abhishek Bhandwaldar; and research staff member Dan Gutfreund of IBM. Additional researchers co-authoring are Harvard University assistant professor Julian De Freitas; and from Stanford University, assistant professors Daniel L.K. Yamins (a TDW founder) and Nick Haber, postdoc Daniel M. Bear, and graduate students Megumi Sano, Kuno Kim, Elias Wang, Damian Mrowca, Kevin Feigelis, and Michael Lingelbach.

This research was supported by the MIT-IBM Watson AI Lab.

Read More

Taking some of the guesswork out of drug discovery

In their quest to discover effective new medicines, scientists search for drug-like molecules that can attach to disease-causing proteins and change their functionality. It is crucial that they know the 3D shape of a molecule to understand how it will attach to specific surfaces of the protein.

But a single molecule can fold in thousands of different ways, so solving that puzzle experimentally is a time consuming and expensive process akin to searching for a needle in a molecular haystack.

MIT researchers are using machine learning to streamline this complex task. They have created a deep learning model that predicts the 3D shapes of a molecule solely based on a graph in 2D of its molecular structure. Molecules are typically represented as small graphs.

Their system, GeoMol, processes molecules in only seconds and performs better than other machine learning models, including some commercial methods. GeoMol could help pharmaceutical companies accelerate the drug discovery process by narrowing down the number of molecules they need to test in lab experiments, says Octavian-Eugen Ganea, a postdoc in the Computer Science and Artificial Intelligence Laboratory (CSAIL) and co-lead author of the paper.

“When you are thinking about how these structures move in 3D space, there are really only certain parts of the molecule that are actually flexible, these rotatable bonds. One of the key innovations of our work is that we think about modeling the conformational flexibility like a chemical engineer would. It is really about trying to predict the potential distribution of rotatable bonds in the structure,” says Lagnajit Pattanaik, a graduate student in the Department of Chemical Engineering and co-lead author of the paper.

Other authors include Connor W. Coley, the Henri Slezynger Career Development Assistant Professor of Chemical Engineering; Regina Barzilay, the School of Engineering Distinguished Professor for AI and Health in CSAIL; Klavs F. Jensen, the Warren K. Lewis Professor of Chemical Engineering; William H. Green, the Hoyt C. Hottel Professor in Chemical Engineering; and senior author Tommi S. Jaakkola, the Thomas Siebel Professor of Electrical Engineering in CSAIL and a member of the Institute for Data, Systems, and Society. The research will be presented this week at the Conference on Neural Information Processing Systems.

Mapping a molecule

In a molecular graph, a molecule’s individual atoms are represented as nodes and the chemical bonds that connect them are edges. 

GeoMol leverages a recent tool in deep learning called a message passing neural network, which is specifically designed to operate on graphs. The researchers adapted a message passing neural network to predict specific elements of molecular geometry.

Given a molecular graph, GeoMol initially predicts the lengths of the chemical bonds between atoms and the angles of those individual bonds. The way the atoms are arranged and connected determines which bonds can rotate.

GeoMol then predicts the structure of each atom’s local neighborhood individually and assembles neighboring pairs of rotatable bonds by computing the torsion angles and then aligning them. A torsion angle determines the motion of three segments that are connected, in this case, three chemical bonds that connect four atoms.

“Here, the rotatable bonds can take a huge range of possible values. So, the use of these message passing neural networks allows us to capture a lot of the local and global environments that influences that prediction. The rotatable bond can take multiple values, and we want our prediction to be able to reflect that underlying distribution,” Pattanaik says.

Overcoming existing hurdles

One major challenge to predicting the 3D structure of molecules is to model chirality. A chiral molecule can’t be superimposed on its mirror image, like a pair of hands (no matter how you rotate your hands, there is no way their features exactly line up). If a molecule is chiral, its mirror image won’t interact with the environment in the same way.

This could cause medicines to interact with proteins incorrectly, which could result in dangerous side effects. Current machine learning methods often involve a long, complex optimization process to ensure chirality is correctly identified, Ganea says.

Because GeoMol determines the 3D structure of each bond individually, it explicitly defines chirality during the prediction process, eliminating the need for optimization after-the-fact.

After performing these predictions, GeoMol outputs a set of likely 3D structures for the molecule.

“What we can do now is take our model and connect it end-to-end with a model that predicts this attachment to specific protein surfaces. Our model is not a separate pipeline. It is very easy to integrate with other deep learning models,” Ganea says.

A “super-fast” model

The researchers tested their model using a dataset of molecules and the likely 3D shapes they could take, which was developed by Rafael Gomez-Bombarelli, the Jeffrey Cheah Career Development Chair in Engineering, and graduate student Simon Axelrod.

They evaluated how many of these likely 3D structures their model was able to capture, in comparison to machine learning models and other methods.

In nearly all instances, GeoMol outperformed the other models on all tested metrics.

“We found that our model is super-fast, which was really exciting to see. And importantly, as you add more rotatable bonds, you expect these algorithms to slow down significantly. But we didn’t really see that. The speed scales nicely with the number of rotatable bonds, which is promising for using these types of models down the line, especially for applications where you are trying to quickly predict the 3D structures inside these proteins,” Pattanaik says.

In the future, the researchers hope to apply GeoMol to the area of high-throughput virtual screening, using the model to determine small molecule structures that would interact with a specific protein. They also want to keep refining GeoMol with additional training data so it can more effectively predict the structure of long molecules with many flexible bonds.

“Conformational analysis is a key component of numerous tasks in computer-aided drug design, and an important component in advancing machine learning approaches in drug discovery,” says Pat Walters, senior vice president of computation at Relay Therapeutics, who was not involved in this research. “I’m excited by continuing advances in the field and thank MIT for contributing to broader learnings in this area.”

This research was funded by the Machine Learning for Pharmaceutical Discovery and Synthesis consortium.

Read More