Researching from home: Science stays social, even at a distance

With all but a skeleton crew staying home from each lab to minimize the spread of Covid-19, scores of Picower Institute researchers are immersing themselves in the considerable amount of scientific work that can done away from the bench. With piles of data to analyze; plenty of manuscripts to write; new skills to acquire; and fresh ideas to conceive, share, and refine for the future, neuroscientists have full plates, even when they are away from their, well, plates. They are proving that science can remain social, even if socially distant.

Ever since the mandatory ramp down of on-campus research took hold March 20, for example, teams of researchers in the lab of Troy Littleton, the Menicon Professor of Neuroscience, have sharpened their focus on two data-analysis projects that are every bit as essential to their science as acquiring the data in the lab in the first place. Research scientist Yulia Akbergenova and graduate student Karen Cunningham, for example, are poring over a huge amount of imaging data showing how the strength of connections between neurons, or synapses, mature and how that depends on the molecular components at the site. Another team, comprised of Picower postdoc Suresh Jetti and graduate students Andres Crane and Nicole Aponte-Santiago, is analyzing another large dataset, this time of gene transcription, to learn what distinguishes two subclasses of motor neurons that form synapses of characteristically different strength.

Work is similarly continuing among researchers in the lab of Elly Nedivi, the William R. (1964) and Linda R. Young Professor of Neuroscience. Since heading home, Senior Research Support Associate Kendyll Burnell has been looking at microscope images tracking how inhibitory interneurons innervate the visual cortex of mice throughout their development. By studying the maturation of inhibition, the lab hopes to improve understanding of the role of inhibitory circuitry in the experience-dependent changes, or plasticity, and development of the visual cortex, she says. As she’s worked, her poodle Soma (named for the central body structure of a neuron) has been by her side.

Despite extra time with comforts of home, though, it’s clear that nobody wanted this current mode of socially distant science. For every lab, it’s tremendously disruptive and costly. But labs are finding many ways to make progress nonetheless.

“Although we are certainly hurting because our lab work is at a standstill, the Miller lab is fortunate to have a large library of multiple-electrode neurophysiological data,” says Picower Professor Earl Miller. “The datasets are very rich. As our hypotheses and analytical tools develop, we can keep going back to old data to ask new questions. We are taking advantage of the wet lab downtime to analyze data and write papers. We have three under review and are writing at least three more right now.”

Miller is inviting new collaborations regardless of the physical impediment of social distancing. A recent lab meeting held via the videoconferencing app Zoom included MIT Department of Brain and Cognitive Sciences Associate Professor Ila Fiete and her graduate student, Mikail Khona. The Miller lab has begun studying how neural rhythms move around the cortex and what that means for brain function. Khona presented models of how timing relationships affect those waves. While this kind of an interaction between labs of the Picower Institute and the McGovern Institute for Brain Research would normally have taken place in person in MIT’s Building 46, neither lab let the pandemic get in the way.

Similarly, the lab of Li-Huei Tsai, Picower Professor and director of the Picower Institute, has teamed up with that of Manolis Kellis, professor in the MIT Computer Science and Artificial Intelligence Laboratory. They’re forming several small squads of experimenters and computational experts to launch analyses of gene expression and other data to illuminate the fate of individual cell types like interneurons or microglia in the context of the Alzheimer’s disease-afflicted brain. Other teams are focusing on analyses of questions such as how pathology varies in brain samples carrying different degrees of genetic risk factors. These analyses will prove useful for stages all along the scientific process, Tsai says, from forming new hypotheses to wrapping up papers that are well underway.

Remote collaboration and communication are proving crucial to researchers in other ways, too, proving that online interactions, though distant, can be quite personally fulfilling.

Nicholas DiNapoli, a research engineer in the lab of Associate Professor Kwanghun Chung, is making the best of time away from the bench by learning about the lab’s computational pipeline for processing the enormous amounts of imaging data it generates. He’s also taking advantage of a new program within the lab in which Senior Computer Scientist Lee Kamentsky is teaching Python computer programming principles to anyone in the lab who wants to learn. The training occurs via Zoom two days a week.

As part of a crowded calendar of Zoom meetings, or “Zeetings” as the lab has begun to call them, Newton Professor Mriganka Sur says he makes sure to have one-to-one meetings with everyone in the lab. The team also has organized into small subgroups around different themes of the lab’s research.

But also, the lab has continued to maintain its cohesion by banding together informally creating novel work and social experiences.

Graduate student Ning Leow, for example, used Zoom to create a co-working session in which participants kept a video connection open for hours at a time, just to be in each other’s virtual presence while they worked. Among a group of Sur lab friends, she read a paper related to her thesis and did a substantial amount of data analysis. She also advised a colleague on an analysis technique via the connection.

“I’ve got to say that it worked out really well for me personally because I managed to get whatever I wanted to complete on my list done,” she says, “and there was also a sense of healthy accountability along with the sense of community.”

Whether in person or via an officially imposed distance, science is social. In that spirit, graduate student K. Guadalupe “Lupe” Cruz organized a collaborative art event via Zoom for female scientists in brain and cognitive sciences at MIT. She took a photo of Rosalind Franklin, the scientist whose work was essential for resolving the structure of DNA, and divided it into nine squares to distribute to the event attendees. Without knowing the full picture, everyone drew just their section, talking all the while about how the strange circumstances of Covid-19 have changed their lives. At the end, they stitched their squares together to reconstruct the image.

Examples abound of how Picower scientists, though mostly separate and apart, are still coming together to advance their research and to maintain the fabric of their shared experiences.

Read More

Upcoming changes to TensorFlow.js

Upcoming changes to TensorFlow.js

Posted by Yannick Assogba, Software Engineer, Google Research

As TensorFlow.js is used more and more in production environments, our team recognizes the need for the community to be able to produce small, production optimized bundles for browsers that use TensorFlow.js. We have been laying out the groundwork for this and want to share our upcoming plans with you.

TensorFlow.js updatesOne primary goal we have for upcoming releases of TensorFlow.js is to make it more modular and more tree shakeable, while preserving ease of use for beginners. To that end, we are planning two major version releases to move us in that direction. We are releasing this work over two major versions in order to maintain semver as we make breaking changes.

TensorFlow.js 2.0

In TensorFlow.js 2.x, the only breaking change will be moving the CPU and WebGL backends from tfjs-core into their own NPM packages (tfjs-backend-cpu and tfjs-backend-webgl respectively). While today these are included by default, we want to make tfjs-core as modular and lean as possible.

What does this mean for me as a user?

If you are using the union package (i.e. @tensorflow/tfjs), you should see no impact to your code. If you are using @tensorflow/tfjs-core directly, you will need to import a package for each backend you want to use.

What benefit do I get?

If you are using @tensorflow/tfjs-core directly, you will now have the option of omitting any backend you do not want to use in your application. For example, if you only want the WebGL backend, you will be able to get modest savings by dropping the CPU backend. You will also be able to lazily load the CPU backend as a fallback if your build tooling/app supports that.

TensorFlow.js 3.0

In this release, we will have fully modularized all our ops and kernels (backend specific implementations of the math behind an operation). This will allow tree shakers in bundlers like WebPack, Rollup, and Closure Compiler to do better dead-code elimination and produce smaller bundles.

We will move to a dynamic gradient and kernel registration scheme as well as provide tooling to aid in creating custom bundles that only contain kernels for a given model or TensorFlow.js program.

We will also start shipping ES2017 bundles by default. Users who need to deploy to browsers that only support earlier versions can transpile down to their desired target.

What does this mean for me as a user?

If you are using the union package (i.e. @tensorflow/tfjs), we anticipate the changes will be minimal. In order to support ease of use in getting started with tfjs, we want the default use of the union package to remain close to what it is today.

For users who want smaller production oriented bundles, you will need to change your code to take advantage of ES2015 modules to import only the ops (and other functionality) you would like to end up in your bundle.

In addition, we will provide command-line tooling to enable builds that only load and register the kernels used by the models/programs you are deploying.

What benefit do I get?

Production oriented users will be able to opt into writing code that results in smaller more optimized builds. Other users will still be able to use the union package pretty much as is, but will not get the advantage of the smallest builds possible.

Dynamic gradient and kernel registration will make it easier to implement custom kernels and gradients for researchers and other advanced users.

FAQ

When will this be ready?

We plan to release TensorFlow.js 2.0 this month. We do not have a release date for Tensorflow 3.0 yet because of the magnitude of the change. Since we need to touch almost every file in tfjs-core, we are also taking the opportunity to clean up technical debt where we can.

Should I upgrade to TensorFlow.js 2.x or just wait for 3.x?

We recommend that you upgrade to TensorFlow 2.x if you are actively developing a TensorFlow.js project. It should be a relatively painless upgrade, and any future bug fixes will be on this release train. We do not yet have a release date for TensorFlow.js 3.x.

How do I migrate my app to 2.x or 3.x? Will there be a tutorial to follow?

As we release these versions, we will publish full release notes with instructions on how to upgrade. Separately, with the launch of 3.x, we will publish a guide on making production builds.

How much will I have to change my code to get smaller builds?

We’ll have more details as we get closer to the release of 3.x, but at a high level, we want to take advantage of the ES2015 module system to let you control what code gets into your bundle.

In general, you will need to do things like import {max, div, mul, depthToSpace} from @tensorflow/tjfs (rather than import * as tf from @tensorflow/tfjs) in order for our tooling to determine which kernels to register from the backends you have selected for deployment. We are even working on making the chaining API on the Tensor class opt-in when targeting production builds.

Will this make TensorFlow.js harder to use?

We do not want to make the barrier to entry higher for using TensorFlow.js so we are designing this in a way that only production oriented users need to do extra work to get optimized builds. For end-users developing applications using the union package (@tensorflow/tfjs) from either a hosted script or from NPM in concert with our collection of pre-trained models we expect there will be no changes as a result of these updatesRead More

Towards understanding glasses with graph neural networks

Under a microscope, a pane of window glass doesnt look like a collection of orderly molecules, as a crystal would, but rather a jumble with no discernable structure. Glass is made by starting with a glowing mixture of high-temperature melted sand and minerals. Once cooled, its viscosity (a measure of the friction in the fluid) increases a trillion-fold, and it becomes a solid, resisting tension from stretching or pulling. Yet the molecules in the glass remain in a seemingly disordered state, much like the original molten liquid almost as though the disordered liquid state had been flash-frozen in place. The glass transition, then, first appears to be a dramatic arrest in the movement of the glass molecules. Whether this process corresponds to a structural phase transition (as in water freezing, or the superconducting transition) is a major open question in the field. Understanding the nature of the dynamics of glass is fundamental to understanding how the atomic-scale properties define the visible features of many solid materials.Read More

Accelerating data-driven discoveries

As technologies like single-cell genomic sequencing, enhanced biomedical imaging, and medical “internet of things” devices proliferate, key discoveries about human health are increasingly found within vast troves of complex life science and health data.

But drawing meaningful conclusions from that data is a difficult problem that can involve piecing together different data types and manipulating huge data sets in response to varying scientific inquiries. The problem is as much about computer science as it is about other areas of science. That’s where Paradigm4 comes in.

The company, founded by Marilyn Matz SM ’80 and Turing Award winner and MIT Professor Michael Stonebraker, helps pharmaceutical companies, research institutes, and biotech companies turn data into insights.

It accomplishes this with a computational database management system that’s built from the ground up to host the diverse, multifaceted data at the frontiers of life science research. That includes data from sources like national biobanks, clinical trials, the medical internet of things, human cell atlases, medical images, environmental factors, and multi-omics, a field that includes the study of genomes, microbiomes, metabolomes, and more.

On top of the system’s unique architecture, the company has also built data preparation, metadata management, and analytics tools to help users find the important patterns and correlations lurking within all those numbers.

In many instances, customers are exploring data sets the founders say are too large and complex to be represented effectively by traditional database management systems.

“We’re keen to enable scientists and data scientists to do things they couldn’t do before by making it easier for them to deal with large-scale computation and machine-learning on diverse data,” Matz says. “We’re helping scientists and bioinformaticists with collaborative, reproducible research to ask and answer hard questions faster.”

A new paradigm

Stonebraker has been a pioneer in the field of database management systems for decades. He has started nine companies, and his innovations have set standards for the way modern systems allow people to organize and access large data sets.

Much of Stonebraker’s career has focused on relational databases, which organize data into columns and rows. But in the mid 2000s, Stonebraker realized that a lot of data being generated would be better stored not in rows or columns but in multidimensional arrays.

For example, satellites break the Earth’s surface into large squares, and GPS systems track a person’s movement through those squares over time. That operation involves vertical, horizontal, and time measurements that aren’t easily grouped or otherwise manipulated for analysis in relational database systems.

Stonebraker recalls his scientific colleagues complaining that available database management systems were too slow to work with complex scientific datasets in fields like genomics, where researchers study the relationships between population-scale multi-omics data, phenotypic data, and medical records.

“[Relational database systems] scan either horizontally or vertically, but not both,” Stonebraker explains. “So you need a system that does both, and that requires a storage manager down at the bottom of the system which is capable of moving both horizontally and vertically through a very big array. That’s what Paradigm4 does.”

In 2008, Stonebraker began developing a database management system at MIT that stored data in multidimensional arrays. He confirmed the approach offered major efficiency advantages, allowing analytical tools based on linear algebra, including many forms of machine learning and statistical data processing, to be applied to huge datasets in new ways.

Stonebraker decided to spin the project into a company in 2010, when he partnered with Matz, a successful entrepreneur who co-founded Cognex Corporation, a large industrial machine-vision company that went public in 1989. The founders and their team, including Alex Poliakov BS ’07, went to work building out key features of the system, including its distributed architecture that allows the system to run on low-cost servers, and its ability to automatically clean and organize data in useful ways for users.

The founders describe their database management system as a computational engine for scientific data, and they’ve named it SciDB. On top of SciDB, they developed an analytics platform, called the REVEAL discovery engine, based on users’ daily research activities and aspirations.

“If you’re a scientist or data scientist, Paradigm’s REVEAL and SciDB products take care of all the data wrangling and computational ‘plumbing and wiring,’ so you don’t have to worry about accessing data, moving data, or setting up parallel distributed computing,” Matz says. “Your data is science-ready. Just ask your scientific question and the platform orchestrates all of the data management and computation for you.”

SciDB is designed to be used by both scientists and developers, so users can interact with the system through graphical user interfaces or by leveraging statistical and programming languages like R and Python.

“It’s been very important to sell solutions, not building blocks,” Matz says. “A big part of our success in the life sciences with top pharmas and biotechs and research institutes is bringing them our REVEAL suite of application-specific solutions to problems. We’re not handing them an analytical platform that’s a set of Spark LEGO blocks; we’re giving them solutions that handle the data they deal with daily, and solutions that use their vocabulary and answer the questions they want to work on.”

Accelerating discovery

Today Paradigm4’s customers include some of the biggest pharmaceutical and biotech companies in the world as well as research labs at the National Institutes of Health, Stanford University, and elsewhere.

Customers can integrate genomic sequencing data, biometric measurements, data on environmental factors, and more into their inquiries to enable new discoveries across a range of life science fields.

Matz says SciDB did 1 billion linear regressions in less than an hour in a recent benchmark, and that it can scale well beyond that, which could speed up discoveries and lower costs for researchers who have traditionally had to extract their data from files and then rely on less efficient cloud-computing-based methods to apply algorithms at scale.

“If researchers can run complex analytics in minutes and that used to take days, that dramatically changes the number of hard questions you can ask and answer,” Matz says. “That is a force-multiplier that will transform research daily.”

Beyond life sciences, Paradigm4’s system holds promise for any industry dealing with multifaceted data, including earth sciences, where Matz says a NASA climatologist is already using the system, and industrial IoT, where data scientists consider large amounts of diverse data to understand complex manufacturing systems. Matz says the company will focus more on those industries next year.

In the life sciences, however, the founders believe they already have a revolutionary product that’s enabling a new world of discoveries. Down the line, they see SciDB and REVEAL contributing to national and worldwide health research that will allow doctors to provide the most informed, personalized care imaginable.

“The query that every doctor wants to run is, when you come into his or her office and display a set of symptoms, the doctor asks, ‘Who in this national database has genetics that look like mine, symptoms that look like mine, lifestyle exposures that look like mine? And what was their diagnosis? What was their treatment? And what was their morbidity?” Stonebraker explains. “This is cross correlating you with everybody else to do very personalized medicine, and I think this is within our grasp.”

Read More

Robots Learning to Move like Animals

Robots Learning to Move like Animals



Quadruped robot learning locomotion skills by imitating a dog.

Whether it’s a dog chasing after a ball, or a monkey swinging through the
trees, animals can effortlessly perform an incredibly rich repertoire of agile
locomotion skills. But designing controllers that enable legged robots to
replicate these agile behaviors can be a very challenging task. The superior
agility seen in animals, as compared to robots, might lead one to wonder: can
we create more agile robotic controllers with less effort by directly imitating
animals?

In this work, we present a framework for learning robotic locomotion skills by
imitating animals. Given a reference motion clip recorded from an animal (e.g.
a dog), our framework uses reinforcement learning to train a control policy
that enables a robot to imitate the motion in the real world. Then, by simply
providing the system with different reference motions, we are able to train a
quadruped robot to perform a diverse set of agile behaviors, ranging from fast
walking gaits to dynamic hops and turns. The policies are trained primarily in
simulation, and then transferred to the real world using a latent space
adaptation technique, which is able to efficiently adapt a policy using only a
few minutes of data from the real robot.

Q&A: Markus Buehler on setting coronavirus and AI-inspired proteins to music

The proteins that make up all living things are alive with music. Just ask Markus Buehler: The musician and MIT professor develops artificial intelligence models to design new proteins, sometimes by translating them into sound. His goal is to create new biological materials for sustainable, non-toxic applications. In a project with the MIT-IBM Watson AI Lab, Buehler is searching for a protein to extend the shelf-life of perishable food. In a new study in Extreme Mechanics Letters, he and his colleagues offer a promising candidate: a silk protein made by honeybees for use in hive building. 

In another recent study, in APL Bioengineering, he went a step further and used AI discover an entirely new protein. As both studies went to print, the Covid-19 outbreak was surging in the United States, and Buehler turned his attention to the spike protein of SARS-CoV-2, the appendage that makes the novel coronavirus so contagious. He and his colleagues are trying to unpack its vibrational properties through molecular-based sound spectra, which could hold one key to stopping the virus. Buehler recently sat down to discuss the art and science of his work.

Q: Your work focuses on the alpha helix proteins found in skin and hair. Why makes this protein so intriguing? 

A: Proteins are the bricks and mortar that make up our cells, organs, and body. Alpha helix proteins are especially important. Their spring-like structure gives them elasticity and resilience, which is why skin, hair, feathers, hooves, and even cell membranes are so durable. But they’re not just tough mechanically, they have built-in antimicrobial properties. With IBM, we’re trying to harness this biochemical trait to create a protein coating that can slow the spoilage of quick-to-rot foods like strawberries.

Q: How did you enlist AI to produce this silk protein?

A: We trained a deep learning model on the Protein Data Bank, which contains the amino acid sequences and three-dimensional shapes of about 120,000 proteins. We then fed the model a snippet of an amino acid chain for honeybee silk and asked it to predict the protein’s shape, atom-by-atom. We validated our work by synthesizing the protein for the first time in a lab — a first step toward developing a thin antimicrobial, structurally-durable coating that can be applied to food. My colleague, Benedetto Marelli, specializes in this part of the process. We also used the platform to predict the structure of proteins that don’t yet exist in nature. That’s how we designed our entirely new protein in the APL Bioengineering study. 

Q: How does your model improve on other protein prediction methods? 

A: We use end-to-end prediction. The model builds the protein’s structure directly from its sequence, translating amino acid patterns into three-dimensional geometries. It’s like translating a set of IKEA instructions into a built bookshelf, minus the frustration. Through this approach, the model effectively learns how to build a protein from the protein itself, via the language of its amino acids. Remarkably, our method can accurately predict protein structure without a template. It outperforms other folding methods and is significantly faster than physics-based modeling. Because the Protein Data Bank is limited to proteins found in nature, we needed a way to visualize new structures to make new proteins from scratch.

Q: How could the model be used to design an actual protein?

A: We can build atom-by-atom models for sequences found in nature that haven’t yet been studied, as we did in the APL Bioengineering study using a different method. We can visualize the protein’s structure and use other computational methods to assess its function by analyzing its stablity and the other proteins it binds to in cells. Our model could be used in drug design or to interfere with protein-mediated biochemical pathways in infectious disease.

Q: What’s the benefit of translating proteins into sound?

A: Our brains are great at processing sound! In one sweep, our ears pick up all of its hierarchical features: pitch, timbre, volume, melody, rhythm, and chords. We would need a high-powered microscope to see the equivalent detail in an image, and we could never see it all at once. Sound is such an elegant way to access the information stored in a protein. 

Typically, sound is made from vibrating a material, like a guitar string, and music is made by arranging sounds in hierarchical patterns. With AI we can combine these concepts, and use molecular vibrations and neural networks to construct new musical forms. We’ve been working on methods to turn protein structures into audible representations, and translate these representations into new materials. 

Q: What can the sonification of SARS-CoV-2’s “spike” protein tell us?

A: Its protein spike contains three protein chains folded into an intriguing pattern. These structures are too small for the eye to see, but they can be heard. We represented the physical protein structure, with its entangled chains, as interwoven melodies that form a multi-layered composition. The spike protein’s amino acid sequence, its secondary structure patterns, and its intricate three-dimensional folds are all featured. The resulting piece is a form of counterpoint music, in which notes are played against notes. Like a symphony, the musical patterns reflect the protein’s intersecting geometry realized by materializing its DNA code.

Q: What did you learn?

A: The virus has an uncanny ability to deceive and exploit the host for its own multiplication. Its genome hijacks the host cell’s protein manufacturing machinery, and forces it to replicate the viral genome and produce viral proteins to make new viruses. As you listen, you may be surprised by the pleasant, even relaxing, tone of the music. But it tricks our ear in the same way the virus tricks our cells. It’s an invader disguised as a friendly visitor. Through music, we can see the SARS-CoV-2 spike from a new angle, and appreciate the urgent need to learn the language of proteins.  

Q: Can any of this address Covid-19, and the virus that causes it?

A: In the longer term, yes. Translating proteins into sound gives scientists another tool to understand and design proteins. Even a small mutation can limit or enhance the pathogenic power of SARS-CoV-2. Through sonification, we can also compare the biochemical processes of its spike protein with previous coronaviruses, like SARS or MERS. 

In the music we created, we analyzed the vibrational structure of the spike protein that infects the host. Understanding these vibrational patterns is critical for drug design and much more. Vibrations may change as temperatures warm, for example, and they may also tell us why the SARS-CoV-2 spike gravitates toward human cells more than other viruses. We’re exploring these questions in current, ongoing research with my graduate students. 

We might also use a compositional approach to design drugs to attack the virus. We could search for a new protein that matches the melody and rhythm of an antibody capable of binding to the spike protein, interfering with its ability to infect.

Q: How can music aid protein design?

A: You can think of music as an algorithmic reflection of structure. Bach’s Goldberg Variations, for example, are a brilliant realization of counterpoint, a principle we’ve also found in proteins. We can now hear this concept as nature composed it, and compare it to ideas in our imagination, or use AI to speak the language of protein design and let it imagine new structures. We believe that the analysis of sound and music can help us understand the material world better. Artistic expression is, after all, just a model of the world within us and around us.  

Co-authors of the study in Extreme Mechanics Letters are: Zhao Qin, Hui Sun, Eugene Lim and Benedetto Marelli at MIT; and Lingfei Wu, Siyu Huo, Tengfei Ma and Pin-Yu Chen at IBM Research. Co-author of the study in APL Bioengineering is Chi-Hua Yu. Buehler’s sonification work is supported by MIT’s Center for Art, Science and Technology (CAST) and the Mellon Foundation. 

Read More

TensorFlow Lite Core ML delegate enables faster inference on iPhones and iPads

TensorFlow Lite Core ML delegate enables faster inference on iPhones and iPads

Posted by Tei Jeong and Karim Nosseir, Software Engineers

TensorFlow Lite offers options to delegate part of the model inference, or the entire model inference, to accelerators, such as the GPU, DSP, and/or NPU for efficient mobile inference. On Android, you can choose from several delegates: NNAPI, GPU, and the recently added Hexagon delegate. Previously, with Apple’s mobile devices — iPhones and iPads — the only option was the GPU delegate.

When Apple released its machine learning framework Core ML and Neural Engine (a neural processing unit (NPU) in Apple’s Bionic SoC) this allowed TensorFlow Lite to leverage Apple’s hardware.

Neural processing units (NPUs), similar to Google’s Edge TPU and Apple’s Neural Engine, are specialized hardware accelerators designed to accelerate machine learning applications. These chips are designed to speed up model inference on mobile or edge devices and use less power than running inference on the CPU or GPU.

Today, we are excited to announce a new TensorFlow Lite delegate that uses Apple’s Core ML API to run floating-point models faster on iPhones and iPads with the Neural Engine. We are able to see performance gains up to 14x (see details below) for models like MobileNet and Inception V3.

TFLite Core
Figure 1: High-level overview of how a delegate works at runtime. Supported portions of the graph run on the accelerator, while other operations run on the CPU via TensorFlow Lite kernels.

Which devices are supported?

This delegate runs on iOS devices with iOS version 12 or later (including iPadOS). However, to get true performance benefits, it should run on devices with Apple A12 SoC or later (for example, iPhone XS). For older iPhones, you should use the TensorFlow lite GPU delegate to get faster performance.

Which models are supported?

With this initial launch, 32-bit floating point models are supported. A few examples of supported models include, but not limited to, image classification, object detection, object segmentation, and pose estimation models. The delegate supports many compute-heavy ops such as convolutions, though there are certain constraints for some ops. These constraints are checked before delegation in runtime so that unsupported ops automatically fall back to the CPU. The complete list of ops and corresponding restrictions (if any) are in the delegate’s documentation.

Impacts on performance

We tested the delegate with two common float models, MobileNet V2 and Inception V3. Benchmarks were conducted on the iPhone 8+ (A11 SoC), iPhone XS (A12 SoC) and iPhone 11 Pro (A13 SoC), and tested for three delegate options: CPU only (no delegate), GPU, and Core ML delegate. As mentioned before, you can see the accelerated performance on models with A12 SoC and later, but on iPhone 8+ — where Neural Engine is not available for third parties — there is no observed performance gain when using the Core ML delegate with small models. For larger models, performance is similar to GPU delegate.

In addition to model inference latency, we also measured startup latency. Note that accelerated speed comes at a tradeoff with delayed startup. For the Core ML delegate, startup latency increases along with the model size. For example, on smaller models like MobileNet, we observed a startup latency of 200-400ms. On the other hand, for larger models, like Inception V3, the startup latency could be 2-4 seconds. We are working on reducing the startup latency. The delegate also has an impact on the binary size. Using the Core ML delegate may increase the binary size by up to 1MB.

Models

  • MobileNet V2 (1.0, 224, float) [download] : Image Classification
    • Small model. Entire graph runs on Core ML.
  • Inception V3 [download] : Image Classification
    • Large model. Entire graph runs on Core ML.

Devices

  • iPhone 8+ (Apple A11, iOS 13.2.3)
  • iPhone XS (Apple A12, iOS 13.2.3)
  • iPhone 11 Pro (Apple A13, iOS 13.2.3)

Latencies and speed-ups observed for MobileNet V2

Figure 2: Latencies and speed-ups observed for MobileNet V2. All versions use a floating-point model. CPU Baseline denotes two-threaded TensorFlow Lite kernels.
* GPU: Core ML uses CPU and GPU for inference. NPU: Core ML uses CPU and GPU, and NPU(Neural Engine) for inference.

Latencies and speed-ups observed for Inception V3

Figure 3: Latencies and speed-ups observed for Inception V3. All versions use a floating-point model. CPU Baseline denotes two-threaded TensorFlow Lite kernels.
* GPU: Core ML uses CPU and GPU for inference. NPU: Core ML uses CPU and GPU, and NPU(Neural Engine) for inference.

How do I use it?

You simply have to make a call on the TensorFlow Lite Interpreter with an instance of the new delegate. For a detailed explanation, please read the full documentation. You can use either the Swift API (example below) or the C++ API (shown in the documentation) to invoke the TensorFlow Lite delegate during inference.

Swift example

This is how invoking this delegate would look from a typical Swift application. All you have to do is create a new Core ML delegate and pass it to the original interpreter initialization code.

let coreMLDelegate = CoreMLDelegate()
let interpreter = try Interpreter(modelPath: modelPath,
delegates: [coreMLDelegate])

Future work

Over the coming months, we will improve upon the existing delegate with more op coverage and additional optimizations. Support for models trained with post-training float16 quantization is on the roadmap. This will allow acceleration of models with about half the model size and small accuracy loss.
Support for post-training weight quantization (also called dynamic quantization) is also on the roadmap.

Feedback

This was a common feature request that we got from our developers. We are excited to release this and are looking forward to hearing your thoughts. Share your use case directly or on Twitter with hashtags #TFLite and #PoweredByTF. For bugs or issues, please reach out to us on GitHub.

Acknowledgements

Thank you to Tei Jeong, Karim Nosseir, Sachin Joglekar, Jared Duke, Wei Wei, Khanh LeViet.
Note: Core ML, Neural Engine and Bionic SoCs (A12, A13) are products of Apple Inc.Read More