Six from MIT elected to American Academy of Arts and Sciences for 2020

Six MIT faculty members are among more than 250 leaders from academia, business, public affairs, the humanities, and the arts elected to the American Academy of Arts and Sciences, the academy announced Thursday.

One of the nation’s most prestigious honorary societies, the academy is also a leading center for independent policy research. Members contribute to academy publications, as well as studies of science and technology policy, energy and global security, social policy and American institutions, the humanities and culture, and education.

Those elected from MIT this year are:

  • Robert C. Armstrong, Chevron Professor in Chemical Engineering;
  • Dave L. Donaldson, professor of economics;
  • Catherine L. Drennan, professor of biology and chemistry;
  • Ronitt Rubinfeld, professor of electrical engineering and computer science;
  • Joshua B. Tenenbaum, professor of brain and cognitive sciences; and
  • Craig Steven Wilder, Barton L. Weller Professor of History.

“The members of the class of 2020 have excelled in laboratories and lecture halls, they have amazed on concert stages and in surgical suites, and they have led in board rooms and courtrooms,” said academy President David W. Oxtoby. “With today’s election announcement, these new members are united by a place in history and by an opportunity to shape the future through the academy’s work to advance the public good.”

Since its founding in 1780, the academy has elected leading thinkers from each generation, including George Washington and Benjamin Franklin in the 18th century, Maria Mitchell and Daniel Webster in the 19th century, and Toni Morrison and Albert Einstein in the 20th century. The current membership includes more than 250 Nobel and Pulitzer Prize winners.

Read More

Reporting tool aims to balance hospitals’ Covid-19 load

As cases of Covid-19 continue to climb in parts of the United States, the number of people seeking treatment is threatening to overwhelm many hospitals, forcing some facilities to ration their care and reserve ventilators, hospital beds, and other limited medical resources for the sickest patients. 

Having a handle on local hospitals’ capacity and resource availability could help balance the load of Covid-19 patients requiring hospitalization across a region, for instance allowing an EMT to send a patient to a facility where they are more likely to be treated quickly. But many states lack real-time data on their current capacity to treat Covid-19 patients. 

A group of researchers in MIT’s Computer Science and Intelligence Laboratory (CSAIL), working with the MIT spinoff Mobi Systems, are aiming to help level demand across the entire health care network by providing real-time updates of hospital resources, which they hope will help patients, EMTs, and physicians quickly decide which facility is best equipped to handle a new patient at any given time. 

The team has developed a web app which is now publicly accessible at: The interface allows users such as patients, nurses, and doctors to report a hospital’s current status in a number of metrics, from the average wait time (something that a patient may get a sense for as they spend time in a waiting room), to the number of ventilators and ICU beds, which doctors and nurses may be able to approximate.

EMTS can use the app as a map, zooming in by state, county, or city to quickly gauge hospital capacity, and decide which nearby hospitals have available beds where they can send a patient requiring hospitalization. The app can also generate a list of hospitals, prioritized by availability, time of travel, and most recently updated data. 

“We want to flatten the Covid curve by physical distancing over the course of months,” says MIT graduate Anna Jaffe ’07, CEO of Mobi Systems. “But there’s another curve to flatten, which is this real-time challenge of getting the right patient to the right hospital, in the right moment, to level the load on hospitals and health care workers.”

“Do something”

As the pandemic began to unfold around the world, Jaffe was intrigued by the results of a short hackathon that one Mobi member, Julius Pätzold, recently attended in Germany. The weekend challenge, sponsored by the German government, included a problem to match supply and demand, for instance in a hospital facing a surge in patient visits. 

His team mapped the German hospital infrastructure, including the status of individual hospitals’ capacity, then simulated dispatching patients to hospitals according to a hospital’s capacity, its relative location to a patient, and a patient’s medical needs. The real-time maps developed over this short time suggested such tools would have a positive impact on a patient’s quality of care, specifically in decreasing death rates.

“That intersected with my feeling that I think everyone wants to do something around Covid-19 in response to the current crisis, and not just be cooped up in our respective homes,” says Jaffe, whose company, Mobi Systems, develops tools for large-scale network optimization problems surrounding mobility and hospitality. 

Mobi originally grew out of CSAIL’s Model-based Embedded Robotic Systems group, led by MIT Professor Brian Williams, whose work involves developing autonomous planning tools to help individuals make complex, real-time decisions in the face of uncertainty and risk. 

Jaffe reached out to Williams to help develop a web-based reporting tool for hospitals, to similarly help patients and medical professionals make critical, real-time decisions of where best to send a patient, based on resource availability. 

“Our question was, how can the resources statewide or nationwide be used most effectively, in order to keep the most people healthy,” Williams says. “And for the individual, which hospital will meet their needs, and how do they get there. That’s the exercise we’re tackling here.”

Crowd power

The team’s app is heavily dependent on crowdsourced data, and the willingness of patients and medical professionals to report on various metrics, from a hospital’s current wait time to the approximate number of ICU beds and ventilators available. 

“The reporting options right now are very specific,” Jaffe says. “But what we really want to know is, can your hospital accept a patient right now?” 

A user can enter their role — patient, nurse, or physician — then report on, for instance, a hospital’s average wait time. With a sliding scale, they can rate their confidence in their report before submitting it. 

But what if those users are reporting false or inaccurate data, whether intentionally or not? 

Williams says in order to guard against such uncertainty, the team takes a probabilistic approach. For instance, the app assumes that one user’s reporting of a hospital’s status is one of low confidence, which is initially not weighed heavily in the overall estimation for that metric. They can then incorporate this one data point into all the other reports they’ve received for that metric. If most of those reports have also been rated with low confidence, but report the same result, that estimate, such as of wait time, is automatically weighed more heavily, and therefore rated at a higher confidence overall.  

Additionally, he says if the app receives reports from more trusted sources — for instance, if hospitals make in-house, aggregated data available to the app — those sources would “swamp out” or take higher priority over low-confidence reports of the same metric. 

The team is testing the app with just such a trustworthy dataset, from the state of Pennsylvania, which for the last several years has had a system in place for hospitals to report resource availability, that is updated at least twice a day. The team has used data from the last week to track Covid-19 visits across the state’s hospital system.

“In this data, you can see that not all hospitals are overrun — there are clear differences in availability,” says MIT graduate Peng Yu ’SM 13, ’PhD 17, chief technology officer at Mobi, highlighting the potential for distributing patients across a region’s hospitals, to balance resources across a hospital network. 

However, most states lack such aggregated, updated information. In most other states, for instance, EMTs either have a handful of default facilities where they typically send patients, or they have to call around to surrounding hospitals to check availability. 

“It’s really about word of mouth — who do you know, and who do you call up,” says Williams, whose nephew is an EMT who has worked in regions with varying decision-making practices. “We’re trying to aggregate that information, to make these recommendations much faster.

The team is now reaching out to thousands of medical professionals to test-drive the reporting tool, in hopes of boosting the crowdsourcing component for the app, which is now available on any internet-enabled device. To address the pandemic, the team believes that data need to be made available at a faster rate than the virus’ spread. Their hope is that states will follow in Pennsylvania’s footsteps and, for instance, mandate that hospitals report resource data, and provide reporting tools such as the new app to doctors and EMTs. 

“This project is very much for the people, by the people, and will be kept open and free,” Williams says.  

“Unfortunately, it doesn’t feel like this is a flash pandemic,” Jaffe says. “Even in a recovery period, hospitals will have to resume normal care, concurrently with treating Covid-19 over time. Our app may help load balance in that way as well, so hospitals can more effectively predict how many floors they need to quarantine for Covid-19, so that the rest of the hospital can go back to things like having families around a mother giving birth. We aim to really understand how to bring things back to a more normal operational status, while still handling the crisis.

Read More

Shedding light on complex power systems

Marija Ilic — a senior research scientist at the Laboratory for Information and Decision Systems, affiliate of the MIT Institute for Data, Systems, and Society, senior staff in MIT Lincoln Laboratory’s Energy Systems Group, and Carnegie Mellon University professor emerita — is a researcher on a mission: making electric energy systems future-ready.

Since the earliest days of streetcars and public utilities, electric power systems have had a fairly standard structure: for a given area, a few large generation plants produce and distribute electricity to customers. It is a one-directional structure, with the energy plants being the only source of power for many end users.

Today, however, electricity can be generated from many and varied sources — and move through the system in multiple directions. An electric power system may include stands of huge turbines capturing wild ocean winds, for instance. There might be solar farms of a hundred megawatts or more, or houses with solar panels on their roofs that some days make more electricity than occupants need, some days much less. And there are electric cars, their batteries hoarding stored energy overnight. Users may draw electricity from one source or another, or feed it back into the system, all at the same time. Add to that the trend toward open electricity markets, where end users like households can pick and choose the electricity services they buy depending on their needs. How should systems operators integrate all these while keeping the grid stable and ensuring power gets to where it is needed?

To explore this question, Ilic has developed a new way to model complex power systems.

Electric power systems, even traditional ones, are complex and heterogeneous to begin with. They cover wide geographical areas and have legal and political barriers to contend with, such as state borders and energy policies. In addition, all electric power systems have inherent physical limitations. For instance, power does not flow in a set path in an electric grid, but rather along all possible paths connecting supply to demand. To maintain grid stability and quality of service, then, the system must control for the impact of interconnections: a change in supply and demand at one point in a system changes supply and demand for the other points in the system. This means there is much more complexity to manage as new sources of energy (more interconnections) with sometimes unpredictable supply (such as wind or solar power) come into play. Ultimately, however, to maintain stability and quality of service, and to balance supply and demand within the system, it comes down to a relatively simple concept: the power consumed and the rate at which it is consumed (plus whatever is lost along the way), must always equal the power produced and the rate at which it is produced.

Using this simpler concept to manage the complexities and limitations of electric power systems, Ilic is taking a non-traditional approach: She models the systems using information about energy, power, and ramp rate (the rate at which power can increase over time) for each part of the system — distributing decision-making calculations into smaller operational chunks. Doing this streamlines the model but retains information about the system’s physical and temporal structure. “That’s the minimal information you need to exchange. It’s simple and technology-agnostic, but we don’t teach systems that way.”

She believes regulatory organizations such as the Federal Energy Regulatory Commission and North American Energy Reliability Corporation should have standard protocols for such information exchanges, just as internet protocols govern how data is exchanged on the internet. “If you were to [use a standard set of] specifications like: what is your capacity, how much does it vary over time, how much energy do you need and within what power range — the system operator could integrate different sources in a much simpler way than we are doing now.” 

Another important aspect of Ilic’s work is that her models lend themselves to controlling the system with a layer of sensor and communications technologies. This uses a framework she developed called Dynamic Monitoring and Decision Systems framework, or DyMonDS. The data-enabled decision-making concept has been tested using real data from Portugal’s Azores Islands, and since applied to real-world challenges. After so many years it appears that her new modeling approach fittingly supports DyMonDS design, including systematic use of many theoretical concepts used by the LIDS community in their research.

One such challenge included work on Puerto Rico’s power grid. Ilic was the technical lead on a Lincoln Laboratory project on designing future architectures and software to make Puerto Rico’s electric power grid more resilient without adding much more production capacity or cost. Typically, a power grid’s generation capacity is scheduled in a simple, brute-force way, based on weather forecasts and the hottest and coldest days of the year, that doesn’t respond sensitively to real-time needs. Making such a system more resilient would mean spending a lot more on generation and transmission and distribution capacity, whereas a more dynamic system that integrates distributed microgrids could tame the cost, Ilic says: “What we are trying to do is to have systematic frameworks for embedding intelligence into small microgrids serving communities, and having them interact with large-scale power grids. People are realizing that you can make many small microgrids to serve communities rather than relying only on large scale electrical power generation.”

Although this is one of Ilic’s most recent projects, her work on DyMonDS can be traced back four decades, to when she was a student at the University of Belgrade in the former country of Yugoslavia, which sent her to the United States to learn how to use computers to prevent blackouts.

She ended up at Washington University in St. Louis, Missouri, studying with applied mathematician John Zaborszky, a legend in the field who was originally chief engineer of Budapest’s municipal power system before moving to the United States. (“The legend goes that in the morning he would teach courses, and in the afternoon he would go and operate Hungarian power system protection by hand.”) Under Zaborszky, a systems and control expert, Ilic learned to think in abstract terms as well as in terms of physical power systems and technologies. She became fascinated by the question of how to model, simulate, monitor, and control power systems — and that’s where she’s been ever since. (Although, she admits as she uncoils to her full height from behind her desk, her first love was actually playing basketball.)

Ilic first arrived at MIT in 1987 to work with the late professor Fred Schweppe on connecting electricity technologies with electricity markets. She stayed on as a senior research scientist until 2002, when she moved to Carnegie Mellon University (CMU) to lead the multidisciplinary Electric Energy Systems Group there. In 2018, after her consulting work for Lincoln Lab ramped up, she retired from CMU to move back to the familiar environs of Cambridge, Massachusetts. CMU’s loss has been MIT’s gain: In fall 2019, Ilic taught a course in modeling, simulation, and control of electric energy systems, applying her work on streamlined models that use pared-down information.

Addressing the evolving needs of electric power systems has not been a “hot” topic, historically. Traditional power systems are often seen by the academic community as legacy technology with no fundamentally new developments. And yet when new software and systems are developed to help integrate distributed energy generation and storage, commercial systems operators regard them as untested and disruptive. “I’ve always been a bit on the sidelines from mainstream power and electrical engineering because I’m interested in some of these things,” she remarks.

However, Ilic’s work is becoming increasingly urgent. Much of today’s power system is physically very old and will need to be retired and replaced over the next decade. This presents an opportunity for innovation: the next generation of electric energy systems could be built to integrate renewable and distributed energy resources at scale — addressing the pressing challenge of climate change and making way for further progress.

“That’s why I’m still working, even though I should be retired.” She smiles. “It supports the evolution of the system to something better.”

Read More

Reducing the carbon footprint of artificial intelligence

Artificial intelligence has become a focus of certain ethical concerns, but it also has some major sustainability issues. 

Last June, researchers at the University of Massachusetts at Amherst released a startling report estimating that the amount of power required for training and searching a certain neural network architecture involves the emissions of roughly 626,000 pounds of carbon dioxide. That’s equivalent to nearly five times the lifetime emissions of the average U.S. car, including its manufacturing.

This issue gets even more severe in the model deployment phase, where deep neural networks need to be deployed on diverse hardware platforms, each with different properties and computational resources. 

MIT researchers have developed a new automated AI system for training and running certain neural networks. Results indicate that, by improving the computational efficiency of the system in some key ways, the system can cut down the pounds of carbon emissions involved — in some cases, down to low triple digits. 

The researchers’ system, which they call a once-for-all network, trains one large neural network comprising many pretrained subnetworks of different sizes that can be tailored to diverse hardware platforms without retraining. This dramatically reduces the energy usually required to train each specialized neural network for new platforms — which can include billions of internet of things (IoT) devices. Using the system to train a computer-vision model, they estimated that the process required roughly 1/1,300 the carbon emissions compared to today’s state-of-the-art neural architecture search approaches, while reducing the inference time by 1.5-2.6 times. 

“The aim is smaller, greener neural networks,” says Song Han, an assistant professor in the Department of Electrical Engineering and Computer Science. “Searching efficient neural network architectures has until now had a huge carbon footprint. But we reduced that footprint by orders of magnitude with these new methods.”

The work was carried out on Satori, an efficient computing cluster donated to MIT by IBM that is capable of performing 2 quadrillion calculations per second. The paper is being presented next week at the International Conference on Learning Representations. Joining Han on the paper are four undergraduate and graduate students from EECS, MIT-IBM Watson AI Lab, and Shanghai Jiao Tong University. 

Creating a “once-for-all” network

The researchers built the system on a recent AI advance called AutoML (for automatic machine learning), which eliminates manual network design. Neural networks automatically search massive design spaces for network architectures tailored, for instance, to specific hardware platforms. But there’s still a training efficiency issue: Each model has to be selected then trained from scratch for its platform architecture. 

“How do we train all those networks efficiently for such a broad spectrum of devices — from a $10 IoT device to a $600 smartphone? Given the diversity of IoT devices, the computation cost of neural architecture search will explode,” Han says.   

The researchers invented an AutoML system that trains only a single, large “once-for-all” (OFA) network that serves as a “mother” network, nesting an extremely high number of subnetworks that are sparsely activated from the mother network. OFA shares all its learned weights with all subnetworks — meaning they come essentially pretrained. Thus, each subnetwork can operate independently at inference time without retraining. 

The team trained an OFA convolutional neural network (CNN) — commonly used for image-processing tasks — with versatile architectural configurations, including different numbers of layers and “neurons,” diverse filter sizes, and diverse input image resolutions. Given a specific platform, the system uses the OFA as the search space to find the best subnetwork based on the accuracy and latency tradeoffs that correlate to the platform’s power and speed limits. For an IoT device, for instance, the system will find a smaller subnetwork. For smartphones, it will select larger subnetworks, but with different structures depending on individual battery lifetimes and computation resources. OFA decouples model training and architecture search, and spreads the one-time training cost across many inference hardware platforms and resource constraints. 

This relies on a “progressive shrinking” algorithm that efficiently trains the OFA network to support all of the subnetworks simultaneously. It starts with training the full network with the maximum size, then progressively shrinks the sizes of the network to include smaller subnetworks. Smaller subnetworks are trained with the help of large subnetworks to grow together. In the end, all of the subnetworks with different sizes are supported, allowing fast specialization based on the platform’s power and speed limits. It supports many hardware devices with zero training cost when adding a new device.
In total, one OFA, the researchers found, can comprise more than 10 quintillion — that’s a 1 followed by 19 zeroes — architectural settings, covering probably all platforms ever needed. But training the OFA and searching it ends up being far more efficient than spending hours training each neural network per platform. Moreover, OFA does not compromise accuracy or inference efficiency. Instead, it provides state-of-the-art ImageNet accuracy on mobile devices. And, compared with state-of-the-art industry-leading CNN models , the researchers say OFA provides 1.5-2.6 times speedup, with superior accuracy. 
“That’s a breakthrough technology,” Han says. “If we want to run powerful AI on consumer devices, we have to figure out how to shrink AI down to size.”

“The model is really compact. I am very excited to see OFA can keep pushing the boundary of efficient deep learning on edge devices,” says Chuang Gan, a researcher at the MIT-IBM Watson AI Lab and co-author of the paper.

“If rapid progress in AI is to continue, we need to reduce its environmental impact,” says John Cohn, an IBM fellow and member of the MIT-IBM Watson AI Lab. “The upside of developing methods to make AI models smaller and more efficient is that the models may also perform better.”

Read More

Making Decision Trees Accurate Again: Explaining What Explainable AI Did Not

Making Decision Trees Accurate Again: Explaining What Explainable AI Did Not

The interpretability of neural networks is becoming increasingly necessary, as
deep learning is being adopted in settings where accurate and justifiable
predictions are required. These applications range from finance to medical
imaging. However, deep neural networks are notorious for a lack of
justification. Explainable AI (XAI) attempts to bridge this divide between
accuracy and interpretability, but as we explain below, XAI justifies
decisions without interpreting the model directly

Jim Collins receives funding to harness AI for drug discovery

Housed at TED and supported by leading social impact advisor The Bridgespan Group, The Audacious Project is a collaborative funding initiative that’s catalyzing social impact on a grand scale by convening funders and social entrepreneurs, with the goal of supporting bold solutions to the world’s most urgent challenges.

Among this year’s carefully selected change-makers is Jim Collins and a team at MIT’s Abdul Latif Jameel Clinic for Machine Learning in Health (Jameel Clinic), including co-principal investigator Regina Barzilay. The funding provided through The Audacious Project will support the response to the antibiotic resistance crisis through the development of new classes of antibiotics to protect patients against some of the world’s deadliest bacterial pathogens.

“The work of Jim Collins and his colleagues is more relevant now than ever before,” says Anantha P. Chandrakasan, dean of the MIT School of Engineering and the Vannevar Bush Professor of Electrical Engineering and Computer Science. “We are grateful for the commitment from The Audacious Project and its contributors, to both support and foster the research around AI and drug discovery, and to join our efforts in the School of Engineering to realize the potential global impact of this incredible work.” 

Collins’ and Barzilay’s Antibiotics-AI Project seeks to produce the first new classes of antibiotics society has seen in three decades, by calling in an interdisciplinary team of world-class bioengineers, microbiologists, computer scientists, and chemists.

Collins is the Termeer Professor of Medical Engineering and Science in MIT’s Institute for Medical Engineering and Science (IMES) and the Department of Biological Engineering, faculty co-lead of Jameel Clinic, faculty lead of the MIT-Takeda Program, and a member of the Harvard-MIT Health Sciences and Technology faculty. He is also a core founding faculty member of the Wyss Institute for Biologically Inspired Engineering at Harvard University and an Institute member of the Broad Institute of MIT and Harvard.

Barzilay is the Delta Electronics Professor in MIT’s Department of Electrical Engineering and Computer Science, faculty co-lead of Jameel Clinic, and a member of the Computer Science and Artificial Intelligence Laboratory at MIT.

Earlier this year, Collins and Barzilay along with Tommi Jaakkola, Thomas Siebel Professor of Electrical Engineering and Computer Science and the Institute for Data, Systems, and Society, and postdoc Jonathan Stokes were part of a research team that successfully used a deep-learning model to identify a new antibiotic. Over the next seven years, The Audacious Project’s commitment will support Collins and Barzilay as they continue to use the same process to rapidly explore over a billion molecules to identify and design novel antibiotics.

Read More

With lidar and artificial intelligence, road status clears up after a disaster

Consider the days after a hurricane strikes. Trees and debris are blocking roads, bridges are destroyed, and sections of roadway are washed out. Emergency managers soon face a bevy of questions: How can supplies get delivered to certain areas? What’s the best route for evacuating survivors? Which roads are too damaged to remain open?

Without concrete data on the state of the road network, emergency managers often have to base their answers on incomplete information. The Humanitarian Assistance and Disaster Relief Systems Group at MIT Lincoln Laboratory hopes to use its airborne lidar platform, paired with artificial intelligence (AI) algorithms, to fill this information gap.  

“For a truly large-scale catastrophe, understanding the state of the transportation system as early as possible is critical,” says Chad Council, a researcher in the group. “With our particular approach, you can determine road viability, do optimal routing, and also get quantified road damage. You fly it, you run it, you’ve got everything.”

Since the 2017 hurricane season, the team has been flying its advanced lidar platform over stricken cities and towns. Lidar works by pulsing photons down over an area and measuring the time it takes for each photon to bounce back to the sensor. These time-of-arrival data points paint a 3D “point cloud” map of the landscape — every road, tree, and building — to within about a foot of accuracy.

To date, they’ve mapped huge swaths of the Carolinas, Florida, Texas, and all of Puerto Rico. In the immediate aftermath of hurricanes in those areas, the team manually sifted through the data to help the Federal Emergency Management Agency (FEMA) find and quantify damage to roads, among other tasks. The team’s focus now is on developing AI algorithms that can automate these processes and find ways to route around damage.

What’s the road status?

Information about the road network after a disaster comes to emergency managers in a “mosaic of different information streams,” Council says, namely satellite images, aerial photographs taken by the Civil Air Patrol, and crowdsourcing from vetted sources.

“These various efforts for acquiring data are important because every situation is different. There might be cases when crowdsourcing is fastest, and it’s good to have redundancy. But when you consider the scale of disasters like Hurricane Maria on Puerto Rico, these various streams can be overwhelming, incomplete, and difficult to coalesce,” he says.

During these times, lidar can act as an all-seeing eye, providing a big-picture map of an area and also granular details on road features. The laboratory’s platform is especially advanced because it uses Geiger-mode lidar, which is sensitive to a single photon. As such, its sensor can collect each of the millions of photons that trickle through openings in foliage as the system is flown overhead. This foliage can then be filtered out of the lidar map, revealing roads that would otherwise be hidden from aerial view.

To provide the status of the road network, the lidar map is first run through a neural network. This neural network is trained to find and extract the roads, and to determine their widths. Then, AI algorithms search these roads and flag anomalies that indicate the roads are impassable. For example, a cluster of lidar points extending up and across a road is likely a downed tree. A sudden drop in the elevation is likely a hole or washed out area in a road.

The extracted road network, with its flagged anomalies, is then merged with an OpenStreetMap of the area (an open-access map similar to Google Maps). Emergency managers can use this system to plan routes, or in other cases to identify isolated communities — those that are cut off from the road network. The system will show them the most efficient route between two specified locations, finding detours around impassable roads. Users can also specify how important it is to stay on the road; on the basis of that input, the system provides routes through parking lots or fields.  

This process, from extracting roads to finding damage to planning routes, can be applied to the data at the scale of a single neighborhood or across an entire city.

How fast and how accurate?

To gain an idea of how fast this system works, consider that in a recent test, the team flew the lidar platform, processed the data, and got AI-based analytics in 36 hours. That sortie covered an area of 250 square miles, an area about the size of Chicago, Illinois.

But accuracy is equally as important as speed. “As we incorporate AI techniques into decision support, we’re developing metrics to characterize an algorithm’s performance,” Council says.

For finding roads, the algorithm determines if a point in the lidar point cloud is “road” or “not road.” The team ran a performance evaluation of the algorithm against 50,000 square meters of suburban data, and the resulting ROC curve indicated that the current algorithm provided an 87 percent true positive rate (that is, correctly labeled a point as “road”), with a 20 percent false positive rate (that is, labeling a point as “road” that may not be road). The false positives are typically areas that geometrically look like a road but aren’t.

“Because we have another data source for identifying the general location of roads, OpenStreetMaps, these false positives can be excluded, resulting in a highly accurate 3D point cloud representation of the road network,” says Dieter Schuldt, who has been leading the algorithm-testing efforts.

For the algorithm that detects road damage, the team is in the process of further aggregating ground truth data to evaluate its performance. In the meantime, preliminary results have been promising. Their damage-finding algorithm recently flagged for review a potentially blocked road in Bedford, Massachusetts, which appeared to be a hole measuring 10 meters wide by 7 meters long by 1 meter deep. The town’s public works department and a site visit confirmed that construction blocked the road.

“We actually didn’t go in expecting that this particular sortie would capture examples of blocked roads, and it was an interesting find,” says Bhavani Ananthabhotla, a contributor to this work. “With additional ground truth annotations, we hope to not only evaluate and improve performance, but also to better tailor future models to regional emergency management needs, including informing route planning and repair cost estimation.”

The team is continuing to test, train, and tweak their algorithms to improve accuracy. Their hope is that these techniques may soon be deployed to help answer important questions during disaster recovery.

“We picture lidar as a 3D scaffold that other data can be draped over and that can be trusted,” Council says. “The more trust, the more likely an emergency manager, and a community in general, will use it to make the best decisions they can.”

Read More

Professor Daniela Rus named to White House science council

This week the White House announced that MIT Professor Daniela Rus, director of MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL), has been selected to serve on the President’s Council of Advisors on Science and Technology (PCAST).

The council provides advice to the White House on topics critical to U.S. security and the economy, including policy recommendations on the future of work, American leadership in science and technology, and the support of U.S. research and development. 

PCAST operates under the aegis of the White House Office of Science and Technology Policy (OSTP), which was established in law in 1976. However, the council has existed more informally going back to Franklin Roosevelt’s Science Advisory Board in 1933.

“I’m grateful to be able to add my perspective as a computer scientist to this group at a time when so many issues involving AI and other aspects of computing raise important scientific and policy questions for the nation and the world,” says Rus.
Rus is the Andrew and Erna Viterbi Professor of Electrical Engineering and Computer Science at MIT and the deputy dean of research for the MIT Stephen A. Schwarzman College of Computing. Her research in robotics, artificial intelligence, and data science focuses primarily on developing the science and engineering of autonomy, with the long-term objective of enabling a future where machines are integrated into daily life to support both cognitive and physical tasks. The applications of her work are broad and include transportation, manufacturing, medicine, and urban planning. 
More than a dozen MIT faculty and alumni have served on PCAST during past presidential administrations. These include former MIT president Charles Vest; Institute Professors Phillip Sharp and John Deutch; Ernest Moniz, professor of physics and former U.S. Secretary of Energy; and Eric Lander, director of the Broad Institute of MIT and Harvard and professor of biology, who co-chaired PCAST during the Obama administration. Previous councils have offered advice on topics ranging from data privacy and nanotechnology to job training and STEM education.

Read More

AI for Medicine Specialization featuring TensorFlow

AI for Medicine Specialization featuring TensorFlow

Posted by Laurence Moroney, AI Advocate

I’m excited to share an important aspect of the TensorFlow community: when educators and domain experts teach and train developers how to use machine learning technology to solve important tasks for a variety of scenarios, including in health care. To this end, and Coursera have launched an “AI for Medicine” specialization using TensorFlow.
Nothing excites our team more than when we see how others are using TensorFlow to solve real-world problems. In this three course specialization introduced by Andrew Ng and taught by Pranav Rajpurkar, we hope to widen access so that more people can understand the needs of medical Machine Learning problems. and Coursera have designed a specialization that is divided into three courses. The first Machine Learning for Medical Diagnosis will take you through some hypothetical Machine Learning scenarios for diagnosis of medical issues. In the first week, you’ll explore scenarios like detecting skin cancer, eye disease and histopathology. You’ll get hands-on with how you can write code in TensorFlow using convolutional neural networks to examine images, which, for example can be used to identify different conditions in an X-Ray.

The course does require some knowledge of TensorFlow, using techniques such as convolutional neural networks, transfer learning, natural language processing, and more. I recommend that you take the TensorFlow: In Practice specialization to understand the coding skills behind it, and the Deep Learning Specialization to go deeper into how the underlying technology works. Another great resource to learn the techniques used in this course is the book “Hands on Machine Learning with SciKit-Learn, Keras and TensorFlow” by Aurelien Geron.

One of the things I really enjoyed about the course is the balance of medical terminology and using common machine learning techniques from TensorFlow, such as data augmentation, to improve your models. Note: all of the data used in the course is de-identified.

Exercises from Rajpurkar’s and Ng’s course: Using image augmentation to extend the effective size of a dataset.

The course continues with techniques such as evaluation metrics and isolating key ones and understanding how to interpret confidence intervals accurately.

The first course wraps up with another deep dive into image processing, this time using segmentation in MRI images, wrapping up with a programming assignment in doing brain tumor auto segmentation on MRIs.

The second course in the specialization will be on Machine Learning for Medical Prognosis where you learn to build models to predict future patient health. You’ll learn techniques to extract data from reports such as a patient’s health metrics, history, and demographics to predict their risk of a major event such as a heart attack.

The third, and final, course will be on Machine Learning for Medical Treatment, where models may be used to assist in medical care to predict what the potential effect of a medical treatment might be on a patient. It will also go into using machine learning for text so that you can use NLP techniques to extract information from radiography reports to get labels or get the basis for a bot for answering medical questions.

In the words of Andrew Ng, “Even if your current work is not in medicine, I think you will find the application scenarios and the practice of these scenarios to be really useful, and maybe this specialization will inspire you to get more interested in medicine”.

The specialization is available at Coursera, and like all courses can be audited for free. You can learn more about at their website, and about TensorFlow at
Read More

PyTorch library updates including new model serving library

Along with the PyTorch 1.5 release, we are announcing new libraries for high-performance PyTorch model serving and tight integration with TorchElastic and Kubernetes. Additionally, we are releasing updated packages for torch_xla (Google Cloud TPUs), torchaudio, torchvision, and torchtext. All of these new libraries and enhanced capabilities are available today and accompany all of the core features released in PyTorch 1.5.

TorchServe (Experimental)

TorchServe is a flexible and easy to use library for serving PyTorch models in production performantly at scale. It is cloud and environment agnostic and supports features such as multi-model serving, logging, metrics, and the creation of RESTful endpoints for application integration. TorchServe was jointly developed by engineers from Facebook and AWS with feedback and engagement from the broader PyTorch community. The experimental release of TorchServe is available today. Some of the highlights include:

  • Support for both Python-based and TorchScript-based models
  • Default handlers for common use cases (e.g., image segmentation, text classification) as well as the ability to write custom handlers for other use cases
  • Model versioning, the ability to run multiple versions of a model at the same time, and the ability to roll back to an earlier version
  • The ability to package a model, learning weights, and supporting files (e.g., class mappings, vocabularies) into a single, persistent artifact (a.k.a. the “model archive”)
  • Robust management capability, allowing full configuration of models, versions, and individual worker threads via command line, config file, or run-time API
  • Automatic batching of individual inferences across HTTP requests
  • Logging including common metrics, and the ability to incorporate custom metrics
  • Ready-made Dockerfile for easy deployment
  • HTTPS support for secure deployment

To learn more about the APIs and the design of this feature, see the links below:

  • See for a full multi-node deployment reference architecture.
  • The full documentation can be found here.

TorchElastic integration with Kubernetes (Experimental)

TorchElastic is a proven library for training large scale deep neural networks at scale within companies like Facebook, where having the ability to dynamically adapt to server availability and scale as new compute resources come online is critical. Kubernetes enables customers using machine learning frameworks like PyTorch to run training jobs distributed across fleets of powerful GPU instances like the Amazon EC2 P3. Distributed training jobs, however, are not fault-tolerant, and a job cannot continue if a node failure or reclamation interrupts training. Further, jobs cannot start without acquiring all required resources, or scale up and down without being restarted. This lack of resiliency and flexibility results in increased training time and costs from idle resources. TorchElastic addresses these limitations by enabling distributed training jobs to be executed in a fault-tolerant and elastic manner. Until today, Kubernetes users needed to manage Pods and Services required for TorchElastic training jobs manually.

Through the joint collaboration of engineers at Facebook and AWS, TorchElastic, adding elasticity and fault tolerance, is now supported using vanilla Kubernetes and through the managed EKS service from AWS.

To learn more see the TorchElastic repo for the controller implementation and docs on how to use it.

torch_xla 1.5 now available

torch_xla is a Python package that uses the XLA linear algebra compiler to accelerate the PyTorch deep learning framework on Cloud TPUs and Cloud TPU Pods. torch_xla aims to give PyTorch users the ability to do everything they can do on GPUs on Cloud TPUs as well while minimizing changes to the user experience. The project began with a conversation at NeurIPS 2017 and gathered momentum in 2018 when teams from Facebook and Google came together to create a proof of concept. We announced this collaboration at PTDC 2018 and made the PyTorch/XLA integration broadly available at PTDC 2019. The project already has 28 contributors, nearly 2k commits, and a repo that has been forked more than 100 times.

This release of torch_xla is aligned and tested with PyTorch 1.5 to reduce friction for developers and to provide a stable and mature PyTorch/XLA stack for training models using Cloud TPU hardware. You can try it for free in your browser on an 8-core Cloud TPU device with Google Colab, and you can use it at a much larger scaleon Google Cloud.

See the full torch_xla release notes here. Full docs and tutorials can be found here and here.

PyTorch Domain Libraries

torchaudio, torchvision, and torchtext complement PyTorch with common datasets, models, and transforms in each domain area. We’re excited to share new releases for all three domain libraries alongside PyTorch 1.5 and the rest of the library updates. For this release, all three domain libraries are removing support for Python2 and will support Python3 only.

torchaudio 0.5

The torchaudio 0.5 release includes new transforms, functionals, and datasets. Highlights for the release include:

  • Added the Griffin-Lim functional and transform, InverseMelScale and Vol transforms, and DB_to_amplitude.
  • Added support for allpass, fade, bandpass, bandreject, band, treble, deemph, and riaa filters and transformations.
  • New datasets added including LJSpeech and SpeechCommands datasets.

See the release full notes here and full docs can be found here.

torchvision 0.6

The torchvision 0.6 release includes updates to datasets, models and a significant number of bug fixes. Highlights include:

  • Faster R-CNN now supports negative samples which allows the feeding of images without annotations at training time.
  • Added aligned flag to RoIAlign to match Detectron2.
  • Refactored abstractions for C++ video decoder

See the release full notes here and full docs can be found here.

torchtext 0.6

The torchtext 0.6 release includes a number of bug fixes and improvements to documentation. Based on user’s feedback, dataset abstractions are currently being redesigned also. Highlights for the release include:

  • Fixed an issue related to the SentencePiece dependency in conda package.
  • Added support for the experimental IMDB dataset to allow a custom vocab.
  • A number of documentation updates including adding a code of conduct and a deduplication of the docs on the torchtext site.

Your feedback and discussions on the experimental datasets API are welcomed. You can send them to issue #664. We would also like to highlight the pull request here where the latest dataset abstraction is applied to the text classification datasets. The feedback can be beneficial to finalizing this abstraction.

See the release full notes here and full docs can be found here.

We’d like to thank the entire PyTorch team, the Amazon team and the community for all their contributions to this work.


Team PyTorch

Read More