Using AI and old reports to understand new medical images

Getting a quick and accurate reading of an X-ray or some other medical images can be vital to a patient’s health and might even save a life. Obtaining such an assessment depends on the availability of a skilled radiologist and, consequently, a rapid response is not always possible. For that reason, says Ruizhi “Ray” Liao, a postdoc and a recent PhD graduate at MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL), “we want to train machines that are capable of reproducing what radiologists do every day.” Liao is first author of a new paper, written with other researchers at MIT and Boston-area hospitals, that is being presented this fall at MICCAI 2021, an international conference on medical image computing.

Although the idea of utilizing computers to interpret images is not new, the MIT-led group is drawing on an underused resource — the vast body of radiology reports that accompany medical images, written by radiologists in routine clinical practice — to improve the interpretive abilities of machine learning algorithms. The team is also utilizing a concept from information theory called mutual information — a statistical measure of the interdependence of two different variables — in order to boost the effectiveness of their approach.

Here’s how it works: First, a neural network is trained to determine the extent of a disease, such as pulmonary edema, by being presented with numerous X-ray images of patients’ lungs, along with a doctor’s rating of the severity of each case. That information is encapsulated within a collection of numbers. A separate neural network does the same for text, representing its information in a different collection of numbers. A third neural network then integrates the information between images and text in a coordinated way that maximizes the mutual information between the two datasets. “When the mutual information between images and text is high, that means that images are highly predictive of the text and the text is highly predictive of the images,” explains MIT Professor Polina Golland, a principal investigator at CSAIL.

Liao, Golland, and their colleagues have introduced another innovation that confers several advantages: Rather than working from entire images and radiology reports, they break the reports down to individual sentences and the portions of those images that the sentences pertain to. Doing things this way, Golland says, “estimates the severity of the disease more accurately than if you view the whole image and whole report. And because the model is examining smaller pieces of data, it can learn more readily and has more samples to train on.”

While Liao finds the computer science aspects of this project fascinating, a primary motivation for him is “to develop technology that is clinically meaningful and applicable to the real world.”

To that end, a pilot program is currently underway at the Beth Israel Deaconess Medical Center to see how MIT’s machine learning model could influence the way doctors managing heart failure patients make decisions, especially in an emergency room setting where speed is of the essence.

The model could have very broad applicability, according to Golland. “It could be used for any kind of imagery and associated text — inside or outside the medical realm. This general approach, moreover, could be applied beyond images and text, which is exciting to think about.”

Liao wrote the paper alongside MIT CSAIL postdoc Daniel Moyer and Golland; Miriam Cha and Keegan Quigley at MIT Lincoln Laboratory; William M. Wells at Harvard Medical School and MIT CSAIL; and clinical collaborators Seth Berkowitz and Steven Horng at Beth Israel Deaconess Medical Center.

The work was sponsored by the NIH NIBIB Neuroimaging Analysis Center, Wistron, MIT-IBM Watson AI Lab, MIT Deshpande Center for Technological Innovation, MIT Abdul Latif Jameel Clinic for Machine Learning in Health (J-Clinic), and MIT Lincoln Lab.

Read More

Announcing the winners of the 2021 Next-generation Data Infrastructure request for proposals

In April 2021, Facebook launched the Next-generation Data Infrastructure request for proposals (RFP). Today, we’re announcing the winners of this award.
VIEW RFPThe Facebook Core Data and Data Infra teams were interested in proposals that sought out innovative solutions to the challenges that still remain in the data management community. Areas of interest included, but were not limited to, the following topics:

  • Large-scale query processing
  • Physical layout and IO optimizations
  • Data management and processing at a global scale
  • Converged architectures for data wrangling, machine learning, and analytics
  • Advances in testing and verification for storage and processing systems

Read our Q&A with database researchers Stavros Harizopoulos and Shrikanth Shankar to learn more about database research at Facebook, the goal of this RFP, and the inspiration behind the RFP.

The team reviewed 109 high-quality proposals, and we are pleased to announce the 10 winning proposals and six finalists. Thank you to everyone who took the time to submit a proposal, and congratulations to the winners.

Research award recipients

Holistic optimization for parallel query processing
Paraschos Koutris (University of Wisconsin–Madison)

SCALER – SCalAbLe vEctor pRocessing of SPJG-Queries
Wolfgang Lehner, Dirk Habich (Technische Universität Dresden)

AnyScale transactions in the cloud
Natacha Crooks, Joe Hellerstein (University of California, Berkeley)

Proudi: Predictability on unpredictable data infrastructure
Haryadi S. Gunawi (University of Chicago)

Making irregular partitioning practical
Spyros Blanas (The Ohio State University)

Dynamic join processing pushdown in Presto
Daniel Abadi, Chujun Song (University of Maryland, College Park)

A learned persistent key-value store
Tim Kraska (Massachusetts Institute of Technology)

Building global-scale systems using a flexible consensus substrate
Faisal Nawab (University of California, Irvine)

Runtime-optimized analytics using compilation hints
Anastasia Ailamaki (Swiss Federal Institute of Technology Lausanne)

Flexible scheduling for machine learning data processing close to storage
Ana Klimovic, Damien Aymon (ETH Zurich)

Finalists

Next generation data provenance/data governance
Tim Kraska, Michael Cafarella, Michael Stonebraker (Massachusetts Institute of Technology)

Optimizing commitment latency for geo-distributed transactions
Xiangyao Yu (University of Wisconsin–Madison)

Semantic optimization of recursive queries
Dan Suciu (University of Washington)

Towards a disaggregated database for future data centers
Jianguo Wang (Purdue University)

Unified data systems for structured and unstructured data
Matei Zaharia, Christos Kozyrakis (Stanford University)

Unifying machine learning and analytics under a single data engine
Stratos Idreos (Harvard University)

The post Announcing the winners of the 2021 Next-generation Data Infrastructure request for proposals appeared first on Facebook Research.

Read More

Summarizing Books with Human Feedback

Read paperBrowse samples

Summarizing Books with Human Feedback

To safely deploy powerful, general-purpose artificial intelligence in the future, we need to ensure that machine learning models act in accordance with human intentions. This challenge has become known as the alignment problem.

A scalable solution to the alignment problem needs to work on tasks where model outputs are difficult or time-consuming for humans to evaluate. To test scalable alignment techniques, we trained a model to summarize entire books, as shown in the following samples.[1] Our model works by first summarizing small sections of a book, then summarizing those summaries into a higher-level summary, and so on.

Explore more samples

Our best model is fine-tuned from GPT-3 and generates sensible summaries of entire books, sometimes even matching the average quality of human-written summaries: it achieves a 6/7 rating (similar to the average human-written summary) from humans who have read the book 5% of the time and a 5/7 rating 15% of the time. Our model also achieves state-of-the-art results on the BookSum dataset for book-length summarization. A zero-shot question-answering model can use our model’s summaries to obtain competitive results on the NarrativeQA dataset for book-length question answering.[2]

Our Approach: Combining Reinforcement Learning from Human Feedback and Recursive Task Decomposition

Consider the task of summarizing a piece of text. Large pretrained models aren’t very good at summarization. In the past we found that training a model with reinforcement learning from human feedback helped align model summaries with human preferences on short posts and articles. But judging summaries of entire books takes a lot of effort to do directly since a human would need to read the entire book, which takes many hours.

To address this problem, we additionally make use of recursive task decomposition: we procedurally break up a difficult task into easier ones. In this case we break up summarizing a long piece of text into summarizing several shorter pieces. Compared to an end-to-end training procedure, recursive task decomposition has the following advantages:

  1. Decomposition allows humans to evaluate model summaries more quickly by using summaries of smaller parts of the book rather than reading the source text.
  2. It is easier to trace the summary-writing process. For example, you can trace to find where in the original text certain events from the summary happen. See for yourself on our summary explorer!
  3. Our method can be used to summarize books of unbounded length, unrestricted by the context length of the transformer models we use.

Why We Are Working on This

This work is part of our ongoing research into aligning advanced AI systems, which is key to our mission. As we train our models to do increasingly complex tasks, making informed evaluations of the models’ outputs will become increasingly difficult for humans. This makes it harder to detect subtle problems in model outputs that could lead to negative consequences when these models are deployed. Therefore we want our ability to evaluate our models to increase as their capabilities increase.

Our current approach to this problem is to empower humans to evaluate machine learning model outputs using assistance from other models. In this case, to evaluate book summaries we empower humans with individual chapter summaries written by our model, which saves them time when evaluating these summaries relative to reading the source text. Our progress on book summarization is the first large-scale empirical work on scaling alignment techniques.

Going forward, we are researching better ways to assist humans in evaluating model behavior, with the goal of finding techniques that scale to aligning artificial general intelligence.

We’re always looking for more talented people to join us; so if this work interests you, please apply to join our team!


Acknowledgments

We’d like to acknowledge our paper co-authors: Long Ouyang, Daniel Ziegler, Nisan Stiennon, and Paul Christiano.

Thanks to the following for feedback on this release: Steve Dowling, Hannah Wong, Miles Brundage, Gretchen Krueger, Ilya Sutskever, and Sam Altman.


Design
Justin Jay Wang


Book Cover Artwork


Footnotes

  1. These samples were selected from works in the public domain, and are part of GPT-3’s pretraining data. To control for this effect, and purely for research purposes, our paper evaluates summaries of books the model has never seen before. ↩︎

  2. We’ve amended our original claim about results on NarrativeQA after being made aware of prior work with better results than ours. ↩︎

OpenAI

Toward a smarter electronic health record

Electronic health records have been widely adopted with the hope they would save time and improve the quality of patient care. But due to fragmented interfaces and tedious data entry procedures, physicians often spend more time navigating these systems than they do interacting with patients.

Researchers at MIT and the Beth Israel Deaconess Medical Center are combining machine learning and human-computer interaction to create a better electronic health record (EHR). They developed MedKnowts, a system that unifies the processes of looking up medical records and documenting patient information into a single, interactive interface.

Driven by artificial intelligence, this “smart” EHR automatically displays customized, patient-specific medical records when a clinician needs them. MedKnowts also provides autocomplete for clinical terms and auto-populates fields with patient information to help doctors work more efficiently.

“In the origins of EHRs, there was this tremendous enthusiasm that getting all this information organized would be helpful to be able to track billing records, report statistics to the government, and provide data for scientific research. But few stopped to ask the deep questions around whether they would be of use for the clinician. I think a lot of clinicians feel they have had this burden of EHRs put on them for the benefit of bureaucracies and scientists and accountants. We came into this project asking how EHRs might actually benefit clinicians,” says David Karger, professor of computer science in the Computer Science and Artificial Intelligence Laboratory (CSAIL) and senior author of the paper.

The research was co-authored by CSAIL graduate students Luke Murray, who is the lead author, Divya Gopinath, and Monica Agrawal. Other authors include Steven Horng, an emergency medicine attending physician and clinical lead for machine learning at the Center for Healthcare Delivery Science of Beth Israel Deaconess Medical Center, and David Sontag, associate professor of electrical engineering and computer science at MIT and a member of CSAIL and the Institute for Medical Engineering and Science, and a principal investigator at the Abdul Latif Jameel Clinic for Machine Learning in Health. It will be presented at the Association for Computing Machinery Symposium on User Interface Software and Technology next month.

A problem-oriented tool

To design an EHR that would benefit doctors, the researchers had to think like doctors.

They created a note-taking editor with a side panel that displays relevant information from the patient’s medical history. That historical information appears in the form of cards that are focused on particular problems or concepts.

For instance, if MedKnowts identifies the clinical term “diabetes” in the text as a clinician types, the system automatically displays a “diabetes card” containing medications, lab values, and snippets from past records that are relevant to diabetes treatment.

Most EHRs store historical information on separate pages and list medications or lab values alphabetically or chronologically, forcing the clinician to search through data to find the information they need, Murray says. MedKnowts only displays information relevant to the particular concept the clinician is writing about.

“This is a closer match to the way doctors think about information. A lot of times, doctors will do this subconsciously. They will look through a medications page and only focus on the medications that are relevant to the current conditions. We are helping to do that process automatically and hopefully move some things out of the doctor’s head so they have more time to think about the complex part, which is determining what is wrong with the patient and coming up with a treatment plan,” Murray says.

Pieces of interactive text called chips serve as links to related cards. As a physician types a note, the autocomplete system recognizes clinical terms, such as medications, lab values, or conditions, and transforms them into chips. Each chip is displayed as a word or phrase that has been highlighted in a certain color depending on its category (red for a medical condition, green for a medication, yellow for a procedure, etc.)

Through the use of autocomplete, structured data on the patient’s conditions, symptoms, and medication usage is collected with no additional effort from the physician.

Sontag says he hopes the advance will “change the paradigm of how to create large-scale health datasets for studying disease progression and assessing the real-world effectiveness of treatments.”

In practice

After a year-long iterative design process, the researchers tested MedKnowts by deploying the software in the emergency department at Beth Israel Deaconess Medical Center in Boston. They worked with an emergency physician and four hospital scribes who enter notes into the electronic health record.

Deploying the software in an emergency department, where doctors operate in a high-stress environment, involved a delicate balancing act, Agrawal says.

“One of the biggest challenges we faced was trying to get people to shift what they currently do. Doctors who have used the same system, and done the same dance of clicks so many times, form a sort of muscle memory. Whenever you are going to make a change, there is a question of is this worth it? And we definitely found that some features had greater usage than others,” she says.

The Covid-19 pandemic complicated the deployment, too. The researchers had been visiting the emergency department to get a sense of the workflow, but were forced to end those visits due to Covid-19 and were unable to be in the hospital while the system was being deployed.

Despite those initial challenges, MedKnowts became popular with the scribes over the course of the one-month deployment. They gave the system an average rating of 83.75 (out of 100) for usability.

Scribes found the autocomplete function especially useful for speeding up their work, according to survey results. Also, the color-coded chips helped them quickly scan notes for relevant information.

Those initial results are promising, but as the researchers consider the feedback and work on future iterations of MedKnowts, they plan to proceed with caution.

“What we are trying to do here is smooth the pathway for doctors and let them accelerate. There is some risk there. Part of the purpose of bureaucracy is to slow things down and make sure all the i’s are dotted and all the t’s are crossed. And if we have a computer dotting the i’s and crossing the t’s for doctors, that may actually be countering the goals of the bureaucracy, which is to force doctors to think twice before they make a decision. We have to be thinking about how to protect doctors and patients from the consequences of making the doctors more efficient,” Karger says.

A longer-term vision

The researchers plan to improve the machine learning algorithms that drive MedKnowts so the system can more effectively highlight parts of the medical record that are most relevant, Agrawal says.

They also want to consider the needs of different medical users. The researchers designed MedKnowts with an emergency department in mind — a setting where doctors are typically seeing patients for the first time. A primary care physician who knows their patients much better would likely have some different needs.

In the longer-term, the researchers envision creating an adaptive system that clinicians can contribute to. For example, perhaps a doctor realizes a certain cardiology term is missing from MedKnowts and adds that information to a card, which would update the system for all users.

The team is exploring commercialization as an avenue for further deployment.

“We want to build tools that let doctors create their own tools. We don’t expect doctors to learn to be programmers, but with the right support they might be able to radically customize whatever medical applications they are using to really suit their own needs and preferences,” Karger says.

This research was funded by the MIT Abdul Latif Jameel Clinic for Machine Learning in Health.

Read More

Googler Marian Croak is now in the Inventors Hall of Fame

Look around you right now and consider everything that was created by an inventor. The computer you’re reading this article on, the internet necessary to load this article, the electricity that powers the screen, even the coffee maker you used this morning. 

To recognize the incredible contributions of those inventors and the benefits they bring to our everyday life, the National Inventors Hall of Fame has inducted a new group of honorees every year since 1973. In this year’s combined inductee class of 2020/2021, Googler Marian Croak is being honored for her work in advancing VoIP (Voice over Internet Protocol) Technology, which powers the online calls and video chats that have helped businesses and families stay connected through the COVID-19 pandemic. She holds more than 200 patents, and recently was honored by the U.S. Patent and Trademark Office. 

These days, Marian leads our Research Center for Responsible AI and Human Centered Technology, which is responsible for ensuring Google ​​develops artificial intelligence responsibly and that it has a positive impact. We chatted over Google Meet to find out how plumbers and electricians sparked her interest in science, how her inventions have made life in a pandemic a tiny bit easier for everyone, and what the NIHF honor means to her.  

When was the first time you realized you were interested in technology?

I was probably around 5 or 6. I know that we don’t usually think of things like plumbing or electricity as necessarily technology, but they are. I was very enchanted with plumbers and electricians who would come to our house and fix things. They would be dirty and greasy, but I would love the smell, you know? I felt like, Wow, what a miracle worker! I would follow them around, trying to figure out how they’d fix something. I still do that today! 

So when you have electricians come to your house, you’re still like, “Hey, how did you do that?”

There was a leak once, and I was asking the plumber all these questions, and he asked me to quiet down! Because he needed to listen to the invisible flow of water through the pipes to determine the problem. It was amazing to me how similar it was to network engineering!

You’ve had a few different roles at Google and Alphabet so far. How did you move to where you are today?

When I first came to Google, my first role was bringing the Internet to emerging markets. Laying fiber in Africa, building public Wi-Fi in railroad stations in India and then exploring the landscape in countries like Cuba and countries where there wasn’t an openness yet for the Internet. And that was a fascinating job. It was a merger of technology, policy and governmental affairs, combined with an understanding of communities and regions. 

Then I worked on bringing features and technology and Google’s products to the next billion users. And after I did that for a few years, I joined the Site Reliability Engineering organization to help enhance the performance of Google’s complex, integrated systems. Now my current role is leading the Research Center for Responsible AI and Human Centered Technology group. I’m inspired that my work has the potential to positively impact so many of our users. 

Today you’re being inducted into the National Inventors Hall of Fame for your work in advancing VoIP technology. What inspired you to work on VoIP, and can you describe that process of bringing the technology to life?

I have alway been motivated by the desire to change the world, and to do that I try to change the world that I’m currently in. What I mean by that is I work on problems that I am aware of, and that I can tackle within the world that surrounds me. So when I began working on VoIP technology, it was at a time in the late ‘90s when there was a lot of change happening involving the internet. Netscape had put a user-friendly web browser in place and there was a lot of new activity beginning to bubble up all over the online world. 

I was part of a team that was also very interested in doing testing and prototyping of voice communications over the internet. There were some existing technologies but they didn’t scale and they were proprietary in nature, so we were thinking of ways we could open it up, make it scalable, make it reliable and be able to support billions of daily calls. We started to work on this but had a lot of doubters telling us that this wouldn’t work, and that no one would ever use this “toy like” technology. And at the time, they were right: It wasn’t working and it wasn’t reliable. But over time we were able to get it to a point where it started working very well. So much so that eventually the senior leaders within AT&T began to adopt the technology for their core network. It was challenging but an exciting thing for me to do because I like to bring change to things, especially when people doubt that it can happen.

What advice would you give to aspiring inventors? 

Most importantly, don’t give up, and during the process of creation, listen to your critics. I received so much criticism and in many ways it was valid. That type of feedback motivated me to improve the technology, and really address a variety of pain points that I hadn’t necessarily thought of. 

What does being inducted into the NIHF mean to you? 

Well it’s humbling, and a great experience. At the time I never thought the work that I was doing was that significant and that it would lead to this, but I’m so I’m very grateful for the recognition.

What does it mean to be a part of a class that sees the first two Black women inducted into the NIHF?

I find that it inspires people when they see someone who looks like themselves on some dimension, and I’m proud to offer that type of representation. People also see that I’m just a normal person like themselves and I think that also inspires them to accomplish their goals. I want people to understand that it may be difficult but that they can overcome obstacles and that it will be so worth it.

Read More