AI of the Storm: How We Built the Most Powerful Industrial Computer in the U.S. in Three Weeks During a Pandemic

AI of the Storm: How We Built the Most Powerful Industrial Computer in the U.S. in Three Weeks During a Pandemic

In under a month amid the global pandemic, a small team assembled the world’s seventh-fastest computer.

Today that mega-system, called Selene, communicates with its operators on Slack, has its own robot attendant and is driving AI forward in automotive, healthcare and natural-language processing.

While many supercomputers tap exotic, proprietary designs that take months to commission, Selene is based on an open architecture NVIDIA shares with its customers.

The Argonne National Laboratory, outside Chicago, is using a system based on Selene’s DGX SuperPOD design to research ways to stop the coronavirus. The University of Florida will use the design to build the fastest AI computer in academia.

DGX SuperPODs are driving business results for companies like Continental in automotive, Lockheed Martin in aerospace and Microsoft in cloud-computing services.

Birth of an AI System

The story of how and why NVIDIA built Selene starts in 2015.

NVIDIA engineers started their first system-level design with two motivations. They wanted to build something both powerful enough to train the AI models their colleagues were building for autonomous vehicles and general purpose enough to serve the needs of any deep-learning researcher.

The result was the SATURNV cluster, born in 2016 and based on the NVIDIA Pascal GPU. When the more powerful NVIDIA Volta GPU debuted a year later, the budding systems group’s motivation and its designs expanded rapidly.

AI Jobs Grow Beyond the Accelerator

“We’re trying to anticipate what’s coming based on what we hear from researchers, building machines that serve multiple uses and have long lifetimes, packing as much processing, memory and storage as possible,” said Michael Houston, a chief architect who leads the systems team.

As early as 2017, “we were starting to see new apps drive the need for multi-node training, demanding very high-speed communications between systems and access to high-speed storage,” he said.

AI models were growing rapidly, requiring multiple GPUs to handle them. Workloads were demanding new computing styles, like model parallelism, to keep pace.

So, in fast succession, the team crafted ever larger clusters of V100-based NVIDIA DGX-2 systems, called DGX PODs. They used 32, then 64 DGX-2 nodes, culminating in a 96-node architecture dubbed the DGX SuperPOD.

They christened it Circe for the irresistible Greek goddess. It debuted in June 2019 at No. 22 on the TOP500 list of the world’s fastest supercomputers and currently holds No. 23.

Cutting Cables in a Computing Jungle

Along the way, the team learned lessons about networking, storage, power and thermals. Those learnings got baked into the latest NVIDIA DGX systems, reference architectures and today’s 280-node Selene.

In the race through ever larger clusters to get to Circe, some lessons were hard won.

“We tore everything out twice, we literally cut the cables out. It was the fastest way forward, but it still had a lot of downtime and cost. So we vowed to never do that again and set ease of expansion and incremental deployment as a fundamental design principle,” said Houston.

The team redesigned the overall network to simplify assembling the system.

They defined modules of 20 nodes connected by relatively simple “thin switches.” Each of these so-called scalable units could be laid down, cookie-cutter style, turned on and tested before the next one was added.

The design let engineers specify set lengths of cables that could be bundled together with Velcro at the factory. Racks could be labeled and mapped, radically simplifying the process of filling them with dozens of systems.

Doubling Up on InfiniBand

Early on, the team learned to split up compute, storage and management fabrics into independent planes, spreading them across more, faster network-interface cards.

The number of NICs per GPU doubled to two. So did their speeds, going from 100 Gbit per second InfiniBand in Circe to 200G HDR InfiniBand in Selene. The result was a 4x increase in the effective node bandwidth.

Likewise, memory and storage links grew in capacity and throughput to handle jobs with hot, warm and cold storage needs. Four storage tiers spanned 100 terabyte/second memory links to 100 Gbyte/s storage pools.

Power and thermals stayed within air-cooled limits. The default designs used 35kW racks typical in leased data centers, but they can stretch beyond 50kW for the most aggressive supercomputer centers and down to 7kW racks some telcos use.

Seeking the Big, Balanced System

The net result is a more balanced design that can handle today’s many different workloads. That flexibility also gives researchers the freedom to explore new directions in AI and high performance computing.

“To some extent HPC and AI both require max performance, but you have to look carefully at how you deliver that performance in terms of power, storage and networking as well as raw processing,” said Julie Bernauer, who leads an advanced development team that’s worked on all of NVIDIA’s large-scale systems.

A portrait of Selene by the numbers

Skeleton Crews on Strict Protocols

The gains paid off in early 2020.

Within days of the pandemic hitting, the first NVIDIA Ampere architecture GPUs arrived, and engineers faced the job of assembling the 280-node Selene.

In the best of times, it can take dozens of engineers a few months to assemble, test and commission a supercomputer-class system. NVIDIA had to get Selene running in a few weeks to participate in industry benchmarks and fulfill obligations to customers like Argonne.

And engineers had to stay well within public-health guidelines of the pandemic.

“We had skeleton crews with strict protocols to keep staff healthy,” said Bernauer.

“To unbox and rack systems, we used two-person teams that didn’t mix with the others — they even took vacation at the same time. And we did cabling with six-foot distances between people. That really changes how you build systems,” she said.

Even with the COVID restrictions, engineers racked up to 60 systems in a day, the maximum their loading dock could handle. Virtual log-ins let administrators validate cabling remotely, testing the 20-node modules as they were deployed.

Bernauer’s team put several layers of automation in place. That cut the need for people at the co-location facility where Selene was built, a block from NVIDIA’s Silicon Valley headquarters.

Slacking with a Supercomputer

Selene talks to staff over a Slack channel as if it were a co-worker, reporting loose cables and isolating malfunctioning hardware so the system can keep running.

“We don’t want to wake up in the night because the cluster has a problem,” Bernauer said.

It’s part of the automation customers can access if they follow the guidance in the DGX POD and SuperPOD architectures.

Thanks to this approach, the University of Florida, for example, is expected to rack and power up a 140-node extension to its HiPerGator system, switching on the most powerful AI supercomputer in academia within as little as 10 days of receiving it.

As an added touch, the NVIDIA team bought a telepresence robot from Double Robotics so non-essential designers sheltering at home could maintain daily contact with Selene. Tongue-in-cheek, they dubbed it Trip given early concerns essential technicians on site might bump into it.

The fact that Trip is powered by an NVIDIA Jetson TX2 module was an added attraction for team members who imagined some day they might tinker with its programming.

Trip robot with Selene
Trip helped engineers inspect Selene while it was under construction.

Since late July, Trip’s been used regularly to let them virtually drive through Selene’s aisles, observing the system through the robot’s camera and microphone.

“Trip doesn’t replace a human operator, but if you are worried about something at 2 a.m., you can check it without driving to the data center,” she said.

Delivering HPC, AI Results at Scale

In the end, it’s all about the results, and they came fast.

In June, Selene hit No. 7 on the TOP500 list and No. 2 on the Green500 list of the most power-efficient systems. In July, it broke records in all eight systems tests for AI training performance in the latest MLPerf benchmarks.

“The big surprise for me was how smoothly everything came up given we were using new processors and boards, and I credit all the testing along the way,” said Houston. “To get this machine up and do a bunch of hard back-to-back benchmarks gave the team a huge lift,” he added.

The work pre-testing NGC containers and HPC software for Argonne was even more gratifying. The lab is already hammering on hard problems in protein docking and quantum chemistry to shine a light on the coronavirus.

Separately, Circe donates many of its free cycles to the Folding@Home initiative that fights COVID.

At the same time, NVIDIA’s own researchers are using Selene to train autonomous vehicles and refine conversational AI, nearing advances they’re expected to report soon. They are among more than a thousand jobs run, often simultaneously, on the system so far.

Meanwhile the team already has on the whiteboard ideas for what’s next. “Give performance-obsessed engineers enough horsepower and cables and they will figure out amazing things,” said Bernauer.

At top: An artist’s rendering of a portion of Selene.

The post AI of the Storm: How We Built the Most Powerful Industrial Computer in the U.S. in Three Weeks During a Pandemic appeared first on The Official NVIDIA Blog.

Read More

How Abyss Solutions Helps Keep Offshore Rig Operators Afloat

How Abyss Solutions Helps Keep Offshore Rig Operators Afloat

As its evocative name suggests, Abyss Solutions is a company taking AI to places where humans can’t — or shouldn’t — go.

The brainchild of four University of Sydney scientists and engineers, six years ago the startup set out to improve the maintenance and observation of industrial equipment.

It began by developing advanced technology to inspect the most difficult to reach assets of urban water infrastructure systems, such as dams, reservoirs, canals, bridges and ship hulls. Later, it zeroed in on an industry that often operates literally in the dark: offshore oil and gas platforms.

Abyss Solutions Lantern Eye output
Abyss Solutions Lantern Eye output.

A few years ago, Abyss CEO Nasir Ahsan and CTO Suchet Bargoti were demonstrating to a Houston-based platform operator the insights they could generate from the image data collected by its underwater Lantern Eye 3D camera. The camera’s sub-millimeter accuracy provides a “way to inspect objects as if you’re taking them out of water,” said Bargoti.

An employee of the operator interrupted the meeting to describe an ongoing problem the company was having with their topside equipment that was decaying and couldn’t be repaired sufficiently. Once it was clear that Abyss could provide detailed insight into the problem and how to solve it, no more selling was needed.

“Every one of these companies is dreading the next Deepwater Horizon,” said Bargoti, referencing the 2010 incident in which BP spilled nearly 5 million barrels of oil into the Gulf of Mexico, killing 11 people and countless wildlife, and costing the company $65 billion in cleanup costs and fines. “What they wanted to know is, ‘Will your data analytics help us understand what to fix and when to fix it?’”

Today, Abyss’s combination of NVIDIA GPU-powered deep learning algorithms, unmanned vehicles and innovative underwater cameras is enabling platform operators to spot faults and anomalies such as corrosion on equipment above and below the water and address it before it fails, potentially saving millions of dollars and even a few human lives.

During the COVID-19 pandemic, the stakes have risen. Offshore rigs have emerged as hotbeds for the spread of the virus, forcing them to adopt strict quarantine procedures that limit the number of people onsite in order to reduce the disease’s spread and minimize interruptions.

Essentially, this has sped up the industry’s digital transformation push and fueled the urgency of Abyss’ work, said Bargoti. “They can’t afford to have these things happening,” he said.

Abyss Solutions corrosion detections
Abyss Solutions corrosion detections.

Better Than Human Performance

Historically, inspection and maintenance of offshore platforms and equipment has been a costly, time-consuming and labor-intensive task for oil and gas companies. It often yields subjective findings that can result in missed needed repairs and unplanned shutdowns.

An independent audit found that Abyss’ semantic segmentation models are able to detect general corrosion with greater than 90 percent accuracy, while severe corrosion is identified with greater than 97 percent accuracy. Both are significant improvements over human efforts, and also have outcompeted other AI companies in the audit.

What’s more, Abyss says that its oil and gas platform clients report reductions in operating costs by as much as 25 percent thanks to its technology.

Training of Abyss’s models, which rely on many terabytes of data (each platform generates about 1TB a day), occurs on AWS instances running NVIDIA T4 Tensor Core GPUs. The company also uses the latest versions of CUDA and cuDNN in conjunction with TensorFlow to power deep learning applications such as image and video segmentation and classification, and object detection.

Bargoti said the company also is working with the NVIDIA Jetson TX2 module and TensorRT software to condense its models so they can run on their unmanned vehicles in real time.

Most of the data can be processed in the cloud because of the slowness of the corrosion process, but there are times when real-time AI is needed onsite, such as when a robotic vehicle needs to make decisions on where to go next.

Taking Full Advantage of Inception

As a member of NVIDIA Inception, a program to help startups working in AI and data science get to market faster, Abyss has benefited from a try-before-you-buy approach to NVIDIA tech. That’s allowed it to experiment with technologies before making big investments.

It’s also getting valuable advice on what’s coming down the pipe and how to time its work with the release of new GPUs. Bargoti said NVIDIA’s regularly advancing technology is helping Abyss squeeze more data into each compute cycle, pushing it closer to its long-term vision.

“We want to be the intel in these unmanned systems that makes smart decisions and pushes the frontier of exploration,” said Bargoti. “It’s all leading to this better development of perception systems, better development of decision-making systems and better development of robotics systems.”

Abyss is taking a deep look at a number of additional markets it believes its technology can help. The team is taking on growth capital and rapidly expanding globally.

“Continuous investment in R&D and innovation plays a critical role in ensuring Abyss can provide game-changing solutions to the industry,” he said.

The post How Abyss Solutions Helps Keep Offshore Rig Operators Afloat appeared first on The Official NVIDIA Blog.

Read More

Startup Lunit Uses AI to Help Doctors Prioritize Patients with COVID-19 Symptoms

Startup Lunit Uses AI to Help Doctors Prioritize Patients with COVID-19 Symptoms

Testing for COVID-19 has become more widespread, but addressing the pandemic will require quickly screening for and triaging patients who are experiencing symptoms.

Lunit, a South Korean medical imaging startup — its name is a portmanteau of “learning unit” — has created an AI-based system to detect pneumonia, often present in COVID-19 infected patients, within seconds.

The Lunit INSIGHT CXR system, which is CE marked, uses AI to quickly detect 10 different radiological findings on chest X-rays, including pneumonia and potentially cancerous lung nodules.

It overlays the results onto the X-ray image along with a probability score for the finding. The system also monitors progression of a patient’s condition, automatically tracking changes within a series of chest X-ray images taken over time.

Lunit has recently partnered with GE Healthcare, which launched its Thoracic Care Suite using Lunit INSIGHT CXR’s AI algorithms to flag abnormalities on chest X-rays for radiologists’ review. It’s one of the first collaborations to bring AI from a medical startup to an existing X-ray equipment manufacturer, making AI-based solutions commercially available.

For integration of its algorithms with GE Healthcare and other partners’ products, Lunit’s hardware is powered by NVIDIA Quadro P1000 GPUs, and its AI model is optimized on the NVIDIA Jetson TX2i module. For cloud-based deployment, the company uses NVIDIA drivers and GPUs.

Lunit is a premier member of NVIDIA Inception, a program that helps startups with go-to-market support, expertise and technology. Brandon Suh, CEO of Lunit, said being an Inception partner “has helped position the company as a leader in state-of-the-art technology for social impact.”

AI Opens New Doors in Medicine

The beauty of AI, according to Suh, is its ability to process vast amounts of data and discover patterns — augmenting human ability, in terms of time and energy.

The founders of Lunit, he said, started with nothing but a “crazy obsession with technology” and a vision to use AI to “open a new door for medical practice with increased survival rates and more affordable costs.”

Image courtesy of Lunit.

Initially, Lunit’s products were focused on detecting potentially cancerous nodules in a patient’s lungs or breasts, as well as analyzing pathology tissue slides. However, the COVID-19 outbreak provided an opportunity for the company to upgrade the algorithms being used to help alleviate the burdens of healthcare professionals on the frontlines of the pandemic.

“The definitive diagnosis for COVID-19 involves a polymerase chain reaction test to detect antigens, but the results take 1-2 days to be delivered,” said Suh. “In the meantime, the doctors are left without any clinical evidence that can help them make a decision on triaging the patients.”

With its newly refined algorithm, Lunit INSIGHT CXR can now single out pneumonia and identify it in a patient within seconds, helping doctors make immediate, actionable decisions for those in more urgent need of care.

The Lunit INSIGHT product line, which provides AI analysis for chest X-rays and mammograms, has been commercially deployed and tested in more than 130 sites in countries such as Brazil, France, Indonesia, Italy, Mexico, South Korea and Thailand.

“We feel fortunate to be able to play a part in the battle against COVID-19 with what we do best: developing medical AI solutions,” said Suh. “Though AI’s considered cutting-edge technology today, it could be a norm tomorrow, and we’d like everyone to benefit from a more accurate and efficient way of medical diagnosis and treatment.”

The team at Lunit is at work developing algorithms to use with 3D imaging, in addition to their current 2D ones. They’re also looking to create software that analyzes a tumor’s microenvironment to predict whether a patient would respond to immunotherapy.

Learn more about Lunit at NVIDIA’s healthcare AI startups solutions webinar on August 13. Register here.

The post Startup Lunit Uses AI to Help Doctors Prioritize Patients with COVID-19 Symptoms appeared first on The Official NVIDIA Blog.

Read More

Here Comes the Sun: NASA Scientists Talk Solar Physics

Here Comes the Sun: NASA Scientists Talk Solar Physics

Michael Kirk and Raphael Attie, scientists at NASA’s Goddard Space Flight Center, regularly face terabytes of data in their quest to analyze images of the sun.

This computational challenge, which could take a year or more on a CPU, has been reduced to less than a week on Quadro RTX data science workstations. Kirk and Attie spoke to AI Podcast host Noah Kravitz about the workflow they follow to study these images, and what they hope to find.

The lessons they’ve learned are useful for those in both science and industry grappling with how to best put torrents of data to work.

The researchers study images captured by telescopes on satellites, such as the Solar Dynamics Observatory spacecraft, as well as those from ground-based observatories.

They study these images to identify particles in Earth’s orbit that could damage interplanetary spacecraft, and to track solar surface flows, which allow them to develop models predicting weather in space.

Currently, these images are taken in space and sent to Earth for computation. But Kirk and Attie aim to shoot for the stars in the future: the goal is the ultimate form of edge computing, putting high-performance computers in space.

Key Points From This Episode:

  • The primary instrument that Kirk and Attie use to see images of the sun is the Solar Dynamics Observatory, a spacecraft that has four telescopes to take images of the extreme ultraviolet light of the sun, as well as an additional instrument to measure its magnetic fields.
  • Researchers such as Kirk and Attie have developed machine learning algorithms for a variety of projects, such as creating synthetic images of the sun’s surface and its flow fields.

Tweetables:

“We take an image about once every 1.3 seconds of the sun … that entire data archive — we’re sitting at about 18 petabytes right now.” — Michael Kirk [6:50]

“What AI is really offering us is a way to crunch through terabytes of data that are very difficult to move back to Earth.” — Raphael Attie [34:34]

You Might Also Like

How the Breakthrough Listen Harnessed AI in the Search for Aliens

UC Berkeley’s Gerry Zhang talks about his work using deep learning to analyze signals from space for signs of intelligent extraterrestrial civilizations. And while we haven’t found aliens yet, the doctoral student has already made some extraordinary discoveries.

Forget Storming Area 51, AI’s Helping Astronomers Scour the Skies for Habitable Planets

Astronomer Olivier Guyon and professor Damien Gratadour speak about the quest to discover nearby habitable planets using GPU-powered extreme adaptive optics in very large telescopes.

Astronomers Turn to AI as New Telescopes Come Online 

To turn the vast quantities of data that will be pouring out of new telescopes into world-changing scientific discoveries, Brant Robertson, a visiting professor at the Institute for Advanced Study in Princeton and an associate professor of astronomy at UC Santa Cruz, is turning to AI.

The post Here Comes the Sun: NASA Scientists Talk Solar Physics appeared first on The Official NVIDIA Blog.

Read More

Clarifying Training Time, Startup Launches AI-Assisted Data Annotation

Clarifying Training Time, Startup Launches AI-Assisted Data Annotation

Creating a labeled dataset for training an AI application can hit the brakes on a company’s speed to market. Clarifai, an image and text recognition startup, aims to put that obstacle in the rearview mirror.

The New York City-based company today announced the general availability of its AI-assisted data labeling service, dubbed Clarifai Labeler. The company offers data labeling as a service as well.

Founded in 2013, Clarifai entered the image-recognition market in its early days. Since that time, the number of companies exploiting unstructured data for business advantages has swelled, creating a wave of demand for data scientists. And with industry disruption from image and text recognition spanning agriculture, retail, banking, construction, insurance and beyond, much is at stake.

“High-quality AI models start with high-quality dataset annotation. We’re able to use AI to make labeling data an order of magnitude faster than some of the traditional technologies out there,” said Alfredo Ramos, a senior vice president at Clarifai.

Backed by NVIDIA GPU Ventures, Clarifai is gaining traction in retail, banking and insurance, as well as for applications in federal, state and local agencies, he says.

AI Labeling with Benefits

Clarifai’s Labeler shines at labeling video footage. The tool integrates a statistical method so that an annotated object — one with a bounding box around it — can be tracked as it moves throughout the video.

Since each second of video is made up of multiple frames of images, the tracking capabilities result in increased accuracy and huge improvements in the quantity of annotations per object, as well as a drastic reduction in the time to label large volumes of data.

The new Labeler was most recently used to annotate days of video footage to build a model to detect whether people were wearing face masks, which resulted in a million annotations in less than four days.

Traditionally, this would’ve taken a human workforce six weeks to label the individual frames. With Labeler, they created 1 million annotations 10 times faster, said Ramos.

Clarifai uses an array of NVIDIA V100 Tensor Core GPUs onsite for development of models, and it taps into NVIDIA T4 GPUs in the cloud for inference.

Star-Powered AI 

Ramos reports to one of AI’s academic champions. CEO and founder Matthew Zeiler took the industry by storm when his neural networks dominated the ImageNet Challenge in 2013. That became his launchpad for Clarifai.

Zeiler has since evolved his research into developer-friendly products that allow enterprises to quickly and easily integrate AI into their workflows and customer experiences. The company continues to attract new customers, most recently, with the release of its natural language processing product.

While much has changed in the industry, Clarifai’s focus on research hasn’t.

“We have a sizable team of researchers, and we have become adept at taking some of the best research out there in the academic world and very quickly deploying it for commercial use,” said Ramos.

 

Clarifai is a member of NVIDIA Inception, a virtual accelerator program that helps startups in AI and data science get to market faster.

Image credit: Chris Curry via Unsplash.

The post Clarifying Training Time, Startup Launches AI-Assisted Data Annotation appeared first on The Official NVIDIA Blog.

Read More

Mass General’s Martinos Center Adopts AI for COVID, Radiology Research

Mass General’s Martinos Center Adopts AI for COVID, Radiology Research

Academic medical centers worldwide are building new AI tools to battle COVID-19 —  including at Mass General, where one center is adopting NVIDIA DGX A100 AI systems to accelerate its work.

Researchers at the hospital’s Athinoula A. Martinos Center for Biomedical Imaging are working on models to segment and align multiple chest scans, calculate lung disease severity from X-ray images, and combine radiology data with other clinical variables to predict outcomes in COVID patients.

Built and tested using Mass General Brigham data, these models, once validated, could be used together in a hospital setting during and beyond the pandemic to bring radiology insights closer to the clinicians tracking patient progress and making treatment decisions.

“While helping hospitalists on the COVID-19 inpatient service, I realized that there’s a lot of information in radiologic images that’s not readily available to the folks making clinical decisions,” said Matthew D. Li, a radiology resident at Mass General and member of the Martinos Center’s QTIM Lab. “Using deep learning, we developed an algorithm to extract a lung disease severity score from chest X-rays that’s reproducible and scalable — something clinicians can track over time, along with other lab values like vital signs, pulse oximetry data and blood test results.”

The Martinos Center uses a variety of NVIDIA AI systems, including NVIDIA DGX-1, to accelerate its research. This summer, the center will install NVIDIA DGX A100 systems, each built with eight NVIDIA A100 Tensor Core GPUs and delivering 5 petaflops of AI performance.

“When we started working on COVID model development, it was all hands on deck. The quicker we could develop a model, the more immediately useful it would be,” said Jayashree Kalpathy-Cramer, director of the QTIM lab and the Center for Machine Learning at the Martinos Center. “If we didn’t have access to the sufficient computational resources, it would’ve been impossible to do.”

Comparing Notes: AI for Chest Imaging

COVID patients often get imaging studies — usually CT scans in Europe, and X-rays in the U.S. — to check for the disease’s impact on the lungs. Comparing a patient’s initial study with follow-ups can be a useful way to understand whether a patient is getting better or worse.

But segmenting and lining up two scans that have been taken in different body positions or from different angles, with distracting elements like wires in the image, is no easy feat.

Bruce Fischl, director of the Martinos Center’s Laboratory for Computational Neuroimaging, and Adrian Dalca, assistant professor in radiology at Harvard Medical School, took the underlying technology behind Dalca’s MRI comparison AI and applied it to chest X-rays, training the model on an NVIDIA DGX system.

“Radiologists spend a lot of time assessing if there is change or no change between two studies. This general technique can help with that,” Fischl said. “Our model labels 20 structures in a high-resolution X-ray and aligns them between two studies, taking less than a second for inference.”

This tool can be used in concert with Li and Kalpathy-Cramer’s research: a risk assessment model that analyzes a chest X-ray to assign a score for lung disease severity. The model can provide clinicians, researchers and infectious disease experts with a consistent, quantitative metric for lung impact, which is described subjectively in typical radiology reports.

Trained on a public dataset of over 150,000 chest X-rays, as well as a few hundred COVID-positive X-rays from Mass General, the severity score AI is being used for testing by four research groups at the hospital using the NVIDIA Clara Deploy SDK. Beyond the pandemic, the team plans to expand the model’s use to more conditions, like pulmonary edema, or wet lung.

Comparing the AI lung disease severity score, or PXS, between images taken at different stages can help clinicians track changes in a patient’s disease over time. (Image from the researchers’ paper in Radiology: Artificial Intelligence, available under open access.)

Foreseeing the Need for Ventilators

Chest imaging is just one variable in a COVID patient’s health. For the broader picture, the Martinos Center team is working with Brandon Westover, executive director of Mass General Brigham’s Clinical Data Animation Center.

Westover is developing AI models that predict clinical outcomes for both admitted patients and outpatient COVID cases, and Kalpathy-Cramer’s lung disease severity score could be integrated as one of the clinical variables for this tool.

The outpatient model analyzes 30 variables to create a risk score for each of hundreds of patients screened at the hospital network’s respiratory infection clinics — predicting the likelihood a patient will end up needing critical care or dying from COVID.

For patients already admitted to the hospital, a neural network predicts the hourly risk that a patient will require artificial breathing support in the next 12 hours, using variables including vital signs, age, pulse oximetry data and respiratory rate.

“These variables can be very subtle, but in combination can provide a pretty strong indication that a patient is getting worse,” Westover said. Running on an NVIDIA Quadro RTX 8000 GPU, the model is accessible through a front-end portal clinicians can use to see who’s most at risk, and which variables are contributing most to the risk score.

Better, Faster, Stronger: Research on NVIDIA DGX

Fischl says NVIDIA DGX systems help Martinos Center researchers more quickly iterate, experimenting with different ways to improve their AI algorithms. DGX A100, with NVIDIA A100 GPUs based on the NVIDIA Ampere architecture, will further speed the team’s work with third-generation Tensor Core technology.

“Quantitative differences make a qualitative difference,” he said. “I can imagine five ways to improve our algorithm, each of which would take seven hours of training. If I can turn those seven hours into just an hour, it makes the development cycle so much more efficient.”

The Martinos Center will use NVIDIA Mellanox switches and VAST Data storage infrastructure, enabling its developers to use NVIDIA GPUDirect technology to bypass the CPU and move data directly into or out of GPU memory, achieving better performance and faster AI training.

“Having access to this high-capacity, high-speed storage will allow us to to analyze raw multimodal data from our research MRI, PET and MEG scanners,” said Matthew Rosen, assistant professor in radiology at Harvard Medical School, who co-directs the Center for Machine Learning at the Martinos Center. “The VAST storage system, when linked with the new A100 GPUs, is going to offer an amazing opportunity to set a new standard for the future of intelligent imaging.”

To learn more about how AI and accelerated computing are helping healthcare institutions fight the pandemic, visit our COVID page.

Main image shows chest x-ray and corresponding heat map, highlighting areas with lung disease. Image from the researchers’ paper in Radiology: Artificial Intelligence, available under open access.

The post Mass General’s Martinos Center Adopts AI for COVID, Radiology Research appeared first on The Official NVIDIA Blog.

Read More

Nerd Watching: GPU-Powered AI Helps Researchers Identify Individual Birds

Nerd Watching: GPU-Powered AI Helps Researchers Identify Individual Birds

Anyone can tell an eagle from an ostrich. It takes a skilled birdwatcher to tell a chipping sparrow from a house sparrow from an American tree sparrow.

Now researchers are using AI to take this to the next level — identifying individual birds.

André Ferreira, a Ph.D. student at France’s Centre for Functional and Evolutionary Ecology, harnessed an NVIDIA GeForce RTX 2070 to train a powerful AI that identifies individual birds within the same species.

It’s the latest example of how deep learning has become a powerful tool for wildlife biologists studying a wide range of animals.

Marine biologists with the U.S. National Oceanic and Atmospheric Research Organization use deep learning to identify and track the endangered North Atlantic Right Whale. Zoologist Dan Rubenstein uses deep learning to distinguish between individuals in herds of Grevy’s Zebras.

The sociable weaver isn’t endangered. But understanding the role of an individual in a group is key to understanding how the birds, native to Southern Africa, work together to build their nests.

The problem: it’s hard to tell the small, rust-colored birds apart, especially when trying to capture their activities in the wild.

In a paper released last week, Ferreira detailed how he and a team of researchers trained a convolutional neural network to identify individual birds.

Ferreira built his model using Keras, a popular open-source neural network library, running on a GeForce RTX 2070 GPU.

He then teamed up with researchers at Germany’s Max Planck Institute of Animal Behavior. Together, they adapted the model to identify wild great tits and captive zebra finches, two other widely studied bird species.

To train their models — a crucial step towards building any modern deep-learning-based AI — researchers made feeders equipped with cameras.

The researchers fitted birds with electronic tags, which triggered sensors in the feeders alerting researchers to the bird’s identity.

This data gave the model a “ground truth” that it could check against for accuracy.

The team’s AI was able to identify individual sociable weavers and wild great tits more than 90 percent of the time. And it identified captive zebra finches 87 percent of the time.

For bird researchers, the work promises several key benefits.

Using cameras and other sensors to track birds allows researchers to study bird behavior much less invasively.

With less need to put researchers in the field, the technique allows researchers to track bird behavior over more extended periods.

Next: Ferreira and his colleagues are working to build AI that can recognize individual birds it has never seen before, and better track groups of birds.

Birdwatching may never be the same.

Featured image credit: Bernard DuPont, some rights reserved.

The post Nerd Watching: GPU-Powered AI Helps Researchers Identify Individual Birds appeared first on The Official NVIDIA Blog.

Read More

AI Goes Uptown: A Tour of Smart Cities Around the Globe 

There are as many ways to define a smart city as there are cities on the road to being smart.

From London and Singapore to Seat Pleasant, Maryland, they vary widely. Most share some common characteristics.

Every city wants to be smart about being a great place to live. So, many embrace broad initiatives for connecting their citizens to the latest 5G and fiber optic networks, expanding digital literacy and services.

Most agree that a big part of being smart means using technology to make their cities more self-aware, automated and efficient.

That’s why a smart city is typically a kind of municipal Internet of Things — a network of cameras and sensors that can see, hear and even smell. These sensors, especially video cameras, generate massive amounts of data that can serve many civic purposes like helping traffic flow smoothly.

Cities around the globe are turning to AI to sift through that data in real time for actionable insights. And, increasingly, smart cities build realistic 3D simulations of themselves, digital twins to test out ideas of what they might look like in the future.

“We define a smart city as a place applying advanced technology to improve the quality of life for people who live in it,” said Sokwoo Rhee, who’s worked on more than 200 smart city projects in 25 countries as an associate director for cyber-physical systems innovation in the U.S. National Institute of Standards and Technology.

U.S., London Issue Smart Cities Guidebooks

At NIST, Rhee oversees work on a guide for building smart cities. Eventually it will include reports on issues and case studies in more than two dozen areas from public safety to water management systems.

Across the pond, London describes its smart city efforts in a 60-page document that details many ambitious goals. Like smart cities from Dubai to San Jose in Silicon Valley, it’s a metro-sized work in progress.

smart london
An image from the Smart London guide.

“We are far from the ideal at the moment with a multitude of systems and a multitude of vendors making the smart city still somewhat complex and fragmented,” said Andrew Hudson-Smith, who is chair of digital urban systems at The Centre for Advanced Spatial Analysis at University College London and sits on a board that oversees London’s smart city efforts.

Living Labs for AI

In a way, smart cities are both kitchen sinks and living labs of technology.

They host everything from air-quality monitoring systems to repositories of data cleared for use in shared AI projects. The London Datastore, for example, already contains more than 700 publicly available datasets.

One market researcher tracks a basket of 13 broad areas that define a smart city from smart streetlights to connected garbage cans. A smart-parking vendor in Stockholm took into account 24 factors — including the number of Wi-Fi hotspots and electric-vehicle charging stations — in its 2019 ranking of the world’s 100 smartest cities. (Its top five were all in Scandinavia.)

“It’s hard to pin it down to a limited set of technologies because everything finds its way into smart cities,” said Dominique Bonte, a managing director at market watcher ABI Research. Among popular use cases, he called out demand-response systems as “a huge application for AI because handling fluctuating demand for electricity and other services is a complex problem.”

smart city factors from EasyPark
Sweden’s EasyPark lists 24 factors that define a smart city.

Because it’s broad, it’s also big. Market watchers at Navigant Research expect the global market for smart-city gear to grow from $97.4 billion in annual revenue in 2019 to $265.4 billion by 2028 at a compound annual growth rate of 11.8 percent.

It’s still early days. In a January 2019 survey of nearly 40 U.S. local and state government managers, more than 80 percent thought a municipal Internet of Things will have significant impact for their operations, but most were still in a planning phase and less than 10 percent had active projects.

smart city survey by NIST
Most smart cities are still under construction, according to a NIST survey.

“Smart cities mean many things to many people,” said Saurabh Jain, product manager of Metropolis, NVIDIA’s GPU software stack for vertical markets such as smart cities.

“Our focus is on building what we call the AI City with the real jobs that can be done today with deep learning, tapping into the massive video and sensor datasets cities generate,” he said.

For example, Verizon deployed on existing streetlights in Boston and Sacramento video nodes using the NVIDIA Jetson TX1 to analyze and improve traffic flow, enhance pedestrian safety and optimize parking.

“Rollout is happening fast across the globe and cities are expanding their lighting infrastructure to become a smart-city platform … helping to create efficiency savings and a new variety of citizen services,” said David Tucker, head of product management in the Smart Communities Group at Verizon in a 2018 article.

Smart Streetlights for Smart Cities

Streetlights will be an important part of the furniture of tomorrow’s smart city.

So far, only a few hundred are outfitted with various mixes of sensors and Wi-Fi and cellular base stations. The big wave is yet to come as the estimated 360 million posts around the world slowly upgrade to energy-saving LED lights.

smart streetlight EU
A European take on a smart streetlight.

In a related effort, the city of Bellevue, Washington, tested a computer vision system from Microsoft Research to improve traffic safety and reduce congestion. Researchers at the University of Wollongong recently described similar work using NVIDIA Jetson TX2 modules to track the flow of vehicles and pedestrians in Liverpool, Australia.

Airports, retail stores and warehouses are already using smart cameras and AI to run operations more efficiently. They are defining a new class of edge computing networks that smart cities can leverage.

For example, Seattle-Tacoma International Airport (SEA) will roll out an AI system from startup Assaia that uses NVIDIA GPUs to speed the time to turn around flights.

“Video analytics is crucial in providing full visibility over turnaround activities as well as improving safety,” said an SEA manager in a May report.

Nashville, Zurich Explore the Simulated City

Some smart cities are building digital twins, 3D simulations that serve many purposes.

For example, both Zurich and Nashville will someday let citizens and city officials don goggles at virtual town halls to see simulated impacts of proposed developments.

“The more immersive and fun an experience, the more you increase engagement,” said Dominik Tarolli, director of smart cities at Esri, which is supplying simulation software that runs on NVIDIA GPUs for both cities.

Cities as far apart in geography and population as Singapore and Rennes, France, built digital twins using a service from Dassault Systèmes.

“We recently signed a partnership with Hong Kong and presented examples for a walkability study that required a 3D simulation of the city,” said Simon Huffeteau, a vice president working on smart cities for Dassault.

Europe Keeps an AI on Traffic

Many smart cities get started with traffic control. London uses digital signs to post speed limits that change to optimize traffic flow. It also uses license-plate recognition to charge tolls for entering a low-emission zone in the city center.

Cities in Belgium and France are considering similar systems.

“We think in the future cities will ban the most polluting vehicles to encourage people to use public transportation or buy electric vehicles,” said Bonte of ABI Research. “Singapore is testing autonomous shuttles on a 5.7-mile stretch of its streets,” he added.

Nearby, Jakarta uses a traffic-monitoring system from Nodeflux, a member of NVIDIA’s Inception program that nurtures AI startups. The software taps AI and the nearly 8,000 cameras already in place around Jakarta to recognize license plates of vehicles with unpaid taxes.

The system is one of more than 100 third-party applications that run on Metropolis, NVIDIA’s application framework for the Internet of Things.

Unsnarling Traffic in Israel and Kansas City

Traffic was the seminal app for a smart-city effort in Kansas City that started in 2015 with a $15 million smart streetcar. Today, residents can call up digital dashboards detailing current traffic conditions around town.

And in Israel, the city of Ashdod deployed AI software from viisights. It helps understand patterns in a traffic monitoring system powered by NVIDIA Metropolis to ensure safety for citizens.

NVIDIA created the AI City Challenge to advance work on deep learning as a tool to unsnarl traffic. Now in its fourth year, it draws nearly 1,000 researchers competing in more than 300 teams that include members from multiple city and state traffic agencies.

The event spawned CityFlow, one of the world’s largest datasets for applying AI to traffic management. It consists of more than three hours of synchronized high-definition videos from 40 cameras at 10 intersections, creating 200,000 annotated bounding boxes around vehicles captured from different angles under various conditions.

Drones to the Rescue in Maryland

You don’t have to be a big city with lots of money to be smart. Seat Pleasant, Maryland, a Washington, D.C., suburb of less than 5,000 people, launched a digital hub for city services in August 2017.

Since then it installed intelligent lighting, connected waste cans, home health monitors and video analytics to save money, improve traffic safety and reduce crime. It’s also become the first U.S. city to use drones for public safety, including plans for life-saving delivery of emergency medicines.

The idea got its start when Mayor Eugene Grant, searching for ways to recover from the 2008 economic downturn, attended an event on innovation villages.

“Seat Pleasant would like to be a voice for small cities in America where 80 percent have less than 10,000 residents,” said Grant. “Look at these cities as test beds of innovation … living labs,” he added.

Seat Pleasant Mayor Eugene Grant
Mayor Grant of Seat Pleasant aims to set an example of how small towns can become smart cities.

Rhee of NIST agrees. “I’m seeing a lot of projects embracing a broadening set of emerging technologies, making smart cities like incubation programs for new businesses like air taxis and autonomous vehicles that can benefit citizens,” he said, noting that even rural communities will get into the act.

Simulating a New Generation of Smart Cities

When the work is done, go to the movies. Hollywood might provide a picture of the next horizon in the same way it sparked some of the current work.

Simulated smart city
Esri’s tools are used to simulate cities for movies as well as the real world.

Flicks including Blade Runner 2049, Cars, Guardians of the Galaxy and Zootopia used a program called City Engine from startup Procedural that enables a rule-based approach to constructing simulated cities.

Their work caught the eye of Esri, which acquired the company and bundled its program with its ArcGIS Urban planning tool, now a staple for hundreds of real cities worldwide.

“Games and movies make people expect more immersive experiences, and that requires more computing,” said Tarolli, a co-founder of Procedural and now Esri’s point person on smart cities.

The post AI Goes Uptown: A Tour of Smart Cities Around the Globe  appeared first on The Official NVIDIA Blog.

Read More

Deep Learning on Tap: NVIDIA Engineer Turns to AI, GPU to Invent New Brew

Some dream of code. Others dream of beer. NVIDIA’s Eric Boucher does both at once, and the result couldn’t be more aptly named.

Full Nerd #1 is a crisp, light-bodied blonde ale perfect for summertime quaffing.

Eric, an engineer in the GPU systems software kernel driver team, went to sleep one night in May wrestling with two problems.

One, he had to wring key information from the often cryptic logs for the systems he oversees to help his team respond to issues faster.

The other: the veteran home brewer wanted a way to brew new kinds of beer.

“I woke up in the morning and I knew just what to do,” Boucher said. “Basically I got both done on one night’s broken sleep.”

Both solutions involved putting deep learning to work on a NVIDIA TITAN V GPU. Such powerful gear tends to encourage this sort of parallel processing, it seems.

Eric, a native of France now based near Sacramento, Calif., began homebrewing two decades ago, inspired by a friend and mentor at Sun Microsystems. He took a break from it when his children were first born.

Now that they’re older, he’s begun brewing again in earnest, using gear in both his garage and backyard, turning to AI for new recipes this spring.

Of course, AI has been used in the past to help humans analyze beer flavors, and even create wild new craft beer names. Eric’s project, however, is more ambitious, because it’s relying on AI to create new beer recipes.

You’ve Got Ale — GPU Speeds New Brew Ideas

For training data, Eric started with the all-grain ale recipes from MoreBeer, a hub for brewing enthusiasts, where he usually shops for recipe kits and ingredients.

Eric focused on ales because they’re relatively easy and quick to brew, and encompass a broad range of different styles, from hearty Irish stout to tangy and refreshing Kölsch.

He used wget — an open source program that retrieves content from the web — to save four index pages of ale recipes.

Then, using a Python script, he filtered the downloaded HTML pages and downloaded the linked recipe PDFs. He then converted the PDFs to plain text and used another Python script to interpret the text and generate recipes in a standardized format.

He fed these 108 recipes — including one for Russian River Brewing’s legendary Pliny the Elder IPA — to Textgenrnn, a recurrent neural network, a type of neural network that can be applied to a sequence of data to help guess what should come next.

And, because no one likes to wait for good beer, he ran it on an NVIDIA TITAN V GPU. Eric estimates it cuts the time to learn from the recipes database to seven minutes from one hour and 45 minutes using a CPU alone.

After a little tuning, Eric generated 10 beer recipes. They ranged from dark stouts to yellowish ales, and in flavor from bitter to light.

To Eric’s surprise, most looked reasonable (though a few were “plain weird and impossible to brew” like a recipe that instructed him to wait 45 days with hops in the wort, or unfermented beer, before adding the yeast).

Speed of Light (Beer)

With the approaching hot California summer in mind, Eric selected a blonde ale.

He was particularly intrigued because the recipe suggested adding Warrior, Cascade, and Amarillo hops — the flowers of the herbaceous perennial Humulus lupulus that gives good beer a range of flavors, from bitter to citrusy — an “intriguing schedule.”

The result, Eric reports, was refreshing, “not too sweet, not too bitter,” with “a nice, fresh hops smell and a long, complex finish.”

He dubbed the result Full Nerd #1.

The AI-generated brew became the latest in a long line of brews with witty names Eric has produced, including a bourbon oak-infused beer named, appropriately enough, “The Groot Beer,” in honor of the tree-like creature from Marvel’s “Guardians of the Galaxy.”

Eric’s next AI brewing project: perhaps a dark stout, for winter, or a lager, a light, crisp beer that requires months of cold storage to mature.

For now, however, there’s plenty of good brew to drink. Perhaps too much. Eric usually shares his creations with his martial arts buddies. But with social distancing in place amidst the global COVID-19 pandemic, the five gallons, or forty pints, is more than the light drinker knows what to do with.

Eric, it seems, has found a problem deep learning can’t help him with. Bottoms up.

The post Deep Learning on Tap: NVIDIA Engineer Turns to AI, GPU to Invent New Brew appeared first on The Official NVIDIA Blog.

Read More

Fleet Dreams Are Made of These: TuSimple and Navistar to Build Autonomous Trucks Powered by NVIDIA DRIVE

Self-driving trucks are coming to an interstate near you.

Autonomous trucking startup TuSimple and truck maker Navistar recently announced they will build self-driving semi trucks, powered by the NVIDIA DRIVE AGX platform. The collaboration is one of the first to develop autonomous trucks, set to begin production in 2024.

Over the past decade, self-driving truck developers have relied on traditional trucks retrofitted with the sensors, hardware and software necessary for autonomous driving. Building these trucks from the ground up, however, allows for companies to custom-build them for the needs of a self-driving system as well as take advantage of the infrastructure of a mass production truck manufacturer.

This transition is the first step from research to widespread deployment, said Chuck Price, chief product officer at TuSimple.

“Our technology, developed in partnership with NVIDIA, is ready to go to production with Navistar,” Price said. “This is a significant turning point for the industry.”

Tailor-Made Trucks

Developing a truck to drive on its own takes more than a software upgrade.

Autonomous driving relies on redundant and diverse deep neural networks, all running simultaneously to handle perception, planning and actuation. This requires massive amounts of compute.

The NVIDIA DRIVE AGX platform delivers high-performance, energy-efficient compute to enable AI-powered and autonomous driving capabilities. TuSimple has been using the platform in its test vehicles and pilots, such as its partnership with the United States Postal Service.

Building dedicated autonomous trucks makes it possible for TuSimple and Navistar to develop a centralized architecture optimized for the power and performance of the NVIDIA DRIVE AGX platform. The platform is also automotive grade, meaning it is built to withstand the wear and tear of years driving on interstate highways.

Invaluable Infrastructure

In addition to a customized architecture, developing an autonomous truck in partnership with a manufacturer opens up valuable infrastructure.

Truck makers like Navistar provide nationwide support for their fleets, with local service centers and vehicle tracking. This network is crucial for deploying self-driving trucks that will criss-cross the country on long-haul routes, providing seamless and convenient service to maintain efficiency.

TuSimple is also building out an HD map network of the nation’s highways for the routes its vehicles will travel. Combined with the widespread fleet management network, this infrastructure makes its autonomous trucks appealing to a wide variety of partners — UPS, U.S. Xpress, Penske Truck Leasing and food service supply chain company McLane Inc., a Berkshire Hathaway company, have all signed on to this autonomous freight network.

And backed by the performance of NVIDIA DRIVE AGX, these vehicles will continue to improve, delivering safer, more efficient logistics across the country.

“We’re really excited as we move into production to have a partner like NVIDIA with us the whole way,” Price said.

The post Fleet Dreams Are Made of These: TuSimple and Navistar to Build Autonomous Trucks Powered by NVIDIA DRIVE appeared first on The Official NVIDIA Blog.

Read More