NVIDIA Maxine Hits the Scene to Create Real-Time Video Experiences

The next time you’re in a virtual meeting or streaming a game, live event or TV program, the star of the show may be NVIDIA Maxine, which took center stage at GTC today when NVIDIA CEO Jensen Huang announced the availability of the GPU-accelerated software development kit during his keynote address.

Developers from video conferencing, content creation and streaming providers are using the Maxine SDK to create real-time video-based experiences. And it’s easily deployed to PCs, data centers or in the cloud.

Shift Towards Remote Work

Virtual collaboration continues to grow with 70 million hours of web meetings daily, and more global organizations are looking at technologies to support an increasingly remote workforce.

Pexip, a scalable video conferencing platform that enables interoperability between different video conferencing systems, was looking to push the boundaries of its video communications offering to meet this growing demand.

“We’re already using NVIDIA Maxine for audio noise removal and working on integrating virtual backgrounds to support premium video conferencing experiences for enterprises of all sizes,” said Giles Chamberlin, CTO and co-founder of Pexip.

Working with NVIDIA, Pexip aims to provide AI-powered video communications that support virtual meetings that are better than meetings in person.

It joins other companies in the video collaboration space like Avaya, which incorporated Maxine audio noise reduction into its Spaces app last October and has now implemented virtual background, which allows presenters to overlay their video over presentations.

Headroom uses AI to take distractions out of video conferencing, so participants can focus on interactions during meetings instead. This includes flagging when people have questions, note taking, transcription and smart meeting summarization.

Seeing Face Value for Virtual Events

Research has shown that there are over 1 million virtual events yearly, with more event marketers planning to invest in them in the future. As a result, everyone from event organizers to visual effects artists are looking for faster, more efficient ways to create digital experiences.

Among them is Touchcast, which combines AI and mixed reality to reimagine virtual events. It’s using Maxine’s super-resolution features to convert and deliver 1080p streams into 4K.

“NVIDIA Maxine is paving the future of video communications — a future where AI and neural networks enhance and enrich content in entirely new ways,” said Edo Segal, founder and CEO of Touchcast.

Another example is Notch, which creates tools that enable real-time visual effects and motion graphics for live events. Maxine provides it with real-time, AI-driven face and body tracking along with background removal.

Artists can track and mask performers in a live performance setting for a variety of creative use cases — all using a standard camera feed and eliminating the challenges of special hardware-tracking solutions.

“The integration of the Maxine SDK was very easy and took just a few days to complete,” said Matt Swoboda, founder and director of Notch.

Field of Streams

With nearly 10 million content creators on Twitch per month, becoming a live broadcaster has also never been easier. Live streamers are looking for powerful yet easy-to-use features to excite their audiences.

BeLive, which provides a platform for live streaming user-generated talk shows, is using Maxine to process its video streams in the cloud so customers don’t have to invest in expensive equipment. By running Maxine in the cloud, users can benefit from high-quality background replacement regardless of the hardware they’re running in the client.

With BeLive, live interactive call-in talk shows can be produced easily and streamed to YouTube or Facebook Live, with participants calling in from around the world.

OBS, the leading platform for streaming and recording, is a free and open source software solution broadly used for game streaming and live production. Users with NVIDIA RTX GPUs can now take advantage of noise removal, improving the clarity of their audio during production.

Maxine users — Developers are using the Maxine SDK for building virtual collaboration and content creation applications.

A Look Into NVIDIA Maxine

NVIDIA Maxine includes three AI SDKs covering video effects, audio effects and augmented reality — each with pre-trained deep learning models, so developers can quickly build or enhance their real-time applications.

Starting with the NVIDIA Video Effects SDK, enterprises can now apply AI effects to improve video quality without special cameras or other hardware. Features include super-resolution, generating 720p output live videos from 360p input videos along with artifact reduction to remove defects for crisper pictures.

Video noise removal eliminates low-light camera noise introduced in the video capture process while preserving all of the details. To hide messy rooms or other visual distractions, the Video Effects SDK removes the background of a webcam feed in real time, so only a user’s face and body show up in a livestream.

The NVIDIA Augmented Reality SDK enables real-time 3D face tracking using a standard web camera, delivering a more engaging virtual communication experience by automatically zooming into the face and keeping that face within view of the camera.

It’s now possible to detect human faces in images of video feeds, track the movement of facial expressions, create a 3D mesh representation of a person’s face, use video to track the movement of a human body in 3D space, simulate eye contact through gaze estimation and much more.

The NVIDIA Audio Effects SDK uses AI to remove distracting background noise from incoming and outgoing audio feeds, improving the clarity and quality of any conversation.

This includes the removal of unwanted background noises — like a dog barking or baby crying — to make conversations easier to understand. For meetings in large spaces, it’s also possible to remove room echoes from the background to make voices clearer.

Developers can add Maxine AI effects into their existing applications or develop new pipelines from scratch using NVIDIA DeepStream, an SDK for building intelligent video analytics, and NVIDIA Video Codec, an SDK for accelerated video encode and decode on Windows and Linux.

Maxine can also be used with NVIDIA Jarvis, a framework for building conversational AI applications, to offer world-class language-based capabilities such as transcription and translation.

Availability

Get started with NVIDIA Maxine.

And don’t let the curtain close on the opportunity to learn more about NVIDIA Maxine during GTC, running April 12-16. Registration is free.

A full list of Maxine-focused sessions can be found here. Be sure to watch Huang’s keynote address on-demand. And check out a demo (below) of Maxine.

The post NVIDIA Maxine Hits the Scene to Create Real-Time Video Experiences appeared first on The Official NVIDIA Blog.

Fast Track to Enterprise AI: New NVIDIA Workflow Lets Any User Choose, Adapt, Deploy Models Easily

AI is the most powerful new technology of our time, but it’s been a force that’s hard to harness for many enterprises — until now.

Many companies lack the specialized skills, access to large datasets or accelerated computing that deep learning requires. Others are realizing the benefits of AI and want to spread them quickly across more products and services.

For both, there’s a new roadmap to enterprise AI. It leverages technology that’s readily available, then simplifies the AI workflow with NVIDIA TAO and NVIDIA Fleet Command to make the trip shorter and less costly.

Grab and Go AI Models

The journey begins with pre-trained models. You don’t have to design and train a neural network from scratch in 2021. You can choose one of many available today in our NGC catalog.

We’ve curated models that deliver skills to advance your business. They span the spectrum of AI jobs from computer vision and conversational AI to natural-language understanding and more.

Models Show Their AI Resumes

So users know what they’re getting, many models in the catalog come with credentials. They’re like the resume for a prospective hire.

Model credentials show you the domain the model was trained for, the dataset that trained it, how often the model was deployed and how it’s expected to perform. They provide transparency and confidence you’re picking the right model for your use case.

Leveraging a Massive Investment

NVIDIA invested hundreds of millions of GPU compute hours over more than five years refining these models. We did this work so you don’t have to.

Here are three quick examples of the R&D you can leverage:

For computer vision, we devoted 3,700 person-years to labeling 500 million objects from 45 million frames. We used voice recordings to train our speech models on GPUs for more than a million hours. A database of biomedical papers packing 6.1 billion words educated our models for natural-language processing.

Transfer Learning, Your AI Tailor

Once you choose a model, you can fine tune it to fit your specific needs using NVIDIA TAO, the next stage of our expedited workflow for enterprise AI.

TAO enables transfer learning, a process that harvests features from an existing neural network and plants them in a new one using NVIDIA’s Transfer Learning Toolkit, an integrated part of TAO. It leverages small datasets users have on hand to give models a custom fit without the cost, time and massive datasets required to build and train a neural network from scratch.

Sometimes companies have an opportunity to further enhance models by training them across larger, more diverse datasets maintained by partners outside the walls of their data center.

TAO Lets Partners Collaborate with Privacy

Federating learning, another part of TAO, lets different sites securely collaborate to refine a model for the highest accuracy. With this technique, users share components of models such as their partial weights. Datasets remain inside each company’s data center so data privacy is preserved.

In one recent example, 20 research sites collaborated to raise the accuracy of the so-called EXAM model that predicts whether a patient has COVID-19. After applying federated learning, the model also could predict the severity of the infection and whether the patient would need supplemental oxygen. Patient data stayed safely behind the walls of each partner.

Taking Enterprise AI to Production

Once a model is fine tuned, it needs to be optimized for deployment.

It’s a pruning process that makes models lean, yet robust, so they function efficiently on your target platform whether it’s an array of GPUs in a server or a Jetson-powered robot on the factory floor.

NVIDIA TensorRT, another part of TAO, dials a model’s mathematical coordinates to an optimal balance of the smallest size with the highest accuracy for the system it will run on. It’s a crucial step, especially for real-time services like speech recognition or fraud detection that won’t tolerate system latency.

Then, with the Triton Inference Server, users can select the optimal configuration to deploy, whatever the model’s architecture, the framework it uses or target CPU or GPU it will run on.

Once a model is optimized and ready for deployment, users can easily integrate it with whatever application framework that fits their use case or industry. For example, it could be Jarvis for conversational AI, Clara for healthcare, Metropolis for video analytics or Isaac for robotics to name just a few that NVIDIA provides.

NGC TAO Fleet Command workflow — Pre-trained models in NGC, along with TAO and Fleet Command for a simple, but powerful AI workflow.

With the chosen application framework, users can launch NVIDIA Fleet Command to deploy and manage the AI application across a variety of GPU-powered devices. It’s the last key step in the journey.

Zero to AI in Minutes

Fleet Command connects NVIDIA-Certified servers deployed at the network’s edge to the cloud. With it, users can work from a browser to securely pair, orchestrate and manage millions of servers, deploy AI to any remote location and update software as needed.

Administrators monitor health and update systems with one-click to simplify AI operations at scale.

Fleet Command uses end-to-end security protocols to ensure application data and intellectual property remain safe.

Data is sent between the edge and the cloud, fully encrypted, ensuring it’s protected. And applications are scanned for malware and vulnerabilities before they are deployed.

An AI Workflow That’s on the Job

Fleet Command and elements of TAO are already in use in warehouses, in retail, in hospitals and on the factory floor. Users include companies such as Accenture, BMW and Siemens Digital Industries

A demo (below) from the GTC keynote shows how the one-two-three combination of NGC models, TAO and Fleet Command can quickly tailor and deploy an application using multiple AI models.

You can sign up for Fleet Command today.

Core parts of TAO, such as the Transfer Learning Toolkit and federated learning, are available today. Apply now for early access to them all, fully integrated into TAO.

The post Fast Track to Enterprise AI: New NVIDIA Workflow Lets Any User Choose, Adapt, Deploy Models Easily appeared first on The Official NVIDIA Blog.

Dream State: Cybersecurity Vendors Detect Breaches in an Instant with NVIDIA Morpheus

In the geography of data center security, efforts have long focused on protecting north-south traffic — the data that passes between the data center and the rest of the network. But one of the greatest risks has become east-west traffic — network packets passing between servers within a data center.

That’s due to the growth of cloud-native applications built from microservices, whose connections across a data center are changing constantly. With a typical 1,000-server data center having over 1 billion network paths, it’s extremely difficult to write fixed rules that control the blast radius should a malicious actor get inside.

The new NVIDIA Morpheus AI application framework gives security teams complete visibility into security threats by bringing together unmatched AI processing and real-time monitoring on every packet through the data center. It lets them respond to anomalies and update policies immediately as threats are identified.

Combining the security superpowers of AI and NVIDIA BlueField data processing units (DPUs), Morpheus provides cybersecurity developers a highly optimized AI pipeline and pre-trained AI skills that, for the first time, allow them to instantaneously inspect all IP network communication through their data center fabric.

Bringing a new level of security to data centers, the framework provides dynamic protection, monitoring, adaptive policies and cyber defenses required to detect and remediate them.

Continuous AI Analytics on Network Traffic

Morpheus — which combines event streaming from NVIDIA Cumulus NetQ and GPU accelerated computing with RAPIDS data analytics pipelines, deep learning frameworks and Triton Inference Server, runs on mainstream NVIDIA-Certified enterprise servers — simplifies the analysis of computer logs and helps detect and mitigate security threats. Pre-trained AI models help find leaked credentials, keys, passwords, credit card numbers, bank account numbers and identify security policies that need to be hardened.

Integrating the framework into a third-party cybersecurity offering brings the world’s best AI computing to communication networks. Morpheus can receive rich telemetry feeds from every NVIDIA BlueField DPU-accelerated server in the data center without impacting server performance. BlueField-2 DPUs act both as a sensor to collect real-time packet flows and as a policy enforcement point to limit communication between any microservice container or virtual machine in a data center.

By placing BlueField-2 DPUs in servers across the data center, Morpheus can automatically write and change policies to immediately remediate security threats — from changing the logs being collected and altering the volume of ingesting, to dynamically redirecting certain log events, blocking traffic newly identified as malicious, rewriting rules to enforce policy updates, and more.

Accelerate and Secure the Data Center with NVIDIA BlueField DPUs

The NVIDIA BlueField-2 DPU, available today, enables true software-defined, hardware-accelerated data center infrastructure. By having software-defined networking policies and telemetry collection run on the BlueField DPU before entering the server, the DPU offloads, accelerates, and isolates critical data center functions without burdening the server’s CPU. The DPU also extends the simple static security logging model and implements sophisticated dynamic telemetry that evolves with new policies being determined and adjusted.

Learn more about NVIDIA Morpheus and apply for early access, currently available in the U.S. and Israel.

The post Dream State: Cybersecurity Vendors Detect Breaches in an Instant with NVIDIA Morpheus appeared first on The Official NVIDIA Blog.

NVIDIA’s New CPU to ‘Grace’ World’s Most Powerful AI-Capable Supercomputer

NVIDIA’s new Grace CPU will power the world’s most powerful AI-capable supercomputer.

The Swiss National Computing Center’s (CSCS) new system will use Grace, a revolutionary Arm-based data center CPU introduced by NVIDIA today, to enable breakthrough research in a wide range of fields.

From climate and weather to materials sciences, astrophysics, computational fluid dynamics, life sciences, molecular dynamics, quantum chemistry and particle physics, as well as domains like economics and social sciences, Alps will play a key role in advancing science throughout Europe and worldwide when it comes online in 2023.

“We are thrilled to announce the Swiss National Supercomputing Center will build a supercomputer powered by Grace and our next-generation GPU,” NVIDIA CEO Jensen Huang said Monday during his keynote at NVIDIA’s GPU Technology Conference.

Alps will be built by Hewlett Packard Enterprise using the new HPE Cray EX supercomputer product line as well as the NVIDIA HGX supercomputing platform, including NVIDIA GPUs and the NVIDIA HPC SDK as well as the new Grace CPU.

The Alps system will replace CSCS’s existing Piz Daint supercomputer.

AI New Kind of Supercomputing

Alps is one of the new generation of machines that are expanding supercomputing beyond traditional modeling and simulation by taking advantage of GPU-accelerated deep learning.

“Deep learning is just an incredibly powerful set of tools that we add to the toolbox,” said CSCS Director Thomas Schulthess.

Taking advantage of the tight coupling between NVIDIA CPUs and GPUs, Alps is expected to be able to train GPT-3, the world’s largest natural language processing model, in only two days — 7x faster than NVIDIA’s 2.8-AI exaflops Selene supercomputer, currently recognized as the world’s leading supercomputer for AI by MLPerf.

CSCS users will be able to apply this incredible AI performance to a wide range of emerging scientific research that can benefit from natural language understanding.

This includes, for example, analyzing and understanding massive amounts of knowledge available in scientific papers and generating new molecules for drug discovery.

Soul of the New Machine

Based on the hyper-efficient Arm microarchitecture found in billions of smartphones and other edge computing devices, Grace will deliver 10x the performance of today’s fastest servers on the most complex AI and high-performance computing workloads.

Grace will support the next generation of NVIDIA’s coherent NVLink interconnect technology, allowing data to move more quickly between system memory, CPUs and GPUs.

And thanks to growing GPU support for data science acceleration at ever-larger scales, Alps will also be able to accelerate a bigger chunk of its users’ workflows, such as ingesting the vast quantities of data needed for modern supercomputing.

“The scientists will not only be able to carry out simulations, but also pre-process or post-process their data,” Schulthess said. “This makes the whole workflow more efficient for them.”

From Particle Physics to Weather Forecasts

CSCS has long supported scientists who are working at the cutting edge, particularly in materials science, weather forecasting and climate modeling, and understanding data streaming in from a new generation of scientific instruments.

CSCS designs and operates a dedicated system for numerical weather predictions (NWP) on behalf of MeteoSwiss, the Swiss meteorological service. This system has been running on GPUs since 2016.

That long-standing experience with operational NWP on GPUs will be key to future climate simulations as well — key not only to modeling long-term changes to climate, but to building models able to more accurately predict extreme weather events, saving lives.

One of that team’s goals is to run global climate models with a spatial resolution of 1 km that can map convective clouds such as thunderclouds.

The CSCS supercomputer is also used by Swiss scientists for the analysis of data from the Large Hadron Collider (LHC) at CERN, the European Council for Nuclear Research. It is the Swiss Tier-2 system in the World LHC Computing Grid.

Based in Geneva, the LHC — at $9 billion, one of the most expensive scientific instruments ever built — generates 90 petabytes of data a year.

Alps uses a new software-defined infrastructure that can support a wide range of projects.

As a result, in the future, different teams, such those from MeteoSwiss, will be able to use one or more partitions on a single, unified infrastructure, rather than different machines.

These can be virtual ad-hoc clusters for individual users or predefined clusters that research teams can put together with CSCS and then operate themselves.

Featured image source: Steve Evans, from Citizen of the World.

The post NVIDIA’s New CPU to ‘Grace’ World’s Most Powerful AI-Capable Supercomputer appeared first on The Official NVIDIA Blog.

What Is Quantum Computing?

Twenty-seven years before Steve Jobs unveiled a computer you could put in your pocket, physicist Paul Benioff published a paper showing it was theoretically possible to build a much more powerful system you could hide in a thimble — a quantum computer.

Named for the subatomic physics it aimed to harness, the concept Benioff described in 1980 still fuels research today, including efforts to build the next big thing in computing: a system that could make a PC look in some ways quaint as an abacus.

Richard Feynman — a Nobel Prize winner whose wit-laced lectures brought physics to a broad audience — helped establish the field, sketching out how such systems could simulate quirky quantum phenomena more efficiently than traditional computers.

So, What Is Quantum Computing?

Quantum computing uses the physics that governs subatomic particles to perform sophisticated parallel calculations, replacing more simplistic transistors in today’s computers.

Quantum computers calculate using qubits, computing units that can be on, off or any value between, instead of the bits in traditional computers that are either on or off, one or zero. The qubit’s ability to live in the in-between state — called superposition — adds a powerful capability to the computing equation, making quantum computers superior for some kinds of math.

What Does a Quantum Computer Do?

Using qubits, quantum computers could buzz through calculations that would take classical computers a loooong time — if they could even finish them.

For example, today’s computers use eight bits to represent any number between 0 and 255. Thanks to features like superposition, a quantum computer can use eight qubits to represent every number between 0 and 255, simultaneously.

It’s a feature like parallelism in computing: All possibilities are computed at once rather than sequentially, providing tremendous speedups.

So, while a classical computer steps through long division calculations one at a time to factor a humongous number, a quantum computer can get the answer in a single step. Boom!

That means quantum computers could reshape whole fields, like cryptography, that are based on factoring what are today impossibly large numbers.

A Big Role for Tiny Simulations

That could be just the start. Some experts believe quantum computers will bust through limits that now hinder simulations in chemistry, materials science and anything involving worlds built on the nano-sized bricks of quantum mechanics.

Quantum computers could even extend the life of semiconductors by helping engineers create more refined simulations of the quantum effects they’re starting to find in today’s smallest transistors.

Indeed, experts say quantum computers ultimately won’t replace classical computers, they’ll complement them. And some predict quantum computers will be used as accelerators much as GPUs accelerate today’s computers.

How Does Quantum Computing Work?

Don’t expect to build your own quantum computer like a DIY PC with parts scavenged from discount bins at the local electronics shop.

The handful of systems operating today typically require refrigeration that creates working environments just north of absolute zero. They need that computing arctic to handle the fragile quantum states that power these systems.

In a sign of how hard constructing a quantum computer can be, one prototype suspends an atom between two lasers to create a qubit. Try that in your home workshop!

Quantum computing takes nano-Herculean muscles to create something called entanglement. That’s when two or more qubits exist in a single quantum state, a condition sometimes measured by electromagnetic waves just a millimeter wide.

Crank up that wave with a hair too much energy and you lose entanglement or superposition, or both. The result is a noisy state called decoherence, the equivalent in quantum computing of the blue screen of death.

What’s the Status of Quantum Computers?

A handful of companies such as Alibaba, Google, Honeywell, IBM, IonQ and Xanadu operate early versions of quantum computers today.

Today they provide tens of qubits. But qubits can be noisy, making them sometimes unreliable. To tackle real-world problems reliably, systems need tens or hundreds of thousands of qubits.

Experts believe it could be a couple decades before we get to a high-fidelity era when quantum computers are truly useful.

quantum computing status — Quantum computers are slowly moving toward commercial use. (Source: ISSCC 2017 talk by Lieven Vandersypen.)

Predictions of when we reach so-called quantum computing supremacy — the time when quantum computers execute tasks classical ones can’t — is a matter of lively debate in the industry.

Accelerating Quantum Circuit Simulations Today

The good news is the world of AI and machine learning put a spotlight on accelerators like GPUs, which can perform many of the types of operations quantum computers would calculate with qubits.

So, classical computers are already finding ways to host quantum simulations with GPUs today. For example, NVIDIA ran a leading-edge quantum simulation on Selene, our in-house AI supercomputer.

NVIDIA announced in the GTC keynote the cuQuantum SDK to speed quantum circuit simulations running on GPUs. Early work suggests cuQuantum will be able to deliver orders of magnitude speedups.

The SDK takes an agnostic approach, providing a choice of tools users can pick to best fit their approach. For example, the state vector method provides high-fidelity results, but its memory requirements grow exponentially with the number of qubits.

That creates a practical limit of roughly 50 qubits on today’s largest classical supercomputers. Nevertheless we’ve seen great results (below) using cuQuantum to accelerate quantum circuit simulations that use this method.

quantum state vector results — State vector: 1,000 circuits, 36 qubits, depth m=10, complex 64 | CPU: Qiskit on dual AMD EPYC 7742 | GPU: Qgate on DGX A100

Researchers from the Jülich Supercomputing Centre will provide a deep dive on their work with the state vector method in session E31941 at GTC (free with registration).

A newer approach, tensor network simulations, use less memory and more computation to perform similar work.

Using this method, NVIDIA and Caltech accelerated a state-of-the-art quantum circuit simulator with cuQuantum running on NVIDIA A100 Tensor Core GPUs. It generated a sample from a full-circuit simulation of the Google Sycamore circuit in 9.3 minutes on Selene, a task that 18 months ago experts thought would take days using millions of CPU cores.

Quantum tensor chart — Tensor Network – 53 qubits, depth m=20 | CPU: Quimb on Dual AMD EPYC 7742 estimated | GPU: Quimb on DGX-A100

“Using the Cotengra/Quimb packages, NVIDIA’s newly announced cuQuantum SDK, and the Selene supercomputer, we’ve generated a sample of the Sycamore quantum circuit at depth m=20 in record time — less than 10 minutes,” said Johnnie Gray, a research scientist at Caltech.

“This sets the benchmark for quantum circuit simulation performance and will help advance the field of quantum computing by improving our ability to verify the behavior of quantum circuits,” said Garnet Chan, a chemistry professor at Caltech whose lab hosted the work.

NVIDIA expects the performance gains and ease of use of cuQuantum will make it a foundational element in every quantum computing framework and simulator at the cutting edge of this research.

The post What Is Quantum Computing? appeared first on The Official NVIDIA Blog.

Drug Discovery Gets Jolt of AI via NVIDIA Collaborations with AstraZeneca, U of Florida Health

NVIDIA is collaborating with biopharmaceutical company AstraZeneca and the University of Florida’s academic health center, UF Health, on new AI research projects using breakthrough transformer neural networks.

Transformer-based neural network architectures — which have become available only in the last several years — allow researchers to leverage massive datasets using self-supervised training methods, avoiding the need for manually labeled examples during pre-training. These models, equally adept at learning the syntactic rules to describe chemistry as they are at learning the grammar of languages, are finding applications across research domains and modalities.

NVIDIA is collaborating with AstraZeneca on a transformer-based generative AI model for chemical structures used in drug discovery that will be among the very first projects to run on Cambridge-1, which is soon to go online as the UK’s largest supercomputer. The model will be open sourced, available to researchers and developers in the NVIDIA NGC software catalog, and deployable in the NVIDIA Clara Discovery platform for computational drug discovery.

Separately, UF Health is harnessing NVIDIA’s state-of-the-art Megatron framework and BioMegatron pre-trained model — available on NGC — to develop GatorTron, the largest clinical language model to date.

New NGC applications include AtacWorks, a deep learning model that identifies accessible regions of DNA, and MELD, a tool for inferring the structure of biomolecules from sparse, ambiguous or noisy data.

Megatron Model for Molecular Insights

The MegaMolBART drug discovery model being developed by NVIDIA and AstraZeneca is slated for use in reaction prediction, molecular optimization and de novo molecular generation. It’s based on AstraZeneca’s MolBART transformer model and is being trained on the ZINC chemical compound database — using NVIDIA’s Megatron framework to enable massively scaled-out training on supercomputing infrastructure.

The large ZINC database allows researchers to pretrain a model that understands chemical structure, bypassing the need for hand-labeled data. Armed with a statistical understanding of chemistry, the model will be specialized for a number of downstream tasks, including predicting how chemicals will react with each other and generating new molecular structures.

“Just as AI language models can learn the relationships between words in a sentence, our aim is that neural networks trained on molecular structure data will be able to learn the relationships between atoms in real-world molecules,” said Ola Engkvist, head of molecular AI, discovery sciences, and R&D at AstraZeneca. “Once developed, this NLP model will be open source, giving the scientific community a powerful tool for faster drug discovery.”

The model, trained using NVIDIA DGX SuperPOD, gives researchers ideas for molecules that don’t exist in databases but could be potential drug candidates. Computational methods, known as in-silico techniques, allow drug developers to search through more of the vast chemical space and optimize pharmacological properties before shifting to expensive and time-consuming lab testing.

This collaboration will use the NVIDIA DGX A100-powered Cambridge-1 and Selene supercomputers to run large workloads at scale. Cambridge-1 is the largest supercomputer in the U.K., ranking No. 3 on the Green500 and No. 29 on the TOP500 list of the world’s most powerful systems. NVIDIA’s Selene supercomputer topped the most recent Green500 and ranks fifth on the TOP500.

Language Models Speed Up Medical Innovation

UF Health’s GatorTron model — trained on records from more than 50 million interactions with 2 million patients — is a breakthrough that can help identify patients for lifesaving clinical trials, predict and alert health teams about life-threatening conditions, and provide clinical decision support to doctors.

“GatorTron leveraged over a decade of electronic medical records to develop a state-of-the-art model,” said Joseph Glover, provost at the University of Florida, which recently boosted its supercomputing facilities with NVIDIA DGX SuperPOD. “A tool of this scale will enable healthcare researchers to unlock insights and reveal previously inaccessible trends from clinical notes.”

Beyond clinical medicine, the model also accelerates drug discovery by making it easier to rapidly create patient cohorts for clinical trials and for studying the effect of a certain drug, treatment or vaccine.

It was created using BioMegatron, the largest biomedical transformer model ever trained, developed by NVIDIA’s applied deep learning research team using data from the PubMed corpus. BioMegatron is available on NGC through Clara NLP, a collection of NVIDIA Clara Discovery models pretrained on biomedical and clinical text.

“The GatorTron project is an exceptional example of the discoveries that happen when experts in academia and industry collaborate using leading-edge artificial intelligence and world-class computing resources,” said David R. Nelson, M.D., senior vice president for health affairs at UF and president of UF Health. “Our partnership with NVIDIA is crucial to UF emerging as a destination for artificial intelligence expertise and development.”

Powering Drug Discovery Platforms

NVIDIA Clara Discovery libraries and NVIDIA DGX systems have been adopted by computational drug discovery platforms, too, boosting pharmaceutical research.

Schrödinger, a leader in chemical simulation software development, today announced a strategic partnership with NVIDIA that includes research in scientific computing and machine learning, optimizing of Schrödinger applications on NVIDIA platforms, and a joint solution around NVIDIA DGX SuperPOD to evaluate billions of potential drug compounds within minutes.
Biotechnology company Recursion has installed BioHive-1, a supercomputer based on the NVIDIA DGX SuperPOD reference architecture that, as of January, is estimated to rank at No. 58 on the TOP500 list of the world’s most powerful computer systems. BioHive-1 will allow Recursion to run within a day deep learning projects that previously took a week to complete using its existing cluster.
Insilico Medicine, a partner in the NVIDIA Inception accelerator program, recently announced the discovery of a novel preclinical candidate to treat idiopathic pulmonary fibrosis — the first example of an AI-designed molecule for a new disease target nominated for clinical trials. Compounds were generated on a system powered by NVIDIA Tensor Core GPUs, taking less than 18 months and under $2 million from target hypothesis to preclinical candidate selection.
Vyasa Analytics, a member of the NVIDIA Inception accelerator program, is using Clara NLP and NVIDIA DGX systems to give its users access to pretrained models for biomedical research. The company’s GPU-accelerated Vyasa Layar Data Fabric is powering solutions for multi-institutional cancer research, clinical trial analytics and biomedical data harmonization.

Learn more about NVIDIA’s work in healthcare at this week’s GPU Technology Conference, which kicks off with a keynote address by NVIDIA CEO Jensen Huang. Registration is free. The healthcare track includes 16 live webinars, 18 special events and over 100 recorded sessions.

Subscribe to NVIDIA healthcare news and follow NVIDIA Healthcare on Twitter.

The post Drug Discovery Gets Jolt of AI via NVIDIA Collaborations with AstraZeneca, U of Florida Health appeared first on The Official NVIDIA Blog.

An Engine of Innovation: Sony Levels Up for the AI Era

If you want to know what the next big thing will be, ask someone at a company that invents it time and again.

“AI is a key tool for the next era, so we are providing the computing resources our developers need to generate great AI results,” said Yuichi Kageyama, general manager of Tokyo Laboratory 16, in R&D Center for Sony Group Corporation.

Called GAIA internally, the lab’s computing resources act as a digital engine serving all Sony Group companies. And it’s about to get a second fuel injection of accelerated computing for AI efforts across the corporation.

Sony’s engineers are packing machine-learning smarts into products from its Xperia smartphones, its entertainment robot, aibo, and a portfolio of imaging components for everything from professional and consumer cameras to factory automation and satellites. It’s even using AI to build the next generation of advanced imaging chips.

More Zip, Fewer Tolls

To move efficiently into the AI era, Sony is installing a cluster of NVIDIA DGX A100 systems linked on an NVIDIA Mellanox InfiniBand network. It expands an existing system now running at near full utilization with NVIDIA V100 Tensor Core GPUs, commissioned in October when the company brought AI training in house.

“When we were using cloud services, AI developers worried about the costs, but now they can focus on AI development on GAIA,” said Kageyama.

An in-house AI engine torques performance, too. One team designed a deep-learning model for delivering super-resolution images and trained it nearly 16x faster by adding more resources to the job, shortening a month’s workload to a day.

“With the computing power of the DGX A100, its expanded GPU memory and faster InfiniBand networking, we expect to see even greater performance on larger datasets,” said Yoshiki Tanaka, who oversees HPC and distributed deep learning technologies for Sony’s developers.

Powering an AI Pipeline

Sony posted fast speeds in deep learning back in 2018, accelerating its Neural Network Libraries on a system at Japan’s National Institute of Advanced Industrial Science and Technology. And it’s already rolling out products powered with machine learning, such as its Airpeak drone for professional filmmakers shown at CES this year.

There’s plenty more to come.

“We will see good results in our fiscal 2021 because we have collaborations with many business teams who have started some good projects,” Kageyama said.

NVIDIA is putting its shoulder to the wheel with software and services to “build a culture of using GPUs,” he added.

For example, Sony developers use NGC, NVIDIA’s online container registry, for all the software components they need to get an AI app up and running.

Sony even created a container of its own, now available on NGC, sporting its Neural Network Libraries and other utilities. It supplements NVIDIA’s containers for work in popular environments like PyTorch and TensorFlow.

Drivers Give a Thumbs Up

Developers tell Kageyama’s team that having their code in one place helps simplify and speed their work.

Some researchers use the system for high performance computing, tapping into NVIDIA’s CUDA software that accelerates a diverse set of technical applications including AI.

To keep it all running smoothly, NVIDIA provided a job scheduler as well as additions for Sony to NVIDIA’s libraries for scaling apps across multiple GPUs.

“Good management software is important for achieving fairness and high utilization on such a complex system,” said Masahiro Hara, who leads development of the GAIA system.

An Eye Toward Analytics

NVIDIA also helped Sony create training programs on how to use its software on GAIA.

Looking ahead, Sony is interested in expanding its work in data analytics and simulations. It’s evaluating RAPIDS, open-source software NVIDIA helped design to let Python programmers access the power of GPUs for data science.

At the end of a work-from-home day keeping Sony ahead of the pack in AI, Kageyama enjoys playing with his kids who keep their dad on his digital toes. “I’m a beginner in Minecraft, and they’re much better than me,” he said.

The post An Engine of Innovation: Sony Levels Up for the AI Era appeared first on The Official NVIDIA Blog.

Introducing causal network motifs: A new approach to identifying heterogeneous spillover effects

This project is joint work with Yuan Yuan, PhD candidate at MIT, and the Facebook Core Data Science team. Learn more about CDS on the CDS team page.

What the research is:

Randomized experiments, or A/B tests, remain the gold standard for evaluating the causal effect of a policy intervention or product change. However, experimental settings, such as social networks, where users are interacting with and influencing one another, may violate conventional assumptions of no interference for credible causal inference. Existing solutions to the network setting include accounting for the fraction or count of treated neighbors in a user’s network, yet most current methods do not account for the local network structure beyond simply counting the number of neighbors.

Our study provides an approach that accounts for both the local structure in a user’s social network via motifs as well as the treatment assignment conditions of neighbors. We propose a two-part approach. We first introduce and employ causal network motifs, which are network motifs that characterize the assignment conditions in local ego networks. Then, we propose a tree-based algorithm for identifying different network interference conditions and estimating their average potential outcomes. Our approach can account for social network theories, such as structural diversity and echo chambers, and also can help specify network interference conditions that are suitable for each experiment. We test our method on a synthetic network setting and on a real-world experiment on a large-scale network, which highlight how accounting for local structures can better account for different interference patterns in networks.

As an example, Figure 1 illustrates four examples of network interference that could be captured by the local network structures and treatment assignment. The first two conditions are simply the cases where all neighbors are treated or nontreated, followed by the important network interference conditions suggested by structural diversity and complex contagion, respectively. In the case of structural diversity and echo chamber settings, the ego nodes in (c) and (d) have 1/2 neighbors treated but exhibit very different local structures, and the ego’s outcome may be different in these settings. We do not know which one is the dominant factor that drives most of the variance in the outcome.

Figure 1: Examples of network interference conditions across different local network structures. The star indicates a user and a circle represents a user’s friends. Solid circles indicate that a friend is in treatment and hollow circles indicate a friend is in control. For stars, the shaded indicates that it could be treated or control.

Given the large number of researcher degrees of freedom in existing approaches for network interference, such as choosing the threshold for an exposure condition, our approach provides a simple way to automatically specify exposure conditions. In this way, researchers no longer need to define exposure conditions a priori, and the exposure conditions generated by the algorithm are suitable for the given data and experiment. We believe that methodological innovation for addressing network interference concerns in A/B tests on networks will continue to be an important area for development, and accounting for network motifs with treatment assignment conditions provides a useful way to detect heterogeneous network interference effects.

How it works:

Our study provides a two-step solution to automatically identify different exposure conditions while overcoming selection bias concerns, as will be explained in more detail in the section after Figure 2. First, for an A/B test on a network, we construct network motif features with treatment assignment conditions (i.e., causal network motifs) to provide a fine-grained characterization of the local network structure and potential interference conditions. Second, using the network motif characterization as input, we develop a tree-based algorithm to perform clustering and define the set D rather than allowing practitioners to explore that.

We introduce causal network motifs, which differ from conventional network motifs in two primary aspects. First, we focus on (1-hop) ego networks that include the ego node, with the methods generalizing to higher 𝑛-hop ego networks for 𝑛>1. Second, we consider the treatment assignment conditions of the user and their 𝑛-hop connections. We use the term “network motifs” to refer to conventional motifs without treatment assignment labels (or assignment conditions) and “causal network motifs” to refer to those with assignment conditions. Examples of network motifs are illustrated in Figure 2. We use these counts on an 𝑛-hop ego network to characterize the exposure condition of each observation.

Figure 2: Examples of causal network motifs. Stars represent egos and circles represent alters. Solid indicates the node being treated, hollow indicates control, and shaded indicates that it could be treated or control. The first patterns in each row are conventional network motifs without assignment conditions, or just called network motifs, followed by corresponding network motifs. Our interference vector is constructed by dividing the count of a causal network motif by the count of the corresponding causal network motif. The labels below each network motif indicate the naming: for example, an open triad where one neighbor is treated is named 3o-1.

After counting causal network motifs for each ego node in our network, our next step is to convert the counts to features, which will be used in the next section. Let X𝑖 denote an 𝑚-dimensional random vector, referred to as interference vector. The interference vector has an important requirement: Each element of the random vector is intervenable — that is, the random treatment assignment affects the value of each element of the vector. The requirement addresses the selection bias issue when we estimate the average potential outcomes.

We construct the interference vector in the following way. For each observation, for the count for each causal network motif (e.g., 2-1, 2-0, …, 3o-2, 3o-1, …), we normalize it by the count of the corresponding network motifs (e.g., dyads, open triads, closed triads, …). In this way, each element of X𝑖 is intervenable, and the support for each element is in [0, 1]. Note that when considering a network motif with many nodes, some observations may not have certain network motifs, and normalization cannot be performed. In these scenarios, we can either exclude this network motif from the interference vector or drop these observations if they take a really small proportion. Please refer to Figure 3 for an illustration of constructing the interference vector.

Figure 3: An example of ego network with treatment assignments and the corresponding interference vector. Stars represent egos and circles represent alters. Solid indicates the node being treated, hollow indicates control, and shaded indicates that it could be treated or control.

Then, our approach partitions [0, 1]^m+1 and determines exposure conditions based on a decision tree regression. Decision trees can be used for clustering [1] and typically have good interpretability in the decision-making process [2]. Thus, it is a proper machine learning algorithm to solve the partitioning problem. Each leaf of the decision tree corresponds to a unique exposure condition (partition). Compared with conventional decision tree regression, we have several revisions to accommodate honest splitting, positivity, and so on.

Why it matters:

Network interference is much more complicated than simply being described as the indirect effect. To examine and analyze heterogeneity of indirect effects in experimental data sets, we provide a two-step solution. We first propose and employ the causal network motifs to characterize the network interference conditions, and then develop a tree-based algorithm for partitioning. Our tree-based algorithm is interpretable in terms of highlighting which exposure conditions are important for defining potential outcomes, it addresses selection bias and positivity issues, and it avoids incorrect standard error concerns via honest splitting.

Practitioners using our approach may obtain important insights. For example, they could understand how to utilize social contagion for product promotion when they have constraints on the number of promos. Researchers may identify important network interference conditions that are not theorized in certain experimental settings.

Read the full paper:

Causal network motifs: Identifying heterogeneous spillover effects in A/B tests

Learn more:

Check out our open source implementation on GitHub.

Watch our presentation at the Web Conference 2021.

[1] Bing Liu, Yiyuan Xia, and Philip S Yu. 2000. Clustering through decision tree construction. In CIKM. 20–29.
[2] J. Ross Quinlan. 1986. Induction of decision trees. Mach Learn (1986).

The post Introducing causal network motifs: A new approach to identifying heterogeneous spillover effects appeared first on Facebook Research.

Using machine learning for virtual-machine placement in the cloud

In tests, a new way to allocate virtual machines across servers outperforms baselines by 10%.Read More

Detect abnormal equipment behavior and review predictions using Amazon Lookout for Equipment and Amazon A2I

Companies that operate and maintain a broad range of industrial machinery such as generators, compressors, and turbines are constantly working to improve operational efficiency and avoid unplanned downtime due to component failure. They invest heavily in physical sensors (tags), data connectivity, data storage, and data visualization to monitor the condition of their equipment and get real-time alerts for predictive maintenance.

With machine learning (ML), more powerful technologies have become available that can provide data-driven models that learn from an equipment’s historical data. However, implementing such ML solutions is time-consuming and expensive because it involves managing and setting up complex infrastructure and having the right ML skills. Furthermore, ML applications need human oversight to ensure accuracy with sensitive data, help provide continuous improvements, and retrain models with updated predictions. However, you’re often forced to choose between an ML-only or human-only system. Companies are looking for the best of both worlds—integrating ML systems into your workflow while keeping a human eye on the results to achieve higher precision.

In this post, we show you how you can set up Amazon Lookout for Equipment to train an abnormal behavior detection model using a wind turbine dataset for predictive maintenance, use a human in the loop workflow to review the predictions using Amazon Augmented AI (Amazon A2I), and augment the dataset and retrain the model.

Solution overview

Amazon Lookout for Equipment analyzes the data from your sensors, such as pressure, flow rate, RPMs, temperature, and power, to automatically train a specific ML model based on your data, for your equipment, with no ML expertise required. Amazon Lookout for Equipment uses your unique ML model to analyze incoming sensor data in near-real time and accurately identify early warning signs that could lead to machine failures. This means you can detect equipment abnormalities with speed and precision, quickly diagnose issues, take action to reduce expensive downtime, and reduce false alerts.

Amazon A2I is an ML service that makes it easy to build the workflows required for human review. Amazon A2I brings human review to all developers, removing the undifferentiated heavy lifting associated with building human review systems or managing large numbers of human reviewers, whether running on AWS or not.

To get started with Amazon Lookout for Equipment, we create a dataset, ingest data, train a model, and run inference by setting up a scheduler. After going through these steps, we show you how you can quickly set up a human review process using Amazon A2I and retrain your model with augmented or human reviewed datasets.

In the accompanying Jupyter notebook, we walk you through the following steps:

Create a dataset in Amazon Lookout for Equipment.
Ingest data into the Amazon Lookout for Equipment dataset.
Train a model in Amazon Lookout for Equipment.
Run diagnostics on the trained model.
Create an inference scheduler in Amazon Lookout for Equipment to send a simulated stream of real-time requests.
Set up an Amazon A2I private human loop and review the predictions from Amazon Lookout for Equipment.
Retrain your model based on augmented datasets from Amazon A2I.

Architecture overview

The following diagram illustrates our solution architecture.

The workflow contains the following steps:

The architecture assumes that the inference pipeline is built and sensor data is periodically stored in the S3 path for inference inputs. These inputs are stored in CSV format with corresponding timestamps in the file name.
Amazon Lookout for Equipment wakes up at a prescribed frequency and processes the most recent file from the inference inputs Amazon Simple Storage Service (Amazon S3) path.
Inference results are stored in the inference outputs S3 path in JSON lines file format. The outputs also contain event diagnostics, which are used for root cause analysis.
When Amazon Lookout for Equipment detects an anomaly, the inference input and outputs are presented to the private workforce for validation via Amazon A2I.
A private workforce investigates and validates the detected anomalies and provides new anomaly labels. These labels are stored in a new S3 path.
Training data is also updated, along with the corresponding new labels, and is staged for subsequent model retraining.
After enough new labels are collected, a new Amazon Lookout for Equipment model is created, trained, and deployed. The retraining cycle can be repeated for continuous model retraining.

Prerequisites

Before you get started, complete the following steps to set up the Jupyter notebook:

Create a notebook instance in Amazon SageMaker.

Make sure your SageMaker notebook has the necessary AWS Identity and Access Management (IAM) roles and permissions mentioned in the prerequisite section of the notebook.

When the notebook is active, choose Open Jupyter.
On the Jupyter dashboard, choose New, and choose Terminal.
In the terminal, enter the following code:

cd SageMaker
git clone https://github.com/aws-samples/lookout-for-equipment-demo

First run the data preparation notebook – 1_data_preparation.ipynb
Then open the notebook for this blog – 3_integrate_l4e_and_a2i.ipynb

You’re now ready to run the following steps through the notebook cells. Run the setup environment step to set up the necessary Python SDKs and libraries that we use throughout the notebook.

Provide an AWS Region, create an S3 bucket, and provide details of the bucket in the following code cell:

REGION_NAME = '<your region>'
BUCKET = '<your bucket name>'
PREFIX = 'data/wind-turbine'

Analyze the dataset and create component metadata

In this section, we walk you through how you can preprocess the existing wind turbine data and ingest it for Amazon Lookout for Equipment. Please make sure to run the data preparation notebook prior to running the accompanying notebook for the blog to follow through all the steps in this post. You need a data schema for using your existing historical data with Amazon Lookout for Equipment. The data schema tells Amazon Lookout for Equipment what the data means. Because a data schema describes the data, its structure mirrors that of the data files of the components it describes.

All components must be described in the data schema. The data for each component is contained in a separate CSV file structured as shown in the data schema.

You store the data for each asset’s component in a separate CSV file using the following folder structure:

S3 bucket > Asset_name > Component 1 > Component1.csv

Go to the notebook section Pre-process and Load Datasets and run the following cell to inspect the data:

import pandas as pd
turbine_id = 'R80711'
df = pd.read_csv(f'../data/wind-turbine/interim/{turbine_id}.csv', index_col = 'Timestamp')
df.head()

The following screenshot shows our output.

Now we create components map to create a dataset expected by Amazon Lookout for Equipment for ingest. Run the notebook cells under the section Create the Dataset Component Map to create a component map and generate a CSV file for ingest.

Create the Amazon Lookout for Equipment dataset

We use Amazon Lookout for Equipment Create Dataset APIs to create a dataset and provide the component map we created in the previous step as an input. Run the following notebook cell to create a dataset:

ROLE_ARN = sagemaker.get_execution_role()
# REGION_NAME = boto3.session.Session().region_name
DATASET_NAME = 'wind-turbine-train-dsv2-PR'
MODEL_NAME = 'wind-turbine-PR-v1'

lookout_dataset = lookout.LookoutEquipmentDataset(
dataset_name=DATASET_NAME,
component_fields_map=DATASET_COMPONENT_FIELDS_MAP,
region_name=REGION_NAME,
access_role_arn=ROLE_ARN
)

pp = pprint.PrettyPrinter(depth=5)
pp.pprint(eval(lookout_dataset.dataset_schema))
lookout_dataset.create()

You get the following output:

Dataset "wind-turbine-train-dsv2-PR" does not exist, creating it...


{'DatasetName': 'wind-turbine-train-dsv2-PR',
 'DatasetArn': 'arn:aws:lookoutequipment:ap-northeast-2:<aws-account>:dataset/wind-turbine-train-dsv2-PR/8325802a-9bb7-48fb-804b-ab9f5b79f49d',
 'Status': 'CREATED',
 'ResponseMetadata': {'RequestId': '52dc754c-84da-4a8c-aaef-1908e4348837',
  'HTTPStatusCode': 200,
  'HTTPHeaders': {'x-amzn-requestid': '52dc754c-84da-4a8c-aaef-1908e4348837',
   'content-type': 'application/x-amz-json-1.0',
   'content-length': '203',
   'date': 'Thu, 25 Mar 2021 21:18:29 GMT'},
  'RetryAttempts': 0}}

Alternatively, you can go to the Amazon Lookout for Equipment console to view the dataset.

You can choose View under Data schema to view the schema of the dataset. You can choose Ingest new data to start ingesting data through the console, or you can use the APIs shown in the notebook to do the same using Python Boto3 APIs.

Run the notebook cells to ingest the data. When ingestion is complete, you get the following response:

=====Polling Data Ingestion Status=====

2021-03-25 21:18:45 |  IN_PROGRESS
2021-03-25 21:19:46 |  IN_PROGRESS
2021-03-25 21:20:46 |  IN_PROGRESS
2021-03-25 21:21:46 |  IN_PROGRESS
2021-03-25 21:22:46 |  SUCCESS

Now that we have preprocessed the data and ingested the data into Amazon Lookout for Equipment, we move on to the training steps. Alternatively, you can choose Ingest data.

Label your dataset using the SageMaker labeling workforce

If you don’t have an existing labeled dataset available to directly use with Amazon Lookout for Equipment, create a custom labeling workflow. This may be relevant in a use case in which, for example, a company wants to build a remote operating facility where alerts from various operations are sent to the central facility for the SMEs to review and update. For a sample crowd HTML template for your labeling UI, refer to our GitHub repository.

The following screenshot shows an example of what the sample labeling UI looks like.

For this post, we use the labels that came with the dataset for training. If you want to use the label file you created for your actual training in the next step, you need to copy the label file to an S3 bucket and provide the location in the training configuration.

Create a model in Amazon Lookout for Equipment

We walk you through the following steps in this section:

Prepare the model parameters and split data into test and train sets
Train the model using Amazon Lookout for Equipment APIs
Get diagnostics for the trained model

Prepare the model parameters and split the data

In this step, we split the datasets into test and train, prepare labels, and start the training using the notebook. Run the notebook code Split train and test to split the dataset into an 80/20 split for training and testing, respectively. Then run the prepare labels code and move on to setting up training config, as shown in the following code:

# Prepare the model parameters:
lookout_model = lookout.LookoutEquipmentModel(model_name=MODEL_NAME,
                                              dataset_name=DATASET_NAME,
                                              region_name=REGION_NAME)

# Set the training / evaluation split date:
lookout_model.set_time_periods(evaluation_start,
                               evaluation_end,
                               training_start,
                               training_end)

# Set the label data location:
lookout_model.set_label_data(bucket=BUCKET, 
                             prefix=PREFIX+'/labelled_data/',
                             access_role_arn=ROLE_ARN)

# This sets up the rate the service will resample the data before 
# training:
lookout_model.set_target_sampling_rate(sampling_rate='PT10M')

In the preceding code, we set up model training parameters such as time periods, label data, and target sampling rate for our model. For more information about these parameters, see CreateModel.

Train model

After setting these model parameters, you need to run the following train model API to start training your model with your dataset and the training parameters:

lookout_model.train()

You get the following response:

{'ModelArn': 'arn:aws:lookoutequipment:ap-northeast-2:<accountid>:model/wind-turbine-PR-v1/fac217a9-8855-4931-95f9-dd47f0af1ec5',
 'Status': 'IN_PROGRESS',
 'ResponseMetadata': {'RequestId': '3d385895-c62e-4126-9622-38f0ebed9715',
  'HTTPStatusCode': 200,
  'HTTPHeaders': {'x-amzn-requestid': '3d385895-c62e-4126-9622-38f0ebed9715',
   'content-type': 'application/x-amz-json-1.0',
   'content-length': '152',
   'date': 'Thu, 25 Mar 2021 21:27:05 GMT'},
  'RetryAttempts': 0}}

Alternatively, you can go to the Amazon Lookout for Equipment console and monitor the training after you create the model.

The sample turbine dataset we provide in our example has millions of data points. Training takes approximately 2.5 hours.

Evaluate the trained model

After a model is trained, Amazon Lookout for Equipment evaluates its performance and displays the results. It provides an overview of the performance and detailed information about the abnormal equipment behavior events and how well the model performed when detecting those. With the data and failure labels that you provided for training and evaluating the model, Amazon Lookout for Equipment reports how many times the model’s predictions were true positives. It also reports the average forewarning time across all true positives. Additionally, it reports the false positive results generated by the model, along with the duration of the non-event.

For more information about performance evaluation, see Evaluating the output.

Review training diagnostics

Run the following code to generate the training diagnostics. Refer to the accompanying notebook for the complete code block to run for this step.

LookoutDiagnostics = lookout.LookoutEquipmentAnalysis(model_name=MODEL_NAME, tags_df=df, region_name=REGION_NAME)
LookoutDiagnostics.set_time_periods(evaluation_start, evaluation_end, training_start, training_end)
predicted_ranges = LookoutDiagnostics.get_predictions()

The results returned show the percentage contribution of each feature towards the abnormal equipment prediction for the corresponding date range.

Create an inference scheduler in Amazon Lookout for Equipment

In this step, we show you how the CreateInferenceScheduler API creates a scheduler and starts it—this starts costing you right away. Scheduling an inference is setting up a continuous real-time inference plan to analyze new measurement data. When setting up the scheduler, you provide an S3 bucket location for the input data, assign it a delimiter between separate entries in the data, set an offset delay if desired, and set the frequency of inference. You must also provide an S3 bucket location for the output data. Run the following notebook section to run inference on the model to create an inference scheduler:

scheduler = lookout.LookoutEquipmentScheduler(
scheduler_name=INFERENCE_SCHEDULER_NAME,
model_name=MODEL_NAME_FOR_CREATING_INFERENCE_SCHEDULER,
region_name=REGION_NAME
)

scheduler_params = {
'input_bucket': INFERENCE_DATA_SOURCE_BUCKET,
'input_prefix': INFERENCE_DATA_SOURCE_PREFIX,
'output_bucket': INFERENCE_DATA_OUTPUT_BUCKET,
'output_prefix': INFERENCE_DATA_OUTPUT_PREFIX,
'role_arn': ROLE_ARN_FOR_INFERENCE,
'upload_frequency': DATA_UPLOAD_FREQUENCY,
'delay_offset': DATA_DELAY_OFFSET_IN_MINUTES,
'timezone_offset': INPUT_TIMEZONE_OFFSET,
'component_delimiter': COMPONENT_TIMESTAMP_DELIMITER,
'timestamp_format': TIMESTAMP_FORMAT
}

scheduler.set_parameters(**scheduler_params)

After you create an inference scheduler, the next step is to create some sample datasets for inference.

Prepare the inference data

Run through the notebook steps to prepare the inference data. Let’s load the tags description; this dataset comes with a data description file. From here, we can collect the list of components (subsystem column) if required. We use the tag metadata from the data descriptions as a point of reference for our interpretation. We use the tag names to construct a list that Amazon A2I uses. For more details, refer to the section Set up Amazon A2I to review predictions from Amazon Lookout for Equipment in this post.

To build our sample inference dataset, we extract the last 2 hours of data from the evaluation period of the original time series. Specifically, we create three CSV files containing simulated real-time tags for our turbine 10 minutes apart. These are all stored in Amazon S3 in the inference-a2i folder. Now that we’ve prepared the data, create the scheduler by running the following code:

create_scheduler_response = scheduler.create()

You get the following response:

===== Polling Inference Scheduler Status =====

Scheduler Status: PENDING
Scheduler Status: RUNNING

===== End of Polling Inference Scheduler Status =====

Alternatively, on the Amazon Lookout for Equipment console, go to the Inference schedule settings section of your trained model and set up a scheduler by providing the necessary parameters.

Get inference results

Run through the notebook steps List inference executions to get the run details from the schedule you created in the previous step. Wait 5–15 minutes for the scheduler to run its first inference. When it’s complete, we can use the ListInferenceExecution API for our current inference scheduler. The only mandatory parameter is the scheduler name.

You can also choose a time period for which you want to query inference runs. If you don’t specify it, all runs for an inference scheduler are listed. If you want to specify the time range, you can use the following code:

START_TIME_FOR_INFERENCE_EXECUTIONS = datetime.datetime(2010,1,3,0,0,0)END_TIME_FOR_INFERENCE_EXECUTIONS = datetime.datetime(2010,1,5,0,0,0)

This code means that the runs after 2010-01-03 00:00:00 and before 2010-01-05 00:00:00 are listed.

You can also choose to query for runs in a particular status, such as IN_PROGRESS, SUCCESS, and FAILED:

START_TIME_FOR_INFERENCE_EXECUTIONS = None
END_TIME_FOR_INFERENCE_EXECUTIONS = None
EXECUTION_STATUS = None

execution_summaries = []

while len(execution_summaries) == 0:
    execution_summaries = scheduler.list_inference_executions(
        start_time=START_TIME_FOR_INFERENCE_EXECUTIONS,
        end_time=END_TIME_FOR_INFERENCE_EXECUTIONS,
        execution_status=EXECUTION_STATUS
    )
    if len(execution_summaries) == 0:
        print('WAITING FOR THE FIRST INFERENCE EXECUTION')
        time.sleep(60)
        
    else:
        print('FIRST INFERENCE EXECUTEDn')
        break
            
execution_summaries

You get the following response:

{'ModelName': 'wind-turbine-PR-v1',
  'ModelArn': 'arn:aws:lookoutequipment:ap-northeast-2:<aws-account>:model/wind-turbine-PR-v1/fac217a9-8855-4931-95f9-dd47f0af1ec5',
  'InferenceSchedulerName': 'wind-turbine-scheduler-a2i-PR-v10',
  'InferenceSchedulerArn': 'arn:aws:lookoutequipment:ap-northeast-2:<aws-account>:inference-scheduler/wind-turbine-scheduler-a2i-PR-v10/e633c39d-a4f9-49f6-8248-7594349db2d0',
  'ScheduledStartTime': datetime.datetime(2021, 3, 29, 15, 35, tzinfo=tzlocal()),
  'DataStartTime': datetime.datetime(2021, 3, 29, 15, 30, tzinfo=tzlocal()),
  'DataEndTime': datetime.datetime(2021, 3, 29, 15, 35, tzinfo=tzlocal()),
  'DataInputConfiguration': {'S3InputConfiguration': {'Bucket': '<your s3 bucket>',
    'Prefix': 'data/wind-turbine/inference-a2i/input/'}},
  'DataOutputConfiguration': {'S3OutputConfiguration': {'Bucket': '<your s3 bucket>',
    'Prefix': 'data/wind-turbine/inference-a2i/output/'}},
  'CustomerResultObject': {'Bucket': '<your s3 bucket>',
   'Key': 'data/wind-turbine/inference-a2i/output/2021-03-29T15:30:00Z/results.jsonl'},
  'Status': 'SUCCESS'}]

Get actual prediction results

After each successful inference, a JSON file is created in the output location of your bucket. Each inference creates a new folder with a single results.jsonl file in it. You can run through this section in the notebook to read these files and display their content.

results_df

The following screenshot shows the results.

Stop the inference scheduler

Make sure to stop the inference scheduler; we don’t need it for the rest of the steps in this post. However, as part of your solution, the inference scheduler should be running to ensure real-time inference for your equipment continues. Run through this notebook section to stop the inference scheduler.

Set up Amazon A2I to review predictions from Amazon Lookout for Equipment

Now that inference is complete, let’s understand how to set up a UI to review the inference results and update it, so we can send it back to Amazon Lookout for Equipment for retraining the model. In this section, we show how to use the Amazon A2I custom task type to integrate with Amazon Lookout for Equipment through the walkthrough notebook to set up a human in the loop process. It includes the following steps:

Create a human task UI
Create a workflow definition
Send predictions to Amazon A2I human loops
Sign in to the worker portal and annotate Amazon Lookout for Equipment inference predictions

Follow the steps provided in the notebook to initialize Amazon A2I APIs. Make sure to set up the bucket name in the initialization block where you want your Amazon A2I output:

a2ibucket = '<your bucket>'

You also need to create a private workforce and provide a work team ARN in the initialize step.

On the SageMaker console, create a private workforce. After you create the private workforce, find the workforce ARN and enter the ARN in the notebook:

WORKTEAM_ARN = 'your private workforce team ARN'

Create the human task UI

You now create a human task UI resource, giving a UI template in liquid HTML. You can download the provided template and customize it. This template is rendered to the human workers whenever a human loop is required. For over 70 pre-built UIs, see the amazon-a2i-sample-task-uis GitHub repo. We also provide this template in our GitHub repo.

You can use this template to create a task UI either via the console or by running the following code in the notebook:

def create_task_ui():
 
    response = sagemaker_client.create_human_task_ui(
        HumanTaskUiName=taskUIName,
        UiTemplate={'Content': template})
    return response

Create a human review workflow definition

Workflow definitions allow you to specify the following:

The worker template or human task UI you created in the previous step.
The workforce that your tasks are sent to. For this post, it’s the private workforce you created in the prerequisite steps.
The instructions that your workforce receives.

This post uses the Create Flow Definition API to create a workflow definition. Run the following cell in the notebook:

create_workflow_definition_response = sagemaker_client.create_flow_definition(
        FlowDefinitionName= flowDefinitionName,
        RoleArn=role,
        HumanLoopConfig= {
            "WorkteamArn": WORKTEAM_ARN,
            "HumanTaskUiArn": humanTaskUiArn,
            "TaskCount": 1,
            "TaskDescription": "Review the contents and select correct values as indicated",
            "TaskTitle": "Equipment Condition Review"
        },
        OutputConfig={
            "S3OutputPath" : OUTPUT_PATH
        }
    )
flowDefinitionArn = create_workflow_definition_response['FlowDefinitionArn']

Send predictions to Amazon A2I human loops

We create an item list from the Pandas DataFrame where we have the Amazon Lookout for Equipement output saved. Run the following notebook cell to create a list of items to send for review:

NUM_TO_REVIEW = 5 # number of line items to review
dftimestamp = sig_full_df['Timestamp'].astype(str).to_list()
dfsig001 = sig_full_df['Q_avg'].astype(str).to_list()
dfsig002 = sig_full_df['Ws1_avg'].astype(str).to_list()
dfsig003 = sig_full_df['Ot_avg'].astype(str).to_list()
dfsig004 = sig_full_df['Nf_avg'].astype(str).to_list()
dfsig046 = sig_full_df['Ba_avg'].astype(str).to_list()
sig_list = [{'timestamp': dftimestamp[x], 'reactive_power': dfsig001[x], 'wind_speed_1': dfsig002[x], 'outdoor_temp': dfsig003[x], 'grid_frequency': dfsig004[x], 'pitch_angle': dfsig046[x]} for x in range(NUM_TO_REVIEW)]
sig_list

Run the following code to create a JSON input for the Amazon A2I loop. This contains the lists that are sent as input to the Amazon A2I UI displayed to the human reviewers.

ip_content = {"signal": sig_list,
'anomaly': ano_list
}

Run the following notebook cell to call the Amazon A2I API to start the human loop:

import json
humanLoopName = str(uuid.uuid4())

start_loop_response = a2i.start_human_loop(
            HumanLoopName=humanLoopName,
            FlowDefinitionArn=flowDefinitionArn,
            HumanLoopInput={
                "InputContent": json.dumps(ip_content)
            }
        )

You can check the status of human loop by running the next cell in the notebook.

Annotate the results via the worker portal

Run the following notebook cell to get a login link to navigate to the private workforce portal:

workteamName = WORKTEAM_ARN[WORKTEAM_ARN.rfind('/') + 1:]
print("Navigate to the private worker portal and do the tasks. Make sure you've invited yourself to your workteam!")
print('https://' + sagemaker_client.describe_workteam(WorkteamName=workteamName)['Workteam']['SubDomain'])

You’re redirected to the Amazon A2I console. Select the human review job and choose Start working. After you review the changes and make corrections, choose Submit.

You can evaluate the results store in Amazon S3.

Evaluate the results

When the labeling work is complete, your results should be available in the S3 output path specified in the human review workflow definition. The human answers are returned and saved in the JSON file. Run the notebook cell to get the results from Amazon S3:

import re
import pprint

pp = pprint.PrettyPrinter(indent=4)
json_output = ''
for resp in completed_human_loops:
    splitted_string = re.split('s3://' + a2ibucket  + '/', resp['HumanLoopOutput']['OutputS3Uri'])
    print(splitted_string[1])
    output_bucket_key = splitted_string[1]
    response = s3.get_object(Bucket=a2ibucket, Key=output_bucket_key)
    content = response["Body"].read()
    json_output = json.loads(content)
    pp.pprint(json_output)
    print('n')

You get a response with human reviewed answers and flow-definition. Refer to the notebook to get the complete response.

Model retraining based on augmented datasets from Amazon A2I

Now we take the Amazon A2I output, process it, and send it back to Amazon Lookout for Equipment to retrain our model based on the human corrections. Refer to the accompanying notebook for all the steps to complete in this section. Let’s look at the last few entries of our original label file:

labels_df = pd.read_csv(os.path.join(LABEL_DATA, 'labels.csv'), header=None)
labels_df[0] = pd.to_datetime(labels_df[0])
labels_df[1] = pd.to_datetime(labels_df[1])
labels_df.columns = ['start', 'end']
labels_df.tail()

The following screenshot shows the labels file.

Update labels with new date ranges

Now let’s update our existing labels dataset with the new labels we received from the Amazon A2I human review process:

faulty = False
a2i_lbl_df = labels_df
x = json_output['humanAnswers'][0]
row_df = pd.DataFrame(columns=['rownr'])
tslist = {}

# Let's first check if the users mark equipment as faulty and if so get those row numbers into a dataframe            
for i in json_output['humanAnswers']:
    print("checking equipment review...")
    x = i['answerContent']
    for idx, key in enumerate(x):
        if "faulty" in key:
            if str(x.get(key)).split(':')[1].lstrip().strip('}') == "True": # faulty equipment selected
                    faulty = True
                    row_df.loc[len(row_df.index)] = [key.split('-')[1]] 
                    print("found faulty equipment in row: " + key.split('-')[1])


# Now we will get the date ranges for the faulty choices                     
for idx,k in row_df.iterrows():
    x = json_output['humanAnswers'][0]
    strchk = "TrueStart"+k['rownr']
    endchk = "TrueEnd"+k['rownr']
    for i in x['answerContent']:
        if i == strchk:
            tslist[i] = x['answerContent'].get(i)
        if i == endchk:
            tslist[i] = x['answerContent'].get(i)

            
# And finally let's add it to our new a2i labels dataset
for idx,k in row_df.iterrows():
    x = json_output['humanAnswers'][0]
    strchk = "TrueStart"+k['rownr']
    endchk = "TrueEnd"+k['rownr']
    a2i_lbl_df.loc[len(a2i_lbl_df.index)] = [tslist[strchk], tslist[endchk]]

You get the following response:

checking equipment review...
found faulty equipment in row: 1
found faulty equipment in row: 2

The following screenshot shows the updated labels file.

Let’s upload the updated labels data to a new augmented labels file:

a2i_label_s3_dest_path = f's3://{BUCKET}/{PREFIX}/augmented-labelled-data/labels.csv'
!aws s3 cp $a2i_label_src_fname $a2i_label_s3_dest_path

Update the training dataset with new measurements

We now update our original training dataset with the new measurement range based on what we got back from Amazon A2I. Run the following code to load the original dataset to a new DataFrame that we use to append our augmented data. Refer to the accompanying notebook for all the steps required.

turbine_id = 'R80711'
file = '../data/wind-turbine/final/training-data/'+turbine_id+'/'+turbine_id+'.csv'
newdf = pd.read_csv(file, index_col='Timestamp')
newdf.head()

The following screenshot shows our original training dataset snapshot.

Now we use the updated training dataset with the simulated inference data we created earlier, in which the human reviewers indicated that they found faulty equipment when running the inference. Run the following code to modify the index of the simulated inference dataset to reflect a 10-minute duration for each reading:

sig_full_df = sig_full_df.set_index('Timestamp')
tm = pd.to_datetime('2021-04-05 20:30:00')
print(tm)
new_index = pd.date_range(
start=tm,
periods=sig_full_df.shape[0],
freq='10min'
)
sig_full_df.index = new_index
sig_full_df.index.name = 'Timestamp'
sig_full_df = sig_full_df.reset_index()
sig_full_df['Timestamp'] = pd.to_datetime(sig_full_df['Timestamp'], errors='coerce')

Run the following code to append the simulated inference dataset to the original training dataset:

newdf = newdf.reset_index()
newdf = pd.concat([newdf,sig_full_df])

The simulated inference data with the recent timestamp is appended to the end of the training dataset. Now let’s create a CSV file and copy the data to the training channel in Amazon S3:

TRAIN_DATA_AUGMENTED = os.path.join(TRAIN_DATA,'augmented')
os.makedirs(TRAIN_DATA_AUGMENTED, exist_ok=True)
newdf.to_csv('../data/wind-turbine/final/training-data/augmented/'+turbine_id+'.csv')
!aws s3 sync $TRAIN_DATA_AUGMENTED s3://$BUCKET/$PREFIX/training_data/augmented

Now we update the components map with this augmented dataset, reload the data into Amazon Lookout for Equipment, and retrain this training model with this dataset. Refer to the accompanying notebook for the detailed steps to retrain the model.

Conclusion

In this post, we walked you through how to use Amazon Lookout for Equipment to train a model to detect abnormal equipment behavior with a wind turbine dataset, review diagnostics from the trained model, review the predictions from the model with a human in the loop using Amazon A2I, augment our original training dataset, and retrain our model with the feedback from the human reviews.

With Amazon Lookout for Equipment and Amazon A2I, you can set up a continuous prediction, review, train, and feedback loop to audit predictions and improve the accuracy of your models.

Please let us know what you think of this solution and how it applies to your industrial use case. Check out the GitHub repo for full resources to this post. Visit the webpages to learn more about Amazon Lookout for Equipment and Amazon Augmented AI. We look forward to hearing from you. Happy experimentation!

About the Authors

Dastan Aitzhanov is a Solutions Architect in Applied AI with Amazon Web Services. He specializes in architecting and building scalable cloud-based platforms with an emphasis on machine learning, internet of things, and big data-driven applications. When not working, he enjoys going camping, skiing, and spending time in the great outdoors with his family

Prem Ranga is an Enterprise Solutions Architect based out of Atlanta, GA. He is part of the Machine Learning Technical Field Community and loves working with customers on their ML and AI journey. Prem is passionate about robotics, is an Autonomous Vehicles researcher, and also built the Alexa-controlled Beer Pours in Houston and other locations.

Mona Mona is a Senior AI/ML Specialist Solutions Architect based out of Arlington, VA. She works with public sector customers, and helps them adopt machine learning on a large scale. She is passionate about NLP and ML explainability areas in AI/ML.

Baris Yasin is a Solutions Architect at AWS. He’s passionate about AI/ML & Analytics technologies and helping startup customers solve challenging business and technical problems with AWS.

Shift Towards Remote Work

Seeing Face Value for Virtual Events

Field of Streams

A Look Into NVIDIA Maxine

Availability

Grab and Go AI Models

Models Show Their AI Resumes

Leveraging a Massive Investment

Transfer Learning, Your AI Tailor

TAO Lets Partners Collaborate with Privacy

Taking Enterprise AI to Production

Zero to AI in Minutes

An AI Workflow That’s on the Job

Continuous AI Analytics on Network Traffic

Accelerate and Secure the Data Center with NVIDIA BlueField DPUs

AI New Kind of Supercomputing

Soul of the New Machine

From Particle Physics to Weather Forecasts

Featured image source: Steve Evans, from Citizen of the World.

So, What Is Quantum Computing?

What Does a Quantum Computer Do?

A Big Role for Tiny Simulations

How Does Quantum Computing Work?

What’s the Status of Quantum Computers?

Accelerating Quantum Circuit Simulations Today

Megatron Model for Molecular Insights

Language Models Speed Up Medical Innovation

Powering Drug Discovery Platforms

More Zip, Fewer Tolls

Powering an AI Pipeline

Drivers Give a Thumbs Up

An Eye Toward Analytics

What the research is:

How it works:

Why it matters:

Read the full paper:

Learn more:

Solution overview

Architecture overview

Prerequisites

Analyze the dataset and create component metadata

Create the Amazon Lookout for Equipment dataset

Label your dataset using the SageMaker labeling workforce

Create a model in Amazon Lookout for Equipment

Prepare the model parameters and split the data

Train model

Evaluate the trained model

Review training diagnostics

Create an inference scheduler in Amazon Lookout for Equipment

Prepare the inference data

Get inference results

Get actual prediction results

Stop the inference scheduler

Set up Amazon A2I to review predictions from Amazon Lookout for Equipment

Create the human task UI

Create a human review workflow definition

Send predictions to Amazon A2I human loops

Annotate the results via the worker portal

Evaluate the results

Model retraining based on augmented datasets from Amazon A2I

Update labels with new date ranges

Update the training dataset with new measurements

Conclusion

About the Authors

Navigation

GenAI Vision Endless Possibilities

"I'm interested in things that change the world or that affect the future and wondrous, new technology where you see it, and you're like, 'Wow, how did that even happen? How is that possible?'" -- Elon Musk

Copyright © 2019-2025 Vedere AI. All Rights Reserved.