AI Summit: US Energy Secretary Highlights AI’s Role in Science, Energy and Security

AI Summit: US Energy Secretary Highlights AI’s Role in Science, Energy and Security

AI can help solve some of the world’s biggest challenges — whether climate change, cancer or national security — U.S. Secretary of Energy Jennifer Granholm emphasized today during her remarks at the AI for Science, Energy and Security session at the NVIDIA AI Summit, in Washington, D.C.

Granholm went on to highlight the pivotal role AI is playing in tackling major national challenges, from energy innovation to bolstering national security.

“We need to use AI for both offense and defense — offense to solve these big problems and defense to make sure the bad guys are not using AI for nefarious purposes,” she said.

Granholm, who calls the Department of Energy “America’s Solutions Department,” highlighted the agency’s focus on solving the world’s biggest problems.

“Yes, climate change, obviously, but a whole slew of other problems, too … quantum computing and all sorts of next-generation technologies,” she said, pointing out that AI is a driving force behind many of these advances.

“AI can really help to solve some of those huge problems — whether climate change, cancer or national security,” she said. “The possibilities of AI for good are awesome, awesome.”

Following Granholm’s 15-minute address, a panel of experts from government, academia and industry took the stage to further discuss how AI accelerates advancements in scientific discovery, national security and energy innovation.

“AI is going to be transformative to our mission space.… We’re going to see these big step changes in capabilities,” said Helena Fu, director of the Office of Critical and Emerging Technologies at the Department of Energy, underscoring AI’s potential in safeguarding critical infrastructure and addressing cyber threats.

During her remarks, Granholm also stressed that AI’s increasing energy demands must be met responsibly.

“We are going to see about a 15% increase in power demand on our electric grid as a result of the data centers that we want to be located in the United States,” she explained.

However, the DOE is taking steps to meet this demand with clean energy.

“This year, in 2024, the United States will have added 30 Hoover Dams’ worth of clean power to our electric grid,” Granholm announced, emphasizing that the clean energy revolution is well underway.

AI’s Impact on Scientific Discovery and National Security

The discussion then shifted to how AI is revolutionizing scientific research and national security.

Tanya Das, director of the Energy Program at the Bipartisan Policy Center, pointed out that “AI can accelerate every stage of the innovation pipeline in the energy sector … starting from scientific discovery at the very beginning … going through to deployment and permitting.”

Das also highlighted the growing interest in Congress to support AI innovations, adding, “Congress is paying attention to this issue, and, I think, very motivated to take action on updating what the national vision is for artificial intelligence.”

Fu reiterated the department’s comprehensive approach, stating, “We cross from open science through national security, and we do this at scale.… Whether they be around energy security, resilience, climate change or the national security challenges that we’re seeing every day emerging.”

She also touched on the DOE’s future goals: “Our scientific systems will need access to AI systems,” Fu said, emphasizing the need to bridge both scientific reasoning and the new kinds of models we’ll need to develop for AI.

Collaboration Across Sectors: Government, Academia and Industry

Karthik Duraisamy, director of the Michigan Institute for Computational Discovery and Engineering at the University of Michigan, highlighted the power of collaboration in advancing scientific research through AI.

“Think about the scientific endeavor as 5% creativity and innovation and 95% intense labor. AI amplifies that 5% by a bit, and then significantly accelerates the 95% part,” Duraisamy explained. “That is going to completely transform science.”

Duraisamy further elaborated on the role AI could play as a persistent collaborator, envisioning a future where AI can work alongside scientists over weeks, months and years, generating new ideas and following through on complex projects.

“Instead of replacing graduate students, I think graduate students can be smarter than the professors on day one,” he said, emphasizing the potential for AI to support long-term research and innovation.

Learn more about how this week’s AI Summit highlights how AI is shaping the future across industries and how NVIDIA’s solutions are laying the groundwork for continued innovation. 

###END###

Read More

What’s the ROI? Getting the Most Out of LLM Inference

What’s the ROI? Getting the Most Out of LLM Inference

Large language models and the applications they power enable unprecedented opportunities for organizations to get deeper insights from their data reservoirs and to build entirely new classes of applications.

But with opportunities often come challenges.

Both on premises and in the cloud, applications that are expected to run in real time place significant demands on data center infrastructure to simultaneously deliver high throughput and low latency with one platform investment.

To drive continuous performance improvements and improve the return on infrastructure investments, NVIDIA regularly optimizes the state-of-the-art community models, including Meta’s Llama, Google’s Gemma, Microsoft’s Phi and our own NVLM-D-72B, released just a few weeks ago.

Relentless Improvements

Performance improvements let our customers and partners serve more complex models and reduce the needed infrastructure to host them. NVIDIA optimizes performance at every layer of the technology stack, including TensorRT-LLM, a purpose-built library to deliver state-of-the-art performance on the latest LLMs. With improvements to the open-source Llama 70B model, which delivers very high accuracy, we’ve already improved minimum latency performance by 3.5x in less than a year.

We’re constantly improving our platform performance and regularly publish performance updates. Each week, improvements to NVIDIA software libraries are published, allowing customers to get more from the very same GPUs. For example, in just a few months’ time, we’ve improved our low-latency Llama 70B performance by 3.5x.

Over the past 10 months, NVIDIA has increased performance on the Llama 70B model by 3.5x through a combination of optimized kernels, multi-head attention techniques and a variety of parallelization techniques.
NVIDIA has increased performance on the Llama 70B model by 3.5x.

In the most recent round of MLPerf Inference 4.1, we made our first-ever submission with the Blackwell platform. It delivered 4x more performance than the previous generation.

This submission was also the first-ever MLPerf submission to use FP4 precision. Narrower precision formats, like FP4, reduces memory footprint and memory traffic, and also boost computational throughput. The process takes advantage of Blackwell’s second-generation Transformer Engine, and with advanced quantization techniques that are part of TensorRT Model Optimizer, the Blackwell submission met the strict accuracy targets of the MLPerf benchmark.

MLPerf Inference v4.1 Closed, Data Center. Results retrieved from www.mlperf.org on August 28, 2024. Blackwell results measured on single GPU and retrieved from entry 4.1-0074 in the Closed, Preview category. H100 results from entry 4.1-0043 in the Closed, Available category on 8x H100 system and divided by GPU count for per GPU comparison. Per-GPU throughput is not a primary metric of MLPerf Inference. The MLPerf name and logo are registered and unregistered trademarks of MLCommons Association in the United States and other countries. All rights reserved. Unauthorized use strictly prohibited. See www.mlcommons.org for more information.
Blackwell B200 delivers up to 4x more performance versus previous generation on MLPerf Inference v4.1’s Llama 2 70B workload.

Improvements in Blackwell haven’t stopped the continued acceleration of Hopper. In the last year, Hopper performance has increased 3.4x in MLPerf on H100 thanks to regular software advancements. This means that NVIDIA’s peak performance today, on Blackwell, is 10x faster than it was just one year ago on Hopper.

MLPerf Inference v4.1 Closed, Data Center. Results retrieved from www.mlperf.org from multiple dates and entries. The October 2023, December 2023, May 2024 and October 24 data points are from internal measurements. The remaining data points are from official submissions. All results using eight accelerators. The MLPerf name and logo are registered and unregistered trademarks of MLCommons Association in the United States and other countries. All rights reserved. Unauthorized use strictly prohibited. See www.mlcommons.org for more information.
These results track progress on the MLPerf Inference Llama 2 70B Offline scenario over the past year.

Our ongoing work is incorporated into TensorRT-LLM, a purpose-built library to accelerate LLMs that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM is built on top of the TensorRT Deep Learning Inference library and leverages much of TensorRT’s deep learning optimizations with additional LLM-specific improvements.

Improving Llama in Leaps and Bounds

More recently, we’ve continued optimizing variants of Meta’s Llama models, including versions 3.1 and 3.2 as well as model sizes 70B and the biggest model, 405B. These optimizations include custom quantization recipes, as well as efficient use of parallelization techniques to more efficiently split the model across multiple GPUs, leveraging NVIDIA NVLink and NVSwitch interconnect technologies. Cutting-edge LLMs like Llama 3.1 405B are very demanding and require the combined performance of multiple state-of-the-art GPUs for fast responses.

Parallelism techniques require a hardware platform with a robust GPU-to-GPU interconnect fabric to get maximum performance and avoid communication bottlenecks. Each NVIDIA H200 Tensor Core GPU features fourth-generation NVLink, which provides a whopping 900GB/s of GPU-to-GPU bandwidth. Every eight-GPU HGX H200 platform also ships with four NVLink Switches, enabling every H200 GPU to communicate with any other H200 GPU at 900GB/s, simultaneously.

Many LLM deployments use parallelism over choosing to keep the workload on a single GPU, which can have compute bottlenecks. LLMs seek to balance low latency and high throughput, with the optimal parallelization technique depending on application requirements.

For instance, if lowest latency is the priority, tensor parallelism is critical, as the combined compute performance of multiple GPUs can be used to serve tokens to users more quickly. However, for use cases where peak throughput across all users is prioritized, pipeline parallelism can efficiently boost overall server throughput.

The table below shows that tensor parallelism can deliver over 5x more throughput in minimum latency scenarios, whereas pipeline parallelism brings 50% more performance for maximum throughput use cases.

For production deployments that seek to maximize throughput within a given latency budget, a platform needs to provide the ability to effectively combine both techniques like in TensorRT-LLM.

Read the technical blog on boosting Llama 3.1 405B throughput to learn more about these techniques.

This table shows that tensor parallelism can deliver over 5x more throughput in minimum latency scenarios, whereas pipeline parallelism brings 50% more performance for maximum throughput use cases.
Different scenarios have different requirements, and parallelism techniques bring optimal performance for each of these scenarios.

The Virtuous Cycle

Over the lifecycle of our architectures, we deliver significant performance gains from ongoing software tuning and optimization. These improvements translate into additional value for customers who train and deploy on our platforms. They’re able to create more capable models and applications and deploy their existing models using less infrastructure, enhancing their ROI.

As new LLMs and other generative AI models continue to come to market, NVIDIA will continue to run them optimally on its platforms and make them easier to deploy with technologies like NIM microservices and NIM Agent Blueprints.

Learn more with these resources:

Read More

Flux and Furious: New Image Generation Model Runs Fastest on RTX AI PCs and Workstations

Flux and Furious: New Image Generation Model Runs Fastest on RTX AI PCs and Workstations

Editor’s note: This post is part of the AI Decoded series, which demystifies AI by making the technology more accessible, and showcases new hardware, software, tools and accelerations for GeForce RTX PC and NVIDIA RTX workstation users.

Image generation models — a popular subset of generative AI — can parse and understand written language, then translate words into images in almost any style.

Representing the cutting edge of what’s possible in image generation, a new series of models from Black Forest Labs — now available to try on PC and workstations — run fastest on GeForce RTX and NVIDIA RTX GPUs.

Fluxible Capabilities

FLUX.1 AI is a text-to-image generation model suite developed by Black Forest Labs. The models are built on the diffusion transformer (DiT) architecture, which allows models with a high number of parameters to maintain efficiency. The Flux models are trained on 12 billion parameters for high-quality image generation.

DiT models are efficient and computationally intensive — and NVIDIA RTX GPUs are essential for handling these new models, the largest of which can’t run on non-RTX GPUs without significant tweaking. Flux models now support the NVIDIA TensorRT software development kit, which improves their performance up to 20%. Users can try Flux and other models with TensorRT in ComfyUI.

Prompt: “A magazine photo of a monkey bathing in a hot spring in a snowstorm with steam coming off the water.” Source: NVIDIA

Flux Appeal

FLUX.1 excels in generating high-quality, diverse images with exceptional prompt adherence, which refers to how accurately the AI interprets and executes instructions. High prompt adherence means the generated image closely matches the text prompt’s described elements, style and mood. Low prompt adherence results in images that may partially or completely deviate from given instructions.

FLUX.1 is noted for its ability to render the human anatomy accurately, including for challenging, intricate features like hands and faces. FLUX.1 also significantly improves the generation of legible text within images, addressing another common challenge in text-to-image models. This makes FLUX.1 models suitable for applications that require precise text representation, such as promotional materials and book covers.

FLUX.AI is available in three variants, offering users choices to best fit their workflows without sacrificing quality:

  • FLUX.1 pro: State-of-the-art quality for enterprise users; accessible through an application programming interface.
  • FLUX.1 dev: A distilled, free version of FLUX.1 pro that still provides high quality.
  • FLUX.1 schnell: The fastest model, ideal for local development and personal use; has a permissive Apache 2.0 license.

The dev and schnell models are open source, and Black Forest Labs provides access to its weights on the popular platform Hugging Face. This encourages innovation and collaboration within the image generation community by allowing researchers and developers to build upon and enhance the models.

Embraced by the Community

The Flux models’ dev and schnell variants were downloaded more than 2 million times on HuggingFace in less than three weeks since their launch.

Users have praised FLUX.1 for its abilities to produce visually stunning images with exceptional detail and realism, as well as to process complex prompts without requiring extensive parameter adjustments.

Prompt: “A highly detailed professional close-up photo of an animorphic Bengal tiger wearing a white, ribbed tank top, sunglasses and headphones around his neck as a DJ with its paws on the turntable on stage at an outdoor electronic dance music concert in Ibiza at night; party atmosphere, wispy smoke with caustic lighting.” Source: NVIDIA

 

Prompt: “A photographic-quality image of a bustling city street during a rainy evening with a yellow taxi cab parked at the curb with its headlights on, reflecting off the wet pavement. A woman in a red coat is standing under a bright green umbrella, looking at her smartphone. On the left, there is a coffee shop with a neon sign that reads ‘Café Mocha’ in blue letters. The shop has large windows, through which people can be seen enjoying their drinks. Streetlights illuminate the area, casting a warm glow over the scene, while raindrops create a misty effect in the air. In the background, a tall building with a large digital clock displays the time as 8:45 p.m.” Source: NVIDIA

In addition, FLUX.1’s versatility in handling various artistic styles and efficiency in quickly generating images makes it a valuable tool for both personal and professional projects.

Get Started

Users can access FLUX.1 using popular community webpages like ComfyUI. The community-run ComfyUI Wiki includes step-by-step instructions for getting started.

Many YouTube creators also offer video tutorials on Flux models, like this one from MDMZ:

Share your generated images on social media using the hashtag #fluxRTX for a chance to be featured on NVIDIA AI’s channels.

Generative AI is transforming gaming, videoconferencing and interactive experiences of all kinds. Make sense of what’s new and what’s next by subscribing to the AI Decoded newsletter.

Read More

NVIDIA AI Summit Highlights Game-Changing Energy Efficiency and AI-Driven Innovation

NVIDIA AI Summit Highlights Game-Changing Energy Efficiency and AI-Driven Innovation

Accelerated computing is sustainable computing, Bob Pette, NVIDIA’s vice president and general manager of enterprise platforms, explained in a keynote at the NVIDIA AI Summit on Tuesday in Washington, D.C.

NVIDIA’s accelerated computing isn’t just efficient. It’s critical to the next wave of industrial, scientific and healthcare transformations.

“We are in the dawn of a new industrial revolution,” Pette told an audience of policymakers, press, developers and entrepreneurs gathered for the event​. “I’m just here to tell you that we’re designing our systems with not just performance in mind, but with energy efficiency in mind.”

NVIDIA’s Blackwell platform has achieved groundbreaking energy efficiency in AI computing, reducing energy consumption by up to 2,000x over the past decade for training models like GPT-4.

NVIDIA accelerated computing is cutting energy use for token generation — the output from AI models — by 100,000x, underscoring the value of accelerated computing for sustainability amid the rapid adoption of AI worldwide.

“These AI factories produce product.  Those products are tokens, tokens are intelligence, and Intelligence is money,” Pette said. That’s what “will revolutionize every industry on this planet.”

NVIDIA’s CUDA libraries, which have been fundamental in enabling breakthroughs across industries, now power over 4,000 accelerated applications, Pette explained.

“CUDA enables acceleration…. It also turns out to be one of the most impressive ways to reduce energy consumption,” Pette said.

These libraries are central to the company’s energy-efficient AI innovations driving significant performance gains while minimizing power consumption.

Pette also detailed how NVIDIA’s AI software helps organizations deploy AI solutions quickly and efficiently, enabling businesses to innovate faster and solve complex problems across sectors.

Pette discussed the concept of agentic AI, which goes beyond traditional AI by enabling intelligent agents to perceive, reason and act autonomously.

Agentic AI is capable of ”reasoning, of learning, and taking action,” Pette said. It’s transforming industries like manufacturing, customer service, and healthcare,” Pette said.

These AI agents are transforming industries by automating complex tasks and accelerating innovation in sectors like manufacturing, customer service and healthcare, he explained.

He also described how AI agents empower businesses to drive innovation in healthcare, manufacturing, scientific research and climate modeling.

With agentic AI, “you can do in minutes what used to take days,” Pette said.

NVIDIA, in collaboration with its partners, is tackling some of the world’s greatest challenges, including improving diagnostics and healthcare delivery, advancing climate modeling efforts and even helping find signs of life beyond our planet.

NVIDIA is collaborating with SETI to conduct real-time AI searches for fast radio bursts from distant galaxies, helping continue the exploration of space, Pette said.

Pette emphasized that NVIDIA is unlocking a $10 trillion opportunity in healthcare.

Through AI, NVIDIA is accelerating innovations in diagnostics, drug discovery and medical imaging, helping transform patient care worldwide.

Solutions like the NVIDIA Clara medical imaging platform are revolutionizing diagnostics, Parabricks is enabling breakthroughs in genomics research and the MONAI AI framework is advancing medical imaging capabilities.

Pette highlighted partnerships with leading institutions, including Carnegie Mellon University and the University of Pittsburgh, fostering AI innovation and development.

Pette also described how NVIDIA’s collaboration with federal agencies illustrates the importance of public-private partnerships in advancing AI-driven solutions in healthcare, climate modeling and national security.

Pette also announced that a new NVIDIA NIM Agent Blueprint supports cybersecurity advancements, enabling industries to safeguard critical infrastructure with AI-driven solutions.

In cybersecurity, Pette highlighted the NVIDIA NIM Agent Blueprint, a powerful tool enabling organizations to safeguard critical infrastructure through real-time threat detection and analysis.

This blueprint reduces threat response times from days to seconds, representing a significant leap forward in protecting industries.

“Agentic systems can access tools and reason through full lines of thought to provide instant one-click assessments,” Pette said. “This boosts productivity by allowing security analysts to focus on the most critical tasks while AI handles the heavy lifting of analysis, delivering fast and actionable insights.

NVIDIA’s accelerated computing solutions are advancing climate research by enabling more accurate and faster climate modeling. This technology is helping scientists tackle some of the most urgent environmental challenges, from monitoring global temperatures to predicting natural disasters.

Pette described how the NVIDIA Earth 2 platform enables climate experts to import data from multiple sources, fusing them together for analysis using Nvidia Omniverse. “NVIDIA Earth 2 brings together the power of simulation AI and visualization to empower the climate, tech ecosystem,” Pette said.

NVIDIA’s Greg Estes on Building the AI Workforce of the Future

Following Pette’s keynote, Greg Estes, NVIDIA’s vice president of corporate marketing and developer programs, underscored the company’s dedication to workforce training through initiatives like the NVIDIA AI Tech Community.

And through its Deep Learning Institute, NVIDIA has already trained more than 600,000 people worldwide, equipping the next generation with the critical skills to navigate and lead in the AI-driven future.

SUBHEAD: Exploring AI’s Role in Cybersecurity and Sustainability

Throughout the week, industry leaders are exploring AI’s role in solving critical issues in fields like cybersecurity and sustainability.

Upcoming sessions will feature U.S. Secretary of Energy Jennifer Granholm, who will discuss how AI is advancing energy innovation and scientific discovery.

Other speakers will address AI’s role in climate monitoring and environmental management, further showcasing the technology’s ability to address global sustainability challenges.

Learn more about how this week’s AI Summit highlights how AI is shaping the future across industries and how NVIDIA’s solutions are laying the groundwork for continued innovation. 

Read More

US Healthcare System Deploys AI Agents, From Research to Rounds

US Healthcare System Deploys AI Agents, From Research to Rounds

The U.S. healthcare system is adopting digital health agents to harness AI across the board, from research laboratories to clinical settings.

The latest AI-accelerated tools — on display at the NVIDIA AI Summit taking place this week in Washington, D.C. — include NVIDIA NIM, a collection of cloud-native microservices that support AI model deployment and execution, and NVIDIA NIM Agent Blueprints, a catalog of pretrained, customizable workflows. 

These technologies are already in use in the public sector to advance the analysis of medical images, aid the search for new therapeutics and extract information from massive PDF databases containing text, tables and graphs. 

For example, researchers at the National Cancer Institute, part of the National Institutes of Health (NIH), are using several AI models built with NVIDIA MONAI for medical imaging — including the VISTA-3D NIM foundation model for segmenting and annotating 3D CT images. A team at NIH’s National Center for Advancing Translational Sciences (NCATS) is using the NIM Agent Blueprint for generative AI-based virtual screening to reduce the time and cost of developing novel drug molecules.

With NVIDIA NIM and NIM Agent Blueprints, medical researchers across the public sector can jump-start their adoption of state-of-the-art, optimized AI models to accelerate their work. The pretrained models are customizable based on an organization’s own data and can be continually refined based on user feedback.

NIM microservices and NIM Agent Blueprints are available at ai.nvidia.com and accessible through a wide variety of cloud service providers, global system integrators and technology solutions providers. 

Building With NIM Agent Blueprints

Dozens of NIM microservices and a growing set of NIM Agent Blueprints are available for developers to experience and download for free. They can be deployed in production with the NVIDIA AI Enterprise software platform.

  • The blueprint for generative virtual screening for drug discovery brings together three NIM microservices to help researchers search and optimize libraries of small molecules to identify promising candidates that bind to a target protein.
  • The multimodal PDF data extraction blueprint uses NVIDIA NeMo Retriever NIM microservices to extract insights from enterprise documents, helping developers build powerful AI agents and chatbots.
  • The digital human blueprint supports the creation of interactive, AI-powered avatars for customer service. These avatars have potential applications in telehealth and nonclinical aspects of patient care, such as scheduling appointments, filling out intake forms and managing prescriptions.

Two new NIM microservices for drug discovery are now available on ai.nvidia.com to help researchers understand how proteins bind to target molecules, a crucial step in drug design. By conducting more of this preclinical research digitally, scientists can narrow down their pool of drug candidates before testing in the lab — making the discovery process more efficient and less expensive. 

With the AlphaFold2-Multimer NIM microservice, researchers can accurately predict protein structure from their sequences in minutes, reducing the need for time-consuming tests in the lab. The RFdiffusion NIM microservice uses generative AI to design novel proteins that are promising drug candidates because they’re likely to bind with a target molecule. 

NCATS Accelerates Drug Discovery Research

ASPIRE, a research laboratory at NCATS, is evaluating the NIM Agent Blueprint for virtual screening and is using RAPIDS, a suite of open-source software libraries for GPU-accelerated data science, to accelerate its drug discovery research. Using the cuGraph library for graph data analytics and cuDF library for accelerating data frames, the lab’s researchers can map chemical reactions across the vast unknown chemical space. 

The NCATS informatics team reported that with NVIDIA AI, processes that used to take hours on CPU-based infrastructure are now done in seconds.

Massive quantities of healthcare data — including research papers, radiology reports and patient records — are unstructured and locked in PDF documents, making it difficult for researchers to quickly search for information. 

The Genetic and Rare Diseases Information Center, also run by NCATS, is exploring using the PDF data extraction blueprint to develop generative AI tools that enhance the center’s ability to glean information from previously unsearchable databases. These tools will help answer questions from those affected by rare diseases.

“The center analyzes data sources spanning the National Library of Medicine, the Orphanet database and other institutes and centers within the NIH to answer patient questions,” said Sam Michael, chief information officer of NCATS. “AI-powered PDF data extraction can make it massively easier to extract valuable information from previously unsearchable databases.”  

Mi-NIM-al Effort, Maximum Benefit: Getting Started With NIM 

A growing number of startups, cloud service providers and global systems integrators include NVIDIA NIM microservices and NIM Agent Blueprints as part of their platforms and services, making it easy for federal healthcare researchers to get started.   

Abridge, an NVIDIA Inception startup and NVentures portfolio company, was recently awarded a contract from the U.S. Department of Veterans Affairs to help transcribe and summarize clinical appointments, reducing the burden on doctors to document each patient interaction.

The company uses NVIDIA TensorRT-LLM to accelerate AI inference and NVIDIA Triton Inference Server for deploying its audio-to-text and content summarization models at scale, some of the same technologies that power NIM microservices.

The NIM Agent Blueprint for virtual screening is now available through AWS HealthOmics, a purpose-built service that helps customers orchestrate biological data analyses. 

Amazon Web Services (AWS) is a partner of the NIH Science and Technology Research Infrastructure for Discovery, Experimentation, and Sustainability Initiative, aka STRIDES Initiative, which aims to modernize the biomedical research ecosystem by reducing economic and process barriers to accessing commercial cloud services. NVIDIA and AWS are collaborating to make NIM Agent Blueprints broadly accessible to the biomedical research community. 

ConcertAI, another NVIDIA Inception member, is an oncology AI technology company focused on research and clinical standard-of-care solutions. The company is integrating NIM microservices, NVIDIA CUDA-X microservices and the NVIDIA NeMo platform into its suite of AI solutions for large-scale clinical data processing, multi-agent models and clinical foundation models. 

NVIDIA NIM microservices are supporting ConcertAI’s high-performance, low-latency AI models through its CARA AI platform. Use cases include clinical trial design, optimization and patient matching — as well as solutions that can help boost the standard of care and augment clinical decision-making.

Global systems integrator Deloitte is bringing the NIM Agent Blueprint for virtual screening to its customers worldwide. With Deloitte Atlas AI, the company can help clients at federal health agencies easily use NIM to adopt and deploy the latest generative AI pipelines for drug discovery. 

Experience NVIDIA NIM microservices and NIM Agent Blueprints today.

NVIDIA AI Summit Highlights Healthcare Innovation

At the NVIDIA AI Summit in Washington, NVIDIA leaders, customers and partners are presenting over 50 sessions highlighting impactful work in the public sector. 

Register for a free virtual pass to hear how healthcare researchers are accelerating innovation with NVIDIA-powered AI in these sessions: 

Watch the AI Summit special address by Bob Pette, vice president of enterprise platforms at NVIDIA:

 

See notice regarding software product information.

Read More

Accelerated Computing Key to Yale’s Quantum Research

Accelerated Computing Key to Yale’s Quantum Research

A recently released joint research paper by Yale, Moderna and NVIDIA reviews how techniques from quantum machine learning (QML) may enhance drug discovery methods by better predicting molecular properties.

Ultimately, this could lead to the more efficient generation of new pharmaceutical therapies.

The review also emphasizes that the key tool for exploring these methods is GPU-accelerated simulation of quantum algorithms.

The study focuses on how future quantum neural networks can use quantum computing to enhance existing AI techniques.

Applied to the pharmaceutical industry, these advances offer researchers the ability to streamline complex tasks in drug discovery.

Researching how such quantum neural networks impact real-world use cases like drug discovery requires intensive, large-scale simulations of future noiseless quantum processing units (QPUs).

This is just one example of how, as quantum computing scales up, an increasing number of challenges are only approachable with GPU-accelerated supercomputing.

The review article explores how NVIDIA’s CUDA-Q quantum development platform provides a unique tool for running such multi-GPU accelerated simulations of QML workloads.

The study also highlights CUDA-Q’s ability to simulate multiple QPUs in parallel. This is a key ability for studying realistic large-scale devices, which, in this particular study, also allowed for the exploration of quantum machine learning tasks that batch training data.

Many of the QML techniques covered by the review — such as hybrid quantum convolution neural networks — also require CUDA-Q’s ability to write programs interweaving classical and quantum resources.

The increased reliance on GPU supercomputing demonstrated in this work is the latest example of NVIDIA’s growing involvement in developing useful quantum computers.

NVIDIA plans to further highlight its role in the future of quantum computing at the SC24 conference, Nov. 17-22 in Atlanta.

Read More

A Not-So-Secret Agent: NVIDIA Unveils NIM Blueprint for Cybersecurity

A Not-So-Secret Agent: NVIDIA Unveils NIM Blueprint for Cybersecurity

Artificial intelligence is transforming cybersecurity with new generative AI tools and capabilities that were once the stuff of science fiction. And like many of the heroes in science fiction, they’re arriving just in time.

AI-enhanced cybersecurity can detect and respond to potential threats in real time — often before human analysts even become aware of them. It can analyze vast amounts of data to identify patterns and anomalies that might indicate a breach. And AI agents can automate routine security tasks, freeing up human experts to focus on more complex challenges.

All of these capabilities start with software, so NVIDIA has introduced an NVIDIA NIM Agent Blueprint for container security that developers can adapt to meet their own application requirements.

The blueprint uses NVIDIA NIM microservices, the NVIDIA Morpheus cybersecurity AI framework, NVIDIA cuVS and NVIDIA RAPIDS accelerated data analytics to help accelerate analysis of common vulnerabilities and exposures (CVEs) at enterprise scale — from days to just seconds.

All of this is included in NVIDIA AI Enterprise, a cloud-native software platform for developing and deploying secure, supported production AI applications.

Deloitte Secures Software With NVIDIA AI

Deloitte is among the first to use the NVIDIA NIM Agent Blueprint for container security in its cybersecurity solutions, which supports agentic analysis of open-source software to help enterprises build secure AI. It can help enterprises enhance and simplify cybersecurity by improving efficiency and reducing the time needed to identify threats and potential adversarial activity.

“Cybersecurity has emerged as a critical pillar in protecting digital infrastructure in the U.S. and around the world,” said Mike Morris, managing director, Deloitte & Touche LLP. “By incorporating NVIDIA’s NIM Agent Blueprint into our cybersecurity solutions, we’re able to offer our clients improved speed and accuracy in identifying and mitigating potential security threats.”

Securing Software With Generative AI

Vulnerability detection and resolution is a top use case for generative AI in software delivery, according to IDC(1).

The NIM Agent Blueprint for container security includes everything an enterprise developer needs to build and deploy customized generative AI applications for rapid vulnerability analysis of software containers.

Software containers incorporate large numbers of packages and releases, some of which may be subject to security vulnerabilities. Traditionally, security analysts would need to review each of these packages to understand potential security exploits across any software deployment.

These manual processes are tedious, time-consuming and error-prone. They’re also difficult to automate effectively because of the complexity of aligning software packages, dependencies, configurations and the operating environment.

With generative AI, cybersecurity applications can rapidly digest and decipher information across a wide range of data sources, including natural language, to better understand the context in which potential vulnerabilities could be exploited.

Enterprises can then create cybersecurity AI agents that take action on this generative AI intelligence. The NIM Agent Blueprint for container security enables quick, automatic and actionable CVE risk analysis using large language models and retrieval-augmented generation for agentic AI applications. It helps developers and security teams protect software with AI to enhance accuracy, efficiency and streamline potential issues for human agents to investigate.

Blueprints for Cybersecurity Success

The new NVIDIA NIM Agent Blueprint for container security includes the NVIDIA Morpheus cybersecurity AI framework to reduce the time and cost associated with identifying, capturing and acting on threats. This brings a new level of security to the data center, cloud and edge.

The GPU-accelerated, end-to-end AI framework enables developers to create optimized applications for filtering, processing and classifying large volumes of streaming cybersecurity data.

Built on NVIDIA RAPIDS software, Morpheus accelerates data processing workloads at enterprise scale. It uses the power of RAPIDS cuDF for fast and efficient data operations, ensuring downstream pipelines harness all available GPU cores for complex agentic AI tasks.

Morpheus also extends human analysts’ capabilities by automating real-time analysis and responses, producing synthetic data to train AI models that identify risks accurately and to run what-if scenarios.

The NVIDIA NIM Agent Blueprint for container security is available now. Learn more in the NVIDIA AI Summit DC special address.

(1) Source: IDC, GenAI Awareness, Readiness, and Commitment: 2024 Outlook — GenAI Plans and Implications for External Services Providers, AI-Ready Infrastructure, AI Platforms, and GenAI Applications US52023824, April 2024

Read More

From Concept to Compliance, MITRE Digital Proving Ground Will Accelerate Validation of Autonomous Vehicles

From Concept to Compliance, MITRE Digital Proving Ground Will Accelerate Validation of Autonomous Vehicles

The path to safe, widespread autonomous vehicles is going digital.

MITRE — a government-sponsored nonprofit research organization — today announced its partnership with Mcity at the University of Michigan to develop a virtual and physical autonomous vehicle (AV) validation platform for industry deployment.

As part of this collaboration, announced during the NVIDIA AI Summit in Washington, D.C., MITRE will use Mcity’s simulation tools and a digital twin of its Mcity Test Facility, a real-world AV test environment in its Digital Proving Ground (DPG). The joint platform will deliver physically based sensor simulation enabled by NVIDIA Omniverse Cloud Sensor RTX APIs.

By combining these simulation capabilities with the MITRE DPG reporting framework, developers will be able to perform exhaustive testing in a simulated world to safely validate AVs before real-world deployment.

The current regulatory environment for AVs is highly fragmented, posing significant challenges for widespread deployment. Today, companies navigate regulations at various levels — city, state and the federal government — without a clear path to large-scale deployment. MITRE and Mcity aim to address this ambiguity with comprehensive validation resources open to the entire industry.

Mcity currently operates a 32-acre mock city for automakers and researchers to test their technology. Mcity is also building a digital framework around its physical proving ground to provide developers with AV data and simulation tools.

Raising Safety Standards

One of the largest gaps in the regulatory framework is the absence of universally accepted safety standards that the industry and regulators can rely on.

The lack of common standards leaves regulators with limited tools to verify AV performance and safety in a repeatable manner, while companies struggle to demonstrate the maturity of their AV technology. The ability to do so is crucial in the wake of public road incidents, where AV developers need to demonstrate the reliability of their software in a way that is acceptable to both industry and regulators.

Efforts like the National Highway Traffic Safety Administration’s New Car Assessment Program (NCAP) have been instrumental in setting benchmarks for vehicle safety in traditional automotive development. However, NCAP is insufficient for AV evaluation, where measures of safety go beyond crash tests to the complexity of real-time decision-making in dynamic environments.

Additionally, traditional road testing presents inherent limitations, as it exposes vehicles to real-world conditions but lacks the scalability needed to prove safety across a wide variety of edge cases. It’s particularly difficult to test rare and dangerous scenarios on public roads without significant risk.

By providing both physical and digital resources to validate AVs, MITRE and Mcity will be able to offer a safe, universally accessible solution that addresses the complexity of verifying autonomy.

Physically Based Sensor Simulation

A core piece of this collaboration is sensor simulation, which models the physics and behavior of cameras, lidars, radars and ultrasonic sensors on a physical vehicle, as well as how these sensors interact with their surroundings.

Sensor simulation enables developers to train against and test rare and dangerous scenarios — such as extreme weather conditions, sudden pedestrian crossings or unpredictable driver behavior — safely in virtual settings.

In collaboration with regulators, AV companies can use sensor simulation to recreate a real-world event, analyze their system’s response and evaluate how their vehicle performed — accelerating the validation process.

Moreover, simulation tests are repeatable, meaning developers can track improvements or regressions in the AV stack over time. This means AV companies can provide quantitative evidence to regulators to show that their system is evolving and addressing safety concerns.

Bridging Industry and Regulators

MITRE and its ecosystem are actively developing the Digital Proving Ground platform to facilitate industry-wide standards and regulations.

The platform will be an open and accessible national resource for accelerating safe AV development and deployment, providing a trusted simulation test environment.

Mcity will contribute simulation infrastructure, a digital twin and the ability to seamlessly connect virtual and physical worlds with NVIDIA Omniverse, an open platform enabling system developers to build physical AI and robotic system simulation applications. By integrating this virtual proving ground into DPG, the collaboration will also accelerate the development and use of advanced digital engineering and simulation for AV safety assurance.

Mcity’s simulation tools will connect to Omniverse Cloud Sensor RTX APIs and render a Universal Scene Description (USD) model of Mcity’s physical proving ground. DPG will be able to access this environment, simulate the behavior of vehicles and pedestrians in a realistic test environment and use the DPG reporting framework to explain how the AV performed.

This testing will then be replicated on the physical Mcity proving ground to create a comprehensive feedback loop.

The Road Ahead

As developers, automakers and regulators continue to collaborate, the industry is moving closer to a future where AVs can operate safely and at scale. The establishment of a repeatable testbed for validating safety — across real and simulated environments — will be critical to gaining public trust and regulatory approval, bringing the promise of AVs closer to reality.

Read More

SETI Institute Researchers Engage in World’s First Real-Time AI Search for Fast Radio Bursts

SETI Institute Researchers Engage in World’s First Real-Time AI Search for Fast Radio Bursts

This summer, scientists supercharged their tools in the hunt for signs of life beyond Earth.

Researchers at the SETI Institute became the first to apply AI to the real-time direct detection of faint radio signals from space. Their advances in radio astronomy are available for any field that applies accelerated computing and AI.

“We’re on the cusp of a fundamentally different way of analyzing streaming astronomical data, and the kinds of things we’ll be able to discover with it will be quite amazing,” said Andrew Siemion, Bernard M. Oliver Chair for SETI at the SETI Institute, a group formed in 1984 that now includes more than 120 scientists.

The SETI Institute operates the Allen Telescope Array (pictured above) in Northern California. It’s a cutting-edge telescope used in the search for extraterrestrial intelligence (SETI) as well as for the study of intriguing transient astronomical events such as fast radio bursts.

Germinating AI

The seed of the latest project was planted more than a decade ago. Siemion attended a talk at the University of California, Berkeley, about an early version of machine learning, a classifier that analyzed radio signals like the ones his team gathered from deep space.

“I was really impressed, and realized the ways SETI researchers detected signals at the time were rather naive,” said Siemion, who earned his Ph.D. in astrophysics at Berkeley.

The researchers started connecting with radio experts in conferences outside the field of astronomy. There, they met Adam Thompson, who leads a group of developers at NVIDIA.

“We explained our challenges searching the extremely wide bandwidth of signals from space at high data rates,” Siemion said.

SETI Institute researchers had been using NVIDIA GPUs for years to accelerate the algorithms that separate signals from background noise. Now they thought there was potential to do more.

A Demo Leads to a Pilot

It took time — in part due to the coronavirus pandemic — but earlier this year, Thompson showed Siemion’s team a new product, NVIDIA Holoscan, a sensor processing platform for processing real Ntime data from scientific instruments.

Siemion’s team decided to build a trial application with Holoscan on the NVIDIA IGX edge computing platform that, if successful, could radically change the way the SETI Institute worked.

The institute collaborates with Breakthrough Listen, another SETI Institute research program, headquartered at the University of Oxford, that uses dozens of radio telescopes to collect and store mountains of data, later analyzed in separate processes using GPUs. Each telescope and analysis employs separate, custom-built programs.

“We wanted to create something that would really push our capabilities forward,” Siemion said. “We envisioned a streaming solution that in a more general way takes real-time data from telescopes and brings it directly into the GPUs to do AI inference on it.”

Pointing at the Stars

In a team effort, Luigi Cruz, a staff engineer at the SETI Institute, developed the real-time data reception and inference pipeline using the Holoscan SDK, while Peter Ma, a Breakthrough Listen collaborator, built and trained an AI model to detect fast radio bursts, one of many radio phenomena tracked by astronomers. Wael Farah, Allen Telescope Array project scientist, provided key contributions to the scientific aspects of the study.

They linked the combined real-time Holoscan pipeline, running on an NVIDIA IGX Orin platform, to 28 antennas pointed at the Crab Nebula. Over 15 hours, they gathered more than 90 billion data packets on signals across a spectrum of 5GHz.

Their system captured and analyzed in real time nearly the full 100Gbps of data from the experiment, twice the previous speed the astronomers had achieved. What’s more, they saw how the same code could be used with any telescope to detect all sorts of signals.

‘It’s Like a Magic Wand’

The test was “fantastically successful,” said Siemion. “It’s hard to overstate the transformative potential of Holoscan for radio astronomy because it’s like we’ve been given a magic wand to get all our data from telescopes into accelerated computers that are ideally suited for AI.”

He called the direct memory access in NVIDIA GPUs “a game changer.”

Rather than throw away some of its data to enable more efficient processing — as it did in the past — institute researchers can keep and analyze all of it, fast.

“It’s a profound change in how radio astronomy is done,” he said. “Now we have a viable path to a very different way of using telescopes with smart AI software, and if we do that in a scalable way the opportunities for discovery will be legion.”

Scaling Up the Pilot

The team plans to scale up its pilot software and deploy it in all the radio telescopes it currently uses across a dozen sites. It also aims to share the capability in collaborations with astronomers worldwide.

“Our intent is to bring this to larger international observatories with thousands of users and uses,” Siemion said.

The partnerships extend to globally distributed arrays of telescopes now under construction that promise to increase by an order of magnitude the kinds of signals space researchers can detect.

Sharing the Technology Broadly

Collaboration has been a huge theme for Siemion since 2015, when he became principal investigator for Breakthrough Listen.

“We voraciously collaborate with anyone we can find,” he said in a video interview from the Netherlands, where he was meeting local astronomers.

Work with NVIDIA was just one part of efforts that involve companies and governments across technical and scientific disciplines.

“The engineering talent at NVIDIA is world class … I can’t say enough about Adam and the Holoscan team,” he said.

The software opens a big door to technical collaborations.

“Holoscan lets us tap into a developer community far larger than those in astronomy with complementary skills,” he said. “It will be exciting to see if, say, a cancer algorithm could be repurposed to look for a novel astronomical source and vice versa.”

It’s one more way, NVIDIA and its customers are advancing AI for the benefit of all.

Read More

TSMC and NVIDIA Transform Semiconductor Manufacturing With Accelerated Computing

TSMC and NVIDIA Transform Semiconductor Manufacturing With Accelerated Computing

TSMC, the world leader in semiconductor manufacturing, is moving to production with NVIDIA’s computational lithography platform, called cuLitho, to accelerate manufacturing and push the limits of physics for the next generation of advanced semiconductor chips.

A critical step in the manufacture of computer chips, computational lithography is involved in the transfer of circuitry onto silicon. It requires complex computation — involving electromagnetic physics, photochemistry, computational geometry, iterative optimization and distributed computing. A typical foundry dedicates massive data centers for this computation, and yet this step has traditionally been a bottleneck in bringing new technology nodes and computer architectures to market.

Computational lithography is also the most compute-intensive workload in the entire semiconductor design and manufacturing process. It consumes tens of billions of hours per year on CPUs in the leading-edge foundries. A typical mask set for a chip can take 30 million or more hours of CPU compute time, necessitating large data centers within semiconductor foundries. With accelerated computing, 350 NVIDIA H100 Tensor Core GPU-based systems can now replace 40,000 CPU systems, accelerating production time, while reducing costs, space and power.

NVIDIA cuLitho brings accelerated computing to the field of computational lithography. Moving cuLitho to production is enabling TSMC to accelerate the development of next-generation chip technology, just as current production processes are nearing the limits of what physics makes possible.

“Our work with NVIDIA to integrate GPU-accelerated computing in the TSMC workflow has resulted in great leaps in performance, dramatic throughput improvement, shortened cycle time and reduced power requirements,” said Dr. C.C. Wei, CEO of TSMC, at the GTC conference earlier this year.

NVIDIA has also developed algorithms to apply generative AI to enhance the value of the cuLitho platform. A new generative AI workflow has been shown to deliver an additional 2x speedup on top of the accelerated processes enabled through cuLitho.

The application of generative AI enables creation of a near-perfect inverse mask or inverse solution to account for diffraction of light involved in computational lithography. The final mask is then derived by traditional and physically rigorous methods, speeding up the overall optical proximity correction process by 2x.

The use of optical proximity correction in semiconductor lithography is now three decades old. While the field has benefited from numerous contributions over this period, rarely has it seen a transformation quite as rapid as the one provided by the twin technologies of accelerated computing and AI. These together allow for the more accurate simulation of physics and the realization of mathematical techniques that were once prohibitively resource-intensive.

This enormous speedup of computational lithography accelerates the creation of every single mask in the fab, which speeds the total cycle time for developing a new technology node. More importantly, it makes possible new calculations that were previously impractical.

For example, while inverse lithography techniques have been described in the scientific literature for two decades, an accurate realization at full chip scale has been largely precluded because the computation takes too long. With cuLitho, that’s no longer the case. Leading-edge foundries will use it to ramp up inverse and curvilinear solutions that will help create the next generation of powerful semiconductors.

Image courtesy of TSMC.

Read More