November 2024 – Page 12

AI Will Drive Scientific Breakthroughs, NVIDIA CEO Says at SC24

NVIDIA kicked off SC24 in Atlanta with a wave of AI and supercomputing tools set to revolutionize industries like biopharma and climate science.

The announcements, delivered by NVIDIA founder and CEO Jensen Huang and Vice President of Accelerated Computing Ian Buck, are rooted in the company’s deep history in transforming computing.

“Supercomputers are among humanity’s most vital instruments, driving scientific breakthroughs and expanding the frontiers of knowledge,” Huang said. “Twenty-five years after creating the first GPU, we have reinvented computing and sparked a new industrial revolution.”

NVIDIA’s journey in accelerated computing began with CUDA in 2006 and the first GPU for scientific computing, Huang said.

Milestones like Tokyo Tech’s Tsubame supercomputer in 2008, the Oak Ridge National Laboratory’s Titan supercomputer in 2012 and the AI-focused NVIDIA DGX-1 delivered to OpenAI in 2016 highlight NVIDIA’s transformative role in the field.

“Since CUDA’s inception, we’ve driven down the cost of computing by a millionfold,” Huang said. “For some, NVIDIA is a computational microscope, allowing them to see the impossibly small. For others, it’s a telescope exploring the unimaginably distant. And for many, it’s a time machine, letting them do their life’s work within their lifetime.”

At SC24, NVIDIA’s announcements spanned tools for next-generation drug discovery, real-time climate forecasting and quantum simulations.

Central to the company’s advancements are CUDA-X libraries, described by Huang as “the engines of accelerated computing,” which power everything from AI-driven healthcare breakthroughs to quantum circuit simulations.

Huang and Buck highlighted examples of real-world impact, including Nobel Prize-winning breakthroughs in neural networks and protein prediction, powered by NVIDIA technology.

“AI will accelerate scientific discovery, transforming industries and revolutionizing every one of the world’s $100 trillion markets,” Huang said.

CUDA-X Libraries Power New Frontiers

At SC24, NVIDIA announced the new cuPyNumeric library, a GPU-accelerated implementation of NumPy, designed to supercharge applications in data science, machine learning and numerical computing.

With over 400 CUDA-X libraries, including cuDNN for deep learning and cuQuantum for quantum circuit simulations, NVIDIA continues to lead in enhancing computing capabilities across various industries.

Real-Time Digital Twins With Omniverse Blueprint

NVIDIA unveiled the NVIDIA Omniverse Blueprint for real-time computer-aided engineering digital twins, a reference workflow designed to help developers create interactive digital twins for industries like aerospace, automotive, energy and manufacturing.

Built on NVIDIA acceleration libraries, physics-AI frameworks and interactive, physically based rendering, the blueprint accelerates simulations by up to 1,200x, setting a new standard for real-time interactivity.

Early adopters, including Siemens, Altair, Ansys and Cadence, are already using the blueprint to optimize workflows, cut costs and bring products to market faster.

Quantum Leap With CUDA-Q

NVIDIA’s focus on real-time, interactive technologies extends across fields, from engineering to quantum simulations.

In partnership with Google, NVIDIA’s CUDA-Q now powers detailed dynamical simulations of quantum processors, reducing weeks-long calculations to minutes.

Buck explained that with CUDA-Q, developers of all quantum processors can perform larger simulations and explore more scalable qubit designs.

AI Breakthroughs in Drug Discovery and Chemistry

With the open-source release of BioNeMo Framework, NVIDIA is advancing AI-driven drug discovery as researchers gain powerful tools tailored specifically for pharmaceutical applications.

BioNeMo accelerates training by 2x compared to other AI software, enabling faster development of lifesaving therapies.

NVIDIA also unveiled DiffDock 2.0, a breakthrough tool for predicting how drugs bind to target proteins — critical for drug discovery.

Powered by the new cuEquivariance library, DiffDock 2.0 is 6x faster than before, enabling researchers to screen millions of molecules with unprecedented speed and accuracy.

And the NVIDIA ALCHEMI NIM microservice, NVIDIA introduces generative AI to chemistry, allowing researchers to design and evaluate novel materials with incredible speed.

Scientists start by defining the properties they want — like strength, conductivity, low toxicity or even color, Buck explained.

A generative model suggests thousands of potential candidates with the desired properties. Then the ALCHEMI NIM sorts candidate compounds for stability by solving for their lowest energy states using NVIDIA Warp.

This microservice is a game-changer for materials discovery, helping developers tackle challenges in renewable energy and beyond.

These innovations demonstrate how NVIDIA is harnessing AI to drive breakthroughs in science, transforming industries and enabling faster solutions to global challenges.

Earth-2 NIM Microservices: Redefining Climate Forecasts in Real Time

Buck also announced two new microservices — CorrDiff NIM and FourCastNet NIM — to accelerate climate change modeling and simulation results by up to 500x in the NVIDIA Earth-2 platform.

Earth-2, a digital twin for simulating and visualizing weather and climate conditions, is designed to empower weather technology companies with advanced generative AI-driven capabilities.

These tools deliver higher-resolution and more accurate predictions, enabling the forecasting of extreme weather events with unprecedented speed and energy efficiency.

With natural disasters causing $62 billion in insured losses in the first half of this year — 70% higher than the 10-year average — NVIDIA’s innovations address a growing need for precise, real-time climate forecasting. These tools highlight NVIDIA’s commitment to leveraging AI for societal resilience and climate preparedness.

Expanding Production With Foxconn Collaboration

As demand for AI systems like the Blackwell supercomputer grows, NVIDIA is scaling production through new Foxconn facilities in the U.S., Mexico and Taiwan.

Foxconn is building the production and testing facilities using NVIDIA Omniverse to bring up the factories as fast as possible.

Scaling New Heights With Hopper

NVIDIA also announced the general availability of the NVIDIA H200 NVL, a PCIe GPU based on the NVIDIA Hopper architecture optimized for low-power, air-cooled data centers.

The H200 NVL offers up to 1.7x faster large language model inference and 1.3x more performance on HPC applications, making it ideal for flexible data center configurations.

It supports a variety of AI and HPC workloads, enhancing performance while optimizing existing infrastructure.

And the GB200 Grace Blackwell NVL4 Superchip integrates four NVIDIA NVLink-connected Blackwell GPUs unified with two Grace CPUs over NVLink-C2C, Buck said. It provides up to 2x performance for scientific computing, training and inference applications over the prior generation. |

The GB200 NVL4 superchip will be available in the second half of 2025.

The talk wrapped up with an invitation to attendees to visit NVIDIA’s booth at SC24 to interact with various demos, including James, NVIDIA’s digital human, the world’s first real-time interactive wind tunnel and the Earth-2 NIM microservices for climate modeling.

Learn more about how NVIDIA’s innovations are shaping the future of science at SC24.

Faster Forecasts: NVIDIA Launches Earth-2 NIM Microservices for 500x Speedup in Delivering Higher-Resolution Simulations

NVIDIA today at SC24 announced two new NVIDIA NIM microservices that can accelerate climate change modeling simulation results by 500x in NVIDIA Earth-2.

Earth-2 is a digital twin platform for simulating and visualizing weather and climate conditions. The new NIM microservices offer climate technology application providers advanced generative AI-driven capabilities to assist in forecasting extreme weather events.

NVIDIA NIM microservices help accelerate the deployment of foundation models while keeping data secure.

Extreme weather incidents are increasing in frequency, raising concerns over disaster safety and preparedness, and possible financial impacts.

Natural disasters were responsible for roughly $62 billion of insured losses during the first half of this year. That’s about 70% more than the 10-year average, according to a report in Bloomberg.

NVIDIA is releasing the CorrDiff NIM and FourCastNet NIM microservices to help weather technology companies more quickly develop higher-resolution and more accurate predictions. The NIM microservices also deliver leading energy efficiency compared with traditional systems.

New CorrDiff NIM Microservices for Higher-Resolution Modeling

NVIDIA CorrDiff is a generative AI model for kilometer-scale super resolution. Its capability to super-resolve typhoons over Taiwan was recently shown at GTC 2024. CorrDiff was trained on the Weather Research and Forecasting (WRF) model’s numerical simulations to generate weather patterns at 12x higher resolution.

High-resolution forecasts capable of visualizing within the fewest kilometers are essential to meteorologists and industries. The insurance and reinsurance industries rely on detailed weather data for assessing risk profiles. But achieving this level of detail using traditional numerical weather prediction models like WRF or High-Resolution Rapid Refresh is often too costly and time-consuming to be practical.

The CorrDiff NIM microservice is 500x faster and 10,000x more energy-efficient than traditional high-resolution numerical weather prediction using CPUs. Also, CorrDiff is now operating at 300x larger scale. It is super-resolving — or increasing the resolution of lower-resolution images or videos — for the entire United States and predicting precipitation events, including snow, ice and hail, with visibility in the kilometers.

Enabling Large Sets of Forecasts With New FourCastNet NIM Microservice

Not every use case requires high-resolution forecasts. Some applications benefit more from larger sets of forecasts at coarser resolution.

State-of-the-art numerical models like IFS and GFS are limited to 50 and 20 sets of forecasts, respectively, due to computational constraints.

The FourCastNet NIM microservice, available today, offers global, medium-range coarse forecasts. By using the initial assimilated state from operational weather centers such as European Centre for Medium-Range Weather Forecasts or National Oceanic and Atmospheric Administration, providers can generate forecasts for the next two weeks, 5,000x faster than traditional numerical weather models.

This opens new opportunities for climate tech providers to estimate risks related to extreme weather at a different scale, enabling them to predict the likelihood of low-probability events that current computational pipelines overlook.

Learn more about CorrDiff and FourCastNet NIM microservices on ai.nvidia.com.

NVIDIA Releases cuPyNumeric, Enabling Scientists to Harness GPU Acceleration at Cluster Scale

Whether they’re looking at nanoscale electron behaviors or starry galaxies colliding millions of light years away, many scientists share a common challenge — they must comb through petabytes of data to extract insights that can advance their fields.

With the NVIDIA cuPyNumeric accelerated computing library, researchers can now take their data-crunching Python code and effortlessly run it on CPU-based laptops and GPU-accelerated workstations, cloud servers or massive supercomputers. The faster they can work through their data, the quicker they can make decisions about promising data points, trends worth investigating and adjustments to their experiments.

To make the leap to accelerated computing, researchers don’t need expertise in computer science. They can simply write code using the familiar NumPy interface or apply cuPyNumeric to existing code, following best practices for performance and scalability.

Once cuPyNumeric is applied, they can run their code on one or thousands of GPUs with zero code changes.

The latest version of cuPyNumeric, now available on Conda and GitHub, offers support for the NVIDIA GH200 Grace Hopper Superchip, automatic resource configuration at run time and improved memory scaling. It also supports HDF5, a popular file format in the scientific community that helps efficiently manage large, complex data.

Researchers at the SLAC National Accelerator Laboratory, Los Alamos National Laboratory, Australia National University, UMass Boston, the Center for Turbulence Research at Stanford University and the National Payments Corporation of India are among those who have integrated cuPyNumeric to achieve significant improvements in their data analysis workflows.

Less Is More: Limitless GPU Scalability Without Code Changes

Python is the most common programming language for data science, machine learning and numerical computing, used by millions of researchers in scientific fields including astronomy, drug discovery, materials science and nuclear physics. Tens of thousands of packages on GitHub depend on the NumPy math and matrix library, which had over 300 million downloads last month. All of these applications could benefit from accelerated computing with cuPyNumeric.

Many of these scientists build programs that use NumPy and run on a single CPU-only node — limiting the throughput of their algorithms to crunch through increasingly large datasets collected by instruments like electron microscopes, particle colliders and radio telescopes.

cuPyNumeric helps researchers keep pace with the growing size and complexity of their datasets by providing a drop-in replacement for NumPy that can scale to thousands of GPUs. cuPyNumeric doesn’t require code changes when scaling from a single GPU to a whole supercomputer. This makes it easy for researchers to run their analyses on accelerated computing systems of any size.

Solving the Big Data Problem, Accelerating Scientific Discovery

Researchers at SLAC National Accelerator Laboratory, a U.S. Department of Energy lab operated by Stanford University, have found that cuPyNumeric helps them speed up X-ray experiments conducted at the Linac Coherent Light Source.

A SLAC team focused on materials science discovery for semiconductors found that cuPyNumeric accelerated its data analysis application by 6x, decreasing run time from minutes to seconds. This speedup allows the team to run important analyses in parallel when conducting experiments at this highly specialized facility.

By using experiment hours more efficiently, the team anticipates it will be able to discover new material properties, share results and publish work more quickly.

Other institutions using cuPyNumeric include:

Australia National University, where researchers used cuPyNumeric to scale the Levenberg-Marquardt optimization algorithm to run on multi-GPU systems at the country’s National Computational Infrastructure. While the algorithm can be used for many applications, the researchers’ initial target is large-scale climate and weather models.
Los Alamos National Laboratory, where researchers are applying cuPyNumeric to accelerate data science, computational science and machine learning algorithms. cuPyNumeric will provide them with additional tools to effectively use the recently launched Venado supercomputer, which features over 2,500 NVIDIA GH200 Grace Hopper Superchips.
Stanford University’s Center for Turbulence Research, where researchers are developing Python-based computational fluid dynamics solvers that can run at scale on large accelerated computing clusters using cuPyNumeric. These solvers can seamlessly integrate large collections of fluid simulations with popular machine learning libraries like PyTorch, enabling complex applications including online training and reinforcement learning.
UMass Boston, where a research team is accelerating linear algebra calculations to analyze microscopy videos and determine the energy dissipated by active materials. The team used cuPyNumeric to decompose a matrix of 16 million rows and 4,000 columns.
National Payments Corporation of India, the organization behind a real-time digital payment system used by around 250 million Indians daily and expanding globally. NPCI uses complex matrix calculations to track transaction paths between payers and payees. With current methods, it takes about 5 hours to process data for a one-week transaction window on CPU systems. A trial showed that applying cuPyNumeric to accelerate the calculations on multi-node NVIDIA DGX systems could speed up matrix multiplication by 50x, enabling NPCI to process larger transaction windows in less than an hour and detect suspected money laundering in near real time.

To learn more about cuPyNumeric, see a live demo in the NVIDIA booth at the Supercomputing 2024 conference in Atlanta, join the theater talk in the expo hall and participate in the cuPyNumeric workshop.

Watch the NVIDIA special address at SC24.

How InsuranceDekho transformed insurance agent interactions using Amazon Bedrock and generative AI

This post is co-authored with Nishant Gupta from InsuranceDekho.

The insurance industry is complex and overwhelming, with numerous options that can be hard for consumers to understand. This complexity hinders customers from making informed decisions. As a result, customers face challenges in selecting the right insurance coverage, while insurance aggregators and agents struggle to provide clear and accurate information.

InsuranceDekho is a leading InsurTech service that offers a wide range of insurance products from over 49 insurance companies in India. The service operates through a vast network of 150,000 point of sale person (POSP) agents and direct-to-customer channels. InsuranceDekho uses cutting-edge technology to simplify the insurance purchase process for all users. The company’s mission is to make insurance transparent, accessible, and hassle-free for all customers through tech-driven solutions.

In this post, we explain how InsuranceDekho harnessed the power of generative AI using Amazon Bedrock and Anthropic’s Claude to provide responses to customer queries on policy coverages, exclusions, and more. This let our customer care agents and POSPs confidently help our customers understand the policies without reaching out to insurance subject matter experts (SMEs) or memorizing complex plans while providing sales and after-sales services. The use of this solution has improved sales, cross-selling, and overall customer service experience.

“Amazon Bedrock provided the flexibility to explore various leading LLM models using a single API, reducing the undifferentiated heavy lifting associated with hosting third-party models. Leveraging this, InsuranceDekho developed the industry’s first Health Pro Genie with the most efficient engine. It facilitates the insurance agents to choose the right plan for the end customer from the pool of over 125 health plans from 21 different health insurers available on the InsuranceDekho platform.”

– Ish Babbar, Co-Founder and CTO, InsuranceDekho

The challenge

InsuranceDekho faced a significant challenge in responding to customer queries on insurance products in a timely manner. For a given lead, the insurance advisors, particularly those who are new to insurance, would often reach out to SMEs to inquire about policy or product-specific queries. The added step of SME consultation resulted in a process slowdown, requiring advisors to await expert input before responding to customers, introducing delays of a few minutes. Additionally, although SMEs can provide valuable guidance and expertise, their involvement introduces additional costs.

This delay not only affects the customer’s experience but also results in lost prospects because potential customers may decide not to purchase and explore competing services if they get better clarity on those products. The current process was inefficient, and InsuranceDekho needed a solution to empower its agents to respond to customer queries confidently and efficiently, without requiring excessive memorization.

The following figure depicts a common scenario where an SME receives multiple calls from insurance advisors, resulting in delays for the customers. Because SMEs can handle one call at a time, the advisors are left waiting for a response. This further prolongs the time it takes for customers to get clarity on the insurance product and decide on which product they want to purchase.

Solution overview

To overcome the limitations of relying on SMEs, a generative AI-based chat assistant was developed to autonomously resolve agent queries with accuracy. One of the key considerations while designing the chat assistant was to avoid responses from the default large language model (LLM) trained on generic data and only use the insurance policy documents. To generate such high-quality responses, we decided to go with the Retrieval Augmented Generation (RAG) approach using Amazon Bedrock and Anthropic’s Claude Haiku.

Amazon Bedrock

We conducted a thorough evaluation of several generative AI model providers and selected Amazon Bedrock as our primary provider for our foundation model (FM) needs. The key reasons that influenced this decision were:

Managed service – Amazon Bedrock is a fully serverless offering that offers a choice of industry leading FMs without provisioning infrastructure, procuring GPUs around the clock, or configuring ML frameworks. As a result, it significantly reduces development, deployment overheads, and total cost of ownership, while enhancing efficiency and accelerating innovation in disruptive technologies like generative AI.
Continuous model enhancements – Amazon Bedrock provides access to a vast and continuously expanding set of FMs through a single API. The continuous additions and updates to its model portfolio facilitate access to the latest advancements and improvements in AI technology, enabling us to evaluate upcoming LLMs and optimize output quality, latency, and cost by selecting the most suitable LLM for each specific task or application. We experienced this flexibility firsthand when we seamlessly transitioned from Anthropic’s Claude Instant to Anthropic’s Claude Haiku with the advent of Anthropic’s Claude 3, without requiring code changes.
Performance – Amazon Bedrock provides options to achieve high-performance, low-latency, and scalable inference capabilities through on-demand and provisioned throughput options depending on the requirements.
Secure model access – Secure, private model access using AWS PrivateLink gives controlled data transfer for inference without traversing the public internet, maintaining data privacy and helping to adhere to compliance requirements.

Retrieval Augmented Generation

RAG is a process in which LLMs access external documents or knowledge bases, promoting accurate and relevant responses. By referencing authoritative sources beyond their training data, RAG helps LLMs generate high-quality responses and overcome common pitfalls such as outdated or misleading information. RAG can be applied to various applications, including improving customer service, enhancing research capabilities, and streamlining business processes.

Solution building blocks

To begin designing the solution, we identified the key components needed, including the generative AI service, LLMs, vector databases, and caching engines. In this section, we delve into the key building blocks used in the solution, highlighting their importance in achieving optimal accuracy, cost-effectiveness, and performance:

LLMs – After a thorough evaluation of various LLMs and benchmarking, we chose Anthropic’s Claude Haiku for its exceptional performance. The benchmarking results demonstrated unparalleled speed and affordability in its category. Additionally, it delivered rapid and accurate responses while handling straightforward queries or complex requests, making it an ideal choice for our use case.
Embedding model – An embedding model is a type of machine learning (ML) model that maps discrete objects, such as words, phrases, or entities, into dense vector representations in a continuous embedding space. These vector representations, called embeddings, capture the semantic and syntactic relationships between the objects, allowing the model to reason about their similarities and differences. For our use case, we used a third-party embedding model.
Vector database – For the purpose of vector database, we chose Amazon OpenSearch Service because of its scalability, high-performance search capabilities, and cost-effectiveness. Additionally, the OpenSearch Service flexible data model and integration with other features make it an ideal choice for our use case.
Caching – To enhance the performance, efficiency, and cost-effectiveness of our chat assistant, we used Redis on Amazon ElastiCache to cache frequently accessed responses. This approach enables the chat assistant to rapidly retrieve cached responses, minimizing latency and computational load and resulting in a significantly improved user experience and reduced cost.

Implementation details

The following diagram illustrates the workflow of the current solution. Overall, the workflow can be divided into two workflows: the ingestion workflow and the response generation workflow.

Ingestion workflow

The ingestion workflow serves as the foundation that fuels the entire response generation workflow by keeping the knowledge base up to date with the latest information. This process is crucial for making sure that the system can provide accurate and relevant responses based on the most recent insurance policy documents. The ingestion workflow involves three key components: policy documents, embedding model, and OpenSearch Service as a vector database.

The policy documents contain the insurance policy information that needs to be ingested into the knowledge base.
These documents are processed by the embedding model, which converts the textual content into high-dimensional vector representations, capturing the semantic meaning of the text. After the embedding model generates the vector representations of the policy documents, these embeddings are stored in OpenSearch Service. This ingestion workflow enables the chat assistant to provide responses based on the latest policy information available.

Response generation workflow

The response generation workflow is the core of our chat assistant solution. Insurance advisors use it to provide comprehensive responses to customers’ queries regarding policy coverage, exclusions, and other related topics.

To initiate this workflow, our chatbot serves as the entry point, facilitating seamless interaction between the insurance advisors and the underlying response generation system.
This solution incorporates a caching mechanism that uses semantic search to check if a query has been recently processed and answered. If a match is found in the cache (Redis), the chat assistant retrieves and returns the corresponding response, bypassing the full response generation workflow for redundant queries, thereby enhancing system performance.
If no match is found in the cache, the query goes to the intent classifier powered by Anthropic’s Claude Haiku. It analyzes the query to understand the user’s intent and classify it accordingly. This enables dynamic prompting and tailored processing based on the query type. For generic or common queries, the intent classifier can provide the final response independently, bypassing the full RAG workflow, thereby optimizing efficiency and response times.
For queries requiring the full RAG workflow, the intent classifier passes the query to the retrieval step, where a semantic search is performed on a vector database containing insurance policy documents to find the most relevant information, that is, the context based on the query.
After the retrieval step, the retrieved context is integrated with the query and prompt, and this augmented information is fed into the generation process. This augmentation is the core component that enables the enhancement of the generation.
In the final generation step, the actual response to the query is produced based on the external knowledge base of policy documents.

Results

The implementation of the generative AI-powered RAG chat assistant solution has yielded impressive results for InsuranceDekho. By using this solution, insurance advisors can now confidently and efficiently address customer queries autonomously, without the constant need for SME involvement. Additionally, the implementation of this solution has resulted in a significant reduction in response time to address customer queries. InsuranceDekho has witnessed a remarkable 80% decrease in the response time of the customer queries to understand the plan features, inclusions, and exclusions.

InsuranceDekho’s adoption of this generative AI-powered solution has streamlined the customer service process, making sure that customers receive precise and trustworthy responses to their inquiries in a timely manner.

Conclusion

In this post, we discussed how InsuranceDekho harnessed the power of generative AI to equip its insurance advisors with the tools to efficiently respond to customer queries regarding various insurance policies. By implementing a RAG-based chat assistant using Amazon Bedrock and OpenSearch Service, InsuranceDekho empowered its insurance advisors to deliver exceptional service. This solution minimized the reliance on SMEs and significantly reduced response times so advisors could address customer inquiries promptly and accurately.

About the Authors

Vishal Gupta is a Senior Solutions Architect at AWS India, based in Delhi. In his current role at AWS, he works with digital native business customers and enables them to design, architect, and innovate highly scalable, resilient, and cost-effective cloud architectures. An avid blogger and speaker, Vishal loves to share his knowledge with the tech community. Outside of work, he enjoys traveling to new destinations and spending time with his family.

Nishant Gupta is working as Vice President, Engineering at InsuranceDekho with 14 years of experience. He is passionate about building highly scalable, reliable, and cost-optimized solutions that can handle massive amounts of data efficiently.

Hopper Scales New Heights, Accelerating AI and HPC Applications for Mainstream Enterprise Servers

Since its introduction, the NVIDIA Hopper architecture has transformed the AI and high-performance computing (HPC) landscape, helping enterprises, researchers and developers tackle the world’s most complex challenges with higher performance and greater energy efficiency.

During the Supercomputing 2024 conference, NVIDIA announced the availability of the NVIDIA H200 NVL PCIe GPU — the latest addition to the Hopper family. H200 NVL is ideal for organizations with data centers looking for lower-power, air-cooled enterprise rack designs with flexible configurations to deliver acceleration for every AI and HPC workload, regardless of size.

According to a recent survey, roughly 70% of enterprise racks are 20kW and below and use air cooling. This makes PCIe GPUs essential, as they provide granularity of node deployment, whether using one, two, four or eight GPUs — enabling data centers to pack more computing power into smaller spaces. Companies can then use their existing racks and select the number of GPUs that best suits their needs.

Enterprises can use H200 NVL to accelerate AI and HPC applications, while also improving energy efficiency through reduced power consumption. With a 1.5x memory increase and 1.2x bandwidth increase over NVIDIA H100 NVL, companies can use H200 NVL to fine-tune LLMs within a few hours and deliver up to 1.7x faster inference performance. For HPC workloads, performance is boosted up to 1.3x over H100 NVL and 2.5x over the NVIDIA Ampere architecture generation.

Complementing the raw power of the H200 NVL is NVIDIA NVLink technology. The latest generation of NVLink provides GPU-to-GPU communication 7x faster than fifth-generation PCIe — delivering higher performance to meet the needs of HPC, large language model inference and fine-tuning.

The NVIDIA H200 NVL is paired with powerful software tools that enable enterprises to accelerate applications from AI to HPC. It comes with a five-year subscription for NVIDIA AI Enterprise, a cloud-native software platform for the development and deployment of production AI. NVIDIA AI Enterprise includes NVIDIA NIM microservices for the secure, reliable deployment of high-performance AI model inference.

Companies Tapping Into Power of H200 NVL

With H200 NVL, NVIDIA provides enterprises with a full-stack platform to develop and deploy their AI and HPC workloads.

Customers are seeing significant impact for multiple AI and HPC use cases across industries, such as visual AI agents and chatbots for customer service, trading algorithms for finance, medical imaging for improved anomaly detection in healthcare, pattern recognition for manufacturing, and seismic imaging for federal science organizations.

Dropbox is harnessing NVIDIA accelerated computing for its services and infrastructure.

“Dropbox handles large amounts of content, requiring advanced AI and machine learning capabilities,” said Ali Zafar, VP of Infrastructure at Dropbox. “We’re exploring H200 NVL to continually improve our services and bring more value to our customers.”

The University of New Mexico has been using NVIDIA accelerated computing in various research and academic applications.

“As a public research university, our commitment to AI enables the university to be on the forefront of scientific and technological advancements,” said Prof. Patrick Bridges, director of the UNM Center for Advanced Research Computing. “As we shift to H200 NVL, we’ll be able to accelerate a variety of applications, including data science initiatives, bioinformatics and genomics research, physics and astronomy simulations, climate modeling and more.”

H200 NVL Available Across Ecosystem

Dell Technologies, Hewlett Packard Enterprise, Lenovo and Supermicro are expected to deliver a wide range of configurations supporting H200 NVL.

Additionally, H200 NVL will be available in platforms from Aivres, ASRock Rack, ASUS, GIGABYTE, Ingrasys, Inventec, MSI, Pegatron, QCT, Wistron and Wiwynn.

Some systems are based on the NVIDIA MGX modular architecture, which enables computer makers to quickly and cost-effectively build a vast array of data center infrastructure designs.

Platforms with H200 NVL will be available from NVIDIA’s global systems partners beginning in December. To complement availability from leading global partners, NVIDIA is also developing an Enterprise Reference Architecture for H200 NVL systems.

The reference architecture will incorporate NVIDIA’s expertise and design principles, so partners and customers can design and deploy high-performance AI infrastructure based on H200 NVL at scale. This includes full-stack hardware and software recommendations, with detailed guidance on optimal server, cluster and network configurations. Networking is optimized for the highest performance with the NVIDIA Spectrum-X Ethernet platform.

NVIDIA technologies will be showcased on the showroom floor at SC24, taking place at the Georgia World Congress Center through Nov. 22. To learn more, watch NVIDIA’s special address.

See notice regarding software product information.

Foxconn Expands Blackwell Testing and Production With New Factories in U.S., Mexico and Taiwan

To meet demand for Blackwell, now in full production, Foxconn, the world’s largest electronics manufacturer, is using NVIDIA Omniverse. The platform for developing industrial AI simulation applications is helping bring facilities in the U.S., Mexico and Taiwan online faster than ever.

Foxconn uses NVIDIA Omniverse to virtually integrate their facility and equipment layouts, NVIDIA Isaac Sim for autonomous robot testing and simulation, and NVIDIA Metropolis for vision AI.

Omniverse enables industrial developers to maximize efficiency through test and optimization in a digital twin before deploying costly change-orders to the physical world. Foxconn expects its Mexico facility alone to deliver significant cost savings and a reduction in kilowatt-hour usage of more than 30% annually.

World’s Largest Electronics Maker Plans With Omniverse and AI

To meet demands at Foxconn, factory planners are building physical AI-powered robotic factories with Omniverse and NVIDIA AI.

The company has built digital twins with Omniverse that allow their teams to virtually integrate facility and equipment information from leading industry applications, such as Siemens Teamcenter X and Autodesk Revit. Floor plan layouts are optimized first in the digital twin, and planners can locate optimal camera positions that help measure and identify ways to streamline operations with Metropolis visual AI agents.

In the construction process, the Foxconn teams use the Omniverse digital twin as the source of truth to communicate and validate the accurate layout and placement of equipment.

Virtual integration on Omniverse offers significant advantages, potentially saving factory planners millions by reducing costly change orders in real-world operations.

Delivering Robotics for Manufacturing With Omniverse Digital Twin

Once the digital twin of the factory is built, it becomes a virtual gym for Foxconn’s fleets of autonomous robots including industrial manipulators and autonomous mobile robots. Foxconn’s robot developers can simulate, test and validate their AI robot models in NVIDIA Isaac Sim before deploying to their real world robots.

Using Omniverse, Foxconn can simulate robot AIs before deploying to NVIDIA Jetson-driven autonomous mobile robots.

On assembly lines, they can simulate with Isaac Manipulator libraries and AI models for automated optical inspection, object identification, defect detection and trajectory planning.

Omniverse also enables their facility planners to test and optimize intelligent camera placement before installing in the physical world – ensuring they have complete coverage of the factory floor to support worker safety, and provide the foundation for visual AI agent frameworks.

Creating Efficiencies While Building Resilient Supply Chains

Using NVIDIA Omniverse and AI, Foxconn plans to replicate its precision production lines across the world. This will enable it to quickly deploy high-quality production facilities that meet unified standards, increasing the company’s competitive edge and adaptability in the market.

Foxconn’s ability to rapidly replicate will accelerate its global deployments and enhance its resilience in the supply chain in the face of disruptions, as it can quickly adjust production strategies and reallocate resources to ensure continuity and stability to meet changing demands.

Foxconn’s Mexico facility will begin production early next year and the Taiwan location will begin production in December.

Learn more about Blackwell and Omniverse.

From Algorithms to Atoms: NVIDIA ALCHEMI NIM Catalyzes Sustainable Materials Research for EV Batteries, Solar Panels and More

More than 96% of all manufactured goods — ranging from everyday products, like laundry detergent and food packaging, to advanced industrial components, such as semiconductors, batteries and solar panels — rely on chemicals that cannot be replaced with alternative materials.

With AI and the latest technological advancements, researchers and developers are studying ways to create novel materials that could address the world’s toughest challenges, such as energy storage and environmental remediation.

Announced today at the Supercomputing 2024 conference in Atlanta, the NVIDIA ALCHEMI NIM microservice accelerates such research by optimizing AI inference for chemical simulations that could lead to more efficient and sustainable materials to support the renewable energy transition.

It’s one of the many ways NVIDIA is supporting researchers, developers and enterprises to boost energy and resource efficiency in their workflows, including to meet requirements aligned with the global Net Zero Initiative.

NVIDIA ALCHEMI for Material and Chemical Simulations

Exploring the universe of potential materials, using the nearly infinite combinations of chemicals — each with unique characteristics — can be extremely complex and time consuming. Novel materials are typically discovered through laborious, trial-and-error synthesis and testing in a traditional lab.

Many of today’s plastics, for example, are still based on material discoveries made in the mid-1900s.

More recently, AI has emerged as a promising accelerant for chemicals and materials innovation.

With the new ALCHEMI NIM microservice, researchers can test chemical compounds and material stability in simulation, in a virtual AI lab, which reduces costs, energy consumption and time to discovery.

For example, running MACE-MP-0, a pretrained foundation model for materials chemistry, on an NVIDIA H100 Tensor Core GPU, the new NIM microservice speeds evaluations of a potential composition’s simulated long-term stability 100x. The below figure shows a 25x speedup from using the NVIDIA Warp Python framework for high-performance simulation, followed by a 4x speedup with in-flight batching. All in all, evaluating 16 million structures would have taken months — with the NIM microservice, it can be done in just hours.

By letting scientists examine more structures in less time, the NIM microservice can boost research on materials for use with solar and electric batteries, for example, to bolster the renewable energy transition.

NVIDIA also plans to release NIM microservices that can be used to simulate the manufacturability of novel materials — to determine how they might be brought from test tubes into the real world in the form of batteries, solar panels, fertilizers, pesticides and other essential products that can contribute to a healthier, greener planet.

SES AI, a leading developer of lithium-metal batteries, is using the NVIDIA ALCHEMI NIM microservice with the AIMNet2 model to accelerate the identification of electrolyte materials used for electric vehicles.

“SES AI is dedicated to advancing lithium battery technology through AI-accelerated material discovery, using our Molecular Universe Project to explore and identify promising candidates for lithium metal electrolyte discovery,” said Qichao Hu, CEO of SES AI. “Using the ALCHEMI NIM microservice with AIMNet2 could drastically improve our ability to map molecular properties, reducing time and costs significantly and accelerating innovation.”

SES AI recently mapped 100,000 molecules in half a day, with the potential to achieve this in under an hour using ALCHEMI. This signals how the microservice is poised to have a transformative impact on material screening efficiency.

Looking ahead, SES AI aims to map the properties of up to 10 billion molecules within the next couple of years, pushing the boundaries of AI-driven, high-throughput discovery.

The new microservice will soon be available for researchers to test for free through the NVIDIA NGC catalog — be notified of ALCHEMI’s launch. It will also be downloadable from build.nvidia.com, and the production-grade NIM microservice will be offered through the NVIDIA AI Enterprise software platform.

Learn more about the NVIDIA ALCHEMI NIM microservice, and hear the latest on how AI and supercomputing are supercharging researchers and developers’ workflows by joining NVIDIA at SC24, running through Friday, Nov. 22.