Chill Factor: NVIDIA Blackwell Platform Boosts Water Efficiency by Over 300x

Chill Factor: NVIDIA Blackwell Platform Boosts Water Efficiency by Over 300x

Traditionally, data centers have relied on air cooling — where mechanical chillers circulate chilled air to absorb heat from servers, helping them maintain optimal conditions. But as AI models increase in size, and the use of AI reasoning models rises, maintaining those optimal conditions is not only getting harder and more expensive — but more energy-intensive.

While data centers once operated at 20 kW per rack, today’s hyperscale facilities can support over 135 kW per rack, making it an order of magnitude harder to dissipate the heat generated by high-density racks. To keep AI servers running at peak performance, a new approach is needed for efficiency and scalability.

One key solution is liquid cooling — by reducing dependence on chillers and enabling more efficient heat rejection, liquid cooling is driving the next generation of high-performance, energy-efficient AI infrastructure.

The NVIDIA GB200 NVL72 and the NVIDIA GB300 NVL72 are rack-scale, liquid-cooled systems designed to handle the demanding tasks of trillion-parameter large language model inference. Their architecture is also specifically optimized for test-time scaling accuracy and performance, making it an ideal choice for running AI reasoning models while efficiently managing energy costs and heat.

Liquid-cooled NVIDIA Blackwell compute tray.

Driving Unprecedented Water Efficiency and Cost Savings in AI Data Centers

Historically, cooling alone has accounted for up to 40% of a data center’s electricity consumption, making it one of the most significant areas where efficiency improvements can drive down both operational expenses and energy demands.

Liquid cooling helps mitigate costs and energy use by capturing heat directly at the source. Instead of relying on air as an intermediary, direct-to-chip liquid cooling transfers heat in a technology cooling system loop. That heat is then cycled through a coolant distribution unit via liquid-to-liquid heat exchanger, and ultimately transferred to a facility cooling loop. Because of the higher efficiency of this heat transfer, data centers and AI factories can operate effectively with warmer water temperatures — reducing or eliminating the need for mechanical chillers in a wide range of climates.

The NVIDIA GB200 NVL72 rack-scale, liquid-cooled system, built on the NVIDIA Blackwell platform, offers exceptional performance while balancing energy costs and heat. It packs unprecedented compute density into each server rack, delivering 40x higher revenue potential, 30x higher throughput, 25x more energy efficiency and 300x more water efficiency than traditional air-cooled architectures. Newer NVIDIA GB300 NVL72 systems built on the Blackwell Ultra platform boast a 50x higher revenue potential and 35x higher throughput with 30x more energy efficiency.

Data centers spend an estimated $1.9-2.8M per megawatt (MW) per year, which amounts to nearly $500,000 spent annually on cooling-related energy and water costs. By deploying the liquid-cooled GB200 NVL72 system, hyperscale data centers and AI factories can achieve up to 25x cost savings, leading to over $4 million dollars in annual savings for a 50 MW hyperscale data center.

For data center and AI factory operators, this means lower operational costs, enhanced energy efficiency metrics and a future-proof infrastructure that scales AI workloads efficiently — without the unsustainable water footprint of legacy cooling methods.

Moving Heat Outside the Data Center

As compute density rises and AI workloads drive unprecedented thermal loads, data centers and AI factories must rethink how they remove heat from their infrastructure. The traditional methods of heat rejection that supported predictable CPU-based scaling are no longer sufficient on their own. Today, there are multiple options for moving heat outside the facility, but four major categories dominate current and emerging deployments.

Key Cooling Methods in a Changing Landscape

  • Mechanical Chillers: Mechanical chillers use a vapor compression cycle to cool water, which is then circulated through the data center to absorb heat. These systems are typically air-cooled or water-cooled, with the latter often paired with cooling towers to reject heat. While chillers are reliable and effective across diverse climates, they are also highly energy-intensive. In AI-scale facilities where power consumption and sustainability are top priorities, reliance on chillers can significantly impact both operational costs and carbon footprint.
  • Evaporative Cooling: Evaporative cooling uses the evaporation of water to absorb and remove heat. This can be achieved through direct or indirect systems, or hybrid designs. These systems are much more energy-efficient than chillers but come with high water consumption. In large facilities, they can consume millions of gallons of water per megawatt annually. Their performance is also climate-dependent, making them less effective in humid or water-restricted regions.
  • Dry Coolers: Dry coolers remove heat by transferring it from a closed liquid loop to the ambient air using large finned coils, much like an automotive radiator. These systems don’t rely on water and are ideal for facilities aiming to reduce water usage or operate in dry climates. However, their effectiveness depends heavily on the temperature of the surrounding air. In warmer environments, they may struggle to keep up with high-density cooling demands unless paired with liquid-cooled IT systems that can tolerate higher operating temperatures.
  • Pumped Refrigerant Systems: Pumped refrigerant systems use liquid refrigerants to move heat from the data center to outdoor heat exchangers. Unlike chillers, these systems don’t rely on large compressors inside the facility and they operate without the use of water. This method offers a thermodynamically efficient, compact and scalable solution that works especially well for edge deployments and water-constrained environments. Proper refrigerant handling and monitoring are required, but the benefits in power and water savings are significant.

Each of these methods offers different advantages depending on factors like climate, rack density, facility design and sustainability goals. As liquid cooling becomes more common and servers are designed to operate with warmer water, the door opens to more efficient and environmentally friendly cooling strategies — reducing both energy and water use while enabling higher compute performance.

Optimizing Data Centers for AI Infrastructure

As AI workloads grow exponentially, operators are reimagining data center design with infrastructure built specifically for high-performance AI and energy efficiency. Whether they’re transforming their entire setup into dedicated AI factories or upgrading modular components, optimizing inference performance is crucial for managing costs and operational efficiency.

To get the best performance, high compute capacity GPUs aren’t enough — they need to be able to communicate with each other at lightning speed.

NVIDIA NVLink boosts communication, enabling GPUs to operate as a massive, tightly integrated processing unit for maximum performance with a full-rack power density of 120 kW. This tight, high-speed communication is crucial for today’s AI tasks, where every second saved on transferring data can mean more tokens per second and more efficient AI models.

Traditional air cooling struggles at these power levels. To keep up, data center air would need to be either cooled to below-freezing temperatures or flow at near-gale speeds to carry the heat away, making it increasingly impractical to cool dense racks with air alone.

At nearly 1,000x the density of air, liquid cooling excels at carrying heat away thanks to its superior heat capacitance and thermal conductivity. By efficiently transferring heat away from high-performance GPUs, liquid cooling reduces reliance on energy-intensive and noisy cooling fans, allowing more power to be allocated to computation rather than cooling overhead.

Liquid Cooling in Action

Innovators across the industry are leveraging liquid cooling to slash energy costs, improve density and drive AI efficiency:

Cloud service providers are also adopting cutting-edge cooling and power innovations. Next-generation AWS data centers, featuring jointly developed liquid cooling solutions, increase compute power by 12% while reducing energy consumption by up to 46% — all while maintaining water efficiency.

Cooling the AI Infrastructure of the Future

As AI continues to push the limits of computational scale, innovations in cooling will be essential to meeting the thermal management challenges of the post-Moore’s law era.

NVIDIA is leading this transformation through initiatives like the COOLERCHIPS program, a U.S. Department of Energy-backed effort to develop modular data centers with next-generation cooling systems that are projected to reduce costs by at least 5% and improve efficiency by 20% over traditional air-cooled designs.

Looking ahead, data centers must evolve not only to support AI’s growing demands but do so sustainably — maximizing energy and water efficiency while minimizing environmental impact. By embracing high-density architectures and advanced liquid cooling, the industry is paving the way for a more efficient AI-powered future.

Learn more about breakthrough solutions for data center energy and water efficiency presented at NVIDIA GTC 2025 and discover how accelerated computing is driving a more efficient future with NVIDIA Blackwell.

Read More

Keeping AI on the Planet: NVIDIA Technologies Make Every Day About Earth Day

Keeping AI on the Planet: NVIDIA Technologies Make Every Day About Earth Day

Whether at sea, land or in the sky — even outer space — NVIDIA technology is helping research scientists and developers alike explore and understand oceans, wildlife, the climate and far out existential risks like asteroids.

These increasingly intelligent developments are helping to analyze environmental pollutants, damage to habitats and natural disaster risks at an accelerated pace. This, in turn, enables partnerships with local governments to take climate mitigation steps like pollution prevention and proactive planting.

Sailing the Seas of AI

Amphitrite, based in France, uses satellite data with AI to simulate and predict ocean currents and weather. Its AI models, driven by the NVIDIA AI and Earth-2 platforms, offer insights for positioning vessels to best harness the power of ocean currents. This helps determine when it’s best to travel, as well as the optimal course, reducing travel times, fuel consumption and carbon emissions. Amphitrite is a member of the NVIDIA Inception program for cutting-edge startups.

Watching Over Wildlife With AI

München, Germany-based OroraTech monitors animal poaching and wildfires with NVIDIA CUDA and Jetson. The NVIDIA Inception program member uses the EarthRanger platform to offer a wildfire detection and monitoring service that uses satellite imagery and AI to safeguard the environment and prevent poaching.

Keeping AI on the Weather

Weather agencies and climate scientists worldwide are using NVIDIA CorrDiff, a generative AI weather model enabling kilometer-scale forecasts of wind, temperature and precipitation type and amount. CorrDiff is part of the NVIDIA Earth-2 platform for simulating weather and climate conditions. It’s available as an easy-to-deploy NVIDIA NIM microservice.

In another climate effort, NVIDIA Research announced a new generative AI model, called StormCast, for reliable weather prediction at a scale larger than storms.

The model, outlined in a paper, can help with disaster and mitigation planning, saving lives.

Avoiding Mass Extinction Events

Researchers reported in Nature how a new method was able to spot 10-meter asteroids within the main asteroid belt located between Jupiter and Mars. Such space rocks can range from bus-sized to several Costco stores in width and deliver destruction to cities. It used NASA’s James Webb Space Telescope (JWST), which was tapped for views of these asteroids from previous research and enabled by NVIDIA accelerated computing.

Boosting Energy Efficiency With Liquid-Cooled Blackwell

NVIDIA GB200 NVL72 rack-scale, liquid-cooled systems, built on the Blackwell platform, offer exceptional performance while balancing energy costs and heat. It delivers 40x higher revenue potential, 30x higher throughput, 25x more energy efficiency and 300x more water efficiency than air-cooled architectures. NVIDIA GB300 NVL72 systems built on the Blackwell Ultra platform offer a 50x higher revenue potential, 35x higher throughput with 30x more energy efficiency.

Learn more about NVIDIA Earth-2 and NVIDIA Blackwell.

Read More

AI Bites Back: Researchers Develop Model to Detect Malaria Amid Venezuelan Gold Rush

AI Bites Back: Researchers Develop Model to Detect Malaria Amid Venezuelan Gold Rush

Gold prospecting in Venezuela has led to a malaria resurgence, but researchers have developed AI to take a bite out of the problem.

In Venezuela’s Bolivar state, deforestation for gold mining in waters has disturbed mosquito populations, which are biting miners and infecting them with the deadly parasite.

Venezuela was certified as malaria-free in 1961 by the World Health Organization. It’s estimated that worldwide there were 263 million cases of malaria and 597,000 deaths in 2023, according to the WHO.

In the Venezuelan outbreak, the area affected is rural and has limited access to medical clinics, so detection with microscopy by trained professionals is lacking.

But researchers at the intersection of medicine and technology have tapped AI and NVIDIA GPUs to come up with a solution. They recently published a paper in Nature, describing the development of a convolutional neural network (CNN) for automatically detecting malaria parasites in blood samples.

“At some point in Venezuela, malaria was almost eradicated,” said 25-year-old Diego Ramos-Briceño, who has a bachelor’s in engineering that he earned while also pursuing a doctorate in medicine. “I believe it was around 135,000 cases last year.”

Identifying Malaria Parasites in Blood Samples

The researchers — Ramos-Briceño, Alessandro Flammia-D’Aleo, Gerardo Fernández-López, Fhabián Carrión-Nessi and David Forero-Peña — used the CNN to identify Plasmodium falciparum and Plasmodium vivax in thick blood smears, achieving 99.51% accuracy.

To develop the model, the team acquired a dataset of 5,941 labeled thick blood smear microscope images from the Chittagong Medical College Hospital, in Bangladesh. They processed this dataset to create nearly 190,000 labeled images.

“What we wanted for the neural network to learn is the morphology of the parasite, so from out of the nearly 6,000 microscope level images, we extracted every single parasite, and from all that data augmentation and segmentation, we ended up having almost 190,000 images for model training,” said Ramos-Briceño.

The model comes as traditional microscopy methods are also challenged by limitations in accuracy and consistency, according to the research paper.

Harnessing Gaming GPUs and CUDA for Model Training, Inference

To run model training, the malaria paper’s team tapped into an RTX 3060 GPU from a computer science teacher mentoring their research.

“We used PyTorch Lightning with NVIDIA CUDA acceleration that enabled us to do efficient parallel computation that significantly sped up the matrix operations and the preparations of the neural network compared with what a CPU would have done,” said Ramos-Briceño.

For inference, malaria determinations from blood samples can be made within several seconds, he said, using such GPUs.

Clinics lacking trained microscopists could use the model and introduce their own data for transfer learning so that the model performs optimally with the types of images they submit, handling the lighting conditions and other factors, he said.

“For communities that are far away from the urban setting, where there’s more access to resources, this could be a way to approach the malaria problem,” said Ramos-Briceño.

Read More

Spring Into Action With 11 New Games on GeForce NOW

Spring Into Action With 11 New Games on GeForce NOW

As the days grow longer and the flowers bloom, GFN Thursday brings a fresh lineup of games to brighten the week.

Dive into thrilling hunts and dark fantasy adventures with the arrivals of titles like Hunt: Showdown 1896 — now available on Xbox and PC Game Pass and Mandragora: Whispers of the Witch Tree on GeForce NOW. Whether chasing bounties in the Colorado Rockies or battling chaos in a cursed land, players will gain unforgettable experiences with these games in the cloud.

Plus, roll with the punches in Capcom’s MARVEL vs. CAPCOM Fighting Collection: Arcade Classics, part of 11 games GeForce NOW is adding to its cloud gaming library — featuring over 2,000 titles playable with GeForce RTX 4080 performance.

Spring Into Gaming Anywhere

With the arrivals of Hunt: Showdown 1896 and Mandragora: Whispers of the Witch Tree in the cloud, GeForce NOW members can take their gaming journeys anywhere, from the wild frontiers of the American West to the shadowy forests of a dark fantasy realm.

Hunt Showdown 1896 on GeForce NOW
It’s the wild, wild west.

Hunt: Showdown 1896 transports players to the untamed Rockies, where danger lurks behind every pine and in every abandoned mine. PC Game Pass members — and those who own the game on Xbox — can stream the action instantly. Whether players are tracking monstrous bounties solo or teaming with friends, the game’s tense player vs. player vs. environment action and new map, Mammon’s Gulch, are ideal for springtime exploration.

Jump into the hunt from the living room, in the backyard or even on the go — no high-end PC required with GeForce NOW.

Mandragora on GeForce NOW
Every whisper is a warning.

Step into a beautifully hand-painted world teetering on the edge of chaos in Mandragora: Whispers of the Witch Tree. As an Inquisitor, battle nightmarish creatures and uncover secrets beneath the budding canopies of Faelduum. With deep role-playing game mechanics and challenging combat, Mandragora is ideal for players seeking a fresh adventure this season. GeForce NOW members can continue their quest wherever spring takes them — including on their laptops, tablets and smartphones.

Time for New Games

Marvel VS. Capcom on GeForce NOW
Everyone’s shouting from the excitement of being in the cloud.

Catch MARVEL vs. CAPCOM Fighting Collection: Arcade Classics in the cloud this week. In this legendary collection of arcade classics from the fan-favorite Marvel and Capcom crossover games, dive into an action-packed lineup of seven titles, including heavy hitters X-MEN vs. STREET FIGHTER and MARVEL vs. CAPCOM 2 New Age of Heroes, as well as THE PUNISHER. 

Each game in the collection can be played online or in co-op mode. Whether new or returning to the series from their arcade days, players of all levels can together enjoy these timeless classics in the cloud.

Look for the following games available to stream in the cloud this week:

  • Forever Skies (New release on Steam, available April 14)
  • Night Is Coming (New release on Steam, available April 14)
  • Hunt: Showdown 1896 (New release on Xbox, available on PC Game Pass April 15)
  • Crime Scene Cleaner (New release on Xbox, available on PC Game Pass April 17)
  • Mandragora: Whispers of the Witch Tree (New release on Steam, available April 17)
  • Tempest Rising (New release on Steam, Advanced Access starts April 17)
  • Aimlabs (Steam)
  • Blue Prince (Steam, Xbox)
  • ContractVille (Steam)
  • Gedonia 2 (Steam) 
  • MARVEL vs. CAPCOM Fighting Collection: Arcade Classics (Steam)
  • Path of Exile 2 (Epic Games Store)

What are you planning to play this weekend? Let us know on X or in the comments below.

Read More

Isomorphic Labs Rethinks Drug Discovery With AI

Isomorphic Labs Rethinks Drug Discovery With AI

Isomorphic Labs is reimagining the drug discovery process with an AI-first approach. At the heart of this work is a new way of thinking about biology.

Max Jaderberg, chief AI officer, and Sergei Yakneen, chief technology officer at Isomorphic Labs joined the AI Podcast to explain why they look at biology as an information processing system.

“We’re building generalizable AI models capable of learning from the entire universe of protein and chemical interactions,” Jaderberg said. “This fundamentally breaks from the target-specific, siloed approach of conventional drug development.”

Isomorphic isn’t just working to optimize existing drug design workflows but completely rethinking how drugs are discovered — moving away from traditional methods that have historically been slow and inefficient.

By modeling cellular processes with AI, Isomorphic’s teams can predict molecular interactions with exceptional accuracy. Their advanced AI models enable scientists to computationally simulate how potential therapeutics interact with their targets in complex biological systems. Using AI to reduce dependence on wet lab experiments accelerates the drug discovery pipeline and creates possibilities for addressing previously untreatable conditions.

And that’s just the beginning.

Isomorphic Labs envisions a future of precision medicine, where treatments are tailored to an individual’s unique molecular and genetic makeup. While regulatory hurdles and technical challenges remain, Jaderberg and Yakneen are optimistic and devoted to balancing ambitious innovation with scientific rigor.

“We’re committed to proving our technology through real-world pharmaceutical breakthroughs,” said Jaderberg.

Time Stamps

1:14 – How AI is boosting the drug discovery process.

17:25 – Biology as a computational system.

19:50 – Applications of AlphaFold 3 in pharmaceutical research.

23:05 – The future of precision and preventative medicine.

You Might Also Like… 

NVIDIA’s Jacob Liberman on Bringing Agentic AI to Enterprises

Agentic AI enables developers to create intelligent multi-agent systems that reason, act and execute complex tasks with a degree of autonomy. Jacob Liberman, director of product management at NVIDIA, joined the NVIDIA AI Podcast to explain how agentic AI bridges the gap between powerful AI models and practical enterprise applications.

Roboflow Helps Unlock Computer Vision for Every Kind of AI Builder

Roboflow’s mission is to make the world programmable through computer vision. By simplifying computer vision development, the company helps bridge the gap between AI and the people looking to harness it. Cofounder and CEO Joseph Nelson discusses how Roboflow empowers users in manufacturing, healthcare and automotive to solve complex problems with visual AI.

How World Foundation Models Will Advance Physical AI With NVIDIA’s Ming-Yu Liu

AI models that can accurately simulate and predict outcomes in physical, real-world environments will enable the next generation of physical AI systems. Ming-Yu Liu, vice president of research at NVIDIA and an IEEE Fellow, explains the significance of world foundation models — powerful neural networks that can simulate physical environments.

Read More

Into the Omniverse: How Digital Twins Are Scaling Industrial AI

Into the Omniverse: How Digital Twins Are Scaling Industrial AI

Editor’s note: This post is part of Into the Omniverse, a series focused on how developers, 3D practitioners, and enterprises can transform their workflows using the latest advances in OpenUSD and NVIDIA Omniverse.

As industrial and physical AI streamline workflows, businesses are looking for ways to most effectively harness these technologies.

Scaling AI in industrial settings — like factories and other manufacturing facilities — presents unique challenges, such as fragmented data pipelines, siloed tools and the need for real-time, high-fidelity simulations.

The Mega NVIDIA Omniverse Blueprint — available in preview on build.nvidia.com — helps address these challenges by providing a scalable reference workflow for simulating multi-robot fleets in industrial facility digital twins, including those built with the NVIDIA Omniverse platform.

Industrial AI leaders — including Accenture, Foxconn, Kenmec, KION and Pegatron — are now using the blueprint to accelerate physical AI adoption and build autonomous systems that efficiently perform actions in industrial settings.

Built on the Universal Scene Description (OpenUSD) framework, the blueprint enables seamless data interoperability, real-time collaboration and AI-driven decision-making by unifying diverse data sources and improving simulation fidelity.

Industrial Leaders Adopt the Mega Blueprint

At Hannover Messe, the world’s largest industrial trade show that took place in Germany earlier this month, Accenture and Schaeffler, a leading motion technology company, showcased the adoption of the Mega blueprint to simulate Digit, a humanoid robot from Agility Robotics, performing material handling in kitting and commissioning areas.

Video courtesy of  Schaeffler, Accenture, Agility Robotics

KION, a supply chain solutions company, with Accenture are now using Mega to optimize warehouse and distribution processes.

At the NVIDIA GTC global AI conference in March, Accenture and Foxconn representatives discussed the impacts of introducing Mega into their industrial AI workflows.

Accelerating Industrial AI With Mega 

Mega NVIDIA Omniverse Blueprint architecture diagram

With the Mega blueprint, developers can accelerate physical AI workflows through:

  • Robot Fleet Simulation: Test and train diverse robot fleets in a safe, virtual environment to ensure they work seamlessly together.
  • Digital Twins: Use digital twins to simulate and optimize autonomous systems before physical deployment.
  • Sensor Simulation and Synthetic Data Generation: Generate realistic sensor data to ensure robots can accurately perceive and respond to their real-world environment.
  • Facility and Fleet Management Systems Integration: Connect robot fleets with management systems for efficient coordination and optimization.
  • Robot Brains as Containers: Use portable, plug-and-play modules for consistent robot performance and easier management.
  • World Simulator With OpenUSD: Simulate industrial facilities in highly realistic virtual environments using NVIDIA Omniverse and OpenUSD.
  • Omniverse Cloud Sensor RTX APIs: Ensure accurate sensor simulation with NVIDIA Omniverse Cloud application programming interfaces to create detailed virtual replicas of industrial facilities.
  • Scheduler: Manage complex tasks and data dependencies with a built-in scheduler for smooth and efficient operations.
  • Video Analytics AI Agents: Integrate AI agents built with the NVIDIA AI Blueprint for video search and summarization (VSS), leveraging NVIDIA Metropolis, to enhance operational insights.

Dive deeper into the Mega blueprint architecture on the NVIDIA Technical Blog.

Industrial AI is also being accelerated by the latest Omniverse Kit SDK 107 release, including major updates for robotics application development and enhanced simulation capabilities such as RTX Real-Time 2.0.

Get Plugged Into the World of OpenUSD

Learn more about OpenUSD and industrial AI by watching sessions from GTC, now available on demand, and by watching how ecosystem partners like Pegatron and others are pushing their industrial automation further, faster.

Join NVIDIA at COMPUTEX, running May 19-23 in Taipei, to discover the latest breakthroughs in AI. Watch NVIDIA founder and CEO Jensen Huang’s keynote on Sunday, May 18, at 8:00 p.m. PT.

Discover why developers and 3D practitioners are using OpenUSD and learn how to optimize 3D workflows with the new self-paced “Learn OpenUSD” curriculum for 3D developers and practitioners, available for free through the NVIDIA Deep Learning Institute.

For more resources on OpenUSD, explore the Alliance for OpenUSD forum and the AOUSD website.

Plus, tune in to the “OpenUSD Insiders” livestream taking place today at 11:00 a.m. PT to hear more about the Mega NVIDIA Omniverse Blueprint. Additionally, don’t miss next week’s livestream on April 26 at 11:00 a.m. PT, to hear Accenture discuss how they’re using the blueprint to build Omniverse digital twins for training and testing industrial AI’s robot brains.

Stay up to date by subscribing to NVIDIA news, joining the community and following NVIDIA Omniverse on Instagram, LinkedIn, Medium and X.

Featured image courtesy of:

Left and Top Right: Accenture, KION Group

Middle: Accenture, Agility Robotics, Schaeffler

Bottom Right: Foxconn

Read More

Thousands of NVIDIA Grace Blackwell GPUs Now Live at CoreWeave, Propelling Development for AI Pioneers

Thousands of NVIDIA Grace Blackwell GPUs Now Live at CoreWeave, Propelling Development for AI Pioneers

CoreWeave today became one of the first cloud providers to bring NVIDIA GB200 NVL72 systems online for customers at scale, and AI frontier companies Cohere, IBM and Mistral AI are already using them to train and deploy next-generation AI models and applications.

CoreWeave, the first cloud provider to make NVIDIA Grace Blackwell generally available, has already shown incredible results in MLPerf benchmarks with NVIDIA GB200 NVL72 — a powerful rack-scale accelerated computing platform designed for reasoning and AI agents. Now, CoreWeave customers are gaining access to thousands of NVIDIA Blackwell GPUs.

“We work closely with NVIDIA to quickly deliver to customers the latest and most powerful solutions for training AI models and serving inference,” said Mike Intrator, CEO of CoreWeave. “With new Grace Blackwell rack-scale systems in hand, many of our customers will be the first to see the benefits and performance of AI innovators operating at scale.”

Thousands of NVIDIA Blackwell GPUs are now turning raw data into intelligence at unprecedented speed, with many more coming online soon.

The ramp-up for customers of cloud providers like CoreWeave is underway. Systems built on NVIDIA Grace Blackwell are in full production, transforming cloud data centers into AI factories that manufacture intelligence at scale and convert raw data into real-time insights with speed, accuracy and efficiency.

Leading AI companies around the world are now putting GB200 NVL72’s capabilities to work for AI applications, agentic AI and cutting-edge model development.

Personalized AI Agents

Cohere is using its Grace Blackwell Superchips to help develop secure enterprise AI applications powered by leading-edge research and model development techniques. Its enterprise AI platform, North, enables teams to build personalized AI agents to securely automate enterprise workflows, surface real-time insights and more.

With NVIDIA GB200 NVL72 on CoreWeave, Cohere is already experiencing up to 3x more performance in training for 100 billion-parameter models compared with previous-generation NVIDIA Hopper GPUs — even without Blackwell-specific optimizations.

With further optimizations taking advantage of GB200 NVL72’s large unified memory, FP4 precision and a 72-GPU NVIDIA NVLink domain — where every GPU is connected to operate in concert — Cohere is getting dramatically higher throughput with shorter time to first and subsequent tokens for more performant, cost-effective inference.

“With access to some of the first NVIDIA GB200 NVL72 systems in the cloud, we are pleased with how easily our workloads port to the NVIDIA Grace Blackwell architecture,” said Autumn Moulder, vice president of engineering at Cohere. “This unlocks incredible performance efficiency across our stack — from our vertically integrated North application running on a single Blackwell GPU to scaling training jobs across thousands of them. We’re looking forward to achieving even greater performance with additional optimizations soon.”

AI Models for Enterprise 

IBM is using one of the first deployments of NVIDIA GB200 NVL72 systems, scaling to thousands of Blackwell GPUs on CoreWeave, to train its next-generation Granite models, a series of open-source, enterprise-ready AI models. Granite models deliver state-of-the-art performance while maximizing safety, speed and cost efficiency. The Granite model family is supported by a robust partner ecosystem that includes leading software companies embedding large language models into their technologies.

Granite models provide the foundation for solutions like IBM watsonx Orchestrate, which enables enterprises to build and deploy powerful AI agents that automate and accelerate workflows across the enterprise.

CoreWeave’s NVIDIA GB200 NVL72 deployment for IBM also harnesses the IBM Storage Scale System, which delivers exceptional high-performance storage for AI. CoreWeave customers can access the IBM Storage platform within CoreWeave’s dedicated environments and AI cloud platform.

“We are excited to see the acceleration that NVIDIA GB200 NVL72 can bring to training our Granite family of models,” said Sriram Raghavan, vice president of AI at IBM Research. “This collaboration with CoreWeave will augment IBM’s capabilities to help build advanced, high-performance and cost-efficient models for powering enterprise and agentic AI applications with IBM watsonx.”

Compute Resources at Scale

Mistral AI is now getting its first thousand Blackwell GPUs to build the next generation of open-source AI models.

Mistral AI, a Paris-based leader in open-source AI, is using CoreWeave’s infrastructure, now equipped with GB200 NVL72, to speed up the development of its language models. With models like Mistral Large delivering strong reasoning capabilities, Mistral needs fast computing resources at scale.

To train and deploy these models effectively, Mistral AI requires a cloud provider that offers large, high-performance GPU clusters with NVIDIA Quantum InfiniBand networking and reliable infrastructure management. CoreWeave’s experience standing up NVIDIA GPUs at scale with industry-leading reliability and resiliency through tools such as CoreWeave Mission Control met these requirements.

“Right out of the box and without any further optimizations, we saw a 2x improvement in performance for dense model training,” said Thimothee Lacroix, cofounder and chief technology officer at Mistral AI. “What’s exciting about NVIDIA GB200 NVL72 is the new possibilities it opens up for model development and inference.”

A Growing Number of Blackwell Instances

In addition to long-term customer solutions, CoreWeave offers instances with rack-scale NVIDIA NVLink across 72 NVIDIA Blackwell GPUs and 36 NVIDIA Grace CPUs, scaling to up to 110,000 GPUs with NVIDIA Quantum-2 InfiniBand networking.

These instances, accelerated by the NVIDIA GB200 NVL72 rack-scale accelerated computing platform, provide the scale and performance needed to build and deploy the next generation of AI reasoning models and agents.

Read More

Everywhere, All at Once: NVIDIA Drives the Next Phase of AI Growth

Everywhere, All at Once: NVIDIA Drives the Next Phase of AI Growth

Every company and country wants to grow and create economic opportunity — but they need virtually limitless intelligence to do so. Working with its ecosystem partners, NVIDIA this week is underscoring its work advancing reasoning, AI models and compute infrastructure to manufacture intelligence in AI factories — driving the next phase of growth in the U.S. and around the world.

Yesterday, NVIDIA announced it will manufacture AI supercomputers in the U.S. for the first time. Within the next four years, the company plans with its partners to produce up to half a trillion dollars of AI infrastructure in the U.S.

Building NVIDIA AI supercomputers in the U.S. for American AI factories is expected to create opportunities for hundreds of thousands of people and drive trillions of dollars in growth over the coming decades. Some of the NVIDIA Blackwell compute engines at the heart of those AI supercomputers are already being produced at TSMC fabs in Arizona.

NVIDIA announced today that NVIDIA Blackwell GB200 NVL72 rack-scale systems are now available from CoreWeave for customers to train next-generation AI models and run applications at scale. CoreWeave has thousands of NVIDIA Grace Blackwell processors available now to train and deploy the next wave of AI.

Beyond hardware innovation, NVIDIA also pioneers AI software to create more efficient and intelligent models.

Marking the latest in those advances, the NVIDIA Llama Nemotron Ultra model was recognized today by Artificial Analysis as the world’s most accurate open-source reasoning model for scientific and complex coding tasks. It’s also now ranked among the top reasoning models in the world.

NVIDIA’s engineering feats serve as the foundation of it all. A team of NVIDIA engineers won first place in the AI Mathematical Olympiad, competing against 2,200 teams to solve complex mathematical reasoning problems, which are key to advancing scientific discovery, disciplines and domains. The same post-training techniques and open datasets from NVIDIA’s winning effort in the math reasoning competition were applied in training the Llama Nemotron Ultra model.

The world’s need for intelligence is virtually limitless, and NVIDIA’s AI platform is helping meet that need — everywhere, all at once.

Read More

Math Test? No Problems: NVIDIA Team Scores Kaggle Win With Reasoning Model

Math Test? No Problems: NVIDIA Team Scores Kaggle Win With Reasoning Model

The final days of the AI Mathematical Olympiad’s latest competition were a transcontinental relay for team NVIDIA.

Every evening, two team members on opposite ends of the U.S. would submit an AI reasoning model to Kaggle — the online Olympics of data science and machine learning. They’d wait a tense five hours before learning how well the model tackled a sample set of 50 complex math problems.

After seeing the results, the U.S. team would pass the baton to teammates waking up in Armenia, Finland, Germany and Northern Ireland, who would spend their day testing, modifying and optimizing different model versions.

“Every night I’d be so disappointed in our score, but then I’d wake up and see the messages that came in overnight from teammates in Europe,” said Igor Gitman, senior applied scientist. “My hopes would go up and we’d try again.”

While the team was disheartened by their lack of improvement on the public dataset during the competition’s final days, the real test of an AI model is how well it can generalize to unseen data. That’s where their reasoning model leapt to the top of the leaderboard — correctly answering 34 out of 50 Olympiad questions within a five-hour time limit using a cluster of four NVIDIA L4 GPUs.

“We got the magic in the end,” said Northern Ireland-based team member Darragh Hanley, a Kaggle grandmaster and senior large language model (LLM) technologist.

Building a Winning Equation

The NVIDIA team competed under the name NemoSkills — a nod to their use of the NeMo-Skills collection of pipelines for accelerated LLM training, evaluation and inference. The seven members each contributed different areas of expertise, spanning LLM training, model distillation and inference optimization.

For the Kaggle challenge, over 2,200 participating teams submitted AI models tasked with solving 50 math questions — complex problems at the National Olympiad level, spanning algebra, geometry, combinatorics and number theory — within five hours.

The team’s winning model uses a combination of natural language reasoning and Python code execution.

To complete this inference challenge on the small cluster of NVIDIA L4 GPUs available via Kaggle, the NemoSkills team had to get creative.

Their winning model used Qwen2.5-14B-Base, a foundation model with chain-of-thought reasoning capabilities which the team fine-tuned on millions of synthetically generated solutions to math problems.

These synthetic solutions were primarily generated by two larger reasoning models — DeepSeek-R1 and QwQ-32B — and used to teach the team’s foundation model via a form of knowledge distillation. The end result was a smaller, faster, long-thinking model capable of tackling complex problems using a combination of natural language reasoning and Python code execution.

To further boost performance, the team’s solution reasons through multiple long-thinking responses in parallel before determining a final answer. To optimize this process and meet the competition’s time limit, the team also used an innovative early-stopping technique.

A reasoning model might, for example, be set to answer a math problem 12 different times before picking the most common response. Using the asynchronous processing capabilities of NeMo-Skills and NVIDIA TensorRT-LLM, the team was able to monitor and exit inference early if the model had already converged at the correct answer four or more times.

TensorRT-LLM also enabled the team to harness FP8 quantization, a compression method that resulted in a 1.5x speedup over using the more commonly used FP16 format. ReDrafter, a speculative decoding technique developed by Apple, was used for a further 1.8x speedup.

The final model performed even better on the competition’s unseen final dataset than it did on the public dataset — a sign that the team successfully built a generalizable model and avoided overfitting their LLM to the sample data.

“Even without the Kaggle competition, we’d still be working to improve AI reasoning models for math,” said Gitman. “But Kaggle gives us the opportunity to benchmark and discover how well our models generalize to a third-party dataset.”

Sharing the Wealth 

The team will soon release a technical report detailing the techniques used in their winning solution — and plans to share their dataset and a series of models on Hugging Face. The advancements and optimizations they made over the course of the competition have been integrated into NeMo-Skills pipelines available on GitHub.

Key data, technology, and insights from this pipeline were also used to train the just-released NVIDIA Llama Nemotron Ultra model.

“Throughout this collaboration, we used tools across the NVIDIA software stack,” said Christof Henkel, a member of the Kaggle Grandmasters of NVIDIA, known as KGMON. “By working closely with our LLM research and development teams, we’re able to take what we learn from the competition on a day-to-day basis and push those optimizations into NVIDIA’s open-source libraries.”

After the competition win, Henkel regained the title of Kaggle World Champion — ranking No. 1 among the platform’s over 23 million users. Another teammate, Finland-based Ivan Sorokin, earned the Kaggle Grandmaster title, held by just over 350 people around the world.

For their first-place win, the group also won a $262,144 prize that they’re directing to the NVIDIA Foundation to support charitable organizations.

Meet the full team — Igor Gitman, Darragh Hanley, Christof Henkel, Ivan Moshkov, Benedikt Schifferer, Ivan Sorokin and Shubham Toshniwal — in the video below:

Sample math questions in the featured visual above are from the 2025 American Invitational Mathematics Examination. Find the full set of questions and solutions on the Art of Problem Solving wiki

Read More

NVIDIA to Manufacture American-Made AI Supercomputers in US for First Time

NVIDIA to Manufacture American-Made AI Supercomputers in US for First Time

NVIDIA is working with its manufacturing partners to design and build factories that, for the first time, will produce NVIDIA AI supercomputers entirely in the U.S.

Together with leading manufacturing partners, the company has commissioned more than a million square feet of manufacturing space to build and test NVIDIA Blackwell chips in Arizona and AI supercomputers in Texas.

NVIDIA Blackwell chips have started production at TSMC’s chip plants in Phoenix, Arizona. NVIDIA is building supercomputer manufacturing plants in Texas, with Foxconn in Houston and with Wistron in Dallas. Mass production at both plants is expected to ramp up in the next 12-15 months.

The AI chip and supercomputer supply chain is complex and demands the most advanced manufacturing, packaging, assembly and test technologies. NVIDIA is partnering with Amkor and SPIL for packaging and testing operations in Arizona.

Within the next four years, NVIDIA plans to produce up to half a trillion dollars of AI infrastructure in the United States through partnerships with TSMC, Foxconn, Wistron, Amkor and SPIL. These world-leading companies are deepening their partnership with NVIDIA, growing their businesses while expanding their global footprint and hardening supply chain resilience.

NVIDIA AI supercomputers are the engines of a new type of data center created for the sole purpose of processing artificial intelligence — AI factories that are the infrastructure powering a new AI industry. Tens of “gigawatt AI factories” are expected to be built in the coming years. Manufacturing NVIDIA AI chips and supercomputers for American AI factories is expected to create hundreds of thousands of jobs and drive trillions of dollars in economic security over the coming decades.

“The engines of the world’s AI infrastructure are being built in the United States for the first time,” said Jensen Huang, founder and CEO of NVIDIA. “Adding American manufacturing helps us better meet the incredible and growing demand for AI chips and supercomputers, strengthens our supply chain and boosts our resiliency.”

The company will utilize its advanced AI, robotics and digital twin technologies to design and operate the facilities, including NVIDIA Omniverse to create digital twins of factories and NVIDIA Isaac GR00T to build robots to automate manufacturing.

Read More