Innovation to Impact: How NVIDIA Research Fuels Transformative Work in AI, Graphics and Beyond

Innovation to Impact: How NVIDIA Research Fuels Transformative Work in AI, Graphics and Beyond

The roots of many of NVIDIA’s landmark innovations — the foundational technology that powers AI, accelerated computing, real-time ray tracing and seamlessly connected data centers — can be found in the company’s research organization, a global team of around 400 experts in fields including computer architecture, generative AI, graphics and robotics.

Established in 2006 and led since 2009 by Bill Dally, former chair of Stanford University’s computer science department, NVIDIA Research is unique among corporate research organizations — set up with a mission to pursue complex technological challenges while having a profound impact on the company and the world.

“We make a deliberate effort to do great research while being relevant to the company,” said Dally, chief scientist and senior vice president of NVIDIA Research. “It’s easy to do one or the other. It’s hard to do both.”

Dally is among NVIDIA Research leaders sharing the group’s innovations at NVIDIA GTC, the premier developer conference at the heart of AI, taking place this week in San Jose, California.

“We make a deliberate effort to do great research while being relevant to the company.” — Bill Dally, chief scientist and senior vice president

While many research organizations may describe their mission as pursuing projects with a longer time horizon than those of a product team, NVIDIA researchers seek out projects with a larger “risk horizon” — and a huge potential payoff if they succeed.

“Our mission is to do the right thing for the company. It’s not about building a trophy case of best paper awards or a museum of famous researchers,” said David Luebke, vice president of graphics research and NVIDIA’s first researcher. “We are a small group of people who are privileged to be able to work on ideas that could fail. And so it is incumbent upon us to not waste that opportunity and to do our best on projects that, if they succeed, will make a big difference.”

Innovating as One Team

One of NVIDIA’s core values is “one team” — a deep commitment to collaboration that helps researchers work closely with product teams and industry stakeholders to transform their ideas into real-world impact.

“Everybody at NVIDIA is incentivized to figure out how to work together because the accelerated computing work that NVIDIA does requires full-stack optimization,” said Bryan Catanzaro, vice president of applied deep learning research at NVIDIA. “You can’t do that if each piece of technology exists in isolation and everybody’s staying in silos. You have to work together as one team to achieve acceleration.”

When evaluating potential projects, NVIDIA researchers consider whether the challenge is a better fit for a research or product team, whether the work merits publication at a top conference, and whether there’s a clear potential benefit to NVIDIA. If they decide to pursue the project, they do so while engaging with key stakeholders.

“We are a small group of people who are privileged to be able to work on ideas that could fail. And so it is incumbent upon us to not waste that opportunity.” — David Luebke, vice president of graphics research

“We work with people to make something real, and often, in the process, we discover that the great ideas we had in the lab don’t actually work in the real world,” Catanzaro said. “It’s a tight collaboration where the research team needs to be humble enough to learn from the rest of the company what they need to do to make their ideas work.”

The team shares much of its work through papers, technical conferences and open-source platforms like GitHub and Hugging Face. But its focus remains on industry impact.

“We think of publishing as a really important side effect of what we do, but it’s not the point of what we do,” Luebke said.

NVIDIA Research’s first effort was focused on ray tracing, which after a decade of sustained work led directly to the launch of NVIDIA RTX and redefined real-time computer graphics. The organization now includes teams specializing in chip design, networking, programming systems, large language models, physics-based simulation, climate science, humanoid robotics and self-driving cars — and continues expanding to tackle additional areas of study and tap expertise across the globe.

“You have to work together as one team to achieve acceleration.” — Bryan Catanzaro, vice president of applied deep learning research

Transforming NVIDIA — and the Industry

NVIDIA Research didn’t just lay the groundwork for some of the company’s most well-known products — its innovations have propelled and enabled today’s era of AI and accelerated computing.

It began with CUDA, a parallel computing software platform and programming model that enables researchers to tap GPU acceleration for myriad applications. Launched in 2006, CUDA made it easy for developers to harness the parallel processing power of GPUs to speed up scientific simulations, gaming applications and the creation of AI models.

“Developing CUDA was the single most transformative thing for NVIDIA,” Luebke said. “It happened before we had a formal research group, but it happened because we hired top researchers and had them work with top architects.”

Making Ray Tracing a Reality

Once NVIDIA Research was founded, its members began working on GPU-accelerated ray tracing, spending years developing the algorithms and the hardware to make it possible. In 2009, the project — led by the late Steven Parker, a real-time ray tracing pioneer who was vice president of professional graphics at NVIDIA — reached the product stage with the NVIDIA OptiX application framework, detailed in a 2010 SIGGRAPH paper.

The researchers’ work expanded and, in collaboration with NVIDIA’s architecture group, eventually led to the development of NVIDIA RTX ray-tracing technology, including RT Cores that enabled real-time ray tracing for gamers and professional creators.

Unveiled in 2018, NVIDIA RTX also marked the launch of another NVIDIA Research innovation: NVIDIA DLSS, or Deep Learning Super Sampling. With DLSS, the graphics pipeline no longer needs to draw all the pixels in a video. Instead, it draws a fraction of the pixels and gives an AI pipeline the information needed to create the image in crisp, high resolution.

Accelerating AI for Virtually Any Application

NVIDIA’s research contributions in AI software kicked off with the NVIDIA cuDNN library for GPU-accelerated neural networks, which was developed as a research project when the deep learning field was still in its initial stages — then released as a product in 2014.

As deep learning soared in popularity and evolved into generative AI, NVIDIA Research was at the forefront — exemplified by NVIDIA StyleGAN, a groundbreaking visual generative AI model that demonstrated how neural networks could rapidly generate photorealistic imagery.

While generative adversarial networks, or GANs, were first introduced in 2014, “StyleGAN was the first model to generate visuals that could completely pass muster as a photograph,” Luebke said. “It was a watershed moment.”

NVIDIA StyleGAN
NVIDIA StyleGAN

NVIDIA researchers introduced a slew of popular GAN models such as the AI painting tool GauGAN, which later developed into the NVIDIA Canvas application. And with the rise of diffusion models, neural radiance fields and Gaussian splatting, they’re still advancing visual generative AI — including in 3D with recent models like Edify 3D and 3DGUT.

NVIDIA GauGAN
NVIDIA GauGAN

In the field of large language models, Megatron-LM was an applied research initiative that enabled the efficient training and inference of massive LLMs for language-based tasks such as content generation, translation and conversational AI. It’s integrated into the NVIDIA NeMo platform for developing custom generative AI, which also features speech recognition and speech synthesis models that originated in NVIDIA Research.

Achieving Breakthroughs in Chip Design, Networking, Quantum and More

AI and graphics are only some of the fields NVIDIA Research tackles — several teams are achieving breakthroughs in chip architecture, electronic design automation, programming systems, quantum computing and more.

In 2012, Dally submitted a research proposal to the U.S. Department of Energy for a project that would become NVIDIA NVLink and NVSwitch, the high-speed interconnect that enables rapid communication between GPU and CPU processors in accelerated computing systems.

NVLink Switch tray
NVLink Switch tray

In 2013, the circuit research team published work on chip-to-chip links that introduced a signaling system co-designed with the interconnect to enable a high-speed, low-area and low-power link between dies. The project eventually became the link between the NVIDIA Grace CPU and NVIDIA Hopper GPU.

In 2021, the ASIC and VLSI Research group developed a software-hardware codesign technique for AI accelerators called VS-Quant that enabled many machine learning models to run with 4-bit weights and 4-bit activations at high accuracy. Their work influenced the development of FP4 precision support in the NVIDIA Blackwell architecture.

And unveiled this year at the CES trade show was NVIDIA Cosmos, a platform created by NVIDIA Research to accelerate the development of physical AI for next-generation robots and autonomous vehicles. Read the research paper and check out the AI Podcast episode on Cosmos for details.

Learn more about NVIDIA Research at GTC. Watch the keynote by NVIDIA founder and CEO Jensen Huang below:

See notice regarding software product information.

Read More

NVIDIA Honors Americas Partners Advancing Agentic and Physical AI

NVIDIA Honors Americas Partners Advancing Agentic and Physical AI

NVIDIA this week recognized 14 partners leading the way across the Americas for their work advancing agentic and physical AI across industries.

The 2025 Americas NVIDIA Partner Network awards — announced at the GTC 2025 global AI conference — represent key efforts by industry leaders to help customers become experts in using AI to solve many of today’s greatest challenges. The awards honor the diverse contributions of NPN members fostering AI-driven innovation and growth.

This year, NPN introduced three new award categories that reflect how AI is driving economic growth and opportunities, including:

  • Trailblazer, which honors a visionary partner spearheading AI adoption and setting new industry standards.
  • Rising Star, which celebrates an emerging talent helping industries harness AI to drive transformation.
  • Innovation, which recognizes a partner that’s demonstrated exceptional creativity and forward thinking.

This year’s NPN ecosystem winners have helped companies across industries use AI to adapt to new challenges and prioritize energy-efficient accelerated computing. NPN partners help customers implement a broad range of AI technologies, including NVIDIA-accelerated AI factories, as well as large language models and generative AI chatbots, to transform business operations.

The 2025 NPN award winners for the Americas are:

  • Global Consulting Partner of the Year — Accenture is recognized for its impact and depth of engineering with its AI Refinery platform for industries, simulation and robotics, marketing and sovereignty, which helps organizations enhance innovation and growth with custom-built approaches to AI-driven enterprise reinvention.
  • Trailblazer Partner of the Year — Advizex is recognized for its commitment to driving innovation in AI and high-performance computing, helping industries like healthcare, manufacturing, retail and government seamlessly integrate advanced AI technologies into existing business frameworks. This enables organizations to achieve significant operations efficiencies, enhanced decision-making, and accelerated digital transformation.
  • Rising Star Partner of the Year — AHEAD is recognized for its leadership, technical expertise and deployment of NVIDIA software, NVIDIA DGX systems, NVIDIA HGX and networking technologies to advance AI, benefitting customers across healthcare, financial services, life sciences and higher education.
  • Networking Partner of the Year — Computacenter is recognized for advancing high-performance computing and data centers with NVIDIA networking technologies. The company achieved this by using the NVIDIA AI Enterprise software platform, DGX platforms and NVIDIA networking to drive innovation and growth throughout industries with efficient, accelerated data centers.
  • Solution Integration Partner of the Year — EXXACT is recognized for its efforts in helping research institutions and businesses tap into generative AI, large language models and high-performance computing. The company harnesses NVIDIA GPUs and networking technologies to deliver powerful computing platforms that accelerate innovation and tackle complex computational challenges across various industries.
  • Enterprise Partner of the Year — World Wide Technology (WWT) is recognized for its leadership in advancing AI adoption of customers across industry verticals worldwide. The company expanded its end-to-end AI capabilities by integrating NVIDIA Blueprints into its AI Proving Ground and has made a $500 million commitment to AI development over three years to help speed enterprise generative AI deployments.
  • Software Partner of the Year — Mark III is recognized for the work of its cross-functional team spanning data scientists, developers, 3D artists, systems engineers, and HPC and AI architects, as well as its close collaborations with enterprises and institutions, to deploy NVIDIA software, including NVIDIA AI Enterprise and NVIDIA Omniverse, across industries. These efforts have helped many customers build software-powered pipelines and data flywheels with machine learning, generative AI, high-performance computing and digital twins.
  • Higher Education Research Partner of the Year — Mark III is recognized for its close engagement with universities, academic institutions and research organizations to cultivate the next generation of leaders across AI, machine learning, generative AI, high-performance computing and digital twins.
  • Healthcare Partner of the Year — Lambda is recognized for empowering healthcare and biotech organizations with AI training, fine-tuning and inferencing solutions to speed innovation and drive breakthroughs in AI-driven drug discovery. The company provides AI training, fine-tuning and inferencing solutions at every scale — from individual workstations to comprehensive AI factories — that help healthcare providers seamlessly integrate NVIDIA accelerated computing and software into their infrastructure.
  • Financial Services Partner of the Year — WWT is recognized for driving the digital transformation of the world’s largest banks and financial institutions. The company harnesses NVIDIA AI technologies to optimize data management, enhance cybersecurity and deliver transformative generative AI solutions, helping financial services clients navigate rapid technological changes and evolving customer expectations.
  • Innovation Partner of the Year — Cambridge Computer is recognized for supporting customers deploying transformative technologies, including NVIDIA Grace Hopper, NVIDIA Blackwell and the NVIDIA Omniverse platform for physical AI.
  • Service Delivery Partner of the Year — SoftServe is recognized for its impact in driving enterprise adoption of NVIDIA AI and Omniverse with custom NVIDIA Blueprints that tap into NVIDIA NIM microservices and NVIDIA NeMo and Riva software. SoftServe helps customers create generative AI services for industries spanning manufacturing, retail, financial services, auto, healthcare and life sciences.
  • Distribution Partner of the Year — TD SYNNEX has been recognized for the second consecutive year for supporting customers in accelerating AI growth through rapid delivery of NVIDIA accelerated computing and software, as part of its Destination AI initiative.
  • Rising Star Consulting Partner of the Year Tata Consultancy Services (TCS) is recognized for its growth and commitment to providing industry-specific solutions  that help customers adopt AI faster and at scale. Through its recently launched business unit and center of excellence built on NVIDIA AI Enterprise and Omniverse, TCS is poised to accelerate adoption of agentic AI and physical AI solutions to speed innovation for customers worldwide.
  • Canadian Partner of the Year — Hypertec is recognized for its advancement of high-performance computing and generative AI across Canada. The company has employed the full-stack NVIDIA platform to accelerate AI for financial services, higher education and research.
  • Public Sector Partner of the Year — Government Acquisitions (GAI) is recognized for its rapid AI deployment and robust customer relationships, helping serve the unique needs of the federal government by adding AI to operations to improve public safety and efficiency.

Learn more about the NPN program.

Read More

NVIDIA Blackwell Powers Real-Time AI for Entertainment Workflows

NVIDIA Blackwell Powers Real-Time AI for Entertainment Workflows

AI has been shaping the media and entertainment industry for decades, from early recommendation engines to AI-driven editing and visual effects automation. Real-time AI — which lets companies actively drive content creation, personalize viewing experiences and rapidly deliver data insights — marks the next wave of that transformation.

With the NVIDIA RTX PRO Blackwell GPU series, announced yesterday at the NVIDIA GTC global AI conference, media companies can now harness real-time AI for media workflows with unprecedented speed, efficiency and creative potential.

NVIDIA Blackwell serves as the foundation of NVIDIA Media2, an initiative that enables real-time AI by bringing together NVIDIA technologies — including NVIDIA NIM microservices, NVIDIA AI Blueprints, accelerated computing platforms and generative AI software — to transform all aspects of production workflows and experiences, starting with content creation, streaming and live media.

Powering Intelligent Content Creation

Accelerated computing enables AI-driven workflows to process massive datasets in real time, unlocking faster rendering, simulation and content generation.

NVIDIA RTX PRO Blackwell GPUs series include new features that enable unprecedented graphics and AI performance. The NVIDIA Streaming Multiprocessor offers up to 1.5x faster throughput over the NVIDIA Ada generation, and new neural shaders that integrate AI inside of programmable shaders for advanced content creation.

Fourth-generation RT Cores deliver up to 2x the performance of the previous generation, enabling the creation of massive photoreal and physically accurate animated scenes. Fifth-generation Tensor Cores deliver up to 4,000 AI trillion operations per second and add support for FP4 precision. And up to 96GB of GDDR7 memory boosts GPU bandwidth and capacity, allowing applications to run faster and work with larger, more complex datasets for massive 3D and AI projects, large-scale virtual-reality environments and more.

Elio © Disney/Pixar

“One of the most exciting aspects of new technology is how it empowers our artists with tools to enhance their creative workflows,” said Steve May, chief technology officer of Pixar Animation Studios. “With Pixar’s next-generation renderer, RenderMan XPU — optimized for the NVIDIA Blackwell platform — 99% of Pixar shots can now fit within the 96GB of memory on the NVIDIA RTX PRO 6000 Blackwell GPUs. This breakthrough will fundamentally improve the way we make movies.”

© Lucasfilm Ltd.

“Our artists were frequently maxing out our 48GB cards with ILM StageCraft environments and having to battle performance issues on set for 6K and 8K real-time renders,” said Stephen Hill, principal rendering engineer at Lucasfilm. “The new NVIDIA RTX PRO 6000 Blackwell Max-Q Workstation Edition GPU lifts these limitations — we’re seeing upwards of a 2.5x performance increase over our current production GPUs, and with 96GB of VRAM we now have twice as much memory to play with.”

In addition, neural rendering with NVIDIA RTX Kit brings cinematic-quality ray tracing and AI-enhanced graphics to real-time engines, elevating visual fidelity in film, TV and interactive media. Including neural texture compression, neural shaders, RTX Global Illumination and Mega Geometry, RTX Kit is a suite of neural rendering technologies that enhance graphics for games, animation, virtual production scenes and immersive experiences.

Fueling the Future of Streaming and Data Analytics

Data analytics is transforming raw audience insights into actionable intelligence faster than ever. NVIDIA accelerated computing and AI-powered frameworks enable studios to analyze viewer behavior, predict engagement patterns and optimize content in real time, driving hyper-personalized experiences and smarter creative decisions.

With the new GPUs, users can achieve real-time ingestion and data transformation with GPU-accelerated data loading and cleansing at scale.

The NVIDIA technologies accelerating streaming and data analytics include a suite of NVIDIA CUDA-X data processing libraries that enable immediate insights from continuous data streams and reduce latency, such as:

  • NVIDIA cuML: Enables GPU-accelerated training and inference for recommendation models using scikit-learn algorithms, providing real-time personalization capabilities and up-to-date relevant content recommendations that boost viewer engagement while reducing churn.
  • NVIDIA cuDF: Offers pandas DataFrame operations on GPUs, enabling faster and more efficient NVIDIA-accelerated extract, transform and load operations and analytics. cuDF helps optimize content delivery by analyzing user data to predict demand and adjust content distribution in real time, improving overall user experiences.

Along with cuML and cuDF, accelerated data science libraries provide seamless integration with the open-source Dask library for multi-GPU or multi-node clusters. NVIDIA RTX Blackwell PRO GPUs’ large GPU memory can further assist with handling massive datasets and spikes in usage without sacrificing performance.

And, the video search and summarization blueprint integrates vision language models and large language models and provides cloud-native building blocks to build video analytics, search and summarization applications.

Breathing Life Into Live Media 

With NVIDIA RTX PRO Blackwell GPUs, broadcasters can achieve higher performance than ever in high-resolution video processing, real-time augmented reality and AI-driven content production and video analytics.

New features include:

  • Ninth-Generation NVIDIA NVENC: Adds support for 4:2:2 encoding, accelerating video encoding speed and improving quality for broadcast and live media applications while reducing costs of storing uncompressed video.
  • Sixth-Generation NVIDIA NVDEC: Provides up to double H.264 decoding throughput and offers support for 4:2:2 H.264 and HEVC decode. Professionals can benefit from high-quality video playback, accelerate video data ingestion and use advanced AI-powered video editing features.
  • Fifth-Generation PCIe: Provides double the bandwidth over the previous generation, improving data transfer speeds from CPU memory and unlocking faster performance for data-intensive tasks.
  • DisplayPort 2.1: Drives high-resolution displays at up to 8K at 240Hz and 16K at 60Hz. Increased bandwidth enables seamless multi-monitor setups, while high dynamic range and higher color depth support deliver more precise color accuracy for tasks like video editing and live broadcasting.

“The NVIDIA RTX PRO 6000 Blackwell Max-Q Workstation Edition GPU is a transformative force in Cosm’s mission to redefine immersive entertainment,” said Devin Poolman, chief product and technology officer at Cosm, a global immersive technology, media and entertainment company. “With its unparalleled performance, we can push the boundaries of real-time rendering, unlocking the ultra-high resolution and fluid frame rates needed to make our live, immersive experiences feel nearly indistinguishable from reality.”

As a key component of Cosm’s CX System 12K LED dome displays, RTX PRO 6000 Max-Q enables seamless merging of the physical and digital worlds to deliver shared reality experiences, enabling audiences to engage with sports, live events and cinematic content in entirely new ways.

Cosm’s shared reality experience, featuring its 87-foot-diameter LED dome display in stunning 12K resolution, with millions of pixels shining 10x brighter than the brightest cinematic display.​ Image courtesy of Cosm.

To learn more about NVIDIA Media2, watch the GTC keynote and register to attend sessions from NVIDIA and industry leaders at the show, which runs through Friday, March 21. 

Try NVIDIA NIM microservices and AI Blueprints on build.nvidia.com.

Read More

Accelerating AI Development With NVIDIA RTX PRO Blackwell Series GPUs and NVIDIA NIM Microservices for RTX

Accelerating AI Development With NVIDIA RTX PRO Blackwell Series GPUs and NVIDIA NIM Microservices for RTX

As generative AI capabilities expand, NVIDIA is equipping developers with the tools to seamlessly integrate AI into creative projects, applications and games to unlock groundbreaking experiences on NVIDIA RTX AI PCs and workstations.

At the NVIDIA GTC global AI conference this week, NVIDIA introduced the NVIDIA RTX PRO Blackwell series, a new generation of workstation and server GPUs built for complex AI-driven workloads, technical computing and high-performance graphics.

Alongside the new hardware, NVIDIA announced a suite of AI-powered tools, libraries and software development kits designed to accelerate AI development on PCs and workstations. With NVIDIA CUDA-X libraries for data science, developers can significantly accelerate data processing and machine learning tasks, enabling faster exploratory data analysis, feature engineering and model development with zero code changes. And with NVIDIA NIM microservices, developers can more seamlessly build AI assistants, productivity plug-ins and advanced content-creation workflows with peak performance.

AI at the Speed of NIM With RTX PRO Series GPUs

The RTX PRO Blackwell series is built to handle the most demanding AI-driven workflows, powering applications like AI agents, simulation, extended reality, 3D design and high-end visual effects. Whether for designing and engineering complex systems or creating sophisticated and immersive content, RTX PRO GPUs deliver the performance, efficiency and scalability professionals need.

The new lineup includes:

  • Desktop GPUs: NVIDIA RTX PRO 6000 Blackwell Workstation Edition, NVIDIA RTX PRO 6000 Blackwell Max-Q Workstation Edition, NVIDIA RTX PRO 5000 Blackwell, NVIDIA RTX PRO 4500 Blackwell and NVIDIA RTX PRO 4000 Blackwell
  • Laptop GPUs: NVIDIA RTX PRO 5000 Blackwell, NVIDIA RTX PRO 4000 Blackwell, NVIDIA RTX PRO 3000 Blackwell, NVIDIA RTX PRO 2000 Blackwell, NVIDIA RTX PRO 1000 Blackwell and NVIDIA RTX PRO 500 Blackwell Laptop GPUs
  • Data center GPU: NVIDIA RTX PRO 6000 Blackwell Server Edition

As AI and data science evolve, the ability to rapidly process and analyze massive datasets will become a key differentiator to enable breakthroughs across industries.

NVIDIA CUDA-X libraries, built on CUDA, is a collection of libraries that deliver dramatically higher performance compared with CPU-only alternatives. With cuML 25.02 — now available in open beta — data scientists and researchers can accelerate scikit-learn, UMAP and HDBSCAN algorithms with zero code changes, unlocking new levels of performance and efficiency in machine learning tasks. This release extends the zero-code-change acceleration paradigm established by cuDF-pandas for DataFrame operations to machine learning, reducing iterations from hours to seconds.

Optimized AI software unlocks even greater possibilities. NVIDIA NIM microservices are prepackaged, high-performance AI models optimized across NVIDIA GPUs, from RTX-powered PCs and workstations to the cloud. Developers can use NIM microservices to build AI-powered app assistants, productivity tools and content-creation workflows that seamlessly integrate with RTX PRO GPUs. This makes AI more accessible and powerful than ever.

NIM microservices integrate top community and NVIDIA-built models, spanning capabilities and modalities important for PC and workstation use cases, including large language models (LLMs), images, speech and retrieval-augmented generation (RAG).

Announced at the CES trade show in January, NVIDIA AI Blueprints are advanced AI reference workflows built on NVIDIA NIM. With AI Blueprints, developers can create podcasts from PDF documents, generate stunning 4K images controlled and guided by 3D scenes, and incorporate digital humans into AI-powered use cases.

Coming soon to build.nvidia.com, the blueprints are extensible and provide everything needed to build and customize them for different use cases. These resources include source code, sample data, a demo application and documentation.

From cutting-edge hardware to optimized AI models and reference workflows, the RTX PRO series is redefining AI-powered computing — enabling professionals to push the limits of creativity, productivity and innovation. Learn about all the GTC announcements and the RTX PRO Blackwell series GPUs for laptops and workstations.

Create NIMble AI Chatbots With ChatRTX

AI-powered chatbots are changing how people interact with their content.

ChatRTX is a demo app that personalizes a LLM connected to a user’s content, whether documents, notes, images or other data. Using RAG, the NVIDIA TensorRT-LLM library and RTX acceleration, a user can query a custom chatbot to get contextually relevant answers. And because it all runs locally on Windows RTX PCs or RTX PRO workstations, they get fast and private results.

Today, the latest version of ChatRTX introduces support for NVIDIA NIM microservices, giving users access to new foundational models. NIM is expected to soon be available in additional top AI ecosystem apps. Download ChatRTX today.

Game On

Half-Life 2 owners can now download a free Half-Life 2 RTX demo from Steam, built with RTX Remix and featuring the latest neural rendering enhancements. RTX Remix supports a host of AI tools, including NVIDIA DLSS 4, RTX Neural Radiance Cache and the new community-published AI model PBRFusion 3, which upscales textures and generates high-quality normal, roughness and height maps for physically based materials.

The March NVIDIA Studio Driver is also now available for download, supporting recent app updates including last week’s RTX Remix launch. For automatic Studio Driver notifications, download the NVIDIA app.

In addition, NVIDIA RTX Kit, a suite of neural rendering technologies for game developers, is receiving major updates with Unreal Engine 5 support for the RTX Mega Geometry and RTX Hair features.

Learn more about the NVIDIA RTX PRO Blackwell GPUs by watching a replay of NVIDIA founder and CEO Jensen Huang’s GTC keynote and register to attend sessions from NVIDIA and industry leaders at the show, which runs through March 21.

Follow NVIDIA AI PC on Facebook, Instagram, TikTok and X — and stay informed by subscribing to the RTX AI PC newsletter.

Follow NVIDIA Workstation on LinkedIn and X

See notice regarding software product information.

Read More

AI Factories Are Redefining Data Centers and Enabling the Next Era of AI

AI Factories Are Redefining Data Centers and Enabling the Next Era of AI

AI is fueling a new industrial revolution — one driven by AI factories.

Unlike traditional data centers, AI factories do more than store and process data — they manufacture intelligence at scale, transforming raw data into real-time insights. For enterprises and countries around the world, this means dramatically faster time to value — turning AI from a long-term investment into an immediate driver of competitive advantage. Companies that invest in purpose-built AI factories today will lead in innovation, efficiency and market differentiation tomorrow.

While a traditional data center typically handles diverse workloads and is built for general-purpose computing, AI factories are optimized to create value from AI. They orchestrate the entire AI lifecycle — from data ingestion to training, fine-tuning and, most critically, high-volume inference.

For AI factories, intelligence isn’t a byproduct but the primary one. This intelligence is measured by AI token throughput — the real-time predictions that drive decisions, automation and entirely new services.

While traditional data centers aren’t disappearing anytime soon, whether they evolve into AI factories or connect to them depends on the enterprise business model.

Regardless of how enterprises choose to adapt, AI factories powered by NVIDIA are already manufacturing intelligence at scale, transforming how AI is built, refined and deployed.

The Scaling Laws Driving Compute Demand

Over the past few years, AI has revolved around training large models. But with the recent proliferation of AI reasoning models, inference has become the main driver of AI economics. Three key scaling laws highlight why:

  • Pretraining scaling: Larger datasets and model parameters yield predictable intelligence gains, but reaching this stage demands significant investment in skilled experts, data curation and compute resources. Over the last five years, pretraining scaling has increased compute requirements by 50 million times. However, once a model is trained, it significantly lowers the barrier for others to build on top of it.
  • Post-training scaling: Fine-tuning AI models for specific real-world applications requires 30x more compute during AI inference than pretraining. As organizations adapt existing models for their unique needs, cumulative demand for AI infrastructure skyrockets.
  • Test-time scaling (aka long thinking): Advanced AI applications such as agentic AI or physical AI require iterative reasoning, where models explore multiple possible responses before selecting the best one. This consumes up to 100x more compute than traditional inference.

Traditional data centers aren’t designed for this new era of AI. AI factories are purpose-built to optimize and sustain this massive demand for compute, providing an ideal path forward for AI inference and deployment.

Reshaping Industries and Economies With Tokens

Across the world, governments and enterprises are racing to build AI factories to spur economic growth, innovation and efficiency.

The European High Performance Computing Joint Undertaking recently announced plans to build seven AI factories in collaboration with 17 European Union member nations.

This follows a wave of AI factory investments worldwide, as enterprises and countries accelerate AI-driven economic growth across every industry and region:

These initiatives underscore a global reality: AI factories are quickly becoming essential national infrastructure, on par with telecommunications and energy.

Inside an AI Factory: Where Intelligence Is Manufactured

Foundation models, secure customer data and AI tools provide the raw materials for fueling AI factories, where inference serving, prototyping and fine-tuning shape powerful, customized models ready to be put into production.

As these models are deployed into real-world applications, they continuously learn from new data, which is stored, refined and fed back into the system using a data flywheel. This cycle of optimization ensures AI remains adaptive, efficient and always improving — driving enterprise intelligence at an unprecedented scale.

AI factories powered by NVIDIA for manufacturing enterprise intelligence at scale.

An AI Factory Advantage With Full-Stack NVIDIA AI

NVIDIA delivers a complete, integrated AI factory stack where every layer — from the silicon to the software — is optimized for training, fine-tuning, and inference at scale. This full-stack approach ensures enterprises can deploy AI factories that are cost effective, high-performing and future-proofed for the exponential growth of AI.

With its ecosystem partners, NVIDIA has created building blocks for the full-stack AI factory, offering:

  • Powerful compute performance
  • Advanced networking
  • Infrastructure management and workload orchestration
  • The largest AI inference ecosystem
  • Storage and data platforms
  • Blueprints for design and optimization
  • Reference architectures
  • Flexible deployment for every enterprise

Powerful Compute Performance

The heart of any AI factory is its compute power. From NVIDIA Hopper to NVIDIA Blackwell, NVIDIA provides the world’s most powerful accelerated computing for this new industrial revolution. With the NVIDIA Blackwell Ultra-based GB300 NVL72 rack-scale solution, AI factories can achieve up to 50X the output for AI reasoning, setting a new standard for efficiency and scale.

The NVIDIA DGX SuperPOD is the exemplar of the turnkey AI factory for enterprises, integrating the best of NVIDIA accelerated computing. NVIDIA DGX Cloud provides an AI factory that delivers NVIDIA accelerated compute with high performance in the cloud.

Global systems partners are building full-stack AI factories for their customers based on NVIDIA accelerated computing — now including the NVIDIA GB200 NVL72 and GB300 NVL72 rack-scale solutions.

Advanced Networking 

Moving intelligence at scale requires seamless, high-performance connectivity across the entire AI factory stack. NVIDIA NVLink and NVLink Switch enable high-speed, multi-GPU communication, accelerating data movement within and across nodes.

AI factories also demand a robust network backbone. The NVIDIA Quantum InfiniBand, NVIDIA Spectrum-X Ethernet, and NVIDIA BlueField networking platforms reduce bottlenecks, ensuring efficient, high-throughput data exchange across massive GPU clusters. This end-to-end integration is essential for scaling out AI workloads to million-GPU levels, enabling breakthrough performance in training and inference.

Infrastructure Management and Workload Orchestration

Businesses need a way to harness the power of AI infrastructure with the agility, efficiency and scale of a hyperscaler, but without the burdens of cost, complexity and expertise placed on IT.

With NVIDIA Run:ai, organizations can benefit from seamless AI workload orchestration and GPU management, optimizing resource utilization while accelerating AI experimentation and scaling workloads. NVIDIA Mission Control software, which includes NVIDIA Run:ai technology, streamlines AI factory operations from workloads to infrastructure while providing full-stack intelligence that delivers world-class infrastructure resiliency.

NVIDIA Mission Control streamlines workflows across the AI factory stack.

The Largest AI Inference Ecosystem

AI factories need the right tools to turn data into intelligence. The NVIDIA AI inference platform, spanning the NVIDIA TensorRT ecosystem, NVIDIA Dynamo and NVIDIA NIM microservices — all part (or soon to be part) of the NVIDIA AI Enterprise software platform — provides the industry’s most comprehensive suite of AI acceleration libraries and optimized software. It delivers maximum inference performance, ultra-low latency and high throughput.

Storage and Data Platforms

Data fuels AI applications, but the rapidly growing scale and complexity of enterprise data often make it too costly and time-consuming to harness effectively. To thrive in the AI era, enterprises must unlock the full potential of their data.

The NVIDIA AI Data Platform is a customizable reference design to build a new class of AI infrastructure for demanding AI inference workloads. NVIDIA-Certified Storage partners are collaborating with NVIDIA to create customized AI data platforms that can harness enterprise data to reason and respond to complex queries.

Blueprints for Design and Optimization

To design and optimize AI factories, teams can use the NVIDIA Omniverse Blueprint for AI factory design and operations. The blueprint enables engineers to design, test and optimize AI factory infrastructure before deployment using digital twins. By reducing risk and uncertainty, the blueprint helps prevent costly downtime — a critical factor for AI factory operators.

For a 1 gigawatt-scale AI factory, every day of downtime can cost over $100 million. By solving complexity upfront and enabling siloed teams in IT, mechanical, electrical, power and network engineering to work in parallel, the blueprint accelerates deployment and ensures operational resilience.

Reference Architectures

NVIDIA Enterprise Reference Architectures and NVIDIA Cloud Partner Reference Architectures provide a roadmap for partners designing and deploying AI factories. They help enterprises and cloud providers build scalable, high-performance and secure AI infrastructure based on NVIDIA-Certified Systems with the NVIDIA AI software stack and partner ecosystem.

NVIDIA full-stack AI factories, built on NVIDIA reference architectures. (*NVIS = NVIDIA infrastructure specialists)

Every layer of the AI factory stack relies on efficient computing to meet growing AI demands. NVIDIA accelerated computing serves as the foundation across the stack, delivering the highest performance per watt to ensure AI factories operate at peak energy efficiency. With energy-efficient architecture and liquid cooling, businesses can scale AI while keeping energy costs in check.

Flexible Deployment for Every Enterprise

With NVIDIA’s full-stack technologies, enterprises can easily build and deploy AI factories, aligning with customers’ preferred IT consumption models and operational needs.

Some organizations opt for on-premises AI factories to maintain full control over data and performance, while others use cloud-based solutions for scalability and flexibility. Many also turn to their trusted global systems partners for pre-integrated solutions that accelerate deployment.

The NVIDIA DGX GB300 is the highest-performing, largest-scale AI factory infrastructure available for enterprises that are built for the era of AI reasoning.

On Premises

NVIDIA DGX SuperPOD is a turnkey AI factory infrastructure solution that provides accelerated infrastructure with scalable performance for the most demanding AI training and inference workloads. It features a design-optimized combination of AI compute, network fabric, storage and NVIDIA Mission Control software, empowering enterprises to get AI factories up and running in weeks instead of months — and with best-in-class uptime, resiliency and utilization.

AI factory solutions are also offered through the NVIDIA global ecosystem of enterprise technology partners with NVIDIA-Certified Systems. They deliver leading hardware and software technology, combined with data center systems expertise and liquid-cooling innovations, to help enterprises de-risk their AI endeavors and accelerate the return on investment of their AI factory implementations.

These global systems partners are providing full-stack solutions based on NVIDIA reference architectures — integrated with NVIDIA accelerated computing, high-performance networking and AI software — to help customers successfully deploy AI factories and manufacture intelligence at scale.

In the Cloud

For enterprises looking to use a cloud-based solution for their AI factory, NVIDIA DGX Cloud delivers a unified platform on leading clouds to build, customize and deploy AI applications.  Every layer of DGX Cloud is optimized and fully managed by NVIDIA, offering the best of NVIDIA AI in the cloud, and features enterprise-grade software and large-scale, contiguous clusters on leading cloud providers, offering scalable compute resources ideal for even the most demanding AI training workloads.

DGX Cloud also includes a dynamic and scalable serverless inference platform that delivers high throughput for AI tokens across hybrid and multi-cloud environments, significantly reducing infrastructure complexity and operational overhead.

By providing a full-stack platform that integrates hardware, software, ecosystem partners and reference architectures, NVIDIA is helping enterprises build AI factories that are cost effective, scalable and high-performing — equipping them to meet the next industrial revolution.

Learn more about NVIDIA AI factories.

See notice regarding software product information.

Read More

NVIDIA Accelerated Quantum Research Center to Bring Quantum Computing Closer

NVIDIA Accelerated Quantum Research Center to Bring Quantum Computing Closer

As quantum computers continue to develop, they will integrate with AI supercomputers to form accelerated quantum supercomputers capable of solving some of the world’s hardest problems.

Integrating quantum processing units (QPUs) into AI supercomputers is key for developing new applications, helping unlock breakthroughs critical to running future quantum hardware and enabling developments in quantum error correction and device control.

The NVIDIA Accelerated Quantum Research Center, or NVAQC, announced today at the NVIDIA GTC global AI conference, is where these developments will happen. With an NVIDIA GB200 NVL72 system and the NVIDIA Quantum-2 InfiniBand networking platform, the facility will house a supercomputer with 576 NVIDIA Blackwell GPUs dedicated to quantum computing research.

“The NVAQC draws on much-needed and long-sought-after tools for scaling quantum computing to next-generation devices,” said Tim Costa, senior director of computer-aided engineering, quantum and CUDA-X at NVIDIA. “The center will be a place for large-scale simulations of quantum algorithms and hardware, tight integration of quantum processors, and both training and deployment of AI models for quantum.”

The NVAQC will host a GB200 NVL72 system.
The NVAQC will host a GB200 NVL72 system.

Quantum computing innovators like Quantinuum, QuEra and Quantum Machines, along with academic partners from the Harvard Quantum Initiative and the Engineering Quantum Systems group at the MIT Center for Quantum Engineering, will work on projects with NVIDIA at the center to explore how AI supercomputing can accelerate the path toward quantum computing.

“The NVAQC is a powerful tool that will be instrumental in ushering in the next generation of research across the entire quantum ecosystem,” said William Oliver, professor of electrical engineering and computer science, and of physics, leader of the EQuS group and director of the MIT Center for Quantum Engineering. “NVIDIA is a critical partner for realizing useful quantum computing.”

There are several key quantum computing challenges where the NVAQC is already set to have a dramatic impact.

Protecting Qubits With AI Supercomputing

Qubit interactions are a double-edged sword. While qubits must interact with their surroundings to be controlled and measured, these same interactions are also a source of noise — unwanted disturbances that affect the accuracy of quantum calculations. Quantum algorithms can only work if the resulting noise is kept in check.

Quantum error correction provides a solution, encoding noiseless, logical qubits within many noisy, physical qubits. By processing the outputs from repeated measurements on these noisy qubits, it’s possible to identify, track and correct qubit errors — all without destroying the delicate quantum information needed by a computation.

The process of figuring out where errors occurred and what corrections to apply is called decoding. Decoding is an extremely difficult task that must be performed by a conventional computer within a narrow time frame to prevent noise from snowballing out of control.

A key goal of the NVAQC will be exploring how AI supercomputing can accelerate decoding. Studying how to collocate quantum hardware within the center will allow the development of low-latency, parallelized and AI-enhanced decoders, running on NVIDIA GB200 Grace Blackwell Superchips.

The NVAQC will also tackle other challenges in quantum error correction. QuEra will work with NVIDIA to accelerate its search for new, improved quantum error correction codes, assessing the performance of candidate codes through demanding simulations of complex quantum circuits.

“The NVAQC will be an essential tool for discovering, testing and refining new quantum error correction codes and decoders capable of bringing the whole industry closer to useful quantum computing,” said Mikhail Lukin, Joshua and Beth Friedman University Professor at Harvard and a codirector of the Harvard Quantum Initiative.

Developing Applications for Accelerated Quantum Supercomputers

The majority of useful quantum algorithms draw equally from classical and quantum computing resources, ultimately requiring an accelerated quantum supercomputer that unifies both kinds of hardware.

For example, the output of classical supercomputers is often needed to prime quantum computations. The NVAQC provides the heterogeneous compute infrastructure needed for research on developing and improving such hybrid algorithms.

A diagram of an accelerated quantum supercomputer connecting classical and quantum processors.
Accelerated quantum supercomputers will connect quantum and classical processors to execute hybrid algorithms.

New AI-based compilation techniques will also be explored at the NVAQC, with the potential to accelerate the runtime of all quantum algorithms, including through work with Quantinuum. Quantinuum will build on its previous integration work with NVIDIA, offering its hardware and emulators through the NVIDIA CUDA-Q platform. Users of CUDA-Q are currently offered unrestricted access to Quantinuum’s QNTM H1-1 hardware and emulator for 90 days.

“We’re excited to deepen our work with NVIDIA via this center,” said Rajeeb Hazra, president and CEO of Quantinuum. “By combining Quantinuum’s powerful quantum systems with NVIDIA’s cutting-edge accelerated computing, we’re pushing the boundaries of hybrid quantum-classical computing and unlocking exciting new possibilities.”

QPU Integration

Integrating quantum hardware with AI supercomputing is the one of the major remaining hurdles on the path to running useful quantum hardware.

The requirements of such an integration can be extremely demanding. The decoding required by quantum error correction can only function if data from millions of qubits can be sent between quantum and classical hardware at ultralow latencies.

Quantum Machines will work with NVIDIA at the NVAQC to develop and hone new controller technologies supporting rapid, high-bandwidth interfaces between quantum processors and GB200 superchips.

“We’re excited to see NVIDIA’s growing commitment to accelerating the realization of useful quantum computers, providing researchers with the most advanced infrastructure to push the boundaries of quantum-classical computing,” said Itamar Sivan, CEO of Quantum Machines.

Depiction of the NVIDIA DGX Quantum system, which comprises an NVIDIA GH200 superchip coupled with Quantum Machines’ OPX1000 control system.
The NVIDIA DGX Quantum system comprises an NVIDIA GH200 superchip and Quantum Machines’ OPX1000 control system.

Key to integrating quantum and classical hardware is a platform that lets researchers and developers quickly shift context between these two disparate computing paradigms within a single application. The NVIDIA CUDA-Q platform will be the entry point for researchers to harness the NVAQC’s quantum-classical integration.

Building on tools like NVIDIA DGX Quantum — a reference architecture for integrating quantum and classical hardware — and CUDA-Q, the NVAQC is set to be an epicenter for next-generation developments in quantum computing, seeding the evolution of qubits into impactful quantum computers.

Learn more about NVIDIA quantum computing.

Read More

Full Steam Ahead: NVIDIA-Certified Program Expands to Enterprise Storage for Faster AI Factory Deployment

Full Steam Ahead: NVIDIA-Certified Program Expands to Enterprise Storage for Faster AI Factory Deployment

AI deployments thrive on speed, data and scale. That’s why NVIDIA is expanding NVIDIA-Certified Systems to include enterprise storage certification — for streamlined AI factory deployments in the enterprise with accelerated computing, networking, software and storage.

As enterprises build AI factories, access to high-quality data is imperative to ensure optimal performance and reliability for AI models. The new NVIDIA-Certified Storage program announced today at the NVIDIA GTC global AI conference validates that enterprise storage systems meet stringent performance and scalability data requirements for AI and high-performance computing workloads.

Leading enterprise data platform and storage providers are already onboard, ensuring businesses have trusted options from day one. These include DDN, Dell Technologies, Hewlett Packard Enterprise, Hitachi Vantara, IBM, NetApp, Nutanix, Pure Storage, VAST Data and WEKA.

Building Blocks for a New Class of Enterprise Infrastructure

At GTC, NVIDIA also announced the NVIDIA AI Data Platform, a customizable reference design to build a new class of enterprise infrastructure for demanding agentic AI workloads.

The NVIDIA-Certified Storage designation is a prerequisite for partners developing agentic AI infrastructure solutions built on the NVIDIA AI Data Platform. Each of these NVIDIA-Certified Storage partners will deliver customized AI data platforms, in collaboration with NVIDIA, that can harness enterprise data to reason and respond to complex queries.

NVIDIA-Certified was created more than four years ago as the industry’s first certification program dedicated to tuning and optimizing AI systems to ensure optimal performance, manageability and scalability. Each NVIDIA-Certified system is rigorously tested and validated to deliver enterprise-grade AI performance.

There are now 50+ partners providing 500+ NVIDIA-Certified systems, helping enterprises reduce time, cost and complexity by giving them a wide selection of performance-optimized systems to power their accelerated computing workloads.

NVIDIA Enterprise Reference Architectures (RAs) were introduced last fall to provide partners with AI infrastructure best practices and configuration guidance for deploying NVIDIA-Certified servers, NVIDIA Spectrum-X networking and NVIDIA AI Enterprise software.

Solutions based on NVIDIA Enterprise RAs are available from the world’s leading systems providers to reduce the time, cost and complexity of enterprise AI deployments. Enterprise RAs are now available for a wide range of NVIDIA Hopper and NVIDIA Blackwell platforms, including NVIDIA HGX B200 systems and the new NVIDIA RTX PRO 6000 Blackwell Server Edition GPU.

These NVIDIA technologies and partner solutions are the building blocks for enterprise AI factories, representing a new class of enterprise infrastructure for high-performance AI deployments at scale.

Enterprise AI Needs Scalable Storage

As the pace of AI innovation and adoption accelerates, secure and reliable access to high-quality enterprise data is becoming more important than ever. Data is the fuel for the AI factory. With enterprise data creation projected to reach 317 zettabytes annually by 2028*, AI workloads require storage architectures built to handle massive, unstructured and multimodal datasets.

NVIDIA’s expanded storage certification program is designed to meet this need and help enterprises build AI factories with a foundation of high-performance, reliable data storage solutions. The program includes performance testing as well as  validation that partner storage systems adhere to design best practices, optimizing performance and scalability for enterprise AI workloads.

NVIDIA-Certified Storage will be incorporated into NVIDIA Enterprise RAs, providing enterprise-grade data storage for AI factory deployments with full-stack solutions from global systems partners.

Certified Storage for Every Deployment

This certification builds on existing NVIDIA DGX systems and NVIDIA Cloud Partner (NCP) storage programs, expanding the data ecosystem for AI infrastructure.

These storage certification programs are aligned with their deployment models and architectures:

  • NVIDIA DGX BasePOD and DGX SuperPOD Storage Certification — designed for enterprise AI factory deployments with NVIDIA DGX systems.
  • NCP Storage Certification — designed for large-scale NCP Reference Architecture AI factory deployments with cloud providers.
  • NVIDIA-Certified Storage Certification — designed for enterprise AI factory deployments with NVIDIA-Certified servers available from global partners, based on NVIDIA Enterprise RA guidelines.

With this framework, organizations of all sizes — from cloud hyperscalers to enterprises — can build AI factories that process massive amounts of data, train models faster and drive more accurate, reliable AI outcomes.

Learn more about how NVIDIA-Certified Systems deliver seamless, high-speed performance and attend these related sessions at GTC:

*Source: IDC, Worldwide IDC Global DataSphere Forecast, 2024–2028: AI Everywhere, But Upsurge in Data Will Take Time, doc #US52076424, May 2024

Read More

From AT&T to the United Nations, AI Agents Redefine Work With NVIDIA AI Enterprise

From AT&T to the United Nations, AI Agents Redefine Work With NVIDIA AI Enterprise

AI agents are transforming work, delivering time and cost savings by helping people resolve complex challenges in new ways.

Whether developed for humanitarian aid, customer service or healthcare, AI agents built with the NVIDIA AI Enterprise software platform make up a new digital workforce helping professionals accomplish their goals faster — at lower costs and for greater impact.

AI Agents Enable Growth and Education

AI can instantly translate, summarize and process multimodal content in hundreds of languages. Integrated into agentic systems, the technology enables international organizations to engage and educate global stakeholders more efficiently.

The United Nations (UN) is working with Accenture to develop a multilingual research agent to support over 150 languages to promote local economic sustainability. The agent will act like a researcher, answering questions about the UN’s Sustainable Development Goals and fostering awareness and engagement toward its agenda of global peace and prosperity.

Mercy Corps, in collaboration with Cloudera, has deployed an AI-driven Methods Matcher tool that supports humanitarian aid experts in more than 40 countries by providing research, summaries, best-practice guidelines and data-driven crisis responses, providing faster aid delivery in disaster situations.

Wikimedia Deutschland, using the DataStax AI Platform, built with NVIDIA AI, can process and embed 10 million Wikidata items in just three days, with 30x faster ingestion performance.

AI Agents Provide Tailored Customer Service Across Industries

Agentic AI enhances customer service with real-time, highly accurate insights for more effective user experiences. AI agents provide 24/7 support, handling common inquiries with more personalized responses while freeing human agents to address more complex issues.

Intelligent-routing capabilities categorize and prioritize requests so customers can be quickly directed to the right specialists. Plus, AI agents’ predictive-analytics capabilities enable proactive support by anticipating issues and empowering human agents with data-driven insights.

Companies across industries including telecommunications, finance, healthcare and sports are already tapping into AI agents to achieve massive benefits.

AT&T, in collaboration with Quantiphi, developed and deployed a new Ask AT&T AI agent to its call center, leading to a 84% decrease in call center analytics costs.

Southern California Edison, working with WWT, is driving Project Orca to enhance data processing and predictions for 100,000+ network assets using agents to reduce downtime, enhance network reliability and enable faster, more efficient ticket resolution.

With the adoption of ServiceNow Dispute Management, built with Visa, banks can use AI agents with the solution to achieve up to a 28% reduction in call center volumes and a 30% decrease in time to resolution.

The Ottawa Hospital, working with Deloitte, deployed a team of 24/7 patient-care agents to provide preoperative support and answer patient questions regarding upcoming procedures for over 1.2 million people in eastern Ontario, Canada.

With the VAST Data Platform, the National Hockey League can unlock over 550,000 hours of historical game footage. This supports sponsorship analysis, helps video producers quickly create broadcast clips and enhances personalized fan content.

State-of-the-Art AI Agents Built With NVIDIA AI Enterprise

AI agents have emerged as versatile tools that can be adapted and adopted across a wide range of industries. These agents connect to organizational knowledge bases to understand the business context they’re deployed in. Their core functionalities — such as question-answering, translation, data processing, predictive analytics and automation — can be tailored to improve productivity and save time and costs, by any organization, in any industry.

NVIDIA AI Enterprise provides the building blocks for enterprise AI agents. It includes NVIDIA NIM microservices for efficient inference of state-of-the-art models — including the new NVIDIA Llama Nemotron reasoning model family — and NVIDIA NeMo tools to streamline data processing, model customization, system evaluation, retrieval-augmented generation and guardrailing.

NVIDIA Blueprints are reference workflows that showcase best practices for developing high-performance agentic systems. With the AI-Q NVIDIA AI Blueprint, developers can build AI agents into larger agentic systems that can reason, then connect these systems to enterprise data to tackle complex problems, harness other tools, collaborate and operate with greater autonomy.

Learn more about AI agent development by watching the NVIDIA GTC keynote and register for sessions from NVIDIA and industry leaders at the show, which runs through March 21.

See notice regarding software product information.

Read More

NVIDIA Aerial Expands With New Tools for Building AI-Native Wireless Networks

NVIDIA Aerial Expands With New Tools for Building AI-Native Wireless Networks

The telecom industry is increasingly embracing AI to deliver seamless connections — even in conditions of poor signal strength — while maximizing sustainability and spectral efficiency, the amount of information that can be transmitted per unit of bandwidth.

Advancements in AI-RAN technology have set the course toward AI-native wireless networks for 6G, built using AI and accelerated computing from the start, to meet the demands of billions of AI-enabled connected devices, sensors, robots, cameras and autonomous vehicles.

To help developers and telecom leaders pioneer these networks, NVIDIA today unveiled new tools in the NVIDIA Aerial Research portfolio.

The expanded portfolio of solutions include the Aerial Omniverse Digital Twin on NVIDIA DGX Cloud, the Aerial Commercial Test Bed on NVIDIA MGX, the NVIDIA Sionna 1.0 open-source library and the Sionna Research Kit on NVIDIA Jetson — helping accelerate AI-RAN and 6G research.

Industry leaders like Amdocs, Ansys, Capgemini, DeepSig, Fujitsu, Keysight, Kyocera, MathWorks, Mediatek, Samsung Research, SoftBank and VIAVI Solutions and more than 150 higher education and research institutions from U.S. and around the world — including Northeastern University, Rice University, The University of Texas at Austin, ETH Zurich, Fraunhofer Institute for Telecommunications, Heinrich-Hertz-Institut, HHI, Singapore University of Technology and Design, and University of Oulu — are harnessing the NVIDIA Aerial Research portfolio to develop, train, simulate and deploy groundbreaking AI-native wireless innovations.

New Tools for Research and Development

The Aerial Research portfolio provides exceptional flexibility and ease of use for developers at every stage of their research — from early experimentation to commercial deployment. Its offerings include:

  • Aerial Omniverse Digital Twin (AODT): A simulation platform to test and fine-tune algorithms in physically precise digital replicas of entire wireless systems, now available on NVIDIA DGX Cloud. Developers can now access AODT everywhere, whether on premises, on laptops, via the public cloud or on an NVIDIA cloud service.
  • Aerial Commercial Test Bed (aka ARC-OTA): A full-stack AI-RAN deployment system that enables developers to deploy new AI models over the air and test them in real time, now available on NVIDIA MGX and available through manufacturers including Supermicro or as a managed offering via Sterling Skywave. ARC-OTA integrates commercial-grade Aerial CUDA-accelerated RAN software with open-source L2+ and 5G core from OpenAirInterface (OAI) and O-RAN-compliant 7.2 split open radio units from WNC and LITEON Technology to enable an end-to-end system for AI-RAN commercial testing.
  • Sionna 1.0: The most widely used GPU-accelerated open-source library for research in communication systems, with more than 135,000 downloads. The latest release of Sionna features a lightning-fast ray tracer for radio propagation, a versatile link-level simulator and new system-level simulation capabilities.
  • Sionna Research Kit: Powered by the NVIDIA Jetson platform, it integrates accelerated computing for AI and machine learning workloads and a software-defined RAN built on OAI. With the kit, researchers can connect 5G equipment and begin prototyping AI-RAN algorithms for next-generation wireless networks in just a few hours.

NVIDIA Aerial Research Ecosystem for AI-RAN and 6G

The NVIDIA Aerial Research portfolio includes the NVIDIA 6G Developer Program, an open community that serves more than 2,000 members, representing leading technology companies, academia, research institutions and telecom operators using NVIDIA technologies to complement their AI-RAN and 6G research.

Testing and simulation will play an essential role in developing AI-native wireless networks. Companies such as Amdocs, Ansys, Keysight, MathWorks and VIAVI are enhancing their simulation solutions with NVIDIA AODT, while operators have created digital twins of their radio access networks to optimize performance with changing traffic scenarios.

Nine out of 10 demonstrations chosen by the AI-RAN Alliance for Mobile World Congress were developed using the NVIDIA Aerial Research portfolio, leading to breakthrough results.

SoftBank and Fujitsu demonstrated an up to 50% throughput gain in poor radio environments using AI-based uplink channel interpolation.

DeepSig developed OmniPHY, an AI-native air interface that eliminates traditional pilot overhead, harnessing neural networks to achieve up to 70% throughput gains in certain scenarios. Using the NVIDIA AI Aerial platform, OmniPHY integrates machine learning into modulation, reception and demodulation to optimize spectral efficiency, reduce power consumption and enhance wireless network performance.

“AI-native signal processing is transforming wireless networks, delivering real-world results,” said Jim Shea, cofounder and CEO of DeepSig. “By integrating deep learning to the air interface and leveraging NVIDIA’s tools, we’re redefining how AI-native wireless networks are designed and built.”

In addition to the Aerial Research portfolio, using the open ecosystem of NVIDIA CUDA-X libraries, built on CUDA, developers can build applications that deliver dramatically higher performance.

Join the NVIDIA 6G Developer Program to access NVIDIA Aerial Research platform tools.

See notice regarding software product information.

Read More

Telecom Leaders Call Up Agentic AI to Improve Network Operations

Telecom Leaders Call Up Agentic AI to Improve Network Operations

Global telecommunications networks can support millions of user connections per day, generating more than 3,800 terabytes of data per minute on average.

That massive, continuous flow of data generated by base stations, routers, switches and data centers — including network traffic information, performance metrics, configuration and topology — is unstructured and complex. Not surprisingly, traditional automation tools have often fallen short on handling massive, real-time workloads involving such data.

To help address this challenge, NVIDIA today announced at the GTC global AI conference that its partners are developing new large telco models (LTMs) and AI agents custom-built for the telco industry using NVIDIA NIM and NeMo microservices within the NVIDIA AI Enterprise software platform. These LTMs and AI agents enable the next generation of AI in network operations.

LTMs — customized, multimodal large language models (LLMs) trained specifically on telco network data — are core elements in the development of network AI agents, which automate complex decision-making workflows, improve operational efficiency, boost employee productivity and enhance network performance.

SoftBank and Tech Mahindra have built new LTMs and AI agents, while Amdocs, BubbleRAN and ServiceNow, are dialing up their network operations and optimization with new AI agents, all using NVIDIA AI Enterprise.

It’s important work at a time when 40% of respondents in a recent NVIDIA-run telecom survey noted they’re deploying AI into their network planning and operations.

LTMs Understand the Language of Networks

Just as LLMs understand and generate human language, and NVIDIA BioNeMo NIM microservices understand the language of biological data for drug discovery, LTMs now enable AI agents to master the language of telecom networks.

The new partner-developed LTMs powered by NVIDIA AI Enterprise are:

  • Specialized in network intelligence — the LTMs can understand real-time network events, predict failures and automate resolutions.
  • Optimized for telco workloads — tapping into NVIDIA NIM microservices, the LTMs are optimized for efficiency, accuracy and low latency.
  • Suited for continuous learning and adaptation — with post-training scalability, the LTMs can use NVIDIA NeMo to learn from new events, alerts and anomalies to enhance future performance.

NVIDIA AI Enterprise provides additional tools and blueprints to build AI agents that simplify network operations and deliver cost savings and operational efficiency, while improving network key performance indicators (KPIs), such as:

  • Reduced downtime — AI agents can predict failures before they happen, delivering network resilience.
  • Improved customer experiences — AI-driven optimizations lead to faster networks, fewer outages and seamless connectivity.
  • Enhanced security — as it continuously scans for threats, AI can help mitigate cyber risks in real time.

Industry Leaders Launch LTMs and AI Agents

Leading companies across telecommunications are using NVIDIA AI Enterprise to advance their latest technologies.

SoftBank has developed a new LTM based on a large-scale LLM base model, trained on its own network data. Initially focused on network configuration, the model — which is available as an NVIDIA NIM microservice — can automatically reconfigure the network to adapt to changes in network traffic, including during mass events at stadiums and other venues. SoftBank is also introducing network agent blueprints to help accelerate AI adoption across telco operations.

Tech Mahindra has developed an LTM with the NVIDIA agentic AI tools to help address critical network operations. Tapping into this LTM, the company’s Adaptive Network Insights Studio provides a 360-degree view of network issues, generating automated reports at various levels of detail to inform and assist IT teams, network engineers and company executives.

In addition, Tech Mahindra’s Proactive Network Anomaly Resolution Hub is powered by the LTM to automatically resolve a significant portion of its network events, lightening engineers’ workloads and enhancing their productivity.

Amdocs’ Network Assurance Agent, powered by amAIz Agents, automates repetitive tasks such as fault prediction. It also conducts impact analysis and prevention methods for network issues, providing step-by-step guidance on resolving any problems that occur. Plus, the company’s Network Deployment Agent simplifies open radio access network (RAN) adoption by automating integration, deployment tasks and interoperability testing, and providing insights to network engineers.

BubbleRAN is developing an autonomous multi-agent RAN intelligence platform on a cloud-native infrastructure, where LTMs can observe the network state, configuration, availability and KPIs to facilitate monitoring and troubleshooting. The platform also automates the process of network reconfiguration and policy enforcement through a high-level set of action tools. The company’s AI agents satisfy user needs by tapping into advanced retrieval-augmented generation pipelines and telco-specific application programming interfaces, answering real-time, 5G deployment-specific questions.

ServiceNow’s AI agents in telecom — built with NVIDIA AI Enterprise on NVIDIA DGX Cloud — drive productivity by generating resolution playbooks and predicting potential network disruptions before they occur. This helps communications service providers reduce resolution time and improve customer satisfaction. The new, ready-to-use AI agents also analyze network incidents, identifying root causes of disruptions so they can be resolved faster and avoided in the future.

Learn more about the latest agentic AI advancements at NVIDIA GTC, running through Friday, March 21, in San Jose, California.

Read More