Ready to Reason: Storage Leaders Build Infrastructure to Fuel AI Agents With NVIDIA AI Data Platform

Ready to Reason: Storage Leaders Build Infrastructure to Fuel AI Agents With NVIDIA AI Data Platform

The world’s leading storage and server manufacturers are combining their design and engineering expertise with the NVIDIA AI Data Platform — a customizable reference design for building a new class of AI infrastructure — to provide systems that enable a new generation of agentic AI applications and tools.

The reference design is now being harnessed by storage system leaders globally to support AI reasoning agents and unlock the value of information stored in the millions of documents, videos and PDFs enterprises use.

NVIDIA-Certified Storage partners DDN, Dell Technologies, Hewlett Packard Enterprise, Hitachi Vantara, IBM, NetApp, Nutanix, Pure Storage, VAST Data and WEKA are introducing products and solutions built on the NVIDIA AI Data Platform, which includes NVIDIA accelerated computing, networking and software.

In addition, AIC, ASUS, Foxconn, Quanta Cloud Technology, Supermicro, Wistron and other original design manufacturers (ODMs) are developing new storage and server hardware platforms that support the NVIDIA reference design. These platforms feature NVIDIA RTX PRO 6000 Blackwell Server Edition GPUs, NVIDIA BlueField DPUs and NVIDIA Spectrum-X Ethernet networking, and are optimized to run NVIDIA AI Enterprise software.

Such integrations allow enterprises across industries to quickly deploy storage and data platforms that scan, index, classify and retrieve large stores of private and public documents in real time. This augments AI agents as they reason and plan to solve complex, multistep problems.

Building agentic AI infrastructure with these new AI Data Platform-based solutions can help enterprises turn data into actionable knowledge using retrieval-augmented generation (RAG) software, including NVIDIA NeMo Retriever microservices and the AI-Q NVIDIA Blueprint.

Storage systems built with the NVIDIA AI Data Platform reference design turn data into knowledge, boosting agentic AI accuracy across many use cases. This can help AI agents and customer service representatives provide quicker, more accurate responses.

With more access to data, agents can also generate interactive summaries of complex documents — and even videos — for researchers of all kinds. Plus, they can assist cybersecurity teams in keeping software secure.

Leading Storage Providers Showcase AI Data Platform to Power Agentic AI

Storage system leaders play a critical role in providing the AI infrastructure that runs AI agents.

Embedding NVIDIA GPUs, networking and NIM microservices closer to storage enhances AI queries by bringing compute closer to critical content. Storage providers can integrate their document-security and access-control expertise into content-indexing and retrieval processes, improving security and data privacy compliance for AI inference.

Data platform leaders such as IBM, NetApp and VAST Data are using the NVIDIA reference design to scale their AI technologies.

IBM Fusion, a hybrid cloud platform for running virtual machines, Kubernetes and AI workloads on Red Hat OpenShift, offers content-aware storage services that unlock the meaning of unstructured enterprise data, enhancing inferencing so AI assistants and agents can deliver better, more relevant answers. Content-aware storage enables faster time to insights for AI applications using RAG when combined with NVIDIA GPUs, NVIDIA networking, the AI-Q NVIDIA Blueprint and NVIDIA NeMo Retriever microservices — all part of the NVIDIA AI Data Platform.

NetApp is advancing enterprise storage for agentic AI with the NetApp AIPod solution built with the NVIDIA reference design. NetApp incorporates NVIDIA GPUs in data compute nodes to run NVIDIA NeMo Retriever microservices and connects these nodes to scalable storage with NVIDIA networking.

VAST Data is embedding NVIDIA AI-Q with the VAST Data Platform to deliver a unified, AI-native infrastructure for building and scaling intelligent multi-agent systems. With high-speed data access, enterprise-grade security and continuous learning loops, organizations can now operationalize agentic AI systems that drive smarter decisions, automate complex workflows and unlock new levels of productivity.

ODMs Innovate on AI Data Platform Hardware

Offering their extensive experience with server and storage design and manufacturing, ODMs are working with storage system leaders to more quickly bring innovative AI Data Platform hardware to enterprises.

ODMs provide the chassis design, GPU integration, cooling innovation and storage media connections needed to build AI Data Platform servers that are reliable, compact, energy efficient and affordable.

A high percentage of the ODM industry’s market share comprises manufacturers based or colocated in Taiwan, making the region a crucial hub for enabling the hardware to run scalable agentic AI, inference and AI reasoning.

AIC, based in Taoyuan City, Taiwan, is building flash storage servers, powered by NVIDIA BlueField DPUs, that enable higher throughput and greater power efficiency than traditional storage designs. These arrays are deployed in many AI Data Platform-based designs.

ASUS partnered with WEKA and IBM to showcase a next-generation unified storage system for AI and high-performance computing workloads, addressing a broad spectrum of storage needs. The RS501A-E12-RS12U, a WEKA-certified software-defined storage solution, overcomes traditional hardware limitations to deliver exceptional flexibility — supporting file, object and block storage, as well as all-flash, tiering and backup capabilities.

Foxconn, based in New Taipei City, builds many of the manufacturing industry’s accelerated servers and storage platforms used for AI Data Platform solutions. Its subsidiary Ingrasys offers NVIDIA-accelerated GPU servers that support the AI Data Platform.

Supermicro is using the reference design to build its intelligent all-flash storage arrays powered by the NVIDIA Grace CPU Superchip or BlueField-3 DPU. The Supermicro Petascale JBOF and Petascale All-Flash Array Storage Server deliver high performance and power efficiency with software-defined storage vendors and support use with AI Data Platform solutions.

Quanta Cloud Technology, also based in Taiwan, is designing and building accelerated server and storage appliances that include NVIDIA GPUs and networking. They’re well-suited to run NVIDIA AI Enterprise software and support AI Data Platform solutions.

Taipei-based Wistron and Wiwynn offer innovative hardware designs compatible with the AI Data Platform, incorporating NVIDIA GPUs, NVIDIA BlueField DPUs and NVIDIA Ethernet SuperNICs for accelerated compute and data movement.

Learn more about the latest agentic AI advancements at NVIDIA GTC Taipei, running May 21-22 at COMPUTEX.

Read More

Exploring the Revenue-Generating Potential of AI Factories

Exploring the Revenue-Generating Potential of AI Factories

AI is creating value for everyone — from researchers in drug discovery to quantitative analysts navigating financial market changes.

The faster an AI system can produce tokens, a unit of data used to string together outputs, the greater its impact. That’s why AI factories are key, providing the most efficient path from “time to first token” to “time to first value.”

AI factories are redefining the economics of modern infrastructure. They produce intelligence by transforming data into valuable outputs — whether tokens, predictions, images, proteins or other forms — at massive scale.

They help enhance three key aspects of the AI journey — data ingestion, model training and high-volume inference. AI factories are being built to generate tokens faster and more accurately, using three critical technology stacks: AI models, accelerated computing infrastructure and enterprise-grade software.

Read on to learn how AI factories are helping enterprises and organizations around the world convert the most valuable digital commodity — data — into revenue potential.

From Inference Economics to Value Creation

Before building an AI factory, it’s important to understand the economics of inference — how to balance costs, energy efficiency and an increasing demand for AI.

Throughput refers to the volume of tokens that a model can produce. Latency is the amount of tokens that the model can output in a specific amount of time, which is often measured in time to first token — how long it takes before the first output appears — and time per output token, or how fast each additional token comes out. Goodput is a newer metric, measuring how much useful output a system can deliver while hitting key latency targets.

User experience is key for any software application, and the same goes for AI factories. High throughput means smarter AI, and lower latency ensures timely responses. When both of these measures are balanced properly, AI factories can provide engaging user experiences by quickly delivering helpful outputs.

For example, an AI-powered customer service agent that responds in half a second is far more engaging and valuable than one that responds in five seconds, even if both ultimately generate the same number of tokens in the answer.

Companies can take the opportunity to place competitive prices on their inference output, resulting in more revenue potential per token.

Measuring and visualizing this balance can be difficult — which is where the concept of a Pareto frontier comes in.

AI Factory Output: The Value of Efficient Tokens

The Pareto frontier, represented in the figure below, helps visualize the most optimal ways to balance trade-offs between competing goals — like faster responses vs. serving more users simultaneously — when deploying AI at scale.

The vertical axis represents throughput efficiency, measured in tokens per second (TPS), for a given amount of energy used. The higher this number, the more requests an AI factory can handle concurrently.

The horizontal axis represents the TPS for a single user, representing how long it takes for a model to give a user the first answer to a prompt. The higher the value, the better the expected user experience. Lower latency and faster response times are generally desirable for interactive applications like chatbots and real-time analysis tools.

The Pareto frontier’s maximum value — shown as the top value of the curve — represents the best output for given sets of operating configurations. The goal is to find the optimal balance between throughput and user experience for different AI workloads and applications.

The best AI factories use accelerated computing to increase tokens per watt — optimizing AI performance while dramatically increasing energy efficiency across AI factories and applications.

The animation above compares user experience when running on NVIDIA H100 GPUs configured to run at 32 tokens per second per user, versus NVIDIA B300 GPUs running at 344 tokens per second per user. At the configured user experience, Blackwell Ultra delivers over a 10x better experience and almost 5x higher throughput, enabling up to 50x higher revenue potential.

How an AI Factory Works in Practice

An AI factory is a system of components that come together to turn data into intelligence. It doesn’t necessarily take the form of a high-end, on-premises data center, but could be an AI-dedicated cloud or hybrid model running on accelerated compute infrastructure. Or it could be a telecom infrastructure that can both optimize the network and perform inference at the edge.

Any dedicated accelerated computing infrastructure paired with software turning data into intelligence through AI is, in practice, an AI factory.

The components include accelerated computing, networking, software, storage, systems, and tools and services.

When a person prompts an AI system, the full stack of the AI factory goes to work. The factory tokenizes the prompt, turning data into small units of meaning — like fragments of images, sounds and words.

Each token is put through a GPU-powered AI model, which performs compute-intensive reasoning on the AI model to generate the best response. Each GPU performs parallel processing — enabled by high-speed networking and interconnects — to crunch data simultaneously.

An AI factory will run this process for different prompts from users across the globe. This is real-time inference, producing intelligence at industrial scale.

Because AI factories unify the full AI lifecycle, this system is continuously improving: inference is logged, edge cases are flagged for retraining and optimization loops tighten over time — all without manual intervention, an example of goodput in action.

Leading global security technology company Lockheed Martin has built its own AI factory to support diverse uses across its business. Through its Lockheed Martin AI Center, the company centralized its generative AI workloads on the NVIDIA DGX SuperPOD to train and customize AI models, use the full power of specialized infrastructure and reduce the overhead costs of cloud environments.

“With our on-premises AI factory, we handle tokenization, training and deployment in house,” said Greg Forrest, director of AI foundations at Lockheed Martin. “Our DGX SuperPOD helps us process over 1 billion tokens per week, enabling fine-tuning, retrieval-augmented generation or inference on our large language models. This solution avoids the escalating costs and significant limitations of fees based on token usage.”

NVIDIA Full-Stack Technologies for AI Factory

An AI factory transforms AI from a series of isolated experiments into a scalable, repeatable and reliable engine for innovation and business value.

NVIDIA provides all the components needed to build AI factories, including accelerated computing, high-performance GPUs, high-bandwidth networking and optimized software.

NVIDIA Blackwell GPUs, for example, can be connected via networking, liquid-cooled for energy efficiency and orchestrated with AI software.

The NVIDIA Dynamo open-source inference platform offers an operating system for AI factories. It’s built to accelerate and scale AI with maximum efficiency and minimum cost. By intelligently routing, scheduling and optimizing inference requests, Dynamo ensures that every GPU cycle ensures full utilization, driving token production with peak performance.

NVIDIA Blackwell GB200 NVL72 systems and NVIDIA InfiniBand networking are tailored to maximize token throughput per watt, making the AI factory highly efficient from both total throughput and low latency perspectives.

By validating optimized, full-stack solutions, organizations can build and maintain cutting-edge AI systems efficiently. A full-stack AI factory supports enterprises in achieving operational excellence, enabling them to harness AI’s potential faster and with greater confidence.

Learn more about how AI factories are redefining data centers and enabling the next era of AI.

Read More

Time to Slay: ‘DOOM: The Dark Ages’ Looms on GeForce NOW

Time to Slay: ‘DOOM: The Dark Ages’ Looms on GeForce NOW

Steel clashes and war drums thunder as a new age of battle dawns — one that will test even the mightiest Slayer.

This GFN Thursday, DOOM: The Dark Ages — the bold medieval-inspired prequel to DOOM and DOOM Eternal — is available for GeForce NOW premium members, aka Ultimate and Performance members, to stream from the cloud at launch. Premium members can also slay in style with a free in-game reward.

The stage is set and the crowd is buzzing — Capcom: Fighting Collection 2 is joining GeForce NOW at launch.

Plus, get ready to take to the skies with Microsoft Flight Simulator 2024 coming to the cloud this week.

And catch the latest GeForce NOW updates rolling out to members starting this week. The updates include quality-of-life improvements, following performance enhancements like 120 frames-per-second streaming for SHIELD TV to keep the cloud gaming experience at its best.

It’s all part of another thrilling GFN Thursday, with five new games joining the cloud.

Stand and Fight

DOOM The Dark Ages on GeForce NOW
Keep your friends close and your enemies closer.

DOOM: The Dark Ages is a dark fantasy and sci-fi single-player experience that delivers the searing combat and over-the-top visuals of the DOOM franchise, powered by the latest idTech engine.

As the super weapon of gods and kings, shred enemies with devastating favorites like the Super Shotgun while wielding a variety of new bone-chewing weapons, including the versatile Shield Saw. Players will stand and fight on the demon-infested battlefields in the vicious, grounded combat the original DOOM is famous for. Take flight atop the new fierce Mecha Dragon, stand tall in a massive Atlan mech and beat demons to a pulp with the newly enhanced glory kill system. Only the Slayer has the power to wield these devastating tools of mayhem.

Experience every gory detail, thunderous shield bash and demon-splitting kill in the cloud. No downloads, no waiting — just pure, uninterrupted DOOM action, wherever members want to play.

DOOM reward on GeForce NOW
SHIELD your eyes.

GeForce NOW Ultimate or Performance members can now claim the DOOM Slayer Verdant skin reward, a fierce, ruthless-looking armor set that’s built for relentless slaughter. Those who’ve opted in to GeForce NOW’s Rewards program can check their email for instructions on how to redeem it. It’s available through Sunday, June 15, first come, first served.

Step Into the Ring

Capcom Fighting Collection 2
The fight continues.

Capcom’s new fighting collection hits the stage — and the cloud.

Choose from fan favorites like Capcom vs. SNK 2: Mark of the Millennium 2001 and Project Justice, as well as 3D action titles like Power Stone and Power Stone 2 in this collection of eight classic fighting games. Each can be played online or in co-op mode. Get back in the ring and duke it out in battles that everyone rumored but no one believed.

Chase victory by streaming on GeForce NOW. Ultimate and Performance members enjoy higher resolutions and lower latency compared with free users for a true cloud-gaming edge.

Game On

Streaming from a powerful GeForce RTX gaming rig in the cloud enables GeForce NOW to deliver continuous improvements and new features that enhance members’ streaming experiences. This week, update 2.0.74 is rolling out, bringing several enhancements to the cloud.

Members will see an upgraded library syncing feature for those using PC game subscription services like PC Game Pass and Ubisoft+, making it even easier to jump into games. Supported titles for these game services will now be automatically added to members’ “My Library” after resyncing their Ubisoft, Battle.net and Xbox connected accounts in the GeForce NOW app.

This update follows the recent performance boost for SHIELD TV users in SHIELD Experience 9.2.1, now supporting up to 120 fps 1080p streaming for GeForce NOW Ultimate members. Those who prefer higher resolution over frame rates can continue streaming at up to 4K 60 fps.

With such ongoing updates, GeForce NOW is making cloud gaming more seamless and accessible across devices.

Fly Your Way

Microsoft Flight Simulator 2024 on GeForce NOW
Fly anywhere with the cloud.

GeForce NOW brings a groundbreaking aviation experience to the cloud with Microsoft Flight Simulator 2024. Members can experience the game that redefines aviation simulation with unparalleled realism and global exploration.

Pursue dynamic aviation careers through missions like Medevac, Search and Rescue, and Aerial Firefighting. Plus, compete in thrilling events such as the Red Bull Air Races. The game introduces advanced physics, enhanced aircraft systems and a groundbreaking flight planner for immersive gameplay. Explore an exceptionally detailed digital recreation of Earth, featuring handcrafted airports, landmarks, dynamic biomes, and real-time air and maritime traffic.

With stunning visuals, diverse wildlife and realistic weather systems, Microsoft Flight Simulator 2024 offers unmatched experiences for pilots and adventurers. Ultimate and Performance members can play with GeForce RTX 4080-level performance with the highest frame rates and lowest latency. Ultimate members can elevate their adventures at up to 4K resolution and 120 fps for the most immersive rides in the sky.

Fired Up for New Games

Blacksmith Master on GeForce NOW
It’s hammer time.

Manage a medieval forge in Blacksmith Master, launching this week in the cloud. Find and hire the best staff and equip them with the right tools to optimize the business and train their skills over time. Design the shop for the best throughput, fulfill orders from across the kingdom to unlock new capabilities, and seek out new opportunities in the market as customers come looking for a variety of historically inspired items — from weapons and armor to tools and cooking utensils. Perfect the craft to become the Blacksmith Master.

Look for the following games available to stream in the cloud this week:

  • The Precinct (New release on Steam, May 13)
  • Blacksmith Master (New release on Steam, May 15)
  • Capcom Fighting Collection 2 (New release on Steam, May 15)
  • DOOM: The Dark Ages (New release on Steam, Battle.net and Xbox, available on PC Game Pass, May 154)
  • Microsoft Flight Simulator 2024 (Steam and Xbox, available on PC Game Pass)

What are you planning to play this weekend? Let us know on X or in the comments below.

Read More

Into the Omniverse: Computational Fluid Dynamics Simulation Finds Smoothest Flow With AI-Driven Digital Twins

Into the Omniverse: Computational Fluid Dynamics Simulation Finds Smoothest Flow With AI-Driven Digital Twins

Editor’s note: This post is part of Into the Omniverse, a series focused on how developers, 3D practitioners and enterprises can transform their workflows using the latest advances in OpenUSD and NVIDIA Omniverse.

Computer-aided engineering (CAE) is at the forefront of modern product development, enabling engineers to virtually test and refine designs before building physical prototypes. Among the powerful CAE methods, computational fluid dynamics (CFD) simulation plays a critical role in understanding and optimizing fluid flow for use cases, such as aerodynamic testing in aerospace and automotive engineering or thermal management for electronics.

The NVIDIA Omniverse Blueprint for real-time digital twins provides a powerful framework for developers to build complex CFD simulation solutions with the combined power of NVIDIA CUDA-X acceleration libraries, NVIDIA PhysicsNeMo AI framework and NVIDIA Omniverse, and Universal Scene Description (OpenUSD).

Multiphysics simulation generates a high diversity of data with optical, thermal, electromagnetic and mechanical applications, all requiring different inputs and outputs.

OpenUSD provides a unified data model that connects the CAE ecosystem so digital twins can operate in real time with diverse data inputs. This seamless interoperability between tools is crucial for engineering efforts that rely on accurate, consistent CFD simulations.

Industry Leaders Deliver 50x Faster Simulation 

At NVIDIA GTC in March, NVIDIA announced that leading CAE software providers, including Ansys, Altair, Cadence, Siemens and Synopsys, are accelerating their simulation tools, including for CFD, by up to 50x with the NVIDIA Blackwell platform.

Thanks to accelerated software, NVIDIA CUDA-X libraries and performance-optimization blueprints, industries like automotive, aerospace, energy, manufacturing and life sciences can greatly reduce product development time and costs while increasing design accuracy and remaining energy efficient.

Ansys, a leader in simulation software, is harnessing the power of NVIDIA technologies for real-time physics and accelerated simulation with AI-driven digital twins. By integrating NVIDIA GPUs and tapping into Blackwell’s advanced accelerated computing capabilities, Ansys software enables engineers to run complex CFD simulations at unprecedented speed and scale.

Real-Time Digital Twins for CFD

Ansys is also adopting Omniverse and OpenUSD to create more connected, collaborative simulation environments for CFD. Ansys users can build real-time digital twins that integrate data from multiple sources, and now those multidisciplinary CFD simulations can be integrated into the visually rich Omniverse environment.

Learn more about how Ansys is using NVIDIA technologies and OpenUSD to advance its CFD workflows in this livestream replay:

Get Plugged Into the World of OpenUSD

Join NVIDIA GTC Taipei at COMPUTEX, running May 19-23, to see how accelerated computing, Omniverse and OpenUSD advance 3D workflows. Watch NVIDIA founder and CEO Jensen Huang’s COMPUTEX keynote on Monday, May 19, at 11 a.m. Taiwan Time.

Ansys Simulation World is a virtual and in-person global simulation experience. The virtual event takes place July 16-17, and includes a keynote from Huang that will provide a closer look at the transformative power of accelerated computing and AI to enable computational engineering breakthroughs – including CFD – across all industries. Until then, watch Ansys GTC sessions on demand to learn more.

Discover why developers and 3D practitioners are using OpenUSD and learn how to optimize 3D workflows with the new self-paced “Learn OpenUSD” curriculum for 3D developers and practitioners, available for free through the NVIDIA Deep Learning Institute.

For more resources on OpenUSD, explore the Alliance for OpenUSD forum and the AOUSD website.

Stay up to date by subscribing to NVIDIA Omniverse news, joining the community and following NVIDIA Omniverse on Instagram, LinkedIn, Medium and X.

Featured image courtesy of Ansys.

Read More

Visa Makes Payments Personalized and Secure With AI

Visa Makes Payments Personalized and Secure With AI

Think tap to pay — but smarter and safer. Visa is tapping into AI to enhance services for its global network of customers, focused on fraud prevention, personalization and agentic commerce.

Sarah Laszlo, senior director of Visa’s machine learning platform, joined the AI Podcast to discuss how artificial intelligence is powering the next generation of payment experiences.

Visa processes hundreds of billions of transactions each year, so even small technological enhancements can have a large impact.

The company prevents $40 billion in fraud annually. “There’s so much attempted fraud that, even though we’re very good at preventing it, marginal improvements save large dollar amounts,” Laszlo said.

AI also powers Visa’s personalization systems, delivering smarter, more relevant offers and recommendations to cardholders. The company’s unique dataset presents a huge opportunity to improve transaction predictions — but also brings privacy challenges.

“The key to addressing those challenges is building abstract representations of users — embeddings that capture preferences without exposing private data,” Laszlo said.

The company is also working toward agentic commerce — where AI agents help customers with payments. For example, AI agents with access to payment credentials can handle transactions on behalf of consumers, like booking travel arrangements, when instructed.

Laszlo shared best practices for enterprises adopting generative AI. She also recommended using open-source models when possible and developing strong relationships between governance and technical teams. In one success story, Visa used GPT-4 to convert legacy code to Python, saving $5 million, with just one engineer completing 50 conversion jobs in a quarter.

To learn more, watch Laszlo’s GTC session, The Next Era of Payments: How Generative AI is Shaping the Future.

Time Stamps

03:28 – Visa’s priorities for AI use.

07:09 – How Visa optimizes resources using virtual GPUs.

14:42 – AI factories and unified pipelines.

18:52 – Best practices for AI in financial services.

You Might Also Like… 

NVIDIA’s Jacob Liberman on Bringing Agentic AI to Enterprises

Agentic AI enables developers to create intelligent multi-agent systems that reason, act and execute complex tasks with a degree of autonomy. Jacob Liberman, director of product management at NVIDIA, explains how agentic AI bridges the gap between powerful AI models and practical enterprise applications.

Firsthand’s Jon Heller Shares How AI Agents Enhance Consumer Journeys in Retail

Jon Heller, co-CEO and founder of Firsthand, discusses how the company’s Brand Agents are transforming the retail landscape by personalizing customer journeys, converting marketing interactions into valuable research data and enhancing the customer experience with hyper-personalized insights and recommendations.

Telenor Builds Norway’s First AI Factory, Offering Sustainable and Sovereign Data Processing

Telenor opened Norway’s first AI factory in November 2024, enabling organizations to process sensitive data securely on Norwegian soil while prioritizing environmental responsibility. Telenor’s Chief Innovation Officer and Head of the AI Factory Kaaren Hilsen discusses the AI factory’s rapid development, going from concept to reality in under a year.

Read More

Press Play on Don Diablo’s Music Video — Created With NVIDIA RTX-Powered Generative AI

Press Play on Don Diablo’s Music Video — Created With NVIDIA RTX-Powered Generative AI

Electronic music icon Don Diablo is known for pushing the boundaries of music, visual arts and live performance — and the music video for his latest single, “BLACKOUT,” explores using generative AI, combining NVIDIA RTX-powered and cloud-based tools in a hybrid workflow.

Set inside an industrial warehouse, the video features Diablo performing in front of an immersive, AI-generated environment — blending stylized effects, a cinematic mood and personalized elements.

The image creation and experimentation was done locally on a desktop powered by a NVIDIA GeForce RTX 5090 GPU, while the final animation was rendered using Kling AI, a cloud-based generative video tool. This hybrid approach gave the team creative control and high-fidelity results.

CTRL + ALT + Create With RTX

All image-based work in the music video — including stylization and identity modeling — was performed locally using ComfyUI, a visual, node-based interface for building and customizing generative AI workflows.

With minimal setup, the team could launch preconfigured workflows and experiment freely with different models and techniques. ComfyUI offered speed, ease of use and support for a wide range of open-source tools — enabling fast iteration and distinctive visuals that aligned with Diablo’s creative direction.

Diablo wanted the visuals to reflect his identity — not just serve as decor. To achieve this, a custom model was trained on a facial dataset of Diablo, using low-rank adaptation in the FLUXGYM user interface.

The resulting model was then loaded into ComfyUI and integrated into a Stable Diffusion pipeline for stylized image generation. An identity-embedding node ensured consistent facial features across frames, as well as stylistic cohesion.

Once the imagery was locked, the team turned to Kling AI in the cloud to animate the visuals into dynamic scenes, harnessing the tool’s advanced physics and cinematic capabilities.

This workflow, all running locally on an RTX 5090 GPU, enabled faster iteration, precise visual control, tighter feedback loops and full creative autonomy. It showcases how creators can use local RTX-powered tools — as well as cloud-based platforms for pre-defined workloads that require more compute — to quickly and easily produce high-impact content on their terms.

“Working with these new AI tools literally feels like stepping into my own mind,” said Diablo. “We were able to creatively raise the bar, shape the visuals in real time and stay focused on the story. AI is a powerful tool for creativity.”

Creative Workflows, Powered by RTX AI

Watch the full “BLACKOUT” music video to see how the team brought their creative ideas to life with RTX AI.

Artists and developers are already using generative AI to streamline their work and push creative boundaries, whether exploring concepts, designing virtual worlds or building intelligent apps. With RTX AI PCs, users can access the latest and greatest models and tools, as well as powerful AI performance.

Learn more about the NVIDIA Studio platform and RTX AI PCs.

Each week, the RTX AI Garage blog series features community-driven AI innovations and content for those looking to learn more about NVIDIA NIM microservices and AI Blueprints, as well as building AI agents, creative workflows, digital humans, productivity apps and more on AI PCs and workstations.

Plug in to NVIDIA AI PC on Facebook, Instagram, TikTok and X — and stay informed by subscribing to the RTX AI PC newsletter.

Follow NVIDIA Workstation on LinkedIn and X.

Read More

How Reasoning AI Agents Transform High-Stakes Decision Making

How Reasoning AI Agents Transform High-Stakes Decision Making

AI agents powered by large language models (LLMs) have grown past their FAQ chatbot beginnings to become true digital teammates capable of planning, reasoning and taking action — and taking in corrective feedback along the way.

Thanks to reasoning AI models, agents can learn how to think critically and tackle complex tasks. This new class of “reasoning agents” can break down complicated problems, weigh options and make informed decisions — while using only as much compute and as many tokens as needed.

Reasoning agents are making a splash in industries where decisions rely on multiple factors. Such industries range from customer service and healthcare to manufacturing and financial services.

Reasoning On vs. Reasoning Off

Modern AI agents can toggle reasoning on and off, allowing them to efficiently use compute and tokens.

A full chain‑of‑thought pass performed during reasoning can take up to 100x more compute and tokens than a quick, single‑shot reply — so it should only be used when needed. Think of it like turning on headlights — switching on high beams only when it’s dark and turning them back to low when it’s bright enough out.

Single-shot responses are great for simple queries — like checking an order number, resetting a password or answering a quick FAQ. Reasoning might be needed for complex, multistep tasks such as reconciling tax depreciation schedules or orchestrating the seating at a 120‑guest wedding.

New NVIDIA Llama Nemotron models, featuring advanced reasoning capabilities, expose a simple system‑prompt flag to enable or disable reasoning, so developers can programmatically decide per query. This allows agents to perform reasoning only when the stakes demand it — saving users wait times and minimizing costs.

Reasoning AI Agents in Action

Reasoning AI agents are already being used for complex problem-solving across industries, including:

  • Healthcare: Enhancing diagnostics and treatment planning.
  • Customer Service: Automating and personalizing complex customer interactions, from resolving billing disputes to recommending tailored products.
  • Finance: Autonomously analyzing market data and providing investment strategies.
  • Logistics and Supply Chain: Optimizing delivery routes, rerouting shipments in response to disruptions and simulating possible scenarios to anticipate and mitigate risks.
  • Robotics: Powering warehouse robots and autonomous vehicles, enabling them to plan, adapt and safely navigate dynamic environments.

Many customers are already experiencing enhanced workflows and benefits using reasoning agents.

Amdocs uses reasoning-powered AI agents to transform customer engagement for telecom operators. Its amAIz GenAI platform, enhanced with advanced reasoning models such as NVIDIA Llama Nemotron and amAIz Telco verticalization, enables agents to autonomously handle complex, multistep customer journeys — spanning customer sales, billing and care.

EY is using reasoning agents to significantly improve the quality of responses to tax-related queries. The company compared generic models to tax-specific reasoning models, which revealed up to an 86% improvement in response quality for tax questions when using a reasoning approach.

SAP’s Joule agents — which will be equipped with reasoning capabilities from Llama Nemotron –– can interpret complex user requests, surface relevant insights from enterprise data and execute cross-functional business processes autonomously.

Designing an AI Reasoning Agent

A few key components are required to build an AI agent, including tools, memory and planning modules. Each of these components augments the agent’s ability to interact with the outside world, create and execute detailed plans, and otherwise act semi- or fully autonomously.

Reasoning capabilities can be added to AI agents at various places in the development process. The most natural way to do so is by augmenting planning modules with a large reasoning model, like Llama Nemotron Ultra or DeepSeek-R1. This allows more time and reasoning effort to be used during the initial planning phase of the agentic workflow, which has a direct impact on the overall outcomes of systems.

The AI-Q NVIDIA AI Blueprint and the NVIDIA Agent Intelligence toolkit can help enterprises break down silos, streamline complex workflows and optimize agentic AI performance at scale.

The AI-Q blueprint provides a reference workflow for building advanced agentic AI systems, making it easy to connect to NVIDIA accelerated computing, storage and tools for high-accuracy, high-speed digital workforces. AI-Q integrates fast multimodal data extraction and retrieval using NVIDIA NeMo Retriever, NIM microservices and AI agents.

In addition, the open-source NVIDIA Agent Intelligence toolkit enables seamless connectivity between agents, tools and data. Available on GitHub, this toolkit lets users connect, profile and optimize teams of AI agents, with full system traceability and performance profiling to identify inefficiencies and improve outcomes. It’s framework-agnostic, simple to onboard and can be integrated into existing multi-agent systems as needed.

Build and Test Reasoning Agents With Llama Nemotron

Learn more about Llama Nemotron, which recently was at the top of industry benchmark leaderboards for advanced science, coding and math tasks. Join the community shaping the future of agentic, reasoning-powered AI.

Plus, explore and fine-tune using the open Llama Nemotron post-training dataset to build custom reasoning agents. Experiment with toggling reasoning on and off to optimize for cost and performance.

And test NIM-powered agentic workflows, including retrieval-augmented generation and the NVIDIA AI Blueprint for video search and summarization, to quickly prototype and deploy advanced AI solutions.

Read More

NVIDIA Partners Showcase Cutting-Edge Robotic and Industrial AI Solutions at Automate 2025

NVIDIA Partners Showcase Cutting-Edge Robotic and Industrial AI Solutions at Automate 2025

As the manufacturing industry faces challenges — such as labor shortages, reshoring and inconsistent operational strategies — AI-powered robots present a significant opportunity to accelerate industrial automation.

At Automate, the largest robotics and automation event in North America, robotics leaders KUKA, Standard Bots, Universal Robots (UR) and Vention are showcasing hardware and robots powered by the NVIDIA accelerated computing, Omniverse and Isaac platforms — helping manufacturers everywhere automate and optimize their production lines.

Deepu Talla, vice president of robotics and edge AI at NVIDIA, delivered a keynote on physical AI and industrial autonomy.

“The manufacturing industry is experiencing a fundamental shift, with industrial automation and AI-powered robots increasingly changing how warehouses and factories operate worldwide,” said Deepu Talla, vice president of robotics and edge AI at NVIDIA. “NVIDIA’s three-computer architecture — enabling robot training, simulation and accelerated runtime — is empowering the entire robotics ecosystem to accelerate this shift toward software-defined autonomous facilities.”

Synthetic Data Generation Blueprint Speeds Up Robot Development Pipelines

Embodied AI systems, which refers to the integration of AI into physical systems, must be trained with real-world data — traditionally a complex and resource-intensive process. Each robot typically needs its own custom dataset due to differences in hardware, sensors and environments.

Synthetic data offers a powerful alternative. NVIDIA Isaac Lab 2.1 — the latest version of the open-source robot learning framework, announced at Automate — provides developers with tools to accelerate the robot training process using the NVIDIA Isaac GR00T Blueprint for synthetic motion generation. Built on NVIDIA Omniverse, a physical AI simulation platform, and NVIDIA Cosmos world foundation models, the blueprint provides a reference workflow for creating vast amounts of synthetic and robot manipulation data, making it easier and faster to train robots, like manipulators and humanoids, for a variety of tasks.

NVIDIA showcases the synthetic manipulation motion generation blueprint.

Robotics Leaders Harness NVIDIA Technologies for Industrial AI

Image courtesy of UR.

Robotics leaders are building next-generation robots, tapping into NVIDIA technologies to train, power and deploy physical AI in industrial settings.

Universal Robots, a leader in collaborative robotics, introduced UR15, its fastest collaborative robot yet, featuring improved cycle times and advanced motion control. Using UR’s AI Accelerator — developed on the NVIDIA Isaac platform’s CUDA-accelerated libraries and AI models, and NVIDIA Jetson AGX Orin — manufacturers can build AI applications to embody intelligence into cobots.

Vention, a manufacturing automation company, announced MachineMotion AI, an automation controller designed to unify motion, sensing, vision and AI. The system taps into the NVIDIA Jetson platform for embedded computing and NVIDIA Isaac’s CUDA-accelerated libraries and models, enabling compute-intensive AI tasks such as real-time vision processing, bin-picking and autonomous decision-making. This technology shows the value AI brings to the manufacturing floor for practical deployment of robotic solutions.

Standard Bots, a robotics developer, unveiled its manipulator, a 30kg-payload, 2m-reach robot that can be used for heavy-duty tooling and moving large objects in the automotive, aerospace and logistics industries. With NVIDIA Isaac Sim, a reference application built on Omniverse, robots can be taught tasks through demonstrations, eliminating the need for traditional coding or programming to free up developers for higher-value tasks. Standard Bots also announced teleoperation capabilities via a tablet device, which can efficiently collect training data.

KUKA, a leading supplier of intelligent automation solutions, unveiled its KR C5 Micro-2, a small robot controller integrated with an NVIDIA Jetson extension for AI-ready applications. It will provide future KUKA robots with better AI vision and AI-based control tasks powered by NVIDIA’s software stack.

NVIDIA Brings Software to Deploy AI Agents in Factories and Warehouses

In addition to robots, manufacturers everywhere are increasingly turning to AI agents capable of analyzing and acting upon ever-growing video data.

The NVIDIA AI Blueprint for video search and summarization (VSS), part of the NVIDIA Metropolis platform, combines generative AI, large language models, vision language models and media management services to deploy visual AI agents that can optimize processes, such as visual inspection and assembly, and enhance worker safety in factories and warehouses.

This helps eliminate manual monitoring and enables rapid processing and interpretation of vast amounts of video data, helping businesses drive industrial automation and make data-driven decisions. Developers can now use their own video data to try the AI Blueprint for VSS in the cloud with NVIDIA Launchable.

Industry leaders are using the blueprint for VSS to enable advanced video analytics and computer vision capabilities across domains.

At Automate, Siemens will be showcasing its Industrial Copilot for Operations, a generative AI-powered assistant that optimizes workflows and enhances collaboration between humans and AI. Using the tool, shop floor operators, maintenance engineers and service technicians can receive machine instructions and guidance quicker, using natural language. The copilot uses NVIDIA accelerated computing and NVIDIA NIM and NeMo Retriever microservices from the AI Blueprint for VSS to add multimodal capabilities.

Connect Tech, an edge computing company, is analyzing drone footage with the blueprint for VSS running on NVIDIA Jetson edge devices to enable real-time Q&A and zero-shot detections for hazards like fires or flooding in remote areas.

DeepHow, a generative AI-powered video training platform provider, is using the blueprint to create smart videos that capture key workflows and convert them into structured training content, improving shop floor operator efficiency.

InOrbit.AI, a software platform for robot orchestration, will showcase its latest improvements in InOrbit Space Intelligence, which harnesses physical AI, computer vision and the VSS blueprint to analyze robot operations and optimize real-world workflows.

And KoiReader Technologies, a provider of vision and generative AI-powered automation solutions, is using the blueprint to enable true real-time operational intelligence from events occurring in supply chain and manufacturing environments.

Explore the following talks to connect with NVIDIA and its partners at Automate:

Learn more about NVIDIA’s latest work in robotics and industrial AI at Automate, running through May 15.

Read More

NVIDIA Scores COMPUTEX Best Choice Awards

NVIDIA Scores COMPUTEX Best Choice Awards

NVIDIA today received multiple accolades at COMPUTEX’s Best Choice Awards, in recognition of innovation across the company.

The NVIDIA GeForce RTX 5090 GPU won the Gaming and Entertainment category award; the NVIDIA Quantum-X Photonics InfiniBand switch system won the Networking and Communication category award; NVIDIA DGX Spark won the Computer and System category award; and the NVIDIA GB200 NVL72 system and NVIDIA Cosmos world foundation model development platform won Golden Awards.

The awards recognize the outstanding functionality, innovation and market promise of technologies in each category.

Jensen Huang, founder and CEO of NVIDIA, will deliver a keynote at COMPUTEX on Monday, May 19, at 11 a.m. Taiwan time.

GB200 NVL72 and NVIDIA Cosmos Go Gold

NVIDIA GB200 NVL72 and NVIDIA Cosmos each won Golden Awards.

The NVIDIA GB200 NVL72 system connects 36 NVIDIA Grace CPUs and 72 NVIDIA Blackwell GPUs in a rack-scale design. It delivers 1.4 exaflops of AI performance and 30 terabytes of fast memory, as well as 30x faster real-time trillion-parameter large language model inference with 25x energy efficiency compared with the NVIDIA H100 GPU.

By design, the GB200 NVL72 accelerates the most compute-intensive AI and high-performance computing workloads, including AI training and data processing for engineering design and simulation.

NVIDIA Cosmos accelerates physical AI development by enabling developers to build and deploy world foundation models with unprecedented speed and scale.

Pretrained on 9,000 trillion tokens of robotics and driving data, Cosmos world foundation models can rapidly generate synthetic, physics-based data or be post-trained for downstream robotics and autonomous vehicle foundation models, significantly reducing development time and the costs of real-world data collection.

The platform’s accelerated video data processing pipeline can process and label 20 million hours of video in just two weeks, a task that would otherwise take over three years with CPU-only systems.

Spotlighting NVIDIA Technologies

All the NVIDIA technologies nominated — including the NVIDIA GeForce RTX 5090 GPU, NVIDIA Quantum-X Photonics InfiniBand switch system and NVIDIA DGX Spark — won in their respective categories.

The NVIDIA GeForce RTX 5090, built on NVIDIA Blackwell architecture and equipped with ultra-fast GDDR7 memory, delivers powerful gaming and creative performance. It features fifth-generation Tensor Cores and a 512-bit memory bus, enabling high-performance gaming and AI-accelerated workloads with next-generation ray-tracing and NVIDIA DLSS 4 technologies.

The NVIDIA Quantum-X Photonics Q3450-LD InfiniBand switch advances data center networking for the agentic AI era with co-packaged optics. By integrating silicon photonics directly with the InfiniBand switch ASIC, the Q3450-LD eliminates the need for pluggable optical transceivers — reducing electrical loss, enhancing signal integrity and improving overall power and thermal efficiency.

NVIDIA DGX Spark is a personal AI supercomputer, bringing the power of the NVIDIA Grace Blackwell architecture to desktops to enable researchers, developers and students to prototype, fine-tune and run advanced AI models locally with up to 1,000 trillion operations per second of performance.

With its compact, power-efficient design and seamless integration into the NVIDIA AI ecosystem, DGX Spark empowers users to accelerate generative and physical AI workloads — whether working at the desk, in the lab or deploying to the cloud.

Learn more about the latest agentic AI advancements at NVIDIA GTC Taipei, running May 21-22 at COMPUTEX.

Read More

Wildfire Prevention: AI Startups Support Prescribed Burns, Early Alerts

Wildfire Prevention: AI Startups Support Prescribed Burns, Early Alerts

Artificial intelligence is helping identify and treat diseases faster with better results for humankind. Natural disasters like wildfires are next.

Fires in the Los Angeles area have claimed more than 16,000 homes and other structures so far this year. Damages in January were estimated as high as $164 billion, making it potentially the worst natural disaster financially in U.S. history, according to Bloomberg.

The U.S. Department of Agriculture and the U.S. Forest Service have reportedly been redirecting resources in recent months toward beneficial fires to reduce overgrowth.

AI enables fire departments to keep more eyes on controlled burns, making them safer and more accepted in communities, say industry experts.

“This is just like cancer treatment,” said Sonia Kastner, CEO and founder of Pano AI, based in San Francisco. “You can do early screening, catch it when it’s in phase one, and hit it with really aggressive treatment so it doesn’t progress — what we’ve seen this fire season is proof that our customers across the country use our solution in this way.”

San Ramon, California-based Green Grid, which specializes in AI for utility companies, in September alerted its customer at a Big Bear resort that a fire started in the San Bernardino National Forest was near, said Chinmoy Saha, the company’s CEO. By acting early, the resort customer was able to prepare for the needed suppression measures for the fire before it reached and became uncontrollable, he said. Due to the favorable weather conditions, the fire did not reach the customer territory.

In the recent Los Angeles area fires, Saha said he had been in discussion with a customer seeking to bring AI to cameras located at the site of the now-devasted Eaton fire that has claimed 17 lives and more than 9,000 buildings.

“If we had our system there, this fire could have been mitigated,” said Saha. “Early detection is the key, so the fire is contained and it doesn’t become a catastrophic wildfire.”

Aiding First Responders With Accelerated Computing

Pano’s service provides human-in-the-loop AI-driven fire detection and alerts that have enabled fire departments to act faster than from 911 calls, accelerating containment efforts, said Kastner.

The company’s Pano Station uses two ultra-high-definition cameras mounted on top of mountains like a cell tower, rotating 360 degrees every minute to capture views 10 miles in all directions. Those images are transmitted to the cloud every minute, where AI models running on GPUs do inference for smoke detection.

Pano AI’s Pano Station in Rancho Palos Verdes

Pano has a daytime smoke detection model and a nighttime near infrared model looking for smoke, as well as a nighttime geostationary satellite model. It has a human in the loop for verifying the detections, and it can be confirmed using digital zoom and time-lapse imagery.

It trains on NVIDIA GPUs locally and runs inference on NVIDIA GPUs in the cloud.

Harnessing AI for Controlled Burns

California Department of Forestry and Fire Protection (CAL FIRE) is carrying out prescribed fires, or controlled burns, to reduce dry vegetation that creates fuel for wildfires.

“Controlled burns are necessary, and we didn’t do a good job in California for the past 30 or 40 years,” said Saha. Green Grid has deployed its trailer mounted AI camera sensors for monitoring fires and control burns before they go out of control.

Pano can be used by fire departments to monitor controlled burn zones with its AI-driven cameras to make sure that plumes of smoke don’t appear outside of the permitted zone, maintaining safety.

The company has its cameras stationed at Rancho Palos Verdes, south of the recent Los Angeles area fires.

“The area around the palisades fire was a very overgrown forest, and with a lot of dead fuels, so our hope is that there is going to be more focus on prescribed fires,” said Kastner.

Embracing AI at Fire Departments for Faster Mitigation

CAL FIRE is partnered with Alert California and UC San Diego for a network of cameras owned by investor-owned utilities, CAL FIRE, U.S. Forest Service and other U.S. Department of the Interior agencies.

Through that network, they’ve implemented an AI program that looks for new fire starts. It pans every two minutes and continuously updates, and Alert California has the most up-to-date information from this network.

If AI can enable fire departments to get to the scene of a fire when it’s just a few acres, it’s a lot easier to control than if it’s 50 or more acres, said David Acuna, battalion chief at CAL FIRE, Clovis, California. This is particularly important in remote areas where it might take hours before a human sees and reports a fire, he added.

“They use AI to determine if this looks like a new start,” said Acuna. “Now the key here is the program will then send an email to the relevant emergency command center, saying ‘Hey, I think we spotted a new start, what do you think?’ And it has to be verified by a human.”

Read More