Light Bulb Moment: NVIDIA CEO Sees Bright Future for AI-Powered Electric Grid

Light Bulb Moment: NVIDIA CEO Sees Bright Future for AI-Powered Electric Grid

The electric grid and the utilities managing it have an important role to play in the next industrial revolution that’s being driven by AI and accelerated computing, said NVIDIA founder and CEO Jensen Huang Monday at the annual meeting of the Edison Electric Institute (EEI), an association of U.S. and international utilities.

“The future of digital intelligence is quite bright, and so the future of the energy sector is bright, too,” said Huang in a keynote before an audience of more than a thousand utility and energy industry executives.

Like other companies, utilities will apply AI to increase employee productivity, but “the greatest impact and return is in applying AI in the delivery of energy over the grid,” said Huang, in conversation with Pedro Pizarro, the chair of EEI and president and CEO of Edison International, the parent company of Southern California Edison, one of the nation’s largest electric utilities.

For example, Huang described how grids will use AI-powered smart meters to let customers sell their excess electricity to neighbors.

“You will connect resources and users, just like Google, so your power grid becomes a smart network with a digital layer like an app store for energy,” he said.

“My sense is, like previous industrial revolutions, [AI] will drive productivity to levels that we’ve never seen,” he added.

A video of the fireside chat will be available here soon.

AI Lights Up Electric Grids

Today, electric grids are mainly one-way systems that link a few big power plants to many users. They’ll increasingly become two-way, flexible and distributed networks with solar and wind farms connecting homes and buildings that sport solar panels, batteries and electric vehicle chargers.

It’s a big job that requires autonomous control systems that process and analyze in real time a massive amount of data — work well suited to AI and accelerated computing.

AI is being applied to use cases across electric grids, thanks to a wide ecosystem of companies using NVIDIA’s technologies.

In a recent GTC session, utility vendor Hubbell and startup Utilidata, a member of the NVIDIA Inception program, described a new generation of smart meters using the NVIDIA Jetson platform that utilities will deploy to process and analyze real-time grid data using AI models at the edge. Deloitte announced today its support for the effort.

Siemens Energy detailed in a separate GTC session its work with AI and NVIDIA Omniverse creating digital twins of transformers in substations to improve predictive maintenance, boosting grid resilience. And a video reports on how Siemens Gamesa used Omniverse and accelerated computing to optimize turbine placements for a large wind farm.

“Deploying AI and advanced computing technologies developed by NVIDIA enables faster and better grid modernization and we, in turn, can deliver for our customers,” said Maria Pope, CEO of Portland General Electric in Oregon.

NVIDIA Delivers 45,000x Gain in Energy Efficiency

The advances come as NVIDIA drives down the costs and energy needed to deploy AI.

Over the last eight years, NVIDIA increased energy efficiency of running AI inference on state-of-the-art large language models a whopping 45,000x, Huang said in his recent keynote at COMPUTEX.

NVIDIA Blackwell architecture GPUs will provide 20x greater energy efficiency than CPUs for AI and high-performance computing. If all CPU servers for these jobs transitioned to GPUs, users would save 37 terawatt-hours a year, the equivalent of 25 million metric tons of carbon dioxide and the electricity use of 5 million homes.

That’s why NVIDIA-powered systems swept the top six spots and took seven of the top 10 in the latest ranking of the Green500, a list of the world’s most energy-efficient supercomputers.

In addition, a recent report calls for governments to accelerate adoption of AI as a significant new tool to drive energy efficiency across many industries. It cited examples of utilities adopting AI to make the electric grid more efficient.

Learn more about how utilities are deploying AI and accelerated computing to improve operations, saving cost and energy.

Read More

Seamless in Seattle: NVIDIA Research Showcases Advancements in Visual Generative AI at CVPR

Seamless in Seattle: NVIDIA Research Showcases Advancements in Visual Generative AI at CVPR

NVIDIA researchers are at the forefront of the rapidly advancing field of visual generative AI, developing new techniques to create and interpret images, videos and 3D environments.

More than 50 of these projects will be showcased at the Computer Vision and Pattern Recognition (CVPR) conference, taking place June 17-21 in Seattle. Two of the papers — one on the training dynamics of diffusion models and another on high-definition maps for autonomous vehicles — are finalists for CVPR’s Best Paper Awards.

NVIDIA is also the winner of the CVPR Autonomous Grand Challenge’s End-to-End Driving at Scale track — a significant milestone that demonstrates the company’s use of generative AI for comprehensive self-driving models. The winning submission, which outperformed more than 450 entries worldwide, also received CVPR’s Innovation Award.

NVIDIA’s research at CVPR includes a text-to-image model that can be easily customized to depict a specific object or character, a new model for object pose estimation, a technique to edit neural radiance fields (NeRFs) and a visual language model that can understand memes. Additional papers introduce domain-specific innovations for industries including automotive, healthcare and robotics.

Collectively, the work introduces powerful AI models that could enable creators to more quickly bring their artistic visions to life, accelerate the training of autonomous robots for manufacturing, and support healthcare professionals by helping process radiology reports.

“Artificial intelligence, and generative AI in particular, represents a pivotal technological advancement,” said Jan Kautz, vice president of learning and perception research at NVIDIA. “At CVPR, NVIDIA Research is sharing how we’re pushing the boundaries of what’s possible — from powerful image generation models that could supercharge professional creators to autonomous driving software that could help enable next-generation self-driving cars.”

At CVPR, NVIDIA also announced NVIDIA Omniverse Cloud Sensor RTX, a set of microservices that enable physically accurate sensor simulation to accelerate the development of fully autonomous machines of every kind.

Forget Fine-Tuning: JeDi Simplifies Custom Image Generation

Creators harnessing diffusion models, the most popular method for generating images based on text prompts, often have a specific character or object in mind — they may, for example, be developing a storyboard around an animated mouse or brainstorming an ad campaign for a specific toy.

Prior research has enabled these creators to personalize the output of diffusion models to focus on a specific subject using fine-tuning — where a user trains the model on a custom dataset — but the process can be time-consuming and inaccessible for general users.

JeDi, a paper by researchers from Johns Hopkins University, Toyota Technological Institute at Chicago and NVIDIA, proposes a new technique that allows users to easily personalize the output of a diffusion model within a couple of seconds using reference images. The team found that the model achieves state-of-the-art quality, significantly outperforming existing fine-tuning-based and fine-tuning-free methods.

JeDi can also be combined with retrieval-augmented generation, or RAG, to generate visuals specific to a database, such as a brand’s product catalog.

 

New Foundation Model Perfects the Pose

NVIDIA researchers at CVPR are also presenting FoundationPose, a foundation model for object pose estimation and tracking that can be instantly applied to new objects during inference, without the need for fine-tuning.

The model, which set a new record on a popular benchmark for object pose estimation, uses either a small set of reference images or a 3D representation of an object to understand its shape. It can then identify and track how that object moves and rotates in 3D across a video, even in poor lighting conditions or complex scenes with visual obstructions.

FoundationPose could be used in industrial applications to help autonomous robots identify and track the objects they interact with. It could also be used in augmented reality applications where an AI model is used to overlay visuals on a live scene.

NeRFDeformer Transforms 3D Scenes With a Single Snapshot

A NeRF is an AI model that can render a 3D scene based on a series of 2D images taken from different positions in the environment. In fields like robotics, NeRFs can be used to generate immersive 3D renders of complex real-world scenes, such as a cluttered room or a construction site. However, to make any changes, developers would need to manually define how the scene has transformed — or remake the NeRF entirely.

Researchers from the University of Illinois Urbana-Champaign and NVIDIA have simplified the process with NeRFDeformer. The method, being presented at CVPR, can successfully transform an existing NeRF using a single RGB-D image, which is a combination of a normal photo and a depth map that captures how far each object in a scene is from the camera.

VILA Visual Language Model Gets the Picture

A CVPR research collaboration between NVIDIA and the Massachusetts Institute of Technology is advancing the state of the art for vision language models, which are generative AI models that can process videos, images and text.

The group developed VILA, a family of open-source visual language models that outperforms prior neural networks on key benchmarks that test how well AI models answer questions about images. VILA’s unique pretraining process unlocked new model capabilities, including enhanced world knowledge, stronger in-context learning and the ability to reason across multiple images.

figure showing how VILA can reason based on multiple images
VILA can understand memes and reason based on multiple images or video frames.

The VILA model family can be optimized for inference using the NVIDIA TensorRT-LLM open-source library and can be deployed on NVIDIA GPUs in data centers, workstations and even edge devices.

Read more about VILA on the NVIDIA Technical Blog and GitHub.

Generative AI Fuels Autonomous Driving, Smart City Research

A dozen of the NVIDIA-authored CVPR papers focus on autonomous vehicle research. Other AV-related highlights include:

Also at CVPR, NVIDIA contributed the largest ever indoor synthetic dataset to the AI City Challenge, helping researchers and developers advance the development of solutions for smart cities and industrial automation. The challenge’s datasets were generated using NVIDIA Omniverse, a platform of APIs, SDKs and services that enable developers to build Universal Scene Description (OpenUSD)-based applications and workflows.

NVIDIA Research has hundreds of scientists and engineers worldwide, with teams focused on topics including AI, computer graphics, computer vision, self-driving cars and robotics. Learn more about NVIDIA Research at CVPR.

Read More

NVIDIA Research Wins CVPR Autonomous Grand Challenge for End-to-End Driving

NVIDIA Research Wins CVPR Autonomous Grand Challenge for End-to-End Driving

Making moves to accelerate self-driving car development, NVIDIA was today named an Autonomous Grand Challenge winner at the Computer Vision and Pattern Recognition (CVPR) conference, running this week in Seattle.

Building on last year’s win in 3D Occupancy Prediction, NVIDIA Research topped the leaderboard this year in the End-to-End Driving at Scale category with its Hydra-MDP model, outperforming more than 400 entries worldwide.

This milestone shows the importance of generative AI in building applications for physical AI deployments in autonomous vehicle (AV) development. The technology can also be applied to industrial environments, healthcare, robotics and other areas.

The winning submission received CVPR’s Innovation Award as well, recognizing NVIDIA’s approach to improving “any end-to-end driving model using learned open-loop proxy metrics.”

In addition, NVIDIA announced NVIDIA Omniverse Cloud Sensor RTX, a set of microservices that enable physically accurate sensor simulation to accelerate the development of fully autonomous machines of every kind.

How End-to-End Driving Works

The race to develop self-driving cars isn’t a sprint but more a never-ending triathlon, with three distinct yet crucial parts operating simultaneously: AI training, simulation and autonomous driving. Each requires its own accelerated computing platform, and together, the full-stack systems purpose-built for these steps form a powerful triad that enables continuous development cycles, always improving in performance and safety.

To accomplish this, a model is first trained on an AI supercomputer such as NVIDIA DGX. It’s then tested and validated in simulation — using the NVIDIA Omniverse platform and running on an NVIDIA OVX system — before entering the vehicle, where, lastly, the NVIDIA DRIVE AGX platform processes sensor data through the model in real time.

Building an autonomous system to navigate safely in the complex physical world is extremely challenging. The system needs to perceive and understand its surrounding environment holistically, then make correct, safe decisions in a fraction of a second. This requires human-like situational awareness to handle potentially dangerous or rare scenarios.

AV software development has traditionally been based on a modular approach, with separate components for object detection and tracking, trajectory prediction, and path planning and control.

End-to-end autonomous driving systems streamline this process using a unified model to take in sensor input and produce vehicle trajectories, helping avoid overcomplicated pipelines and providing a more holistic, data-driven approach to handle real-world scenarios.

Watch a video about the Hydra-MDP model, winner of the CVPR Autonomous Grand Challenge for End-to-End Driving:

Navigating the Grand Challenge 

This year’s CVPR challenge asked participants to develop an end-to-end AV model, trained using the nuPlan dataset, to generate driving trajectory based on sensor data.

The models were submitted for testing inside the open-source NAVSIM simulator and were tasked with navigating thousands of scenarios they hadn’t experienced yet. Model performance was scored based on metrics for safety, passenger comfort and deviation from the original recorded trajectory.

NVIDIA Research’s winning end-to-end model ingests camera and lidar data, as well as the vehicle’s trajectory history, to generate a safe, optimal vehicle path for five seconds post-sensor input.

The workflow NVIDIA researchers used to win the competition can be replicated in high-fidelity simulated environments with NVIDIA Omniverse. This means AV simulation developers can recreate the workflow in a physically accurate environment before testing their AVs in the real world. NVIDIA Omniverse Cloud Sensor RTX microservices will be available later this year. Sign up for early access.

In addition, NVIDIA ranked second for its submission to the CVPR Autonomous Grand Challenge for Driving with Language. NVIDIA’s approach connects vision language models and autonomous driving systems, integrating the power of large language models to help make decisions and achieve generalizable, explainable driving behavior.

Learn More at CVPR 

More than 50 NVIDIA papers were accepted to this year’s CVPR, on topics spanning automotive, healthcare, robotics and more. Over a dozen papers will cover NVIDIA automotive-related research, including:

Sanja Fidler, vice president of AI research at NVIDIA, will speak on vision language models at the CVPR Workshop on Autonomous Driving.

Learn more about NVIDIA Research, a global team of hundreds of scientists and engineers focused on topics including AI, computer graphics, computer vision, self-driving cars and robotics.

Read More

NVIDIA Advances Physical AI at CVPR With Largest Indoor Synthetic Dataset

NVIDIA Advances Physical AI at CVPR With Largest Indoor Synthetic Dataset

NVIDIA contributed the largest ever indoor synthetic dataset to the Computer Vision and Pattern Recognition (CVPR) conference’s annual AI City Challenge — helping researchers and developers advance the development of solutions for smart cities and industrial automation.

The challenge, garnering over 700 teams from nearly 50 countries, tasks participants to develop AI models to enhance operational efficiency in physical settings, such as retail and warehouse environments, and intelligent traffic systems.

Teams tested their models on the datasets that were generated using NVIDIA Omniverse, a platform of application programming interfaces (APIs), software development kits (SDKs) and services that enable developers to build Universal Scene Description (OpenUSD)-based applications and workflows.

Creating and Simulating Digital Twins for Large Spaces

In large indoor spaces like factories and warehouses, daily activities involve a steady stream of people, small vehicles and future autonomous robots. Developers need solutions that can observe and measure activities, optimize operational efficiency, and prioritize human safety in complex, large-scale settings.

Researchers are addressing that need with computer vision models that can perceive and understand the physical world. It can be used in applications like multi-camera tracking, in which a model tracks multiple entities within a given environment.

To ensure their accuracy, the models must be trained on large, ground-truth datasets for a variety of real-world scenarios. But collecting that data can be a challenging, time-consuming and costly process.

AI researchers are turning to physically based simulations — such as digital twins of the physical world — to enhance AI simulation and training. These virtual environments can help generate synthetic data used to train AI models. Simulation also provides a way to run a multitude of “what-if” scenarios in a safe environment while addressing privacy and AI bias issues.

Creating synthetic data is important for AI training because it offers a large, scalable, and expandable amount of data. Teams can generate a diverse set of training data by changing many parameters including lighting, object locations, textures and colors.

Building Synthetic Datasets for the AI City Challenge

This year’s AI City Challenge consists of five computer vision challenge tracks that span traffic management to worker safety.

NVIDIA contributed datasets for the first track, Multi-Camera Person Tracking, which saw the highest participation, with over 400 teams. The challenge used a benchmark and the largest synthetic dataset of its kind — comprising 212 hours of 1080p videos at 30 frames per second spanning 90 scenes across six virtual environments, including a warehouse, retail store and hospital.

Created in Omniverse, these scenes simulated nearly 1,000 cameras and featured around 2,500 digital human characters. It also provided a way for the researchers to generate data of the right size and fidelity to achieve the desired outcomes.

The benchmarks were created using Omniverse Replicator in NVIDIA Isaac Sim, a reference application that enables developers to design, simulate and train AI for robots, smart spaces or autonomous machines in physically based virtual environments built on NVIDIA Omniverse.

Omniverse Replicator, an SDK for building synthetic data generation pipelines, automated many manual tasks involved in generating quality synthetic data, including domain randomization, camera placement and calibration, character movement, and semantic labeling of data and ground-truth for benchmarking.

Ten institutions and organizations are collaborating with NVIDIA for the AI City Challenge:

  • Australian National University, Australia
  • Emirates Center for Mobility Research, UAE
  • Indian Institute of Technology Kanpur, India
  • Iowa State University, U.S.
  • Johns Hopkins University, U.S.
  • National Yung-Ming Chiao-Tung University, Taiwan
  • Santa Clara University, U.S.
  • The United Arab Emirates University, UAE
  • University at Albany – SUNY, U.S.
  • Woven by Toyota, Japan

Driving the Future of Generative Physical AI 

Researchers and companies around the world are developing infrastructure automation and robots powered by physical AI — which are models that can understand instructions and autonomously perform complex tasks in the real world.

Generative physical AI uses reinforcement learning in simulated environments, where it perceives the world using accurately simulated sensors, performs actions grounded by laws of physics, and receives feedback to reason about the next set of actions.

Developers can tap into developer SDKs and APIs, such as the NVIDIA Metropolis developer stack — which includes a multi-camera tracking reference workflow — to add enhanced perception capabilities for factories, warehouses and retail operations. And with the latest release of NVIDIA Isaac Sim, developers can supercharge robotics workflows by simulating and training AI-based robots in physically based virtual spaces before real-world deployment.

Researchers and developers are also combining high-fidelity, physics-based simulation with advanced AI to bridge the gap between simulated training and real-world application. This helps ensure that synthetic training environments closely mimic real-world conditions for more seamless robot deployment.

NVIDIA is taking the accuracy and scale of simulations further with the recently announced NVIDIA Omniverse Cloud Sensor RTX, a set of microservices that enable physically accurate sensor simulation to accelerate the development of fully autonomous machines.

This technology will allow autonomous systems, whether a factory, vehicle or robot, to gather essential data to effectively perceive, navigate and interact with the real world. Using these microservices, developers can run large-scale tests on sensor perception within realistic, virtual environments, significantly reducing the time and cost associated with real-world testing.

Omniverse Cloud Sensor RTX microservices will be available later this year. Sign up for early access.

Showcasing Advanced AI With Research

Participants submitted research papers for the AI City Challenge and a few achieved top rankings, including:

All accepted papers will be presented at the AI City Challenge 2024 Workshop, taking place on June 17.

At CVPR 2024, NVIDIA Research will present over 50 papers, introducing generative physical AI breakthroughs with potential applications in areas like autonomous vehicle development and robotics.

Papers that used NVIDIA Omniverse to generate synthetic data or digital twins of environments for model simulation, testing and validation include:

Read more about NVIDIA Research at CVPR, and learn more about the AI City Challenge.

Get started with NVIDIA Omniverse by downloading the standard license free, access OpenUSD resources and learn how Omniverse Enterprise can connect teams. Follow Omniverse on Instagram, Medium, LinkedIn and X. For more, join the Omniverse community on the forums, Discord server, Twitch and YouTube channels.

Read More

‘Believe in Something Unconventional, Something Unexplored,’ NVIDIA CEO Tells Caltech Grads

‘Believe in Something Unconventional, Something Unexplored,’ NVIDIA CEO Tells Caltech Grads

NVIDIA founder and CEO Jensen Huang on Friday encouraged Caltech graduates to pursue their craft with dedication and resilience — and to view setbacks as new opportunities.

“I hope you believe in something. Something unconventional, something unexplored. But let it be informed, and let it be reasoned, and dedicate yourself to making that happen,” he said. “You may find your GPU. You may find your CUDA. You may find your generative AI. You may find your NVIDIA.”

Trading his signature leather jacket for black and yellow academic regalia, Huang addressed the nearly 600 graduates at their commencement ceremony in Pasadena, Calif., starting with the tale of the computing industry’s decades-long evolution to reach this pivotal moment of AI transformation.

“Computers today are the single most important instrument of knowledge, and it’s foundational to every single industry in every field of science,” Huang said. “As you enter industry, it’s important you know what’s happening.”

He shared how, over a decade ago, NVIDIA — a small company at the time — bet on deep learning, investing billions of dollars and years of engineering resources to reinvent every computing layer.

“No one knew how far deep learning could scale, and if we didn’t build it, we’d never know,” Huang said. Referencing the famous line from Field of Dreams — if you build it, he will come — he said, “Our logic is: If we don’t build it, they can’t come.”

Looking to the future, Huang said, the next wave of AI is robotics, a field where NVIDIA’s journey resulted from a series of setbacks.

He reflected on a period in NVIDIA’s past when the company each year built new products that “would be incredibly successful, generate enormous amounts of excitement. And then one year later, we were kicked out of those markets.”

These roadblocks pushed NVIDIA to seek out untapped areas — what Huang refers to as “zero-billion-dollar markets.”

“With no more markets to turn to, we decided to build something where we are sure there are no customers,” Huang said. “Because one of the things you can definitely guarantee is where there are no customers, there are also no competitors.”

Robotics was that new market. NVIDIA built the first robotics computer, Huang said, processing a deep learning algorithm. Over a decade later, that pivot has given the company the opportunity to create the next wave of AI.

“One setback after another, we shook it off and skated to the next opportunity. Each time, we gain skills and strengthen our character,” Huang said. “No setback that comes our way doesn’t look like an opportunity these days.”

Huang stressed the importance of resilience and agility as superpowers that strengthen character.

“The world can be unfair and deal you with tough cards. Swiftly shake it off,” he said, with a tongue-in-cheek reference to one of Taylor Swift’s biggest hits. “There’s another opportunity out there — or create one.”

Huang concluded by sharing a story from his travels to Japan, where, as he watched a gardener painstakingly tending to Kyoto’s famous moss garden, he realized that when a person is dedicated to their craft and prioritizes doing their life’s work, they always have plenty of time.

“Prioritize your life,” he said, “and you will have plenty of time to do the important things.”

Main image courtesy of Caltech. 

Read More

NVIDIA Releases Open Synthetic Data Generation Pipeline for Training Large Language Models

NVIDIA Releases Open Synthetic Data Generation Pipeline for Training Large Language Models

NVIDIA today announced Nemotron-4 340B, a family of open models that developers can use to generate synthetic data for training large language models (LLMs) for commercial applications across healthcare, finance, manufacturing, retail and every other industry.

High-quality training data plays a critical role in the performance, accuracy and quality of responses from a custom LLM — but robust datasets can be prohibitively expensive and difficult to access.

Through a uniquely permissive open model license, Nemotron-4 340B gives developers a free, scalable way to generate synthetic data that can help build powerful LLMs.

The Nemotron-4 340B family includes base, instruct and reward models that form a pipeline to generate synthetic data used for training and refining LLMs. The models are optimized to work with NVIDIA NeMo, an open-source framework for end-to-end model training, including data curation, customization and evaluation. They’re also optimized for inference with the open-source NVIDIA TensorRT-LLM library.

Nemotron-4 340B can be downloaded now from Hugging Face. Developers will soon be able to access the models at ai.nvidia.com, where they’ll be packaged as an NVIDIA NIM microservice with a standard application programming interface that can be deployed anywhere.

Navigating Nemotron to Generate Synthetic Data

LLMs can help developers generate synthetic training data in scenarios where access to large, diverse labeled datasets is limited.

The Nemotron-4 340B Instruct model creates diverse synthetic data that mimics the characteristics of real-world data, helping improve data quality to increase the performance and robustness of custom LLMs across various domains.

Then, to boost the quality of the AI-generated data, developers can use the Nemotron-4 340B Reward model to filter for high-quality responses. Nemotron-4 340B Reward grades responses on five attributes: helpfulness, correctness, coherence, complexity and verbosity. It’s currently first place on the Hugging Face RewardBench leaderboard, created by AI2, for evaluating the capabilities, safety and pitfalls of reward models.

nemotron synthetic data generation pipeline diagram
In this synthetic data generation pipeline, (1) the Nemotron-4 340B Instruct model is first used to produce synthetic text-based output. An evaluator model, (2) Nemotron-4 340B Reward, then assesses this generated text — providing feedback that guides iterative improvements and ensures the synthetic data is accurate, relevant and aligned with specific requirements.

Researchers can also create their own instruct or reward models by customizing the Nemotron-4 340B Base model using their proprietary data, combined with the included HelpSteer2 dataset.

Fine-Tuning With NeMo, Optimizing for Inference With TensorRT-LLM

Using open-source NVIDIA NeMo and NVIDIA TensorRT-LLM, developers can optimize the efficiency of their instruct and reward models to generate synthetic data and to score responses.

All Nemotron-4 340B models are optimized with TensorRT-LLM to take advantage of tensor parallelism, a type of model parallelism in which individual weight matrices are split across multiple GPUs and servers, enabling efficient inference at scale.

Nemotron-4 340B Base, trained on 9 trillion tokens, can be customized using the NeMo framework to adapt to specific use cases or domains. This fine-tuning process benefits from extensive pretraining data and yields more accurate outputs for specific downstream tasks.

A variety of customization methods are available through the NeMo framework, including supervised fine-tuning and parameter-efficient fine-tuning methods such as low-rank adaptation, or LoRA.

To boost model quality, developers can align their models with NeMo Aligner and datasets annotated by Nemotron-4 340B Reward. Alignment is a key step in training LLMs, where a model’s behavior is fine-tuned using algorithms like reinforcement learning from human feedback (RLHF) to ensure its outputs are safe, accurate, contextually appropriate and consistent with its intended goals.

Businesses seeking enterprise-grade support and security for production environments can also access NeMo and TensorRT-LLM through the cloud-native NVIDIA AI Enterprise software platform, which provides accelerated and efficient runtimes for generative AI foundation models.

Evaluating Model Security and Getting Started

The Nemotron-4 340B Instruct model underwent extensive safety evaluation, including adversarial tests, and performed well across a wide range of risk indicators. Users should still perform careful evaluation of the model’s outputs to ensure the synthetically generated data is suitable, safe and accurate for their use case.

For more information on model security and safety evaluation, read the model card.

Download Nemotron-4 340B models via Hugging Face. For more details, read the research papers on the model and dataset.

See notice regarding software product information.

Read More

‘The Proudest Refugee’: How Veronica Miller Charts Her Own Path at NVIDIA

‘The Proudest Refugee’: How Veronica Miller Charts Her Own Path at NVIDIA

When she was five years old, Veronica Miller (née Teklai) and her family left their homeland of Eritrea, in the Horn of Africa, to escape an ongoing war with Ethiopia and create a new life in the U.S.

She grew up in East Orange, New Jersey, watching others judge her parents and turn them away from jobs they were qualified for because of their appearance, their accented English or their unfamiliar names.

After working in the shipping industry for 20 years, Miller’s dad eventually became a New York City cab driver, an often-dangerous job in the 1980s. Her mom, despite earning a computer science degree in the U.S., trained to become a home health aide, where jobs were more available.

“My parents’ resilience and courage made my life possible,” Miller said.

After graduating from Ramapo College of New Jersey with a degree in international business, Miller worked at large automotive companies in client support, production support and project management.

Now working as a technical program manager in product security at NVIDIA, she feels like her family’s journey has come full circle.

“It’s the honor of my life being here at NVIDIA: I’m the proudest refugee,” she said.

In her role, Miller functions like a conductor in an orchestra. She works with engineers to bridge gaps and understand challenges to define solutions — always trying to create opportunities to turn a “no” into a “yes” through collaboration.

At NVIDIA, Miller feels like she can be herself, helping her thrive. She no longer feels the pressure to conform to fit in, allowing her creativity to flow freely and solve problems.

“Previously in my career, I never wore my hair curly. After someone once asked to touch my curly hair, I believed it would be easier to make myself look like everyone else. I thought it was the best way to let my work be the focus instead of my hair,” she said. “NVIDIA is the first employer that encouraged me to bring my full self to work.”

Outside of work, Veronica and her husband, Nathan, are passionate about paying it forward and helping local youth in Trenton, New Jersey. Together, they’ve developed The Miller Family Foundation to help with community needs, including education. The foundation’s scholarship fund has donated $20,000 to low-income high school students to provide support for college tuition and career mentorship.

“I truly believe anyone could get here. There wasn’t anyone that showed me the path. It was belief in myself, a ton of research and endless hard work,” she said. “We’re in a special place where my husband and I can give the next generation some of the financial support and career guidance we didn’t have.”

Learn more about NVIDIA life, culture and careers

Read More

Cloud Ahoy! Treasure Awaits With ‘Sea of Thieves’ on GeForce NOW

Cloud Ahoy! Treasure Awaits With ‘Sea of Thieves’ on GeForce NOW

Set sail for adventure, pirates. Sea of Thieves makes waves in the cloud this week. It’s an adventure-filled GFN Thursday with four new games joining the GeForce NOW library.

#GreetingsfromGFN
#GreetingsFromGFN by Cloud Gaming Photography.

Plus, members are sharing their favorite locations they can access from the cloud. Follow along all month on @NVIDIAGFN social media accounts and post your own favorite cloud screenshots using #GreetingsfromGFN.

Seas the Day

Live the pirate life in the smash-hit pirate adventure game from Rare and Xbox Game Studios. Sea of Thieves takes place in an open world where players can explore the vast seas, engage in ship battles, hunt for treasure and embark on exciting quests.

Sea of Thieves on GeForce NOW
Come sea what’s possible in the cloud.

The Sea of Thieves environment is always changing, as various seasons bring new features to the game and offer rich rewards for pirates old and new. Visit uncharted islands in search of treasure, dive deep into narrative-focused Tall Tales, take part in events and forge a path to become a true Pirate Legend. The newest season features the mysterious Sunken City, Cursed Sloop skeleton ships and fresh cosmetics.

Every pirate needs a crew, so grab some mateys and carve a fearsome reputation across the open seas, or adventure solo to keep all the bountiful treasure. Make the journey more rewarding with a GeForce NOW Ultimate membership, and play with gamers across the world with up to eight-hour gaming sessions for a kraken good time.

New Games Zoom Onto the Cloud

Disney Speedstorm on GeForce NOW
Take the tracks by storm.

Drift into the ultimate hero-based combat racing game in Disney Speedstorm, a free-to-play kart-racing game that features characters and high-speed circuits inspired by beloved Disney and Pixar worlds. Customize racers and karts, master each character’s unique skills and engage in thrilling multiplayer races. Whether exploring the docks of the Pirates’ Island track from Pirates of the Caribbean or the wilds of the Jungle Ruins map from The Jungle Book, players can experience iconic environments in the game.

Check out the list of new games this week:

  • SunnySide (New release on Steam, June 14)
  • Disney Speedstorm (Steam and Xbox, available on PC Game Pass)
  • Sea of Thieves (Steam and Xbox, available on PC Game Pass)
  • Bodycam (Steam)

What are you planning to play this weekend? Let us know on X or in the comments below.

 

Read More

Every Company’s Data Is Their ‘Gold Mine,’ NVIDIA CEO Says at Databricks Data + AI Summit

Every Company’s Data Is Their ‘Gold Mine,’ NVIDIA CEO Says at Databricks Data + AI Summit

Accelerated computing is transforming data processing and analytics for enterprises, declared NVIDIA founder and CEO Jensen Huang Wednesday during an on-stage chat with Databricks cofounder and CEO Ali Ghodsi at the Databricks Data + AI Summit 2024.

“Every company’s business data is their gold mine,” Huang said, explaining that every company has enormous amounts of data, but extracting insights and distilling intelligence from it has been challenging.

Databricks Leverages NVIDIA’s Full Stack to Accelerate Generative AI Applications

To unlock all that intelligence, Huang and Ghodsi announced the integration of NVIDIA’s accelerated computing with Databricks Photon, Databricks’ engine for fast data processing, designed to power Databricks SQL with top-tier performance and cost efficiency.

“This is a big announcement,” Huang said, adding that accelerated computing and generative AI are the two most important technological trends today. “NVIDIA and Databricks are going to partner to combine our skills in these areas and bring them to all of you.”

Huang shared that it’s taken NVIDIA five years to build a set of libraries that make it possible to accelerate Photon, allowing users to “wrangle data faster, more cost-effectively and consume a lot less energy.”

“We are super-excited to partner with you to use GPU acceleration on the Photon engine to enhance core data processing and get them to also run on NVIDIA GPUs,” Ghodsi said.

Creating Generative AI Factories With NVIDIA NIM

NVIDIA and Databricks also announced that Databricks’ open-source model DBRX is now available as an NVIDIA NIM microservice hosted on the NVIDIA API catalog.

NVIDIA NIM inference microservices provide models as fully optimized, pre-built containers for deployment anywhere.

“Creating these endpoints is complicated,” Huang explained. “We optimized everything into a microservice, which runs on every cloud and on premises.”

Microservices dramatically increase enterprise developer productivity by providing a simple, standardized way to add generative AI models to applications.

Launched in March, DBRX was built entirely on top of Databricks, leveraging all the tools and techniques available to Databricks customers and partners, and was trained with NVIDIA DGX Cloud, a scalable end-to-end AI platform for developers.

Organizations can customize DBRX with enterprise data to create high-quality, organization-specific models or use it to build a custom DBRX-style mixture of expert models as a reference architecture.

Huang said that accelerating data processing is a huge opportunity, encouraging everyone to put accelerated computing and generative AI to work.

“Whatever you do, just start — you have to engage in this incredibly fast-moving train,” Huang said. “Remember, generative AI is growing exponentially — you don’t want to wait and observe an exponential trend, because in a couple of years, you’ll be so far behind.”

Joining the Conversation

Attendees at the summit are encouraged to participate in sessions and engage with NVIDIA experts to learn more about how NVIDIA and Databricks are driving the future of AI and data intelligence.

Key sessions, taking place June 13, include:

  • “Development and Deployment of Generative AI with NVIDIA” at 12:30 p.m. PT
  • “Architecture Analysis for ETL Processing: CPU vs. GPU” at 4:30 p.m. PT;
  • “Spark RAPIDS ML: GPU Accelerated Distributed ML in Spark Clusters” at 1:30 p.m. PT

Read More

Scaling to New Heights: NVIDIA MLPerf Training Results Showcase Unprecedented Performance and Elasticity

Scaling to New Heights: NVIDIA MLPerf Training Results Showcase Unprecedented Performance and Elasticity

The full-stack NVIDIA accelerated computing platform has once again demonstrated exceptional performance in the latest MLPerf Training v4.0 benchmarks.

NVIDIA more than tripled the performance on the large language model (LLM) benchmark, based on GPT-3 175B, compared to the record-setting NVIDIA submission made last year. Using an AI supercomputer featuring 11,616 NVIDIA H100 Tensor Core GPUs connected with NVIDIA Quantum-2 InfiniBand networking, NVIDIA  achieved this remarkable feat through larger scale — more than triple that of the 3,584 H100 GPU submission a year ago — and extensive full-stack engineering.

Thanks to the scalability of the NVIDIA AI platform, Eos can now train massive AI models like GPT-3 175B even faster, and this great AI performance translates into significant business opportunities. For example, in NVIDIA’s recent earnings call, we described how LLM service providers can turn a single dollar invested into seven dollars in just four years running the Llama 3 70B model on NVIDIA HGX H200 servers. This return assumes an LLM service provider serving Llama 3 70B at $0.60/M tokens, with an HGX H200 server throughput of 24,000 tokens/second.

NVIDIA H200 GPU Supercharges Generative AI and HPC 

The NVIDIA H200 Tensor GPU builds upon the strength of the Hopper architecture, with 141GB of HBM3 memory and over 40% more memory bandwidth compared to the H100 GPU. Pushing the boundaries of what’s possible in AI training, the NVIDIA H200 Tensor Core GPU extended the H100’s performance by up to 47% in its MLPerf Training debut.

NVIDIA Software Drives Unmatched Performance Gains

Additionally, our submissions using a 512 H100 GPU configuration are now up to 27% faster compared to just one year ago due to numerous optimizations to the NVIDIA software stack. This improvement highlights how continuous software enhancements can significantly boost performance, even with the same hardware.

This work also delivered nearly perfect scaling. As the number of GPUs increased by 3.2x — going from 3,584 H100 GPUs last year to 11,616 H100 GPUs with this submission — so did the delivered performance.

Learn more about these optimizations on the NVIDIA Technical Blog.

Excelling at LLM Fine-Tuning

As enterprises seek to customize pretrained large language models, LLM fine-tuning is becoming a key industry workload. MLPerf introduced a new LLM fine-tuning benchmark this round, based on the popular low-rank adaptation (LoRA) technique applied to Meta Llama 2 70B.

The NVIDIA platform excelled at this task, scaling from eight to 1,024 GPUs, with the largest-scale NVIDIA submission completing the benchmark in a record 1.5 minutes.

Accelerating Stable Diffusion and GNN Training

NVIDIA also accelerated Stable Diffusion v2 training performance by up to 80% at the same system scales submitted last round. These advances reflect numerous enhancements to the NVIDIA software stack, showcasing how software and hardware improvements go hand-in-hand to deliver top-tier performance.

On the new graph neural network (GNN) test based on R-GAT, the NVIDIA platform with H100 GPUs excelled at both small and large scales. The H200 delivered a 47% boost on single-node GNN training compared to the H100. This showcases the powerful performance and high efficiency of NVIDIA GPUs, which make them ideal for a wide range of AI applications.

Broad Ecosystem Support

Reflecting the breadth of the NVIDIA AI ecosystem, 10 NVIDIA partners submitted results, including ASUS, Dell Technologies, Fujitsu, GIGABYTE, Hewlett Packard Enterprise, Lenovo, Oracle, Quanta Cloud Technology, Supermicro and Sustainable Metal Cloud. This broad participation, and their own impressive benchmark results, underscores the widespread adoption and trust in NVIDIA’s AI platform across the industry.

MLCommons’ ongoing work to bring benchmarking best practices to AI computing is vital. By enabling peer-reviewed comparisons of AI and HPC platforms, and keeping pace with the rapid changes that characterize AI computing, MLCommons provides companies everywhere with crucial data that can help guide important purchasing decisions.

And with the NVIDIA Blackwell platform, next-level AI performance on trillion-parameter generative AI models for both training and inference is coming soon.

Read More