June 2024 – Page 8

NVIDIA Releases Open Synthetic Data Generation Pipeline for Training Large Language Models

NVIDIA today announced Nemotron-4 340B, a family of open models that developers can use to generate synthetic data for training large language models (LLMs) for commercial applications across healthcare, finance, manufacturing, retail and every other industry.

High-quality training data plays a critical role in the performance, accuracy and quality of responses from a custom LLM — but robust datasets can be prohibitively expensive and difficult to access.

Through a uniquely permissive open model license, Nemotron-4 340B gives developers a free, scalable way to generate synthetic data that can help build powerful LLMs.

The Nemotron-4 340B family includes base, instruct and reward models that form a pipeline to generate synthetic data used for training and refining LLMs. The models are optimized to work with NVIDIA NeMo, an open-source framework for end-to-end model training, including data curation, customization and evaluation. They’re also optimized for inference with the open-source NVIDIA TensorRT-LLM library.

Nemotron-4 340B can be downloaded now from Hugging Face. Developers will soon be able to access the models at ai.nvidia.com, where they’ll be packaged as an NVIDIA NIM microservice with a standard application programming interface that can be deployed anywhere.

Navigating Nemotron to Generate Synthetic Data

LLMs can help developers generate synthetic training data in scenarios where access to large, diverse labeled datasets is limited.

The Nemotron-4 340B Instruct model creates diverse synthetic data that mimics the characteristics of real-world data, helping improve data quality to increase the performance and robustness of custom LLMs across various domains.

Then, to boost the quality of the AI-generated data, developers can use the Nemotron-4 340B Reward model to filter for high-quality responses. Nemotron-4 340B Reward grades responses on five attributes: helpfulness, correctness, coherence, complexity and verbosity. It’s currently first place on the Hugging Face RewardBench leaderboard, created by AI2, for evaluating the capabilities, safety and pitfalls of reward models.

nemotron synthetic data generation pipeline diagram — In this synthetic data generation pipeline, (1) the Nemotron-4 340B Instruct model is first used to produce synthetic text-based output. An evaluator model, (2) Nemotron-4 340B Reward, then assesses this generated text — providing feedback that guides iterative improvements and ensures the synthetic data is accurate, relevant and aligned with specific requirements.

Researchers can also create their own instruct or reward models by customizing the Nemotron-4 340B Base model using their proprietary data, combined with the included HelpSteer2 dataset.

Fine-Tuning With NeMo, Optimizing for Inference With TensorRT-LLM

Using open-source NVIDIA NeMo and NVIDIA TensorRT-LLM, developers can optimize the efficiency of their instruct and reward models to generate synthetic data and to score responses.

All Nemotron-4 340B models are optimized with TensorRT-LLM to take advantage of tensor parallelism, a type of model parallelism in which individual weight matrices are split across multiple GPUs and servers, enabling efficient inference at scale.

Nemotron-4 340B Base, trained on 9 trillion tokens, can be customized using the NeMo framework to adapt to specific use cases or domains. This fine-tuning process benefits from extensive pretraining data and yields more accurate outputs for specific downstream tasks.

A variety of customization methods are available through the NeMo framework, including supervised fine-tuning and parameter-efficient fine-tuning methods such as low-rank adaptation, or LoRA.

To boost model quality, developers can align their models with NeMo Aligner and datasets annotated by Nemotron-4 340B Reward. Alignment is a key step in training LLMs, where a model’s behavior is fine-tuned using algorithms like reinforcement learning from human feedback (RLHF) to ensure its outputs are safe, accurate, contextually appropriate and consistent with its intended goals.

Businesses seeking enterprise-grade support and security for production environments can also access NeMo and TensorRT-LLM through the cloud-native NVIDIA AI Enterprise software platform, which provides accelerated and efficient runtimes for generative AI foundation models.

Evaluating Model Security and Getting Started

The Nemotron-4 340B Instruct model underwent extensive safety evaluation, including adversarial tests, and performed well across a wide range of risk indicators. Users should still perform careful evaluation of the model’s outputs to ensure the synthetically generated data is suitable, safe and accurate for their use case.

For more information on model security and safety evaluation, read the model card.

Download Nemotron-4 340B models via Hugging Face. For more details, read the research papers on the model and dataset.

See notice regarding software product information.

‘The Proudest Refugee’: How Veronica Miller Charts Her Own Path at NVIDIA

When she was five years old, Veronica Miller (née Teklai) and her family left their homeland of Eritrea, in the Horn of Africa, to escape an ongoing war with Ethiopia and create a new life in the U.S.

She grew up in East Orange, New Jersey, watching others judge her parents and turn them away from jobs they were qualified for because of their appearance, their accented English or their unfamiliar names.

After working in the shipping industry for 20 years, Miller’s dad eventually became a New York City cab driver, an often-dangerous job in the 1980s. Her mom, despite earning a computer science degree in the U.S., trained to become a home health aide, where jobs were more available.

“My parents’ resilience and courage made my life possible,” Miller said.

After graduating from Ramapo College of New Jersey with a degree in international business, Miller worked at large automotive companies in client support, production support and project management.

Now working as a technical program manager in product security at NVIDIA, she feels like her family’s journey has come full circle.

“It’s the honor of my life being here at NVIDIA: I’m the proudest refugee,” she said.

In her role, Miller functions like a conductor in an orchestra. She works with engineers to bridge gaps and understand challenges to define solutions — always trying to create opportunities to turn a “no” into a “yes” through collaboration.

At NVIDIA, Miller feels like she can be herself, helping her thrive. She no longer feels the pressure to conform to fit in, allowing her creativity to flow freely and solve problems.

“Previously in my career, I never wore my hair curly. After someone once asked to touch my curly hair, I believed it would be easier to make myself look like everyone else. I thought it was the best way to let my work be the focus instead of my hair,” she said. “NVIDIA is the first employer that encouraged me to bring my full self to work.”

Outside of work, Veronica and her husband, Nathan, are passionate about paying it forward and helping local youth in Trenton, New Jersey. Together, they’ve developed The Miller Family Foundation to help with community needs, including education. The foundation’s scholarship fund has donated $20,000 to low-income high school students to provide support for college tuition and career mentorship.

“I truly believe anyone could get here. There wasn’t anyone that showed me the path. It was belief in myself, a ton of research and endless hard work,” she said. “We’re in a special place where my husband and I can give the next generation some of the financial support and career guidance we didn’t have.”

Learn more about NVIDIA life, culture and careers.

Server-side Rescoring of Spoken Entity-centric Knowledge Queries for Virtual Assistants

On-device Virtual Assistants powered by Automated Speech Recognition (ASR) require effective knowledge integration for the challenging entity-rich query recognition.
In this paper, we conduct an empirical study of modeling strategies for server-side rescoring of spoken information domain queries using various categories of Language Models (N-Gram word Language Models, sub-word neural LMs).
We investigate the combination of on-device and server-side signals, and demonstrate significant WER improvements of 23%-35% on various entity-centric query subpopulations
by integrating various server-side…Apple Machine Learning Research

Hypernetworks for Personalizing ASR to Atypical Speech

*Equal Contributors
Parameter-efficient fine-tuning (PEFT) for personalizing automatic speech recognition (ASR) has recently shown promise for adapting general population models to atypical speech. However, these approaches assume a priori knowledge of the atypical speech disorder being adapted for — the diagnosis of which requires expert knowledge that is not always available. Even given this knowledge, data scarcity and high inter/intra-speaker variability further limit the effectiveness of traditional fine-tuning. To circumvent these challenges, we first identify the minimal set of model…Apple Machine Learning Research

Time Sensitive Knowledge Editing through Efficient Finetuning

Large Language Models (LLMs) have demonstrated impressive capability in different tasks and are bringing transformative changes to many domains. However, keeping the knowledge in LLMs up-to-date remains a challenge once pretraining is complete. It is thus essential to design effective methods to both update obsolete knowledge and induce new knowledge into LLMs. Existing locate-and-edit knowledge editing (KE) method suffers from two limitations. First, the post-edit LLMs by such methods generally have poor capability in answering complex queries that require multi-hop reasoning. Second, the…Apple Machine Learning Research

Transformer-based Model for ASR N-Best Rescoring and Rewriting

Voice assistants increasingly use on-device Automatic Speech Recognition (ASR) to ensure speed and privacy. However, due to resource constraints on the device, queries pertaining to complex information domains often require further processing by a search engine. For such applications, we propose a novel Transformer based model capable of rescoring and rewriting, by exploring full context of the N-best hypotheses in parallel. We also propose a new discriminative sequence training objective that can work well for both rescore and rewrite tasks. We show that our Rescore+Rewrite model outperforms…Apple Machine Learning Research

Automated evaluation of RAG pipelines with exam generation

The fight against hallucination in retrieval-augmented-generation models starts with a method for accurately assessing it.Read More

Quiz: Test your knowledge of Google’s May news

Test your knowledge of Google updates with our May news quiz.Read More

A quick guide to Amazon’s papers at CVPR 2024

As in other areas of AI, generative models and foundation models — such as vision-language models — are a hot topic.Read More

Cloud Ahoy! Treasure Awaits With ‘Sea of Thieves’ on GeForce NOW

Set sail for adventure, pirates. Sea of Thieves makes waves in the cloud this week. It’s an adventure-filled GFN Thursday with four new games joining the GeForce NOW library.

#GreetingsfromGFN — *#GreetingsFromGFN by Cloud Gaming Photography.*

Plus, members are sharing their favorite locations they can access from the cloud. Follow along all month on @NVIDIAGFN social media accounts and post your own favorite cloud screenshots using #GreetingsfromGFN.

Seas the Day

Live the pirate life in the smash-hit pirate adventure game from Rare and Xbox Game Studios. Sea of Thieves takes place in an open world where players can explore the vast seas, engage in ship battles, hunt for treasure and embark on exciting quests.

Sea of Thieves on GeForce NOW — *Come sea what’s possible in the cloud.*

The Sea of Thieves environment is always changing, as various seasons bring new features to the game and offer rich rewards for pirates old and new. Visit uncharted islands in search of treasure, dive deep into narrative-focused Tall Tales, take part in events and forge a path to become a true Pirate Legend. The newest season features the mysterious Sunken City, Cursed Sloop skeleton ships and fresh cosmetics.

Every pirate needs a crew, so grab some mateys and carve a fearsome reputation across the open seas, or adventure solo to keep all the bountiful treasure. Make the journey more rewarding with a GeForce NOW Ultimate membership, and play with gamers across the world with up to eight-hour gaming sessions for a kraken good time.

New Games Zoom Onto the Cloud

Disney Speedstorm on GeForce NOW — *Take the tracks by storm.*

Drift into the ultimate hero-based combat racing game in Disney Speedstorm, a free-to-play kart-racing game that features characters and high-speed circuits inspired by beloved Disney and Pixar worlds. Customize racers and karts, master each character’s unique skills and engage in thrilling multiplayer races. Whether exploring the docks of the Pirates’ Island track from Pirates of the Caribbean or the wilds of the Jungle Ruins map from The Jungle Book, players can experience iconic environments in the game.

Check out the list of new games this week:

SunnySide (New release on Steam, June 14)
Disney Speedstorm (Steam and Xbox, available on PC Game Pass)
Sea of Thieves (Steam and Xbox, available on PC Game Pass)
Bodycam (Steam)

What are you planning to play this weekend? Let us know on X or in the comments below.

Oar you looking forward to tomorrow?

— NVIDIA GeForce NOW (@NVIDIAGFN) June 12, 2024

Vedere AI

Monthly Archives: June 2024