Unleash the Dragonborn: ‘Elder Scrolls V: Skyrim Special Edition’ Joins GeForce NOW

Unleash the Dragonborn: ‘Elder Scrolls V: Skyrim Special Edition’ Joins GeForce NOW

“Hey, you. You’re finally awake.”

It’s the summer of Elder Scrolls — whether a seasoned Dragonborn or a new adventurer, dive into the legendary world of Tamriel this GFN Thursday as The Elder Scrolls V: Skyrim Special Edition joins the cloud.

Epic adventures await, along with nine new games joining the GeForce NOW library this week.

Plus make sure to catch the GeForce NOW Summer Sale for 50% off new Ultimate and Priority memberships.

Unleash the Dragonborn

Skyrim on GeForce NOW
Taking an arrow to the knee won’t stop gamers from questing in the cloud.

Experience the legendary adventures, breathtaking landscapes and immersive storytelling of the iconic role-playing game The Elder Scrolls V: Skyrim Special Edition from Bethesda Game Studios — now accessible on any device from the cloud. Become the Dragonborn and defeat Alduin the World-Eater, a dragon prophesied to destroy the world. 

Explore a vast landscape, complete quests and improve skills to develop characters in the open world of Skyrim. The Special Edition includes add-ons with all-new features, including remastered art and effects. It also brings the adventure of Bethesda Game Studios creations, including new quests, environments, characters, dialogue, armor and weapons.

Get ready to embark on unforgettable quests, battle fearsome foes and uncover the rich lore of the Elder Scrolls universe, all with the power and convenience of GeForce NOW. “Fus Ro Dah” with an Ultimate membership to stream at up to 4K resolution and 120 frames per second with up to eight-hour gaming sessions for the ultimate immersive experience throughout the realms of Tamriel.

All Hands on Deck

World of Warships members rewards on GeForce NOW
Get those sea legs ready for a reward.

Wargaming is bringing back an in-game event exclusively for GeForce NOW members this week.

Through Tuesday, July 30, members who complete the quest while streaming World of Warships can earn up to five GeForce NOW one-day Priority codes — one for each day of the challenge. Aspiring admirals can learn more on the World of Warships blog and social channels.

Shiny and New

Conscript on GeForce NOW
Rendezvous with death.

Take on classic survival horror in CONSCRIPT from Jordan Mochi and Team17. Inspired by legendary games in the genre, the game is set in 1916 during the Great War. CONSCRIPT blends all the punishing mechanics of older horror games into a cohesive, tense and unique experience. Play as a French soldier searching for his missing-in-action brother during the Battle of Verdun. Search through twisted trenches, navigate overrun forts and cross no-man’s-land to find him.

Here’s the full list of new games this week:

  • Cataclismo (New release on Steam, July 22
  • CONSCRIPT (New release on Steam, July 23)
  • F1 Manager 2024 (New release on Steam, July 23)
  • EARTH DEFENSE FORCE 6 (New release on Steam, July 25)
  • The Elder Scrolls V: Skyrim (Steam)
  • The Elder Scrolls V: Skyrim Special Edition (Steam, Epic Games Store and Xbox, available on PC Game Pass)
  • Gang Beasts (Steam and Xbox, available on PC Game Pass)
  • Kingdoms and Castles (Steam)
  • The Settlers: New Allies (Steam)

What are you planning to play this weekend? Let us know on X or in the comments below.

 

Read More

Demystifying AI-Assisted Artistry With Adobe Apps Using NVIDIA RTX

Demystifying AI-Assisted Artistry With Adobe Apps Using NVIDIA RTX

Editor’s note: This post is part of the AI Decoded series, which demystifies AI by making the technology more accessible, and showcases new hardware, software, tools and accelerations for RTX PC users.

Adobe Creative Cloud applications, which tap NVIDIA RTX GPUs, are designed to enhance the creativity of users, empowering them to work faster and focus on their craft.

These tools seamlessly integrate into existing creator workflows, enabling greater productivity and delivering power and precision.

Look to the Light

Generative AI creates new data in forms such as images or text by learning from existing data. It effectively visualizes and generates content to match what a user describes and helps open up fresh avenues for creativity.

Adobe Firefly is Adobe’s family of creative generative AI models that offer new ways to ideate and create while assisting creative workflows using generative AI. They’re designed to be safe for commercial use and were trained, using NVIDIA GPUs, on licensed content, like Adobe Stock Images, and public domain content where copyright has expired.

Firefly features are integrated in Adobe’s most popular creative apps.

Adobe Photoshop features the Generative Fill tool, which uses simple description prompts to easily add content from images. With the latest Reference Image feature currently in beta, users can also upload a sample image to get image results closer to their desired output.

Use Generative Fill to add content and Reference Image to refine it.

Generative Expand allows artists to extend the border of their image with the Crop tool, filling in bigger canvases with new content that automatically blends in with the existing image.

Bigger canvas? Not a problem.

RTX-accelerated Neural Filters, such as Photo Restoration, enable complex adjustments such as colorizing black-and-white photos and performing style transfers using AI. The Smart Portrait filter, which allows non-destructive editing with filters, is based on work from NVIDIA Research.

The brand-new Generative Shape Fill (beta) in Adobe Illustrator, powered by the latest Adobe Firefly Vector Model, allows users to accelerate design workflows by quickly filling shapes with detail and color in their own styles. With Generative Shape Fill, designers can easily match the style and color of their own artwork to create a wide variety of editable and scalable vector graphic options.

Generative AI.

Adobe Illustrator’s Generative Recolor feature lets creators type in a text prompt to explore custom color palettes and themes for their vector artwork in seconds.

Color us impressed.

NVIDIA will continue working with Adobe to support advanced generative AI models, with a focus on deep integration into the apps the world’s leading creators use.

Making Moves on Video

Adobe Premiere Pro is one of the most popular and powerful video editing solutions.

Its Enhance Speech tool, accelerated by RTX, uses AI to remove unwanted noise and improve the quality of dialogue clips so they sound professionally recorded. It’s up to 4.5x faster on RTX PCs.

Adobe Premiere Pro’s AI-powered Enhance Speech tool removes unwanted noise and improves dialogue quality.

Auto Reframe, another Adobe Premiere feature, uses GPU acceleration to identify and track the most relevant elements in a video, and intelligently reframes video content for different aspect ratios. Scene Edit Detection automatically finds the original edit points in a video, a necessary step before the video editing stage begins.

Visual Effects

Separating a foreground object from a background is a crucial step in many visual effects and compositing workflows.

Adobe After Effects has a new feature that uses a matte to isolate an object, enabling capabilities including background replacement and the selective application of effects to the foreground.

Using the Roto Brush tool, artists can draw strokes on representative areas of the foreground and background elements. After Effects uses that information to create a segmentation boundary between the foreground and background elements, delivering cleaner cutouts with fewer clicks.

Creating 3D Product Shots

The Substance 3D Collection is Adobe’s solution for 3D material authoring, texturing and rendering, enabling users to rapidly create stunningly photorealistic 3D content, including models, materials and lighting.

Visualizing products and designs in the context of a space is compelling, but it can be time-consuming to find the right environment for the objects to live in. Substance 3D Stager’s Generative Background feature, powered by Adobe Firefly, solves this issue by letting artists quickly explore generated backgrounds to composite 3D models.

Once an environment is selected, Stager can automatically match the perspective and lighting to the generated background.

Material Authoring With AI

Adobe Substance 3D Sampler, also part of the Substance 3D Collection, is designed to transform images of surfaces and objects into photorealistic physically based rendering (PBR) materials, 3D models and high-dynamic range environment lights. With the recent introduction of new generative workflows powered by Adobe Firefly, Sampler is making it easier than ever for artists to explore variations when creating materials for everything from product visualization projects to the latest AAA games.

Sampler’s Text-to-Texture feature allows users to generate tiled images from detailed text prompts. These generated images can then be edited and transformed into photorealistic PBR materials using the machine learning-powered Image-to-Material feature or any Sampler filter.

Image-to-Texture similarly enables the creation of tiled textures from reference images, providing an alternate way to prompt and generate variations from existing visual content.

Adobe 3D Sampler’s Image-to-Texture feature.

Sampler’s Text-to-Pattern feature uses text prompts to generate tiling patterns, which can be used as base colors or inputs for various filters, such as the Cloth Weave filter for creating original fabric materials.

All of these generative AI features in the Substance 3D Collection, supercharged with RTX GPUs, are designed to help 3D creators ideate and create faster.

Photo-tastic Features

Adobe Lightroom’s AI-powered Raw Details feature produces crisp detail and more accurate renditions of edges, improves color rendering and reduces artifacts, enhancing the image without changing its original resolution. This feature is handy for large displays and prints, where fine details are visible.

Enhance, enhance, enhance.

Super Resolution helps create an enhanced image with similar results as Raw Details but with 2x the linear resolution. This means that the enhanced image will have 2x the width and height of the original image — or 4x the total pixel count. This is especially useful for increasing the resolution of cropped imagery.

For faster editing, AI-powered, RTX-accelerated masking tools like Select Subject, which isolates people from an image, and Select Sky, which captures skies, enable users to create complex masks with the click of a button.

Visit Adobe’s AI features page for a complete list of AI features using RTX.

Looking for more AI-powered content creation apps? Consider NVIDIA Broadcast, which transforms any room into a home studio, free for RTX GPU owners. 

Generative AI is transforming gaming, videoconferencing and interactive experiences of all kinds. Make sense of what’s new and what’s next by subscribing to the AI Decoded newsletter.

Read More

How Georgia Tech’s AI Makerspace Is Preparing the Future Workforce for AI

How Georgia Tech’s AI Makerspace Is Preparing the Future Workforce for AI

AI is set to transform the workforce — and the Georgia Institute of Technology’s new AI Makerspace is helping tens of thousands of students get ahead of the curve. In this episode of NVIDIA’s AI Podcast, host Noah Kravitz speaks with Arijit Raychowdhury, a professor and Steve W. Cedex school chair of electrical engineering at Georgia Tech’s college of engineering, about the supercomputer hub, which provides students with the computing resources to reinforce their coursework and gain hands-on experience with AI. Built in collaboration with NVIDIA, the AI Makerspace underscores Georgia Tech’s commitment to preparing students for an AI-driven future, while fostering collaboration with local schools and universities.

Time Stamps

1:45: What is the AI Makerspace?

5:57: What computing resources are included in the AI Makerspace?

7:23: What is the aim of the AI Makerspace?

14:47: Georgia Tech’s AI-focused minor and coursework

19:25: Raychowdhury’s insight on the intersection of AI and higher education

23:33: How have industries and jobs already changed as a result of AI?

27:44: What can younger students do to prepare to get a spot in Georgia Tech’s engineering program?

You Might Also Like…

How Two Students Are Building Robots for Handling Houeshold Chores – Ep. 224
Imagine having a robot that could help you clean up after a party — or fold heaps of laundry. Chengshu Eric Li and Josiah David Wong, two Stanford University Ph.D. students advised by renowned computer science professor Fei-Fei Li, are making that a ‌dream come true with BEHAVIOR-1K, a project that aims to enable robots to perform 1,000 household chores, including picking up fallen objects or cooking.

Making Machines Mindful: NYU Professor Talks Responsible AI – Ep. 205

Artificial intelligence is now a household term. Responsible AI is hot on its heels. Julia Stoyanovich, associate professor of computer science and engineering at NYU and director of the university’s Center for Responsible AI, wants to make the terms “AI” and “responsible AI” synonymous, sharing her advocacy efforts and how people can help.

Replit CEO Amjad Masad on Empowering the Next Billion Software Creators – Ep. 201

Replit aims to empower the next billion software creators. Replit CEO Amjad Masad aims to bridge the gap between ideas and software, a task simplified by advances in generative AI. The company’s suite of technologies help make software creation accessible to all, even those with no coding experience.

MIT’s Anant Agarwal on AI in Education – Ep. 197

Anant Agarwal, founder of edX and chief platform officer at 2U, shares his vision for the future of online education and the impact of AI in revolutionizing the learning experience, emphasizing the importance of accessibility and quality in education.

Subscribe to the AI Podcast

Get the AI Podcast through iTunes, Google Play, Amazon Music, Castbox, DoggCatcher, Overcast, PlayerFM, Pocket Casts, Podbay, PodBean, PodCruncher, PodKicker, Soundcloud, Spotify, Stitcher and TuneIn.

Make the AI Podcast better: Have a few minutes to spare? Fill out this listener survey.

Read More

How NVIDIA AI Foundry Lets Enterprises Forge Custom Generative AI Models

How NVIDIA AI Foundry Lets Enterprises Forge Custom Generative AI Models

Businesses seeking to harness the power of AI need customized models tailored to their specific industry needs.

NVIDIA AI Foundry is a service that enables enterprises to use data, accelerated computing and software tools to create and deploy custom models that can supercharge their generative AI initiatives.

Just as TSMC manufactures chips designed by other companies, NVIDIA AI Foundry provides the infrastructure and tools for other companies to develop and customize AI models — using DGX Cloud, foundation models, NVIDIA NeMo software, NVIDIA expertise, as well as ecosystem tools and support.

The key difference is the product: TSMC produces physical semiconductor chips, while NVIDIA AI Foundry helps create custom models. Both enable innovation and connect to a vast ecosystem of tools and partners.

Enterprises can use AI Foundry to customize NVIDIA and open community models, including the new Llama 3.1 collection, as well as NVIDIA Nemotron, CodeGemma by Google DeepMind, CodeLlama, Gemma by Google DeepMind, Mistral, Mixtral, Phi-3, StarCoder2 and others.

Industry Pioneers Drive AI Innovation

Industry leaders Amdocs, Capital One, Getty Images, KT, Hyundai Motor Company, SAP, ServiceNow and Snowflake are among the first using NVIDIA AI Foundry. These pioneers are setting the stage for a new era of AI-driven innovation in enterprise software, technology, communications and media.

“Organizations deploying AI can gain a competitive edge with custom models that incorporate industry and business knowledge,” said Jeremy Barnes, vice president of AI Product at ServiceNow. “ServiceNow is using NVIDIA AI Foundry to fine-tune and deploy models that can integrate easily within customers’ existing workflows.”

The Pillars of NVIDIA AI Foundry 

NVIDIA AI Foundry is supported by the key pillars of foundation models, enterprise software, accelerated computing, expert support and a broad partner ecosystem.

Its software includes AI foundation models from NVIDIA and the AI community as well as the complete NVIDIA NeMo software platform for fast-tracking model development.

The computing muscle of NVIDIA AI Foundry is NVIDIA DGX Cloud, a network of accelerated compute resources co-engineered with the world’s leading public clouds — Amazon Web Services, Google Cloud and Oracle Cloud Infrastructure. With DGX Cloud, AI Foundry customers can develop and fine-tune custom generative AI applications with unprecedented ease and efficiency, and scale their AI initiatives as needed without significant upfront investments in hardware. This flexibility is crucial for businesses looking to stay agile in a rapidly changing market.

If an NVIDIA AI Foundry customer needs assistance, NVIDIA AI Enterprise experts are on hand to help. NVIDIA experts can walk customers through each of the steps required to build, fine-tune and deploy their models with proprietary data, ensuring the models tightly align with their business requirements.

NVIDIA AI Foundry customers have access to a global ecosystem of partners that can provide a full range of support. Accenture, Deloitte, Infosys and Wipro are among the NVIDIA partners that offer AI Foundry consulting services that encompass design, implementation and management of AI-driven digital transformation projects. Accenture is first to offer its own AI Foundry-based offering for custom model development, the Accenture AI Refinery framework.

Additionally, service delivery partners such as Data Monsters, Quantiphi, Slalom and SoftServe help enterprises navigate the complexities of integrating AI into their existing IT landscapes, ensuring that AI applications are scalable, secure and aligned with business objectives.

Customers can develop NVIDIA AI Foundry models for production using AIOps and MLOps platforms from NVIDIA partners, including Cleanlab, DataDog, Dataiku, Dataloop, DataRobot, Domino Data Lab, Fiddler AI, New Relic, Scale and Weights & Biases.

Customers can output their AI Foundry models as NVIDIA NIM inference microservices — which include the custom model, optimized engines and a standard API — to run on their preferred accelerated infrastructure.

Inferencing solutions like NVIDIA TensorRT-LLM deliver improved efficiency for Llama 3.1 models to minimize latency and maximize throughput. This enables enterprises to generate tokens faster while reducing total cost of running the models in production. Enterprise-grade support and security is provided by the NVIDIA AI Enterprise software suite.

NVIDIA NIM and TensorRT-LLM minimize inference latency and maximize throughput for Llama 3.1 models to generate tokens faster.

The broad range of deployment options includes NVIDIA-Certified Systems from global server manufacturing partners including Cisco, Dell Technologies, Hewlett Packard Enterprise, Lenovo and Supermicro, as well as cloud instances from Amazon Web Services, Google Cloud and Oracle Cloud Infrastructure.

Additionally, Together AI, a leading AI acceleration cloud, today announced it will enable its ecosystem of over 100,000 developers and enterprises to use its NVIDIA GPU-accelerated inference stack to deploy Llama 3.1 endpoints and other open models on DGX Cloud.

“Every enterprise running generative AI applications wants a faster user experience, with greater efficiency and lower cost,” said Vipul Ved Prakash, founder and CEO of Together AI. “Now, developers and enterprises using the Together Inference Engine can maximize performance, scalability and security on NVIDIA DGX Cloud.”

NVIDIA NeMo Speeds and Simplifies Custom Model Development

With NVIDIA NeMo integrated into AI Foundry, developers have at their fingertips the tools needed to curate data, customize foundation models and evaluate performance. NeMo technologies include:

  • NeMo Curator is a GPU-accelerated data-curation library that improves generative AI model performance by preparing large-scale, high-quality datasets for pretraining and fine-tuning.
  • NeMo Customizer is a high-performance, scalable microservice that simplifies fine-tuning and alignment of LLMs for domain-specific use cases.
  • NeMo Evaluator provides automatic assessment of generative AI models across academic and custom benchmarks on any accelerated cloud or data center.
  • NeMo Guardrails orchestrates dialog management, supporting accuracy, appropriateness and security in smart applications with large language models to provide safeguards for generative AI applications.

Using the NeMo platform in NVIDIA AI Foundry, businesses can create custom AI models that are precisely tailored to their needs. This customization allows for better alignment with strategic objectives, improved accuracy in decision-making and enhanced operational efficiency. For instance, companies can develop models that understand industry-specific jargon, comply with regulatory requirements and integrate seamlessly with existing workflows.

“As a next step of our partnership, SAP plans to use NVIDIA’s NeMo platform to help businesses to accelerate AI-driven productivity powered by SAP Business AI,” said Philipp Herzig, chief AI officer at SAP.

Enterprises can deploy their custom AI models in production with NVIDIA NeMo Retriever NIM inference microservices. These help developers fetch proprietary data to generate knowledgeable responses for their AI applications with retrieval-augmented generation (RAG).

“Safe, trustworthy AI is a non-negotiable for enterprises harnessing generative AI, with retrieval accuracy directly impacting the relevance and quality of generated responses in RAG systems,” said Baris Gultekin, Head of AI, Snowflake. “Snowflake Cortex AI leverages NeMo Retriever, a component of NVIDIA AI Foundry, to further provide enterprises with easy, efficient, and trusted answers using their custom data.”

Custom Models Drive Competitive Advantage

One of the key advantages of NVIDIA AI Foundry is its ability to address the unique challenges faced by enterprises in adopting AI. Generic AI models can fall short of meeting specific business needs and data security requirements. Custom AI models, on the other hand, offer superior flexibility, adaptability and performance, making them ideal for enterprises seeking to gain a competitive edge.

Learn more about how NVIDIA AI Foundry allows enterprises to boost productivity and innovation.

Read More

AI, Go Fetch! New NVIDIA NeMo Retriever Microservices Boost LLM Accuracy and Throughput

AI, Go Fetch! New NVIDIA NeMo Retriever Microservices Boost LLM Accuracy and Throughput

Generative AI applications have little, or sometimes negative, value without accuracy — and accuracy is rooted in data.

To help developers efficiently fetch the best proprietary data to generate knowledgeable responses for their AI applications, NVIDIA today announced four new NVIDIA NeMo Retriever NIM inference microservices.

Combined with NVIDIA NIM inference microservices for the Llama 3.1 model collection, also announced today, NeMo Retriever NIM microservices enable enterprises to scale to agentic AI workflows — where AI applications operate accurately with minimal intervention or supervision — while delivering the highest accuracy retrieval-augmented generation, or RAG.

NeMo Retriever allows organizations to seamlessly connect custom models to diverse business data and deliver highly accurate responses for AI applications using RAG. In essence, the production-ready microservices enable highly accurate information retrieval for building highly accurate AI applications.

For example, NeMo Retriever can boost model accuracy and throughput for developers creating AI agents and customer service chatbots, analyzing security vulnerabilities or extracting insights from complex supply chain information.

NIM inference microservices enable high-performance, easy-to-use, enterprise-grade inferencing. And with NeMo Retriever NIM microservices, developers can benefit from all of this — superpowered by their data.

These new NeMo Retriever embedding and reranking NIM microservices are now generally available:

  • NV-EmbedQA-E5-v5, a popular community base embedding model optimized for text question-answering retrieval
  • NV-EmbedQA-Mistral7B-v2, a popular multilingual community base model fine-tuned for text embedding for high-accuracy question answering
  • Snowflake-Arctic-Embed-L, an optimized community model, and
  • NV-RerankQA-Mistral4B-v3, a popular community base model fine-tuned for text reranking for high-accuracy question answering.

They join the collection of NIM microservices easily accessible through the NVIDIA API catalog.

Embedding and Reranking Models

NeMo Retriever NIM microservices comprise two model types — embedding and reranking — with open and commercial offerings that ensure transparency and reliability.

A diagram showing a user prompt inquiring about a bill, retrieving the most accurate response.
Example RAG pipeline using NVIDIA NIM microservices for Llama 3.1 and NeMo Retriever embedding and reranking NIM microservices for a customer service AI chatbot application.

An embedding model transforms diverse data — such as text, images, charts and video — into numerical vectors, stored in a vector database, while capturing their meaning and nuance. Embedding models are fast and computationally less expensive than traditional large language models, or LLMs.

A reranking model ingests data and a query, then scores the data according to its relevance to the query. Such models offer significant accuracy improvements while being computationally complex and slower than embedding models.

NeMo Retriever provides the best of both worlds. By casting a wide net of data to be retrieved with an embedding NIM, then using a reranking NIM to trim the results for relevancy, developers tapping NeMo Retriever can build a pipeline that ensures the most helpful, accurate results for their enterprise.

With NeMo Retriever, developers get access to state-of-the-art open, commercial models for building text Q&A retrieval pipelines that provide the highest accuracy. When compared with alternate models, NeMo Retriever NIM microservices provided 30% fewer inaccurate answers for enterprise question answering.

Bar chart showing lexical search (45%), alternative embedder (63%), compared with NeMo Retriever embedding NIM (73%) and NeMo Retriever embedding + reranking NIM microservices (75%).
Comparison of NeMo Retriever embedding NIM and embedding plus reranking NIM microservices performance versus lexical search and an alternative embedder.

Top Use Cases

From RAG and AI agent solutions to data-driven analytics and more, NeMo Retriever powers a wide range of AI applications.

The microservices can be used to build intelligent chatbots that provide accurate, context-aware responses. They can help analyze vast amounts of data to identify security vulnerabilities. They can assist in extracting insights from complex supply chain information. And they can boost AI-enabled retail shopping advisors that offer natural, personalized shopping experiences, among other tasks.

NVIDIA AI workflows for these use cases provide an easy, supported starting point for developing generative AI-powered technologies.

Dozens of NVIDIA data platform partners are working with NeMo Retriever NIM microservices to boost their AI models’ accuracy and throughput.

DataStax has integrated NeMo Retriever embedding NIM microservices in its Astra DB and Hyper-Converged platforms, enabling the company to bring accurate, generative AI-enhanced RAG capabilities to customers with faster time to market.

Cohesity will integrate NVIDIA NeMo Retriever microservices with its AI product, Cohesity Gaia, to help customers put their data to work to power insightful, transformative generative AI applications through RAG.

Kinetica will use NVIDIA NeMo Retriever to develop LLM agents that can interact with complex networks in natural language to respond more quickly to outages or breaches — turning insights into immediate action.

NetApp is collaborating with NVIDIA to connect NeMo Retriever microservices to exabytes of data on its intelligent data infrastructure. Every NetApp ONTAP customer will be able to seamlessly “talk to their data” to access proprietary business insights without having to compromise the security or privacy of their data.

NVIDIA global system integrator partners including Accenture, Deloitte, Infosys, LTTS, Tata Consultancy Services, Tech Mahindra and Wipro, as well as service delivery partners Data Monsters, EXLService (Ireland) Limited, Latentview, Quantiphi, Slalom, SoftServe and Tredence, are developing services to help enterprises add NeMo Retriever NIM microservices into their AI pipelines.

Use With Other NIM Microservices

NeMo Retriever NIM microservices can be used with NVIDIA Riva NIM microservices, which  supercharge speech AI applications across industries — enhancing customer service and enlivening digital humans.

New models that will soon be available as Riva NIM microservices include: FastPitch and HiFi-GAN for text-to-speech applications; Megatron for multilingual neural machine translation; and the record-breaking NVIDIA Parakeet family of models for automatic speech recognition.

NVIDIA NIM microservices can be used all together or separately, offering developers a modular approach to building AI applications. In addition, the microservices can be integrated with community models, NVIDIA models or users’ custom models — in the cloud, on premises or in hybrid environments — providing developers with further flexibility.

NVIDIA NIM microservices are available at ai.nvidia.com. Enterprises can deploy AI applications in production with NIM through the NVIDIA AI Enterprise software platform.

NIM microservices can run on customers’ preferred accelerated infrastructure, including cloud instances from Amazon Web Services, Google Cloud, Microsoft Azure and Oracle Cloud Infrastructure, as well as NVIDIA-Certified Systems from global server manufacturing partners including Cisco, Dell Technologies, Hewlett Packard Enterprise, Lenovo and Supermicro.

NVIDIA Developer Program members will soon be able to access NIM for free for research, development and testing on their preferred infrastructure.

Learn more about the latest in generative AI and accelerated computing by joining NVIDIA at SIGGRAPH, the premier computer graphics conference, running July 28-Aug. 1 in Denver. 

See notice regarding software product information.

Read More

NVIDIA’s AI Masters Sweep KDD Cup 2024 Data Science Competition

NVIDIA’s AI Masters Sweep KDD Cup 2024 Data Science Competition

Team NVIDIA has triumphed at the Amazon KDD Cup 2024, securing first place Friday across all five competition tracks.

The team — consisting of NVIDIANs Ahmet Erdem, Benedikt Schifferer, Chris Deotte, Gilberto Titericz, Ivan Sorokin and Simon Jegou — demonstrated its prowess in generative AI, winning in categories that included text generation, multiple-choice questions, name entity recognition, ranking, and retrieval.

The competition, themed “Multi-Task Online Shopping Challenge for LLMs,” asked participants to solve various challenges using limited datasets.

“The new trend in LLM competitions is that they don’t give you training data,” said Deotte, a senior data scientist at NVIDIA. “They give you 96 example questions — not enough to train a model — so we came up with 500,000 questions on our own.”

Deotte explained that the NVIDIA team generated a variety of questions by writing some themselves, using a large language model to create others, and transforming existing e-commerce datasets.

“Once we had our questions, it was straightforward to use existing frameworks to fine-tune a language model,” he said.

The competition organizers hid the test questions to ensure participants couldn’t exploit previously known answers. This approach encourages models that generalize well to any question about e-commerce, proving the model’s capability to handle real-world scenarios effectively.

Despite these constraints, Team NVIDIA’s innovative approach outperformed all competitors by using Qwen2-72B, a just-released LLM with 72 billion parameters, fine-tuned on eight NVIDIA A100 Tensor Core GPUs, and employing QLoRA, a technique for fine-tuning models with datasets.

About the KDD Cup 2024

The KDD Cup, organized by the Association for Computing Machinery’s Special Interest Group on Knowledge Discovery and Data Mining, or ACM SIGKDD, is a prestigious annual competition that promotes research and development in the field.

This year’s challenge, hosted by Amazon, focused on mimicking the complexities of online shopping with the goal of making it a more intuitive and satisfying experience using large language models. Organizers utilized the test dataset ShopBench — a benchmark that replicates the massive challenge for online shopping with 57 tasks and about 20,000 questions derived from real-world Amazon shopping data — to evaluate participants’ models.

The ShopBench benchmark focused on four key shopping skills, along with a fifth “all-in-one” challenge:

  1. Shopping Concept Understanding: Decoding complex shopping concepts and terminologies.
  2. Shopping Knowledge Reasoning: Making informed decisions with shopping knowledge.
  3. User Behavior Alignment: Understanding dynamic customer behavior.
  4. Multilingual Abilities: Shopping across languages.
  5. All-Around: Solving all tasks from the previous tracks in a unified solution.

NVIDIA’s Winning Solution

NVIDIA’s winning solution involved creating a single model for each track.

The team fine-tuned the just-released Qwen2-72B model using eight NVIDIA A100 Tensor Core GPUs for approximately 24 hours. The GPUs provided fast and efficient processing, significantly reducing the time required for fine-tuning.

First, the team generated training datasets based on the provided examples and synthesized additional data using Llama 3 70B hosted on build.nvidia.com.

Next, they employed QLoRA (Quantized Low-Rank Adaptation), a training process using the data created in step one. QLoRA modifies a smaller subset of the model’s weights, allowing efficient training and fine-tuning.

The model was then quantized — making it smaller and able to run on a system with a smaller hard drive and less memory — with AWQ 4-bit and used the vLLM inference library to predict the test datasets on four NVIDIA T4 Tensor Core GPUs within the time constraints.

This approach secured the top spot in each individual track and the overall first place in the competition—a clean sweep for NVIDIA for the second year in a row.

The team plans to submit a detailed paper on its solution next month and plans to present its findings at KDD 2024 in Barcelona.

Read More

Sustainable Strides: How AI and Accelerated Computing Are Driving Energy Efficiency

Sustainable Strides: How AI and Accelerated Computing Are Driving Energy Efficiency

AI and accelerated computing — twin engines NVIDIA continuously improves — are delivering energy efficiency for many industries.

It’s progress the wider community is starting to acknowledge.

“Even if the predictions that data centers will soon account for 4% of global energy consumption become a reality, AI is having a major impact on reducing the remaining 96% of energy consumption,” said a report from Lisbon Council Research, a nonprofit formed in 2003 that studies economic and social issues.

The article from the Brussels-based research group is among a handful of big-picture AI policy studies starting to emerge. It uses Italy’s Leonardo supercomputer, accelerated with nearly 14,000 NVIDIA GPUs, as an example of a system advancing work in fields from automobile design and drug discovery to weather forecasting.

Energy-efficiency gains over time for the most efficient supercomputer on the TOP500 list. Source: TOP500.org

Why Accelerated Computing Is Sustainable Computing

Accelerated computing uses the parallel processing of NVIDIA GPUs to do more work in less time. As a result, it consumes less energy than general-purpose servers that employ CPUs built to handle one task at a time.

That’s why accelerated computing is sustainable computing.

Accelerated systems use parallel processing on GPUs to do more work in less time, consuming less energy than CPUs.

The gains are even greater when accelerated systems apply AI, an inherently parallel form of computing that’s the most transformative technology of our time.

“When it comes to frontier applications like machine learning or deep learning, the performance of GPUs is an order of magnitude better than that of CPUs,” the report said.

NVIDIA offers a combination of GPUs, CPUs, and DPUs tailored to maximize energy efficiency with accelerated computing.

User Experiences With Accelerated AI

Users worldwide are documenting energy-efficiency gains with AI and accelerated computing.

In financial services, Murex — a Paris-based company with a trading and risk-management platform used daily by more than 60,000 people — tested the NVIDIA Grace Hopper Superchip. On its workloads, the CPU-GPU combo delivered a 4x reduction in energy consumption and a 7x reduction in time to completion compared with CPU-only systems (see chart below).

“On risk calculations, Grace is not only the fastest processor, but also far more power-efficient, making green IT a reality in the trading world,” said Pierre Spatz, head of quantitative research at Murex.

In manufacturing, Taiwan-based Wistron built a digital copy of a room where NVIDIA DGX systems undergo thermal stress tests to improve operations at the site. It used NVIDIA Omniverse, a platform for industrial digitization, with a surrogate model, a version of AI that emulates simulations.

The digital twin, linked to thousands of networked sensors, enabled Wistron to increase the facility’s overall energy efficiency by up to 10%. That amounts to reducing electricity consumption by 120,000 kWh per year and carbon emissions by a whopping 60,000 kilograms.

Up to 80% Fewer Carbon Emissions

The RAPIDS Accelerator for Apache Spark can reduce the carbon footprint for data analytics, a widely used form of machine learning, by as much as 80% while delivering 5x average speedups and 4x reductions in computing costs, according to a recent benchmark.

Thousands of companies — about 80% of the Fortune 500 — use Apache Spark to analyze their growing mountains of data. Companies using NVIDIA’s Spark accelerator include Adobe, AT&T and the U.S. Internal Revenue Service.

In healthcare, Insilico Medicine discovered and put into phase 2 clinical trials a drug candidate for a relatively rare respiratory disease, thanks to its NVIDIA-powered AI platform.

Using traditional methods, the work would have cost more than $400 million and taken up to six years. But with generative AI, Insilico hit the milestone for one-tenth of the cost in one-third of the time.

“This is a significant milestone not only for us, but for everyone in the field of AI-accelerated drug discovery,” said Alex Zhavoronkov, CEO of Insilico Medicine.

This is just a sampler of results that users of accelerated computing and AI are pursuing at companies such as Amgen, BMW, Foxconn, PayPal and many more.

Speeding Science With Accelerated AI 

In basic research, the National Energy Research Scientific Computing Center (NERSC), the U.S. Department of Energy’s lead facility for open science, measured results on a server with four NVIDIA A100 Tensor Core GPUs compared with dual-socket x86 CPU servers across four of its key high-performance computing and AI applications.

Researchers found that the apps, when accelerated with the NVIDIA A100 GPUs, saw energy efficiency rise 5x on average (see below). One application, for weather forecasting, logged gains of nearly 10x.

Scientists and researchers worldwide depend on AI and accelerated computing to achieve high performance and efficiency.

In a recent ranking of the world’s most energy-efficient supercomputers, known as the Green500, NVIDIA-powered systems swept the top six spots, and 40 of the top 50.

Underestimated Energy Savings

The many gains across industries and science are sometimes overlooked in forecasts that extrapolate only the energy consumption of training the largest AI models. That misses the benefits from most of an AI model’s life when it’s consuming relatively little energy, delivering the kinds of efficiencies users described above.

In an analysis citing dozens of sources, a recent study debunked as misleading and inflated projections based on training models.

“Just as the early predictions about the energy footprints of e-commerce and video streaming ultimately proved to be exaggerated, so too will those estimates about AI likely be wrong,” said the report from the Information Technology and Innovation Foundation (ITIF), a Washington-based think tank.

The report notes as much as 90% of the cost — and all the efficiency gains — of running an AI model are in deploying it in applications after it’s trained.

“Given the enormous opportunities to use AI to benefit the economy and society — including transitioning to a low-carbon future — it is imperative that policymakers and the media do a better job of vetting the claims they entertain about AI’s environmental impact,” said the report’s author, who described his findings in a recent podcast.

Others Cite AI’s Energy Benefits

Policy analysts from the R Street Institute, also in Washington, D.C., agreed.

“Rather than a pause, policymakers need to help realize the potential for gains from AI,” the group wrote in a 1,200-word article.

“Accelerated computing and the rise of AI hold great promise for the future, with significant societal benefits in terms of economic growth and social welfare,” it said, citing demonstrated benefits of AI in drug discovery, banking, stock trading and insurance.

AI can make the electric grid, manufacturing and transportation sectors more efficient, it added.

AI Supports Sustainability Efforts

The reports also cited the potential of accelerated AI to fight climate change and promote sustainability.

“AI can enhance the accuracy of weather modeling to improve public safety as well as generate more accurate predictions of crop yields. The power of AI can also contribute to … developing more precise climate models,” R Street said.

The Lisbon report added that AI plays “a crucial role in the innovation needed to address climate change” for work such as discovering more efficient battery materials.

How AI Can Help the Environment

ITIF called on governments to adopt AI as a tool in efforts to decarbonize their operations.

Public and private organizations are already applying NVIDIA AI to protect coral reefs, improve tracking of wildfires and extreme weather, and enhance sustainable agriculture.

For its part, NVIDIA is working with hundreds of startups employing AI to address climate issues. NVIDIA also announced plans for Earth-2, expected to be the world’s most powerful AI supercomputer dedicated to climate science.

Enhancing Energy Efficiency Across the Stack

Since its founding in 1993, NVIDIA has worked on energy efficiency across all its products — GPUs, CPUs, DPUs, networks, systems and software, as well as platforms such as Omniverse.

In AI, the brunt of an AI model’s life is in inference, delivering insights that help users achieve new efficiencies. The NVIDIA GB200 Grace Blackwell Superchip has demonstrated 25x energy efficiency over the prior NVIDIA Hopper GPU generation in AI inference.

Over the last eight years, NVIDIA GPUs have advanced a whopping 45,000x in their energy efficiency running large language models (see chart below).

Recent innovations in software include TensorRT-LLM. It can help GPUs reduce 3x the energy consumption of LLM inference.

Here’s an eye-popping stat: If the efficiency of cars improved as much as NVIDIA has advanced the efficiency of AI on its accelerated computing platform, cars would get 280,000 miles per gallon. That means you could drive to the moon on less than a gallon of gas.

The analysis applies to the fuel efficiency of cars NVIDIA’s whopping 10,000x efficiency gain in AI training and inference from 2016 to 2025 (see chart below).

How the big AI efficiency leap from the NVIDIA P100 GPU to the NVIDIA Grace Blackwell compares to car fuel-efficiency gains.

Driving Data Center Efficiency

NVIDIA delivers many optimizations through system-level innovations. For example, NVIDIA BlueField-3 DPUs can reduce power consumption up to 30% by offloading essential data center networking and infrastructure functions from less efficient CPUs.

Last year, NVIDIA received a $5 million grant from the U.S. Department of Energy — the largest of 15 grants from a pool of more than 100 applications — to design a new liquid-cooling technology for data centers. It will run 20% more efficiently than today’s air-cooled approaches and has a smaller carbon footprint.

These are just some of the ways NVIDIA contributes to the energy efficiency of data centers.

Data centers are among the most efficient users of energy and one of the largest consumers of renewable energy.

The ITIF report notes that between 2010 and 2018, global data centers experienced a 550% increase in compute instances and a 2,400% increase in storage capacity, but only a 6% increase in energy use, thanks to improvements across hardware and software.

NVIDIA continues to drive energy efficiency for accelerated AI, helping users in science, government and industry accelerate their journeys toward sustainable computing.

Try NVIDIA’s energy-efficiency calculator to find ways to improve energy efficiency. And check out NVIDIA’s sustainable computing site and corporate sustainability report for more information. 

Read More

Byte-Sized Courses: NVIDIA Offers Self-Paced Career Development in AI and Data Science

Byte-Sized Courses: NVIDIA Offers Self-Paced Career Development in AI and Data Science

AI has seen unprecedented growth — spurring the need for new training and education resources for students and industry professionals.

NVIDIA’s latest on-demand webinar, Essential Training and Tips to Accelerate Your Career in AI, featured a panel discussion with industry experts on fostering career growth and learning in AI and other advanced technologies.

Over 1,800 attendees gained insights on how to kick-start their careers and use NVIDIA’s technologies and resources to accelerate their professional development.

Opportunities in AI

AI’s impact is touching nearly every industry, presenting new career opportunities for professionals of all backgrounds.

Lauren Silveira, a university recruiting program manager at NVIDIA, challenged attendees to take their unique education and experience and apply it in the AI field.

“You don’t have to work directly in AI to impact the industry,” said Silveira. “I knew I wouldn’t be a doctor or engineer — that wasn’t in my career path — but I could create opportunities for those that wanted to pursue those dreams.”

Kevin McFall, a principal instructor for the NVIDIA Deep Learning Institute, offered some advice for those looking to navigate a career in AI and advanced technologies but finding themselves overwhelmed or unsure of where to start.

“Don’t try to do it all by yourself,” he said. “Don’t get focused on building everything from scratch — the best skill that you can have is being able to take pieces of code or inspiration from different resources and plug them together to make a whole.”

A main takeaway from the panelists was that students and industry professionals can significantly enhance their capabilities by leveraging tools and resources in addition to their networks.

Every individual can access a variety of free software development kits, community resources and specialized courses in areas like robotics, CUDA and OpenUSD through the NVIDIA Developer Program. Additionally, they can kick off projects with the CUDA code sample library and explore specialized guides such as “A Simple Guide to Deploying Generative AI With NVIDIA NIM”.

Spinning a Network

Staying up to date on the rapidly expanding technology industry involves more than just keeping up with the latest education and certifications.

Sabrina Koumoin, a senior software engineer at NVIDIA, spoke on the importance of networking. She believes people can find like-minded peers and mentors to gain inspiration from by sharing their personal learning journeys or projects on social platforms like LinkedIn.

A self-taught coder, Koumoin also advocates for active engagement and education accessibility. Outside of work, she hosted multiple coding bootcamps for people looking to break into tech.

“It’s a way to show that learning technical skills can be engaging, not intimidating,” she said.

David Ajoku, founder and CEO at Demystifyd and Aware.ai, also emphasized the importance of using LinkedIn to build connections, demonstrate key accomplishments and show passion.

He outlined a three-step strategy to enhance your LinkedIn presence, designed to help you stand out, gain deeper insights into your preferred companies and boldly share your aspirations and interests:

  1. Think about a company you’d like to work for and what draws you to it.
  2. Research thoroughly, focusing on its main activities, mission and goals.
  3. Be bold — create a series of posts informing your network about your career journey and what advancements interest you in the chosen company.

One attendee asked about how AI might evolve over the next decade and what skills professionals should focus on to stay relevant. Louis Stewart, head of strategic initiatives at NVIDIA, replied that crafting a personal narrative and growth journey is just as important as ensuring certifications and skills are up to date.

“Be intentional and purposeful — have an end in mind,” he said. “That’s how you connect with future potential companies and people — it’s a skill you have to develop to stay ahead.”

Deep Dive Into Learning

NVIDIA offers a variety of programs and resources to equip the next generation of AI professionals with the skills and training needed to excel in a career in AI.

NVIDIA’s AI Learning Essentials is designed to give individuals the knowledge, skills and certifications they need to be prepared for the workforce and the fast moving field of AI. It includes free access to self-paced introductory courses and webinars on topics such as generative AI, retrieval-augmented generation (RAG) and CUDA.

The NVIDIA Deep Learning Institute (DLI) provides a diverse range of resources, including learning materials, self-paced and live trainings, and educator programs spanning AI, accelerated computing and data science, graphics simulation and more. They also offer technical workshops for students currently enrolled in universities.

DLI provides comprehensive training for generative AI, RAG, NVIDIA NIM inference microservices and large language models. Offerings also include certifications for generative AI LLMs and generative AI multimodal that help learners showcase their expertise and stand out from the crowd.

Get started with AI Learning Essentials, the NVIDIA Deep Learning Institute and on-demand resources.

Read More

Magnetic Marvels: NVIDIA’s Supercomputers Spin a Quantum Tale

Magnetic Marvels: NVIDIA’s Supercomputers Spin a Quantum Tale

Research published earlier this month in the science journal Nature used NVIDIA-powered supercomputers to validate a pathway toward the commercialization of quantum computing.

The research, led by Nobel laureate Giorgio Parisi, focuses on quantum annealing, a method that may one day tackle complex optimization problems that are extraordinarily challenging to conventional computers.

To conduct their research, the team utilized 2 million GPU computing hours at the Leonardo facility (Cineca, in Bologna, Italy), nearly 160,000 GPU computing hours on the Meluxina-GPU cluster, in Luxembourg, and 10,000 GPU hours from the Spanish Supercomputing Network. Additionally, they accessed the Dariah cluster, in Lecce, Italy.

They used these state-of-the-art resources to simulate the behavior of a certain kind of quantum computing system known as a quantum annealer.

Quantum computers fundamentally rethink how information is computed to enable entirely new solutions.

Unlike classical computers, which process information in binary — 0s and 1s — quantum computers use quantum bits or qubits that can allow information to be processed in entirely new ways.

Quantum annealers are a special type of quantum computer that, though not universally useful, may have advantages for solving certain types of optimization problems.

The paper, “The Quantum Transition of the Two-Dimensional Ising Spin Glass,” represents a significant step in understanding the phase transition — a change in the properties of a quantum system — of Ising spin glass, a disordered magnetic material in a two-dimensional plane, a critical problem in computational physics.

The paper addresses the problem of how the properties of magnetic particles arranged in a two-dimensional plane can abruptly change their behavior.

The study also shows how GPU-powered systems play a key role in developing approaches to quantum computing.

GPU-accelerated simulations allow researchers to understand the complex systems’ behavior in developing quantum computers, illuminating the most promising paths forward.

Quantum annealers, like the systems developed by the pioneering quantum computing company D-Wave, operate by methodically decreasing a magnetic field that is applied to a set of magnetically susceptible particles.

When strong enough, the applied field will act to align the magnetic orientation of the particles — similar to how iron filings will uniformly stand to attention near a bar magnet.

If the strength of the field is varied slowly enough, the magnetic particles will arrange themselves to minimize the energy of the final arrangement.

Finding this stable, minimum-energy state is crucial in a particularly complex and disordered magnetic system known as a spin glass since quantum annealers can encode certain kinds of problems into the spin glass’s minimum-energy configuration.

Finding the stable arrangement of the spin glass then solves the problem.

Understanding these systems helps scientists develop better algorithms for solving difficult problems by mimicking how nature deals with complexity and disorder.

That’s crucial for advancing quantum annealing and its applications in solving extremely difficult computational problems that currently have no known efficient solution — problems that are pervasive in fields ranging from logistics to cryptography.

Unlike gate-model quantum computers, which operate by applying a sequence of quantum gates, quantum annealers allow a quantum system to evolve freely in time.

This is not a universal computer — a device capable of performing any computation given sufficient time and resources — but may have advantages for solving particular sets of optimization problems in application areas such as vehicle routing, portfolio optimization and protein folding.

Through extensive simulations performed on NVIDIA GPUs, the researchers learned how key parameters of the spin glasses making up quantum annealers change during their operation, allowing a better understanding of how to use these systems to achieve a quantum speedup on important problems.

Much of the work for this groundbreaking paper was first presented at NVIDIA’s GTC 2024 technology conference. Read the full paper and learn more about NVIDIA’s work in quantum computing.

Read More

Mistral AI and NVIDIA Unveil Mistral NeMo 12B, a Cutting-Edge Enterprise AI Model

Mistral AI and NVIDIA Unveil Mistral NeMo 12B, a Cutting-Edge Enterprise AI Model

Mistral AI and NVIDIA today released a new state-of-the-art language model, Mistral NeMo 12B, that developers can easily customize and deploy for enterprise applications supporting chatbots, multilingual tasks, coding and summarization.

By combining Mistral AI’s expertise in training data with NVIDIA’s optimized hardware and software ecosystem, the Mistral NeMo model offers high performance for diverse applications.

“We are fortunate to collaborate with the NVIDIA team, leveraging their top-tier hardware and software,” said Guillaume Lample, cofounder and chief scientist of Mistral AI. “Together, we have developed a model with unprecedented accuracy, flexibility, high-efficiency and enterprise-grade support and security thanks to NVIDIA AI Enterprise deployment.”

Mistral NeMo was trained on the NVIDIA DGX Cloud AI platform, which offers dedicated, scalable access to the latest NVIDIA architecture.

NVIDIA TensorRT-LLM for accelerated inference performance on large language models and the NVIDIA NeMo development platform for building custom generative AI models were also used to advance and optimize the process.

This collaboration underscores NVIDIA’s commitment to supporting the model-builder ecosystem.

Delivering Unprecedented Accuracy, Flexibility and Efficiency 

Excelling in multi-turn conversations, math, common sense reasoning, world knowledge and coding, this enterprise-grade AI model delivers precise, reliable performance across diverse tasks.

With a 128K context length, Mistral NeMo processes extensive and complex information more coherently and accurately, ensuring contextually relevant outputs.

Released under the Apache 2.0 license, which fosters innovation and supports the broader AI community, Mistral NeMo is a 12-billion-parameter model. Additionally, the model uses the FP8 data format for model inference, which reduces memory size and speeds deployment without any degradation to accuracy.

That means the model learns tasks better and handles diverse scenarios more effectively, making it ideal for enterprise use cases.

Mistral NeMo comes packaged as an NVIDIA NIM inference microservice, offering performance-optimized inference with NVIDIA TensorRT-LLM engines.

This containerized format allows for easy deployment anywhere, providing enhanced flexibility for various applications.

As a result, models can be deployed anywhere in minutes, rather than several days.

NIM features enterprise-grade software that’s part of NVIDIA AI Enterprise, with dedicated feature branches, rigorous validation processes, and enterprise-grade security and support.

It includes comprehensive support, direct access to an NVIDIA AI expert and defined service-level agreements, delivering reliable and consistent performance.

The open model license allows enterprises to integrate Mistral NeMo into commercial applications seamlessly.

Designed to fit on the memory of a single NVIDIA L40S, NVIDIA GeForce RTX 4090 or NVIDIA RTX 4500 GPU, the Mistral NeMo NIM offers high efficiency, low compute cost, and enhanced security and privacy.

Advanced Model Development and Customization 

The combined expertise of Mistral AI and NVIDIA engineers has optimized training and inference for Mistral NeMo.

Trained with Mistral AI’s expertise, especially on multilinguality, code and multi-turn content, the model benefits from accelerated training on NVIDIA’s full stack.

It’s designed for optimal performance, utilizing efficient model parallelism techniques, scalability and mixed precision with Megatron-LM.

The model was trained using Megatron-LM, part of NVIDIA NeMo, with 3,072 H100 80GB Tensor Core GPUs on DGX Cloud, composed of NVIDIA AI architecture, including accelerated computing, network fabric and software to increase training efficiency.

Availability and Deployment

With the flexibility to run anywhere — cloud, data center or RTX workstation — Mistral NeMo is ready to revolutionize AI applications across various platforms.

Experience Mistral NeMo as an NVIDIA NIM today via ai.nvidia.com, with a downloadable NIM coming soon.

See notice regarding software product information.

Read More