Boom in AI-Enabled Medical Devices Transforms Healthcare

Boom in AI-Enabled Medical Devices Transforms Healthcare

The future of healthcare is software-defined and AI-enabled. Around 700 FDA-cleared, AI-enabled medical devices are now on the market — more than 10x the number available in 2020 

Many of the innovators behind this boom announced their latest AI-powered solutions at NVIDIA GTC, a global conference that last week attracted more than 16,000 business leaders, developers and researchers in Silicon Valley and many more online. 

Designed to make healthcare more efficient and help improve patient outcomes, these new technologies include foundation models to accelerate ultrasound analysis, augmented and virtual reality solutions for cardiac imaging, and generative AI software to support surgeons. 

Shifting From Hardware to Software-Defined Medical Devices 

Medical devices have long been hardware-centric, relying on intricate designs and precise engineering. They’re now shifting to be software-defined, meaning they can be enhanced over time through software updates — the same way that smartphones can be upgraded with new apps and features for years before a user upgrades to a new device.  

This new approach, supported by NVIDIA’s domain-specific platforms for real-time accelerated computing, is taking center stage because of its potential to transform patient care, increase efficiencies, enhance the clinician experience and drive better outcomes. 

Leading medtech companies such as GE Healthcare are using NVIDIA technology to develop, fine-tune and deploy AI for software-defined medical imaging applications.   

GE Healthcare announced at GTC that it used NVIDIA tools including the TensorRT software development kit to develop and optimize SonoSAMTrack, a recent research foundation model that delineates and tracks organs, structures or lesions across medical images with just a few clicks. The research model has the potential to simplify and speed up ultrasound analysis for healthcare professionals. 

Powering the Next Generation of Digital Surgery 

With the NVIDIA IGX edge computing platform and NVIDIA Holoscan medical-grade edge AI platform, medical device companies are accelerating the development and deployment of AI-powered innovation in the operating room.  

Johnson & Johnson MedTech is working with NVIDIA to test new AI capabilities for the company’s connected digital ecosystem for surgery. It aims to enable open innovation and accelerate the delivery of real-time insights at scale to support medical professionals before, during and after procedures.  

Paris-based robotic surgery company Moon Surgical is using Holoscan and IGX to power its Maestro System, which is used in laparoscopy, a technique where surgeons operate through small incisions with an internal camera and instruments.  

Maestro’s ScoPilot enables surgeons to control a laparoscope without taking their hands off other surgical tools during an operation.  To date, it’s been used to treat over 200 patients successfully.  

Moon Surgical and NVIDIA are also collaborating to bring generative AI features to the operating room using Maestro and Holoscan. 

NVIDIA Platforms Power Thriving Medtech Ecosystem 

A growing number of medtech companies and solution providers is making it easier for customers to adopt NVIDIA’s edge AI platforms to enhance and accelerate healthcare.  

Arrow Electronics is delivering IGX as a subscription-like platform-as-a-service for industrial and medical customers. Customers who have adopted Arrow’s business model to accelerate application deployment include Kaliber AI, a company developing AI tools to assist minimally invasive surgery. At GTC, Kaliber showcased AI-generated insights for surgeons and a large language model to respond to patient questions. 

Global visualization leader Barco is adopting Holoscan and IGX to build a turnkey surgical AI platform for customers seeking an off-the-shelf offering that allows them to focus their engineering resources on application development. The company is working with SoftAcuity on two Holoscan-based products that will include generative AI voice control and AI-powered data analytics.  

And Magic Leap has integrated Holoscan in its extended reality software stack, enhancing the capabilities of customers like Medical iSight — a software developer building real-time, intraoperative support for minimally invasive treatments of stroke and neurovascular conditions. 

Learn more about NVIDIA-accelerated medtech 

Get started on NVIDIA NGC or visit ai.nvidia.com to experiment with more than two dozen healthcare microservices 

Subscribe to NVIDIA healthcare news

Read More

Model Innovators: How Digital Twins Are Making Industries More Efficient

Model Innovators: How Digital Twins Are Making Industries More Efficient

A manufacturing plant near Hsinchu, Taiwan’s Silicon Valley, is among facilities worldwide boosting energy efficiency with AI-enabled digital twins.

A virtual model can help streamline operations, maximizing throughput for its physical counterpart, say engineers at Wistron, a global designer and manufacturer of computers and electronics systems.

In the first of several use cases, the company built a digital copy of a room where NVIDIA DGX systems undergo thermal stress tests (pictured above). Early results were impressive.

Making Smart Simulations

Using NVIDIA Modulus, a framework for building AI models that understand the laws of physics, Wistron created digital twins that let them accurately predict the airflow and temperature in test facilities that must remain between 27 and 32 degrees C.

A simulation that would’ve taken nearly 15 hours with traditional methods on a CPU took just 3.3 seconds on an NVIDIA GPU running inference with an AI model developed using Modulus, a whopping 15,000x speedup.

The results were fed into tools and applications built by Wistron developers with NVIDIA Omniverse, a platform for creating 3D workflows and applications based on OpenUSD.

Image of Wistron’s digital twin of a computer test room
A bird’s-eye view of the model of Wistron’s computer test room.

With their Omniverse-powered software, Wistron created realistic and immersive simulations that operators interact with via VR headsets. And thanks to the AI models they developed using Modulus, the airflows in the simulation obey the laws of physics.

“Physics-informed models let us control the test process and the room’s temperature remotely in near real time, saving time and energy,” said John Lu, a manufacturing operations director at Wistron.

Specifically, Wistron combined separate models for predicting air temperature and airflow to eliminate risks of overheating in the test room. It also created a recommendation system to identify the best locations to test computer baseboards.

The digital twin, linked to thousands of networked sensors, enabled Wistron to increase the facility’s overall energy efficiency up to 10%. That amounts to using up to 121,600 kWh less electricity a year, reducing carbon emissions by a whopping 60,192 kilograms.

An Expanding Effort

Currently, the group is expanding its AI model to track more than a hundred variables in a space that holds 50 computer racks. The team is also simulating all the mechanical details of the servers and testers.

“The final model will help us optimize test scheduling as well as the energy efficiency of the facilities’ air conditioning system,” said Derek Lai, a Wistron technical supervisor with expertise in physics-informed neural networks.

Looking ahead, “The tools and applications we’re building with Omniverse help us improve the layout of our DGX factories to provide the best throughput, further improving efficiency,” said Liu.

Efficiently Generating Energy

Half a world away, Siemens Energy is demonstrating the power of digital industrialization using Modulus and Omniverse.

The Munich-based company, whose technology generates one-sixth of the world’s electricity, achieved a 10,000x speedup simulating a heat-recovery steam generator using a physics-informed AI model (see video below).

Using a digital twin to detect corrosion early on, these massive systems can reduce downtime by 70%, potentially saving the industry $1.7 billion annually compared to a standard simulation that took half a month.

“The reduced computational time enables us to develop energy-efficient digital twins for a sustainable, reliable and affordable energy ecosystem,” said Georg Rollmann, head of advanced analytics and AI at Siemens Energy.

Digital Twins Drive Science and Industry

Automotive companies are applying the technology to the design of new cars and manufacturing plants. Scientists are using it in fields as diverse as astrophysics, genomics and weather forecasting. It’s even being used to create a digital twin of Earth to understand and mitigate the impacts of climate change.

Every year, physics simulations, typically run on supercomputer-class systems, consume an estimated 200 billion CPU core hours and 4 terawatt hours of energy. Physics-informed AI is accelerating these complex workflows 200x on average, saving time, cost and energy.

For more insights, listen to a talk from GTC describing Wistron’s work and a panel about industries using generative AI.

Learn more about the impact accelerated computing is having on sustainability.

Read More

Into the Omniverse: Groundbreaking OpenUSD Advancements Put NVIDIA GTC Spotlight on Developers

Into the Omniverse: Groundbreaking OpenUSD Advancements Put NVIDIA GTC Spotlight on Developers

Editor’s note: This post is part of Into the Omniverse, a series focused on how artists, developers and enterprises can transform their workflows using the latest advances in OpenUSD and NVIDIA Omniverse.

The Universal Scene Description framework, aka OpenUSD, has emerged as a game-changer for building virtual worlds and accelerating creative workflows. It can ease the handling of complex datasets, facilitate collaboration and enable seamless interoperability between 3D applications.

The latest news and demos from NVIDIA GTC, a global AI conference that ran last week, put on display the power developers gain from NVIDIA Omniverse — a platform of application programming interfaces (APIs) and software development kits (SDKs) that enable them to build 3D pipelines, tools, applications and services.

Newly announced NVIDIA Omniverse Cloud APIs, coming first to Microsoft Azure, allow developers to send their OpenUSD industrial scenes from content-creation applications to the NVIDIA Graphics Delivery Network.

Such a workflow was showcased in a demo featuring an interactive car configurator application, developed by computer-generated-imagery studio Katana using Omniverse, streamed in full fidelity to an Apple Vision Pro’s high-resolution display. A designer wearing the Vision Pro toggled through paint and trim options, and even entered the vehicle.

In a separate demo, Dassault Systèmes showcased, using its 3DEXCITE portfolio, a powerful web-based application for 3D data preparation supercharged with NVIDIA AI and Omniverse Cloud APIs to deliver new generative storytelling capabilities.

OpenUSD also played a part in the announcement of NVIDIA’s latest AI supercomputer, a powerful cluster based on the NVIDIA GB200 NVL72 liquid-cooled system, which was showcased as a digital twin in Omniverse.

Engineers unified and visualized multiple computer-aided design datasets with full physical accuracy and photorealism using OpenUSD through the Cadence Reality digital twin platform, powered by Omniverse APIs. The technologies together provided a powerful computing platform for developing OpenUSD-based 3D tools, workflows and applications.

Siemens announced it has integrated OpenUSD into its Xcelerator platform applications via Omniverse Cloud APIs, enabling its customers to unify their 3D data and services in digital twins with physically based rendering.

A demo showcased how ship manufacturer HD Hyundai used Siemens’ Teamcenter X, which is part of Xcelerator, to design digital twins of complex engineering projects, delivering accelerated collaboration, minimized workflow waste, time and cost savings, and reduced manufacturing defects.

OpenUSD Ecosystem Updates on Replay

The latest OpenUSD ecosystem updates shared at GTC include:

  • Ansys is adopting OpenUSD and Omniverse Cloud APIs to enable data interoperability and NVIDIA RTX visualization in technologies such as Ansys AVxcelerate for autonomous vehicles, Ansys Perceive EM for 6G simulation, and NVIDIA-accelerated solvers such as Ansys Fluent.
  • Dassault Systèmes is using OpenUSD, Omniverse Cloud APIs and Shutterstock 3D AI Services for generative storytelling in 3DEXCITE applications.
  • Continental is developing an OpenUSD-based digital twin platform to optimize factory operations and speed time to market.
  • Hexagon is integrating reality-capture sensors and digital-reality platforms with OpenUSD and Omniverse Cloud APIs for hyperrealistic simulation and visualization.
  • Media.Monks is adopting Omniverse for a generative AI- and OpenUSD-enabled content-creation pipeline for scalable hyper-personalization.
  • Microsoft is integrating Omniverse Cloud APIs with Microsoft Power BI, so factory operators can see real-time factory data overlaid on a 3D digital twin to speed up production.
  • Rockwell Automation is using OpenUSD and Omniverse Cloud APIs for RTX-enabled visualization in industrial automation and digital transformation.
  • Trimble is enabling interactive NVIDIA Omniverse RTX viewers with Trimble model data using OpenUSD and Omniverse Cloud APIs.
  • Wistron is building OpenUSD-based digital twins of NVIDIA DGX and HGX factories using custom software developed with Omniverse SDKs and APIs.
  • WPP is expanding its Omniverse Cloud-based OpenUSD and generative AI content-generation engine to the retail and consumer packaged goods sector.

Get Plugged In to the World of OpenUSD

Several GTC sessions expanded on the latest OpenUSD advancements. Register free to watch them on demand:

Get started with NVIDIA Omniverse by downloading the standard license free, access OpenUSD resources, and learn how Omniverse Enterprise can connect your team. Stay up to date on Instagram, Medium and X. For more, join the Omniverse community on the forums, Discord server, Twitch and YouTube channels. 

Featured image courtesy of Siemens, HD Hyundai.

Read More

NVIDIA Blackwell and Automotive Industry Innovators Dazzle at NVIDIA GTC

NVIDIA Blackwell and Automotive Industry Innovators Dazzle at NVIDIA GTC

Generative AI, in the data center and in the car, is making vehicle experiences safer and more enjoyable.

The latest advancements in automotive technology were on display last week at NVIDIA GTC, a global AI conference that drew tens of thousands of business leaders, developers and researchers from around the world.

The event kicked off with NVIDIA founder and CEO Jensen Huang’s keynote, which included the announcement of the NVIDIA Blackwell platform — purpose-built to power a new era of AI computing.

The NVIDIA Blackwell GPU architecture will be integrated into the NVIDIA DRIVE Thor centralized car computer to enable generative AI applications and immersive in-vehicle experiences. Large language models will be able to run in the car, enabling an intelligent copilot that understands and speaks in natural language.

BYD, the world’s largest electric-vehicle maker, announced it will adopt DRIVE Thor as the AI brain of its future fleets. In addition, the company will use NVIDIA’s AI infrastructure for cloud-based AI development and training, and the NVIDIA Isaac and NVIDIA Omniverse platforms to develop tools and applications for virtual factory planning and retail configurators. Hyper, Nuro, Plus, Waabi, WeRide and XPENG are also adopting DRIVE Thor.

Learn more about the automotive ecosystem’s announcements at GTC:

Some of the latest NVIDIA-powered vehicles displayed on the exhibition floor included:

  • Aurora self-driving truck, already on the highways of Texas
  • Lucid Air long-range electric sedan
  • Mercedes-Benz Concept CLA Class, showcasing what’s to come
  • Nuro R3, a fully autonomous robotic delivery model
  • Polestar 3, the SUV for the electric age
  • Volvo Cars EX90, its new fully electric, flagship SUV
  • And WeRide’s Robobus, a new form of urban mobility.
Mercedes-Benz Concept CLA Class.

The NVIDIA auto booth highlighted the wide adoption of the NVIDIA DRIVE platform, with displays featuring electronic control units from a variety of partners, including Bosch, Lenovo and ZEEKR.

A wide range of NVIDIA automotive partners, including Ansys, Foretellix, Lenovo, MediaTek, NODAR, OMNIVISION, Plus, Seyond, SoundHound, Voxel51 and Waabi, all made next-generation product announcements at GTC.

In addition, the automotive pavilion buzzed with interest in the latest lidar advancements from Luminar and Robosense, as well as Helm.ai’s software offerings for the level 2 to level 4 autonomous driving stack.

And other partners, such as Ford, Geely, General Motors, Jaguar Land Rover and Zoox, participated in dozens of sessions and panels covering topics such as building data center applications and developing safe autonomous vehicles. Watch the sessions on demand.

Learn more about the latest advancements in generative AI and automotive technology by watching Huang’s GTC keynote in replay.

Read More

AI’s New Frontier: From Daydreams to Digital Deeds

AI’s New Frontier: From Daydreams to Digital Deeds

Imagine a world where you can whisper your digital wishes into your device, and poof, it happens.

That world may be coming sooner than you think. But if you’re worried about AI doing your thinking for you, you might be waiting for a while.

In a fireside chat Wednesday at NVIDIA GTC, the global AI conference, Kanjun Qiu, CEO of Imbue, and Bryan Catanzaro, VP of applied deep learning research at NVIDIA, challenged many of the clichés that have long dominated conversations about AI.

Launched in October 2022, Imbue made headlines with its Series B fundraiser last year, raising over $200 million at a $1 billion valuation.

Bridging the Gap Between ‘Idea and Execution’

The discussion highlighted not only Imbue’s approach toward building practical AI agents able to automate menial, unrewarding work, but also painted a vivid picture of what the next chapter in AI innovation might hold.

“Our lives are full of so much friction … every single person’s vision can come to life,” Qiu said. “The barrier between idea and execution can be much smaller.”

Catanzaro’s reflections on the practical difficulties of using AI for simple tasks, such as his own challenges trying to get his digital assistant to help him find his next meeting, underscored the current limitations in human-AI interaction.

It turns out that figuring out where and when to go to a meeting, while easy for a human assistant, isn’t easy to automate.

“We tend to underestimate the things that we do naturally and overestimate the things that require reasoning,” Catanzaro observed. “One of the things humans deal with well is ambiguity.”

This set the stage for a broader discussion of the need for AI to move beyond mere code generation and become a dynamic, intuitive interface between humans and computers.

Qiu said the idea that AI can be a magical assistant, one that knows everything about you “isn’t necessarily the right paradigm.”

That’s because delegation is hard.

“When I’m delegating something, even to a human, I have to think a lot about ‘okay, how can I package this up so that the person will do the right thing?’”

Instead, the better model might be telling your computer to do anything you want. So you’re “telling your computer to do stuff and the agent is a middle layer,” she said.

Such agents will need to be able to interact with people — something often described as “reasoning,” the two observed — and communicate with computers — or “code.”

A Vision for Empowerment Through Technology

Qiu and Catanzaro — who often completed each other’s sentences during the 45-minute conversation — compared AI’s potential to democratize software creation to the Industrial Revolution’s impact on manufacturing.

The parts needed for a steam engine, for example, once took years to create. Now they can be ordered off the shelf for a small sum.

Both speakers emphasized the importance of creating intuitive interfaces that allow individuals from nontechnical backgrounds to engage with computers more effectively, fostering a more inclusive digital landscape.

That means going beyond coding, which is done in text-heavy environments such as an Integrated Development Environment, or even using text-based chats.

“The interface to agents, a lot of them today, is like a chat interface. It’s not a very good interface, in a lot of ways, very restrictive. And so there are much better ways of working with these systems,” Qiu said.

The Future of Personal Computing

Qiu and Catanzaro discussed the role that virtual worlds will play in this, and how they could serve as interfaces for human-technology interaction.

“I think it’s pretty clear that AI is going to help build virtual worlds,” said Catanzaro. “I think the maybe more controversial part is virtual worlds are going to be necessary for humans to interact with AI.”

People have an almost primal fear of being displaced, Catanzaro said, but what’s much more likely is that our capabilities will be amplified as the technology fades into the background.

Catanzaro compared it to the adoption of electricity. A century ago, people talked a lot about electricity. Now that it’s ubiquitous, it’s no longer the focus of broader conversations, even as it makes our day-to-day lives better.

“I think of it as really being able to [help us] control information environments … once we have control over information environments, we’ll feel a lot more empowered,” Qiu said. “Every single person’s vision can come to life.”

Read More

Here Be Dragons: ‘Dragon’s Dogma 2’ Comes to GeForce NOW

Here Be Dragons: ‘Dragon’s Dogma 2’ Comes to GeForce NOW

Arise for a new adventure with Dragon’s Dogma 2, leading two new titles joining the GeForce NOW library this week.

Set Forth, Arisen

Dragon's Dogma 2
Fulfill a forgotten destiny in “Dragon’s Dogma 2” from Capcom.

Time to go on a grand adventure, Arisen!

Dragon’s Dogma 2, the long-awaited sequel to Capcom’s legendary action role-playing game, streams this week on GeForce NOW.

The game challenges players to choose their own experience, including their Arisen’s appearance, vocation, party, approaches to different situations and more. Wield swords, bows and magick across an immersive fantasy world full of life and battle. But players won’t be alone. Recruit Pawns — mysterious otherworldly beings — to aid in battle and work with other players’ Pawns to fight the diverse monsters inhabiting the ever-changing lands.

Upgrade to a GeForce NOW Ultimate membership to stream Dragon’s Dogma 2 from NVIDIA GeForce RTX 4080 servers in the cloud for the highest performance, even on low-powered devices. Ultimate members also get exclusive access to servers to get right into gaming without waiting for any downloads.

New Games, New Challenges

Battlefield 2042 S7 on GeForce NOW
No holding back.

Battlefield 2042: Season 7 Turning Point is here. Do whatever it takes to battle for Earth’s most valuable resource — water — in a Chilean desert. Deploy on a new map, Haven, focused on suburban combat, and revisit a fan-favorite front: Stadium. Gear up with new hardware like the SCZ-3 SMG or the Predator SRAW, and jump into a battle for ultimate power.

Then, look forward to the following list of games this week:

  • Alone in the Dark (New release on Steam, March 20)
  • Dragon’s Dogma 2 (New release on Steam, March 21)

What are you planning to play this weekend? Let us know on X or in the comments below.

Read More

Instant Latte: NVIDIA Gen AI Research Brews 3D Shapes in Under a Second

Instant Latte: NVIDIA Gen AI Research Brews 3D Shapes in Under a Second

NVIDIA researchers have pumped a double shot of acceleration into their latest text-to-3D generative AI model, dubbed LATTE3D.

Like a virtual 3D printer, LATTE3D turns text prompts into 3D representations of objects and animals within a second.

Crafted in a popular format used for standard rendering applications, the generated shapes can be easily served up in virtual environments for developing video games, ad campaigns, design projects or virtual training grounds for robotics.

“A year ago, it took an hour for AI models to generate 3D visuals of this quality — and the current state of the art is now around 10 to 12 seconds,” said Sanja Fidler, vice president of AI research at NVIDIA, whose Toronto-based AI lab team developed LATTE3D. “We can now produce results an order of magnitude faster, putting near-real-time text-to-3D generation within reach for creators across industries.”

This advancement means that LATTE3D can produce 3D shapes near instantly when running inference on a single GPU, such as the NVIDIA RTX A6000, which was used for the NVIDIA Research demo.

Ideate, Generate, Iterate: Shortening the Cycle

Instead of starting a design from scratch or combing through a 3D asset library, a creator could use LATTE3D to generate detailed objects as quickly as ideas pop into their head.

The model generates a few different 3D shape options based on each text prompt, giving a creator options. Selected objects can be optimized for higher quality within a few minutes. Then, users can export the shape into graphics software applications or platforms such as NVIDIA Omniverse, which enables Universal Scene Description (OpenUSD)-based 3D workflows and applications.

While the researchers trained LATTE3D on two specific datasets — animals and everyday objects — developers could use the same model architecture to train the AI on other data types.

If trained on a dataset of 3D plants, for example, a version of LATTE3D could help a landscape designer quickly fill out a garden rendering with trees, flowering bushes and succulents while brainstorming with a client. If trained on household objects, the model could generate items to fill in 3D simulations of homes, which developers could use to train personal assistant robots before they’re tested and deployed in the real world.

LATTE3D was trained using NVIDIA A100 Tensor Core GPUs. In addition to 3D shapes, the model was trained on diverse text prompts generated using ChatGPT to improve the model’s ability to handle the various phrases a user might come up with to describe a particular 3D object — for example, understanding that prompts featuring various canine species should all generate doglike shapes.

NVIDIA Research comprises hundreds of scientists and engineers worldwide, with teams focused on topics including AI, computer graphics, computer vision, self-driving cars and robotics.

Researchers shared work at NVIDIA GTC this week that advances the state of the art for training diffusion models. Read more on the NVIDIA Technical Blog, and see the full list of NVIDIA Research sessions at GTC, running in San Jose, Calif., and online through March 21.

For the latest NVIDIA AI news, watch the replay of NVIDIA founder and CEO Jensen Huang’s keynote address at GTC: 

Read More

‘You Transformed the World,’ NVIDIA CEO Tells Researchers Behind Landmark AI Paper

‘You Transformed the World,’ NVIDIA CEO Tells Researchers Behind Landmark AI Paper

Of GTC’s 900+ sessions, the most wildly popular was a conversation hosted by NVIDIA founder and CEO Jensen Huang with seven of the authors of the legendary research paper that introduced the aptly named transformer — a neural network architecture that went on to change the deep learning landscape and enable today’s era of generative AI.

“Everything that we’re enjoying today can be traced back to that moment,” Huang said to a packed room with hundreds of attendees, who heard him speak with the authors of “Attention Is All You Need.”

Sharing the stage for the first time, the research luminaries reflected on the factors that led to their original paper, which has been cited more than 100,000 times since it was first published and presented at the NeurIPS AI conference. They also discussed their latest projects and offered insights into future directions for the field of generative AI.

While they started as Google researchers, the collaborators are now spread across the industry, most as founders of their own AI companies.

“We have a whole industry that is grateful for the work that you guys did,” Huang said.

From L to R: Lukasz Kaiser, Noam Shazeer, Aidan Gomez, Jensen Huang, Llion Jones, Jakob Uszkoreit, Ashish Vaswani and Illia Polosukhin.

Origins of the Transformer Model

The research team initially sought to overcome the limitations of recurrent neural networks, or RNNs, which were then the state of the art for processing language data.

Noam Shazeer, cofounder and CEO of Character.AI, compared RNNs to the steam engine and transformers to the improved efficiency of internal combustion.

“We could have done the industrial revolution on the steam engine, but it would just have been a pain,” he said. “Things went way, way better with internal combustion.”

“Now we’re just waiting for the fusion,” quipped Illia Polosukhin, cofounder of blockchain company NEAR Protocol.

The paper’s title came from a realization that attention mechanisms — an element of neural networks that enable them to determine the relationship between different parts of input data — were the most critical component of their model’s performance.

“We had very recently started throwing bits of the model away, just to see how much worse it would get. And to our surprise it started getting better,” said Llion Jones, cofounder and chief technology officer at Sakana AI.

Having a name as general as “transformers” spoke to the team’s ambitions to build AI models that could process and transform every data type — including text, images, audio, tensors and biological data.

“That North Star, it was there on day zero, and so it’s been really exciting and gratifying to watch that come to fruition,” said Aidan Gomez, cofounder and CEO of Cohere. “We’re actually seeing it happen now.”

Packed house at the San Jose Convention Center.

Envisioning the Road Ahead 

Adaptive computation, where a model adjusts how much computing power is used based on the complexity of a given problem, is a key factor the researchers see improving in future AI models.

“It’s really about spending the right amount of effort and ultimately energy on a given problem,” said Jakob Uszkoreit, cofounder and CEO of biological software company Inceptive. “You don’t want to spend too much on a problem that’s easy or too little on a problem that’s hard.”

A math problem like two plus two, for example, shouldn’t be run through a trillion-parameter transformer model — it should run on a basic calculator, the group agreed.

They’re also looking forward to the next generation of AI models.

“I think the world needs something better than the transformer,” said Gomez. “I think all of us here hope it gets succeeded by something that will carry us to a new plateau of performance.”

“You don’t want to miss these next 10 years,” Huang said. “Unbelievable new capabilities will be invented.”

The conversation concluded with Huang presenting each researcher with a framed cover plate of the NVIDIA DGX-1 AI supercomputer, signed with the message, “You transformed the world.”

Jensen presents lead author Ashish Vaswani with a signed DGX-1 cover.

There’s still time to catch the session replay by registering for a virtual GTC pass — it’s free.

To discover the latest in generative AI, watch Huang’s GTC keynote address:

Read More

AI Decoded From GTC: The Latest Developer Tools and Apps Accelerating AI on PC and Workstation

AI Decoded From GTC: The Latest Developer Tools and Apps Accelerating AI on PC and Workstation

Editor’s note: This post is part of the AI Decoded series, which demystifies AI by making the technology more accessible, and which showcases new hardware, software, tools and accelerations for RTX PC users.

NVIDIA’s RTX AI platform includes tools and software development kits that help Windows developers create cutting-edge generative AI features to deliver the best performance on AI PCs and workstations.

At GTC — NVIDIA’s annual technology conference — a dream team of industry luminaries, developers and researchers have come together to learn from one another, fueling what’s next in AI and accelerated computing.

This special edition of AI Decoded from GTC spotlights the best AI tools currently available and looks at what’s ahead for the 100 million RTX PC and workstation users and developers.

Chat with RTX, the tech demo and developer reference project that quickly and easily allows users to connect a powerful LLM to their own data, showcased new capabilities and new models in the GTC exhibit hall.

The winners of the Gen AI on RTX PCs contest were announced Monday. OutlookLLM, Rocket League BotChat and CLARA were highlighted in one of the AI Decoded talks in the generative AI theater and each are accelerated by NVIDIA TensorRT-LLM. Two other AI Decoded talks included using generative AI in content creation and a deep dive on Chat with RTX.

Developer frameworks and interfaces with TensorRT-LLM integration continue to grow as Jan.ai, Langchain, LlamaIndex and Oobabooga will all soon be accelerated — helping to grow the already more than 500 AI applications for RTX PCs and workstations.

NVIDIA NIM microservices are coming to RTX PCs and workstations. They provide pre-built containers, with industry standard APIs, enabling developers to accelerate deployment on RTX PCs and workstations. NVIDIA AI Workbench, an easy-to-use developer toolkit to manage AI model customization and optimization workflows, is now generally available for RTX developers.

These ecosystem integrations and tools will accelerate development of new Windows apps and features. And today’s contest winners are an inspiring glimpse into what that content will look like.

Hear More, See More, Chat More

Chat with RTX, or ChatRTX for short, uses retrieval-augmented generation, NVIDIA TensorRT-LLM software and NVIDIA RTX acceleration to bring local generative AI capabilities to RTX-powered Windows systems. Users can quickly and easily connect local files as a dataset to an open large language model like Mistral or Llama 2, enabling queries for quick, contextually relevant answers.

Moving beyond text, ChatRTX will soon add support for voice, images and new models.

Users will be able to talk to ChatRTX with Whisper — an automatic speech recognition system that uses AI to process spoken language. When the feature becomes available, ChatRTX will be able to “understand” spoken language, and provide text responses.

A future update will also add support for photos. By integrating OpenAI’s CLIP — Contrastive Language-Image Pre-training — users will be able to search by words, terms or phrases to find photos in their private library.

In addition to Google’s Gemma, ChatGLM will get support in a future update.

Developers can start with the latest version of the developer reference project on GitHub.

Generative AI for the Win

The NVIDIA Generative AI on NVIDIA RTX developer contest prompted developers to build a Windows app or plug-in.

“I found that playing against bots that react to game events with in-game messages in near real time adds a new level of entertainment to the game, and I’m excited to share my approach to incorporating AI into gaming as a participant in this developer contest. The target audience for my project is anyone who plays Rocket League with RTX hardware.” — Brian Caffey, Rocket League BotChat developer

Submissions were judged on three criteria, including a short demo video posted to social media, relative impact and ease of use of the project, and how effectively NVIDIA’s technology stack was used in the project. Each of the three winners received a pass to GTC, including a spot in the NVIDIA Deep Learning Institute GenAI/LLM courses, and a GeForce RTX 4090 GPU to power future development work.

OutlookLLM gives Outlook users generative AI features — such as email composition — securely and privately in their email client on RTX PCs and workstations. It uses a local LLM served via TensorRT-LLM.

Rocket League BotChat, for the popular Rocket League game, is a plug-in that allows bots to send contextual in-game chat messages based on a log of game events, such as scoring a goal or making a save. Designed to be used only in offline games against bot players, the plug-in is configurable in many ways via its settings menu.

CLARA (short for Command Line Assistant with RTX Acceleration) is designed to enhance the command line interface of PowerShell by translating plain English instructions into actionable commands. The extension runs locally, quickly and keeps users in their PowerShell context. Once it’s enabled, users type their English instructions and press the tab button to invoke CLARA. Installation is straightforward, and there are options for both script-based and manual setup.

From the Generative AI Theater

GTC attendees can attend three AI Decoded talks on Wednesday, March 20 at the generative AI theater. These 15-minute sessions will guide the audience through ChatRTX and how developers can productize their own personalized chatbot; how each of the three contest winners’ showed some of the possibilities for generative AI apps on RTX systems; and a celebration of artists, the tools and methods they use powered by NVIDIA technology.

In the creator session, Lee Fraser, senior developer relations manager for generative AI media and entertainment at NVIDIA, will explore why generative AI has become so popular. He’ll show off new workflows and how creators can rapidly explore ideas. Artists to be featured include Steve Talkowski, Sophia Crespo, Lim Wenhui, Erik Paynter, Vanessa Rosa and Refik Anadol.

Anadol also has an installation at the show that combines data visualization and imagery based on that data.

Ecosystem of Acceleration

Top creative app developers, like Blackmagic Design and Topaz Labs have integrated RTX AI acceleration in their software. TensorRT doubles the speed of AI effects like rotoscoping, denoising, super-resolution and video stabilization in the DaVinci Resolve and Topaz apps.

“Blackmagic Design and NVIDIA’s ongoing collaborations to run AI models on RTX AI PCs will produce a new wave of groundbreaking features that give users the power to create captivating and immersive content, faster.” — Rohit Gupta, director of software development at Blackmagic Design

TensorRT-LLM is being integrated with popular developer frameworks and ecosystems such as LangChain, LlamaIndex, Oobabooga and Jan.AI. Developers and enthusiasts can easily access the performance benefits of TensorRT-LLM through top LLM frameworks to build and deploy generative AI apps to both local and cloud GPUs.

Enthusiasts can also try out their favorite LLMs — accelerated with TensorRT-LLM on RTX systems — through the Oobabooga and Jan.AI chat interfaces.

AI That’s NIMble, AI That’s Quick

Developers and tinkerers can tap into NIM microservices. These pre-built AI “containers,” with industry-standard APIs, provide an optimized solution that helps to reduce deployment times from weeks to minutes. They can be used with more than two dozen popular models from NVIDIA, Getty Images, Google, Meta, Microsoft, Shutterstock and more.

NVIDIA AI Workbench is now generally available, helping developers quickly create, test and customize pretrained generative AI models and LLMs on RTX GPUs. It offers streamlined access to popular repositories like Hugging Face, GitHub and NVIDIA NGC, along with a simplified user interface that enables developers to easily reproduce, collaborate on and migrate projects.

Projects can be easily scaled up when additional performance is needed — whether to the data center, a public cloud or NVIDIA DGX Cloud — and then brought back to local RTX systems on a PC or workstation for inference and light customization. AI Workbench is a free download and provides example projects to help developers get started quickly.

These tools, and many others announced and shown at GTC, are helping developers drive innovative AI solutions.

From the Blackwell platform’s arrival, to a digital twin for Earth’s climate, it’s been a GTC to remember. For RTX PC and workstation users and developers, it was also a glimpse into what’s next for generative AI.

See notice regarding software product information.

Read More

Secure by Design: NVIDIA AIOps Partner Ecosystem Blends AI for Businesses

Secure by Design: NVIDIA AIOps Partner Ecosystem Blends AI for Businesses

In today’s complex business environments, IT teams face a constant flow of challenges, from simple issues like employee account lockouts to critical security threats. These situations demand both quick fixes and strategic defenses, making the job of maintaining smooth and secure operations ever tougher.

That’s where AIOps comes in, blending artificial intelligence with IT operations to not only automate routine tasks, but also enhance security measures. This efficient approach allows teams to quickly deal with minor issues and, more importantly, to identify and respond to security threats faster and with greater accuracy than before.

By using machine learning, AIOps becomes a crucial tool in not just streamlining operations but also in strengthening security across the board. It’s proving to be a game-changer for businesses looking to integrate advanced AI into their teams, helping them stay a step ahead of potential security risks.

According to IDC, the IT operations management software market is expected to grow at a rate of 10.3% annually, reaching a projected revenue of $28.4 billion by 2027. This growth underscores the increasing reliance on AIOps for operational efficiency and as a critical component of modern cybersecurity strategies.

As the rapid growth of machine learning operations continues to transform the era of generative AI, a broad ecosystem of NVIDIA partners are offering AIOps solutions that leverage NVIDIA AI to improve IT operations.

NVIDIA is helping a broad ecosystem of AIOps partners with accelerated compute and AI software. This includes NVIDIA AI Enterprise, a cloud-native stack that can run anywhere and provides a basis for AIOps through software like NVIDIA NIM for accelerated inference of AI modes, NVIDIA Morpheus for AI-based cybersecurity and NVIDIA NeMo for custom generative AI. This software facilitates GenAI-based chatbot, summarization and search functionality.

AIOps providers using NVIDIA AI include:

  • Dynatrace Davis hypermodal AI advances AIOps by integrating causal, predictive and generative AI techniques with the addition of Davis CoPilot. This combination enhances observability and security across IT, development, security and business operations by offering precise and actionable, AI-driven answers and automation.

  • Elastic offers Elasticsearch Relevance Engine (ESRE) for semantic and vector search, which integrates with popular LLMs like GPT-4 to power AI Assistants in their Observability and Security solutions. The Observability AI Assistant is a next-generation AI Ops capability that helps IT teams understand complex systems, monitor health and automate remediation of operational issues.
  • New Relic is advancing AIOps by leveraging its machine learning, generative AI assistant frameworks and longstanding expertise in observability. Its machine learning and advanced logic helps IT teams reduce alerting noise, improve mean time to detect and mean time to repair, automate root cause analysis and generate retrospectives. Its GenAI assistant, New Relic AI, accelerates issue resolution by allowing users to identify, explain and resolve errors without switching contexts, and suggests and applies code fixes directly in a developer’s integrated development environment. It also extends incident visibility and prevention to non-technical teams by automatically producing high-level system health reports, analyzing and summarizing dashboards and answering plain-language questions about a user’s applications, infrastructure and services. New Relic also provides full-stack observability for AI-powered applications benefitting from NVIDIA GPUs.
  • PagerDuty has introduced a new feature in PagerDuty Copilot, integrating a generative AI assistant within Slack to offer insights from incident start to resolution, streamlining the incident lifecycle and reducing manual task loads for IT teams.
  • ServiceNow’s commitment to creating a proactive IT operations encompasses automating insights for rapid incident response, optimizing service management and detecting anomalies. Now, in collaboration with NVIDIA, it is pushing into generative AI to further innovate technology service and operations.
  • Splunk’s technology platform applies artificial intelligence and machine learning to automate the processes of identifying, diagnosing and resolving operational issues and threats, thereby enhancing IT efficiency and security posture. Splunk IT Service Intelligence serves as Splunk’s primary AIOps offering, providing embedded AI-driven incident prediction, detection and resolution all from one place.

Cloud service providers including Amazon Web Services (AWS), Google Cloud and Microsoft Azure enable organizations to automate and optimize their IT operations, leveraging the scale and flexibility of cloud resources.

  • AWS offers a suite of services conducive to AIOps, including Amazon CloudWatch for monitoring and observability; AWS CloudTrail for tracking user activity and API usage; Amazon SageMaker for creating repeatable and responsible machine learning workflows; and AWS Lambda for serverless computing, allowing for the automation of response actions based on triggers.
  • Google Cloud supports AIOps through services like Google Cloud Operations, which provides monitoring, logging and diagnostics across applications on the cloud and on-premises. Google Cloud’s AI and machine learning products include Vertex AI for model training and prediction and BigQuery for fast SQL queries using the processing power of Google’s infrastructure.
  • Microsoft Azure facilitates AIOps with Azure Monitor for comprehensive monitoring of applications, services and infrastructure. Azure Monitor’s built-in AIOps capabilities help predict capacity usage, enable autoscaling, identify application performance issues and detect anomalous behaviors in virtual machines, containers and other resources. Microsoft Azure Machine Learning (AzureML) offers a cloud-based MLOps environment for training, deploying and managing machine learning models responsibly, securely and at scale.

Platforms specializing in MLOps primarily focus on streamlining the lifecycle of machine learning models, from development to deployment and monitoring. While the core mission centers on making machine learning more accessible, efficient and scalable, their technologies and methodologies indirectly support AIOps by enhancing AI capabilities within IT operations: 

  • Anyscale’s platform, based on Ray, allows for the easy scaling of AI and machine learning applications, including those used in AIOps for tasks like anomaly detection and automated remediation. By facilitating distributed computing, Anyscale helps AIOps systems process large volumes of operational data more efficiently, enabling real-time analytics and decision-making.
  • Dataiku can be used to create models that predict IT system failures or optimize resource allocation, with features that allow IT teams to quickly deploy and iterate on these models in production environments.
  • Dataloop’s platform delivers full data lifecycle management and a flexible way to plug in AI models for an end-to-end workflow, allowing users to develop AI applications using their data.
  • DataRobot is a full AI lifecycle platform that enables IT operations teams to rapidly build, deploy and govern AI solutions, improving operational efficiency and performance.
  • Domino Data Lab’s platform lets enterprises and their data scientists build, deploy and manage AI on a unified, end-to-end platform. Data, tools, compute, models and projects across all environments are centrally managed so teams can collaborate, monitor production models and standardize best practices for governed AI innovation. This approach is vital for AIOps as it balances the self-service needed by data science teams with complete reproducibility, granular cost tracking and proactive governance for IT operational needs.
  • Weights & Biases provides tools for experiment tracking, model optimization, and collaboration, crucial for developing and fine-tuning AI models used in AIOps. By offering detailed insights into model performance and facilitating collaboration across teams, Weights & Biases helps ensure that AI models deployed for IT operations are both effective and transparent.

Learn more about NVIDIA’s partner ecosystem and their work at NVIDIA GTC.

Read More