FOMO Alert: Discover 7 Unmissable Reasons to Attend GTC 2024

FOMO Alert: Discover 7 Unmissable Reasons to Attend GTC 2024

“I just got back from GTC and ….”

In four weeks, those will be among the most powerful words in your industry. But you won’t be able to use them if you haven’t been here.

NVIDIA’s GTC 2024 transforms the San Jose Convention Center into a crucible of innovation, learning and community from March 18-21, marking a return to in-person gatherings that can’t be missed.

Tech enthusiasts, industry leaders and innovators from around the world are set to present and explore over 900 sessions and close to 300 exhibits.

They’ll dive into the future of AI, computing and beyond, with contributions from some of the brightest minds at companies such as Amazon, Amgen, Character.AI, Ford Motor Co., Genentech, L’Oréal, Lowe’s, Lucasfilm and Industrial Light & Magic, Mercedes-Benz, Pixar, Siemens, Shutterstock, xAI and many more.

Among the most anticipated events is the Transforming AI Panel, featuring the original architects behind the concept that revolutionized the way we approach AI today: Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin.

All eight authors of “Attention Is All You Need,” the seminal 2017 NeurIPS paper that introduced the trailblazing transformer neural network architecture will appear in person at GTC on a panel hosted by NVIDIA Founder and CEO Jensen Huang.

Located in the vibrant heart of Silicon Valley, GTC stands as a pivotal gathering where the convergence of technology and community shapes the future. This conference offers more than just presentations; it’s a collaborative platform for sharing knowledge and sparking innovation.

  1. Exclusive Insights: Last year, Huang announced a “lightspeed” leap in computing and partnerships with giants like Microsoft to set the stage. This year, anticipate more innovations at the SAP Center, giving attendees a first look at the next transformative breakthroughs.
  2. Networking Opportunities: GTC’s networking events are designed to transform casual encounters into pivotal career opportunities. Connect directly with industry leaders and innovators, making every conversation a potential gateway to your next big role or project.
  3. Cutting-Edge Exhibits: Step into the future with exhibits that showcase the latest in AI and robotics. Beyond mere displays, these exhibits offer hands-on learning experiences, providing attendees with invaluable knowledge to stay ahead.

    AI is spilling out in all directions, and GTC is the best way to capture it all. Pictured: The latest installation from AI artist Refik Anadol, whose work will be featured at GTC.

  4. Diversity and Innovation: Begin your day at the Women In Tech breakfast. This, combined with unique experiences like generative AI art installations and street food showcases, feeds creativity and fosters innovation in a relaxed setting.
  5. Learn From the Best: Engage with sessions led by visionaries from organizations such as Disney Research, Google DeepMind, Johnson & Johnson Innovative Medicine, Stanford University and beyond. These aren’t just lectures but opportunities to question, engage and turn insights into actionable knowledge that can shape your career trajectory.
  6. Silicon Valley Experience: Embrace the energy of the world’s foremost tech hub. Inside the conference, GTC connects attendees with the latest technologies and minds. Beyond the show floor, it’s a gateway to building lasting relationships with leaders and thinkers across industries.
  7. Seize the Future Now: Don’t just join a story — write one. Be part of this moment in AI. Register now for GTC to write your own story in the epicenter of technological advancement. Be part of this transformative moment in AI.

Read More

Shining Brighter Together: Google’s Gemma Optimized to Run on NVIDIA GPUs

Shining Brighter Together: Google’s Gemma Optimized to Run on NVIDIA GPUs

NVIDIA, in collaboration with Google, today launched optimizations across all NVIDIA AI platforms for Gemma — Google’s state-of-the-art new lightweight 2 billion– and 7 billion-parameter open language models that can be run anywhere, reducing costs and speeding innovative work for domain-specific use cases.

Teams from the companies worked closely together to accelerate the performance of Gemma — built from the same research and technology used to create the Gemini models — with NVIDIA TensorRT-LLM, an open-source library for optimizing large language model inference, when running on NVIDIA GPUs in the data center, in the cloud and on PCs with NVIDIA RTX GPUs.

This allows developers to target the installed base of over 100 million NVIDIA RTX GPUs available in high-performance AI PCs globally.

Developers can also run Gemma on NVIDIA GPUs in the cloud, including on Google Cloud’s A3 instances based on the H100 Tensor Core GPU and soon, NVIDIA’s H200 Tensor Core GPUs — featuring 141GB of HBM3e memory at 4.8 terabytes per second — which Google will deploy this year.

Enterprise developers can additionally take advantage of NVIDIA’s rich ecosystem of tools — including NVIDIA AI Enterprise with the NeMo framework and TensorRT-LLM — to fine-tune Gemma and deploy the optimized model in their production application.

Learn more about how TensorRT-LLM is revving up inference for Gemma, along with additional information for developers. This includes several model checkpoints of Gemma and the FP8-quantized version of the model, all optimized with TensorRT-LLM.

Experience Gemma 2B and Gemma 7B directly from your browser on the NVIDIA AI Playground.

Gemma Coming to Chat With RTX

Adding support for Gemma soon is Chat with RTX, an NVIDIA tech demo that uses retrieval-augmented generation and TensorRT-LLM software to give users generative AI capabilities on their local, RTX-powered Windows PCs.

The Chat with RTX lets users personalize a chatbot with their own data by easily connecting local files on a PC to a large language model.

Since the model runs locally, it provides results fast, and user data stays on the device. Rather than relying on cloud-based LLM services, Chat with RTX lets users process sensitive data on a local PC without the need to share it with a third party or have an internet connection.

Read More

AI’s Hottest Ticket: NVIDIA GTC Brings Together Automotive Leaders and Visionaries Transforming the Future of Transportation

AI’s Hottest Ticket: NVIDIA GTC Brings Together Automotive Leaders and Visionaries Transforming the Future of Transportation

Generative AI and software-defined computing are transforming the automotive landscape — making the journey behind the wheel safer, smarter and more enjoyable.

Dozens of automakers and NVIDIA DRIVE ecosystem partners will be demonstrating their developments in mobility, along with showcasing their next-gen vehicles at GTC, the conference for the era of AI, running from March 18-21 in San Jose, Calif., and online. These include the Mercedes-Benz Concept CLA Class, the new Volvo EX90, Polestar 3, WeRide Robobus, Nuro R3 autonomous delivery vehicle and more.

Explore myriad sessions to learn about the latest developments in mobility — from highly automated and autonomous driving, generative AI and large language models to simulation, safety, design and manufacturing.

Featured sessions include:

Rounding out the week will be DRIVE Developer Day on Thursday, March 21 — featuring a series of deep-dive sessions on how to build safe and robust self-driving systems. Led by NVIDIA’s engineering experts, these talks will highlight the latest DRIVE features and developments.

Find additional details on automotive-specific programming at GTC here.

Don’t stall — register today to learn how generative AI and software-defined computing are transforming the auto industry.

Read More

Telco GPT: Survey Shows Scale of Industry’s Enthusiasm and Adoption of Generative AI

Telco GPT: Survey Shows Scale of Industry’s Enthusiasm and Adoption of Generative AI

It’s been five years since the telecommunications industry first deployed 5G networks to drive new performance levels for customers and unlock new value for telcos.

But that industry milestone has been overshadowed by the emergence of generative AI and the swift pace at which telcos are embracing large language models as they seek to transform all parts of their business.

A recent survey of more than 400 telecommunications industry professionals from around the world showed that generative AI is the breakout technology of the year and that enthusiasm and adoption for both generative AI, and AI in general, is booming. In addition, the survey showed that, among respondents, AI is improving both revenues and cost savings.

The generative AI insight is the main highlight in the second edition of NVIDIA’s “State of AI in Telecommunications” survey, which included questions covering a range of AI topics, including infrastructure spending, top use cases, biggest challenges and deployment models.

Survey respondents included C-suite leaders, managers, developers and IT architects from mobile telecoms, fixed and cable companies. The survey was conducted over eight weeks between October and December.

Ramping Up on Generative AI

The survey results show how generative AI went from relative obscurity in 2022 to a key solution within a year. Forty-three percent of respondents reported they were investing in it, showing clear evidence that the telecom industry is enthusiastically embracing the generative AI wave to address a wide variety of business goals.

More broadly, there was a marked increase in interest in adopting AI and growing expectations of success from the technology, especially among industry executives. In the survey, 53% of respondents agreed or strongly agreed that adopting AI will be a source of competitive advantage, compared to 39% who reported the same in 2022. For management respondents, the figure was 56%.

The primary reason for this sustained engagement is because many industry stakeholders expect AI to contribute to their company’s success. Overall, 56% of respondents agreed or strongly agreed that “AI is important to my company’s future success,” with the figure rising to 61% among decision-making management respondents. The overall figure is a 14-point boost over the 42% result from the 2022 survey.

Customer Experience Remains Key Driver of AI Investment 

Telcos are adopting AI and generative AI to address a wide variety of business needs. Overall, 31% of respondents said they invested in at least six AI use cases in 2023, while 40% are planning to scale to six or more use cases in 2024.

But enhancing customer experiences remains the biggest AI opportunity for the telecom industry, with 48% of survey respondents selecting it as their main goal for using the technology. Likewise, some 35% of respondents identified customer experiences as their key AI success story.

For generative AI, 57% are using it to improve customer service and support, 57% to improve employee productivity, 48% for network operations and management, 40% for network planning and design, and 32% for marketing content generation.

Early Phase of AI Investment Cycle

The focus on customer experience is influencing investments. Investing in customer-experience optimization remains the most popular AI use case for 2023 (49% of respondents) and for generative AI investments (57% of respondents).

Telcos are also investing in other AI use cases: security (42%), network predictive maintenance (37%), network planning and operations (34%) and field operations (34%) are notable examples. However, using AI for fraud detection in transactions and payments had the biggest jump in popularity between 2022 and 2023, rising 14 points to 28% of respondents.

Overall, investments in AI are still in an early phase of the investment cycle, although growing strongly. In the survey, 43% of respondents reported an investment of over $1 million in AI in their previous year, 52% reported the same for the current year, and 66% reported their budget for AI infrastructure will increase in the next year.

For those who are already investing in AI, 67% reported that AI adoption has helped them increase revenues, with 19% of respondents noting that this revenue growth is more than 10% in specific business areas. Likewise, 63% reported that AI adoption has helped them reduce costs in specific business areas, with 14% noting that this cost reduction is more than 10%.

Innovation With Partners

While telcos are increasing their investments to improve their internal AI capabilities, partnerships remain critical for the adoption of AI solutions in the industry. This is applicable both for AI models and AI hardware infrastructure.

In the survey, 44% of respondents reported that co-development with partners is their company’s preferred approach to building AI solutions. Some 28% of respondents prefer to use open-source tools, while 25% take an AI-as-a-service approach. For generative AI, 29% of respondents built or customized models with a partner, an understandable conservative approach for the telecom industry with its stringent data protection rules.

For infrastructure, increasingly, many telcos are opting for cloud hosting, although the hybrid model still remains dominant. In the survey, 31% of respondents reported that they run most of their AI workloads in the cloud (44% for hybrid), compared to 21% of respondents in the previous survey (56% for hybrid). This is helping to fuel the growing need for more localized cloud infrastructure.

Download the “State of AI in Telecommunications: 2024 Trends” report for in-depth results and insights.

Explore how AI is transforming telecommunications at NVIDIA GTC, featuring industry leaders including Amdocs, Indosat, KT, Samsung Research, ServiceNow, Singtel, SoftBank, Telconet and Verizon.

Learn more about NVIDIA solutions for telecommunications across customer experience, network operations, sovereign AI factories and more.

Read More

Artistry With Adobe: Creator Esteban Toro Delivers Inspirational Master Class Powered by AI and RTX

Artistry With Adobe: Creator Esteban Toro Delivers Inspirational Master Class Powered by AI and RTX

Adobe is putting generative AI into the hands of creators with Adobe Firefly — powered by NVIDIA in the cloud — and adding to its impressive app lineup with exciting new features.

The AI-powered Enhance Speech tool, available soon in Adobe Premiere Pro, is accelerated by NVIDIA RTX. This new feature removes unwanted noise and improves the quality of dialogue clips so they sound professionally recorded.

Esteban Toro, senior community relationship manager at Adobe and this week’s featured In the NVIDIA Studio artist, expertly wields AI-powered features in Adobe Photoshop and Lightroom to create his emotionally moving Cinematic Portraits series.

A sneak peek of Toro’s work.

Have a Chat with RTX — the tech demo app that lets GeForce RTX owners personalize a generative pretrained transformer large language model connected to their own content, whether in documents, notes, videos or other data formats. Since it runs locally on a Windows RTX PC or workstation, results are fast and secure. Download Chat with RTX today.

Don’t forget GTC registration is open for virtual or in-person attendance. Running March 18-21 in San Jose, Calif., the event delivers something for every technical level and interest area, including sessions on how to power content creation using OpenUSD and generative AI.

And Omniverse OpenUSD month rolls on, spotlighting the open and extensible ecosystem for describing, composing, simulating and collaborating within 3D worlds. Follow NVIDIA Studio on Instagram, X and Facebook to learn more.

Storytelling With Adobe AI and RTX 

The talented Toro is driven by stories.

Stories fuel Toro’s creative process.

“Understanding how every person has a different upbringing and how the decisions they made took them to different places is absolutely inspiring,” said Toro. “When I discover a story worth telling, I just feel a necessity to tell it — and tell it right.”

It’s those stories that gave rise to Cinematic Portraits, a photo and video collection of people Toro’s befriended, such as Korean painter Kim Nam Soon, age 81, who impressively learned how to paint at 65.

 

Toro’s planning process is long and thorough — he can only retell stories by first having a conversation with each subject, making sure that they understand what the project is about and building a relationship with them so they feel comfortable enough to authentically share.

He captures video and photos of his subjects using Hasselblad and Sony camera gear. Then, he uses Adobe apps, accelerated by GeForce RTX and NVIDIA RTX technology, in post-production.

Toro deployed the Enhance Speech tool to boost the clarity and quality of voice recordings and adjusted enhancement levels with the Mix Amount setting — all powered by AI. The feature is 75% faster on a GeForce RTX 4090 laptop GPU compared with an RTX 3080 Ti.

“Without AI, the footage, filmed in challenging, noisy conditions, would be unusable,” he said.

The Text-Based Editing tool in Premiere Pro allowed Toro to use speech-to-text AI capabilities to automatically create captions, supported in 18 languages, for video footage — speeding the editing process.

The Text-Based Editing tool can create a transcription of a video sequence and add captions.

Toro also used the Filler Word Detection feature, which detects and deletes filler words and pauses, to achieve cleaner, more accurate transcripts. Filler words are language agnostic, so the feature works in all 18 languages supported in Text-Based Editing.

Adobe expert Esteban Toro hard at work.

Adobe offers a wide variety of time-saving features, such as the AI-powered Auto Reframe tool for automated editing in multiple size formats for social media with project templates. Toro’s NVIDIA Studio laptop with the GeForce RTX 4070 graphics card accelerates all of these powerful tools.

Final file exports were achieved 4x faster than with a CPU alone thanks to the GPU-accelerated NVIDIA video encoder (NVENC). Toro quickly and easily added finishing touches in Photoshop Lightroom, using the RTX-accelerated, AI-powered Raw Details feature to refine the color detail of his high-resolution RAW images, and the Super Resolution feature to upscale images with higher quality than traditional methods.

“Having a dedicated GPU for video projects when filming high-quality video is almost mandatory,” said Toro. “Using NVIDIA GPUs allows me to render and process my projects faster, so the post-processing tools are serving my creative ideas, and I’m not limited by what the computer can do, but exactly what I want to create.”

Artist and Adobe expert Esteban Toro.

Follow Esteban Toro on Instagram.

Read More

NVIDIA Eos Revealed: Peek Into Operations of a Top 10 Supercomputer

NVIDIA Eos Revealed: Peek Into Operations of a Top 10 Supercomputer

Providing a peek at the architecture powering advanced AI factories, NVIDIA Thursday released a video that offers the first public look at Eos, its latest data-center-scale supercomputer.

An extremely large-scale NVIDIA DGX SuperPOD, Eos is where NVIDIA developers create their AI breakthroughs using accelerated computing infrastructure and fully optimized software.

Eos is built with 576 NVIDIA DGX H100 systems, NVIDIA Quantum-2 InfiniBand networking and software, providing a total of 18.4 exaflops of FP8 AI performance.

Revealed in November at the Supercomputing 2023 trade show, Eos — named for the Greek goddess said to open the gates of dawn each day — reflects NVIDIA’s commitment to advancing AI technology.

Eos Supercomputer Fuels Innovation

Each DGX H100 system is equipped with eight NVIDIA H100 Tensor Core GPUs. Eos features a total of 4,608 H100 GPUs.

As a result, Eos can handle the largest AI workloads to train large language models, recommender systems, quantum simulations and more.

It’s a showcase of what NVIDIA’s technologies can do, when working at scale.

Eos is arriving at the perfect time. People are changing the world with generative AI, from drug discovery to chatbots to autonomous machines and beyond.

To achieve these breakthroughs, they need more than AI expertise and development skills. They need an AI factory — a purpose-built AI engine that’s always available and can help ramp their capacity to build AI models at scale

Eos delivers. Ranked No. 9 in the TOP500 list of the world’s fastest supercomputers, Eos pushes the boundaries of AI technology and infrastructure.

It includes NVIDIA’s advanced accelerated computing and networking alongside sophisticated software offerings such as NVIDIA Base Command and NVIDIA AI Enterprise.


Eos’s architecture is optimized for AI workloads demanding ultra-low-latency and high-throughput interconnectivity across a large cluster of accelerated computing nodes, making it an ideal solution for enterprises looking to scale their AI capabilities.

Based on NVIDIA Quantum-2 InfiniBand with In-Network Computing technology, its network architecture supports data transfer speeds of up to 400Gb/s, facilitating the rapid movement of large datasets essential for training complex AI models.

At the heart of Eos lies the groundbreaking DGX SuperPOD architecture powered by NVIDIA’s DGX H100 systems.

The architecture is built to provide the AI and computing fields with tightly integrated full-stack systems capable of computing at an enormous scale.

As enterprises and developers worldwide seek to harness the power of AI, Eos stands as a pivotal resource, promising to accelerate the journey towards AI-infused applications that fuel every organization.

Read More

The Easiest Upgrade: Play at Ultimate Quality With GeForce NOW

The Easiest Upgrade: Play at Ultimate Quality With GeForce NOW

GFN Thursday keeps its fourth anniversary celebrations rolling by bringing Ubisoft’s Skull and Bones and Microsoft’s Halo Infinite to the cloud this week.

They’re part of five newly supported games, and thanks to the power of the cloud, members can play them at unrivaled quality across nearly any device.

The Ultimate Upgrade, Instantly

When GeForce NOW launched in 2020, members flocked to take advantage of NVIDIA GeForce RTX 20 Series GPU-powered servers and experience real-time ray tracing on low-powered devices. For the first time, high-performance PC gaming was available to all.

Later, members gained access to the Ultimate upgrade, as NVIDIA cloud gaming servers brought GeForce RTX 3080-class power to users across the globe.

Now, with the NVIDIA Ada Lovelace GPU architecture, cloud gaming has taken another leap forward, powered by the GeForce RTX 4080 SuperPOD.

Alan Wake 2 Performance GeForce NOW
Oh deer, experience “Alan Wake 2” at the highest performance from the cloud.

That means nearly anyone can experience groundbreaking PC gaming technologies like NVIDIA DLSS 3.5, with its AI-powered Frame Generation and upscaling features. Members can explore their favorite game worlds rendered with cinematic lighting and reflections thanks to RTX ON, with full ray tracing supported in titles like Cyberpunk 2077 and Alan Wake 2. Experience immersive gaming, even on old laptops or smartphones.

Enjoy the greatest PC games available at up to 4K resolution with an Ultimate membership, and explore a whole new world with support for 21:9 ultrawide resolutions.

Members also have the competitive edge in the cloud, thanks to support for NVIDIA Reflex technology. Ultimate members can take aim and make every shot count with ultra-low latency and support for up to 240 frames per second performance — a first for cloud gaming — all made possible by GeForce NOW. Upgrade today to feel the difference.

Shiver Me Timbers

Skull and Bones on GeForce NOW
Sail the seven seas in Ubisoft’s latest title.

Enter the perilous world of Skull and Bones, Ubisoft’s nautical action-packed adventure streaming now on GeForce NOW.

Sail the seas as a fearsome pirate kingpin, gaining infamy and gathering resources while building a smuggling empire. Engage in thrilling naval battles and risk it all for the biggest loot. Equip powerful weapons to outgun other ships and rain terror on enemy forts. Craft and sail up to 10 ships, each with unique perks, and become a force of destruction on the water.

Upgrade to a GeForce NOW Ultimate membership to loot and plunder at full quality, with support for ultrawide resolutions and gameplay at up to 4K resolution and 120 fps on PCs and Macs.

Infinite Action

Halo Infinite on GeForce NOW
“I need a weapon.”

Step inside the armor of humanity’s greatest hero. Halo Infinite joins GeForce NOW this week, delivering the most expansive Master Chief campaign yet and a groundbreaking, free-to-play multiplayer experience. Plus, read this article and search for Halo Infinite for more details on how to launch the game.

It’s part of five new games this week:

  • Banishers: Ghosts of New Eden (New release on Steam, Feb. 12)
  • Deep Rock Galactic: Survivor (New release on Steam, Feb. 14)
  • Goat Simulator 3 (New release on Steam, Feb 15)
  • Skull and Bones (New release on Ubisoft, Feb. 16)
  • Halo Infinite (Steam and Xbox, available on PC Game Pass)

What are you planning to play this weekend? Let us know on X or in the comments below.

Read More

Digitalization: A Game Changer for the Auto Industry

Digitalization: A Game Changer for the Auto Industry

The fusion of the physical and digital worlds is reshaping the automotive industry. NVIDIA’s automotive partners are using digitalization to transform every phase of the product lifecycle — evolving primarily physical, manual processes into software-driven, AI-enhanced digital systems.

Watch the video to learn more.

Digitalization: A Game Changer From End to End

Kaivan Karimi, global partner strategy lead at Microsoft, observes that companies are achieving “huge” results from “digitizing the physical entity, running simulations and rendering in 3D, whether it’s factory automation or modernizing the design and development of the car.”

Brian Ullem, vice president of engineering at Capgemini, explains that, with “the 30,000 parts that go into a car, it takes approximately five years to develop a vehicle end to end. Instead of building 50 or 100 cars, we can use digitalization to simulate without having to build prototypes. That saves a lot of time and money in the process.”

Thomas Mueller, chief technology officer of engineering at Wipro, adds that with digitalization, “we are now able to run simulations at a low cost…and improve the user experience.”

Simulation: Critical for Autonomous Driving

“Simulation is crucial to the development of autonomous systems,” says Ziv Binyamini, CEO of Foretellix. “On one hand, you need the real world, but this is highly costly. So you have to complement it with the ability to simulate a virtual world where everything is possible. And then you can, in a very cost-effective way, iterate quickly and ensure the system operates under all of these conditions.”

Simulation “gives our customers the power to validate their ADAS or autonomous systems virtually — with highly accurate sensors in the camera, lidar and radar domains — without having to rely on actual physical drives,” adds Tony Karam, global sales director at Ansys.

Austin Russell, founder and CEO of Luminar, agrees that “simulation is absolutely critical for autonomous driving. It’s great to see the work that NVIDIA has been doing in that domain, with not just the hardware but also the software.”

NVIDIA Omniverse: The Digital-Physical Convergence

“Software is a new component in the value proposition,” notes Walid Negm, chief technology officer of product engineering at Deloitte. The companies that will “survive and thrive are going to have to become much more efficient using the digital-physical convergence. The Omniverse experience is going to be important for the automotive sector.”

Shiv Tasker, global vice president of engineering at Capgemini, adds that the “visualization and production of digital twins relies on an efficient, high-performance infrastructure as well as the platforms that make it easy for customers to adopt the technology.”

Omniverse “will allow your worldwide team to simultaneously collaborate,” says Karimi of Microsoft. “Design engineers, migration engineers, test engineers — everybody collaborates simultaneously. That’s the power of NVIDIA Omniverse.”

Learn more about the NVIDIA DRIVE platform and how it’s helping industry leaders redefine transportation.

Join NVIDIA at GTC from March 18-21 in San Jose, Calif., to learn more about digitalization in the automotive industry.

Read More

Speak Like a Native: NVIDIA Parlays Win in Voice Challenge

Speak Like a Native: NVIDIA Parlays Win in Voice Challenge

Thanks to their work driving AI forward, Akshit Arora and Rafael Valle could someday speak to their spouses’ families in their native languages.

Arora and Valle — along with colleagues Sungwon Kim and Rohan Badlani — won the LIMMITS ’24 challenge which asks contestants to recreate in real time a speaker’s voice in English or any of six languages spoken in India with the appropriate accent. Their novel AI model only required a three-second speech sample.

The NVIDIA team advanced the state of the art in an emerging field of personalized voice interfaces for more than a billion native speakers of Bengali, Chhattisgarhi, Hindi, Kannada, Marathi and Telugu.

Making Voice Interfaces Realistic

The technology for personalized text-to-speech translation is a work in progress. Existing services sometimes fail to accurately reflect the accents of the target language or nuances of the speaker’s voice.

The challenge judged entries by listening for the naturalness of models’ resulting speech and its similarity to the original speaker’s voice.

The latest improvements promise personalized, realistic conversations and experiences that break language barriers. Broadcasters, telcos, universities, as well as e-commerce and online gaming services are eager to deploy such technology to create multilingual movies, lectures and virtual agents.

“We demonstrated we can do this at a scale not previously seen,” said Arora, who has two uses close to his heart.

Breaking Down Linguistic Barriers

A senior data scientist who supports one of NVIDIA’s biggest customers, Arora speaks Punjabi, while his wife and her family are native Tamil speakers.

It’s a gulf he’s long wanted to bridge for himself and others. “I had classmates who knew their native languages much better than the Hindi and English used in school, so they struggled to understand class material,” he said.

The gulf crosses continents for Valle, a native of Brazil whose wife and family speak Gujarati, a language popular in west India.

“It’s a problem I face every day,” said Valle, an AI researcher with degrees in computer music and machine listening and improvisation. “We’ve tried many products to help us have clearer conversations.”

Badlani, an AI researcher, said living in seven different Indian states, each with its own popular language, inspired him to work in the field.

A Race to the Finish Line

The initiative started nearly two years ago when Arora and Badlani formed the four-person team to work on the very different version of the challenge that would be held in 2023.

Their efforts generated a working code base for the so-called Indic languages. But getting to the win announced in January required a full-on sprint because the 2024 challenge didn’t get on the team’s radar until 15 days before the deadline.

Luckily, Kim, a deep learning researcher in NVIDIA’s Seoul office, had been working for some time on an AI model well suited to the challenge.

A specialist in text-to-speech voice synthesis, Kim was designing a so-called P-Flow model prior to starting his second internship at NVIDIA in 2023. P-Flow models borrow the technique large language models employ of using short voice samples as prompts so they can respond to new inputs without retraining.

“I created the model for English, but we were able to generalize it for any language,” he said.

“We were talking and texting about this model even before he started at NVIDIA,” said Valle, who mentored Kim in two internships before he joined full time in January.

Giving Others a Voice

P-Flow will soon be part of NVIDIA Riva, a framework for building multilingual speech and translation AI software, included in the NVIDIA AI Enterprise software platform.

The new capability will let users deploy the technology inside their data centers, on personal systems or in public or private cloud services. Today, voice translation services typically run on public cloud services.

“I hope our customers are inspired to try this technology,” Arora said. “I enjoy being able to showcase in challenges like this one the work we do every day.”

The contest is part of an initiative to develop open-source datasets and AI models for nine languages most widely spoken in India.

Hear Arora and Badlani share their experiences in a session at GTC next month.

And listen to the results of the team’s model below, starting with a three-second sample of a native Kannada speaker:


 

Here’s a similar-sounding synthesized voice reading the first sentence of this blog in Hindi:

 

And then in English:

See notice regarding software product information.

Read More

How the Ohio Supercomputer Center Drives the Future of Computing

How the Ohio Supercomputer Center Drives the Future of Computing

NASCAR races are all about speed, but even the fastest cars need to factor in safety, especially as rules and tracks change. The Ohio Supercomputer Center is ready to help. In this episode of NVIDIA’s AI Podcast, host Noah Kravitz speaks with Alan Chalker, the director of strategic programs at the OSC, about all things supercomputing. The center’s Open OnDemand program, which takes the form of a web-based interface, empowers Ohio higher education institutions and industries with accessible, reliable and secure computational services and training and educational programs. Chalker dives into the history and evolution of the OSC, and explains how it’s working with client companies like NASCAR, which is simulating race car designs virtually. Tune in to learn more about Chalker’s outlook on the future of supercomputing and OSC’s role in realizing it.

Time Stamps:

1:39: History of the Ohio Supercomputer Center
3:18: What are supercomputers?
5:08: How the Open OnDemand program came to be
11:50 How is Open OnDemand being used across higher education, industries?
22:45: OSC’s work with NASCAR
26:57: What’s on the horizon for Open OnDemand?

You Might Also Like…

MIT’s Anant Agarwal on AI in Education – Ep. 197

AI could help students work smarter, not harder. Anant Agarwal, founder of edX and Chief Platform Officer at 2U, shares his vision for the future of online education and the impact of AI in revolutionizing the learning experience.

UF Provost Joe Glover on Building a Leading AI University – Ep. 186

Joe Glover, provost and senior vice president of academic affairs at the University of Florida, discusses the university’s efforts to implement AI across all aspects of higher education, including a public-private partnership with NVIDIA that has helped transform UF into one of the leading AI universities in the country.

NVIDIA’s Marc Hamilton on Building the Cambridge-1 Supercomputer During a Pandemic – Ep. 137

Cambridge-1, U.K.’s most powerful supercomputer, ranks among the world’s top 3 most energy-efficient supercomputers and was built to help healthcare researchers make new discoveries. Marc Hamilton, vice president of solutions architecture and engineering at NVIDIA, speaks on how he remotely oversaw its construction.

Subscribe to the AI Podcast

Get the AI Podcast through iTunes, Google Podcasts, Google Play, Amazon Music, Castbox, DoggCatcher, Overcast, PlayerFM, Pocket Casts, Podbay, PodBean, PodCruncher, PodKicker, Soundcloud, Spotify, Stitcher and TuneIn.

Make the AI Podcast better: Have a few minutes to spare? Fill out this listener survey.

Read More