AWS and NVIDIA to bring Arm-based instances with GPUs to the cloud

AWS continues to innovate on behalf of our customers. We’re working with NVIDIA to bring an Arm processor-based, NVIDIA GPU accelerated Amazon Elastic Compute Cloud (Amazon EC2) instance to the cloud in the second half of 2021. This instance will feature the Arm-based AWS Graviton2 processor, which was built from the ground up by AWS and optimized for how customers run their workloads in the cloud, eliminating a lot of unneeded components that otherwise might go into a general-purpose processor.

AWS innovation with Arm technology

AWS has continued to pioneer cloud computing for our customers. In 2018, AWS was the first major cloud provider to offer Arm-based instances in the cloud with EC2 A1 instances powered by AWS Graviton processors. These instances are built around Arm cores and make extensive use of AWS custom-built silicon. They’re a great fit for scale-out workloads in which you can share the load across a group of smaller instances.

In 2020, AWS released AWS-designed, Arm-based Graviton2 processors, delivering a major leap in performance and capabilities over first-generation AWS Graviton processors. These processors power EC2 general purpose (M6g, M6gd, T4g), compute-optimized (C6g, C6gd, C6gn), and memory-optimized (R6g, R6gd, X2gd) instances, and provide up to 40% better price performance over comparable current generation x86-based instances for a wide variety of workloads. AWS Graviton2 processors deliver seven times more performance, four times more compute cores, five times faster memory, and caches twice as large over first-generation AWS Graviton processors.

Customers including Domo, Formula One, Honeycomb.io, Intuit, LexisNexis Risk Solutions, Nielsen, NextRoll, Redbox, SmugMug, Snap, and Twitter have seen significant performance gains and reduced costs from running AWS Graviton2-based instances in production. AWS Graviton2 processors, based on the 64-bit Arm architecture, are supported by popular Linux operating systems, including Amazon Linux 2, Red Hat, SUSE, and Ubuntu. Many popular applications and services from AWS and ISVs also support AWS Graviton2-based instances. Arm developers can use these instances to build applications natively in the cloud, thereby eliminating the need for emulation and cross-compilation, which are error-prone and time-consuming. Adding NVIDIA GPUs accelerates Graviton2-based instances for diverse cloud workloads, including gaming and other Arm-based workloads like machine learning (ML) inference.

Easily move Android games to the cloud

According to research from App Annie, mobile gaming is now the most popular form of gaming and has overtaken console, PC, and Mac. Additional research from App Annie has shown that up to 10% of all time spent on mobile devices is with games, and game developers need to support and optimize their games for the diverse set of mobile devices being used today and in the future. By leveraging the cloud, game developers can provide a uniform experience across the spectrum of mobile devices and extend battery life due to lower compute and power demands on the mobile device. The AWS Graviton2 instance with NVIDIA GPU acceleration enables game developers to run Android games natively, encode the rendered graphics, and stream the game over networks to a mobile device, all without needing to run emulation software on x86 CPU-based infrastructure.

Cost-effective, GPU-based machine learning inference

In addition to mobile gaming, customers running machine learning models in production are continuously looking for ways to lower costs as ML inference can represent up to 90% of the overall infrastructure spend for running these applications at scale. With this new offering, customers will be able to take advantage of the price/performance benefits of Graviton2 to deploy GPU accelerated deep learning models at a significantly lower cost vs. x86-based instances with GPU acceleration.

AWS and NVIDIA: A long history of collaboration

AWS and NVIDIA have collaborated for over 10 years to continually deliver powerful, cost-effective, and flexible GPU-based solutions to customers including the latest EC2 G4 instances with NVIDIA T4 GPUs launched in 2019 and EC2 P4d instances with NVIDIA A100 GPUs launched in 2020. EC2 P4d instances are deployed in hyperscale clusters called EC2 UltraClusters that are comprised of the highest performance compute, networking, and storage in the cloud. EC2 UltraClusters support 400 Gbps instance networking, Elastic Fabric Adapter (EFA), and NVIDIA GPUDirect RDMA technology to help rapidly train ML models using scale-out and distributed techniques.

In addition to being first in the cloud to offer GPU accelerated instances and first in the cloud to offer NVIDIA V100 GPUs, we’re now working together with NVIDIA to offer new EC2 instances that combine an Arm-based processor with a GPU accelerator in the second half of 2021. To learn more about how AWS and NVIDIA work together to bring innovative technology to customers, visit AWS at NVIDIA GTC 21.


About the Author

Geoff Murase is a Senior Product Marketing Manager for AWS EC2 accelerated computing instances, helping customers meet their compute needs by providing access to hardware-based compute accelerators such as Graphics Processing Units (GPUs) or Field Programmable Gate Arrays (FPGAs). In his spare time, he enjoys playing basketball and biking with his family.

Read More

NVIDIA DRIVE Sim Ecosystem Creates Diverse Proving Ground for Self-Driving Vehicles

Developing  autonomous vehicles with large scale simulation requires an ecosystem of partners and tools that’s just as wide ranging.

NVIDIA DRIVE Sim powered by Omniverse addresses AV development challenges with a scalable, diverse and physically accurate simulation platform. With DRIVE Sim, developers can improve productivity and test coverage, accelerating their time to market while minimizing the need for real-world driving.

The variety and depth of companies that form the DRIVE Sim ecosystem are core components to what makes the platform the foremost solution for autonomous vehicle simulation.

DRIVE Sim enables high-fidelity simulation by tapping into NVIDIA’s core technologies, including NVIDIA RTX, Omniverse and AI, to deliver a powerful, cloud-based simulation platform. It can generate datasets to train the vehicle’s perception system or provide a virtual proving ground to test the vehicle’s decision-making and control logic.

The platform can be connected to the AV stack in software-in-the-loop or hardware-in-the-loop configurations to test the full driving experience.

DRIVE Sim comes with a rich library of configurable models for environments, scenarios, vehicles, sensors and traffic that work right out-of-the-box.

It also includes dedicated application programming interfaces that enable developers to build DRIVE Sim connectors, plugins, and extensions to tailor the simulation experience to specific requirements and workflows. These APIs make it possible to leverage past investment and development by allowing integration into pre-established AV simulation tool-chains.

A broad partner ecosystem provides connectors, plugins and extensions to tailor the DRIVE Sim simulation experience to specific requirements and workflows.

With a broad ecosystem of simulation partners, DRIVE Sim always features the cutting edge in virtual simulation models, rich environments as well as verification and validation tools.

Ever-Changing Environments

Driving behavior varies with the environment the vehicle is driving in. From the dense traffic of urban driving to the sparse, winding roads of highways, self-driving cars must be able to handle different domains, as well as follow the unique laws of different countries.

DRIVE Sim ecosystem partners provide realistic virtual models of the three-dimensional road environment, including tools to create such environments, reference maps to create accurate road network and environment assets such as traffic signs and lights, other vehicles, pedestrians, bicyclists, buildings, trees, lamp posts, fire hydrants and road debris.

DRIVE Sim features realistic virtual models of complex road environments, either via out-of-the-box sample environments or via imported environments and assets from ecosystem partners.

NVIDIA is partnering with various 3D model providers to make these assets available for easy download and import via Omniverse into simulated environments and scenarios for DRIVE Sim.

Modeling Vehicle Behavior

In addition to recreating the real-world environment in the virtual world, simulation must accurately reproduce the way the vehicle itself responds to road inputs and controls, such as acceleration, steering and braking.

Vehicle dynamics models respond to vehicle control signals sent by DRIVE Sim with the correct position and orientation of the vehicle given the inputs.

These models simulate the vehicle dynamics to help validate planning and control algorithms with the highest possible fidelity. They can recreate the orientation and motion of sensors as the vehicle turns or brakes suddenly, as well as the sensor reaction to road vibration or other harsh conditions.

Vehicle models also help assess the robustness of the autonomous driving system itself. As the vehicle experiences tire and brake wear, varying cargo loads and wheel alignment, it’s critical to see how the system responds to ensure safety.

High-fidelity vehicle dynamics models are necessary to evaluate planning & control algorithms, even for low-speed parking maneuvers.

NVIDIA is collaborating with all major vehicle dynamics model providers to ensure that their models can be integrated into DRIVE Sim.

Sensing Simulation

Just as with autonomous vehicles in the physical world, virtual vehicles also need sensors to perceive their surroundings. DRIVE Sim comes with a library of standard models for camera, radar, lidar and ultrasonic sensors.

Through APIs, it’s also possible for users and ecosystem partners to integrate dedicated models for sensor simulation into DRIVE Sim.

These models typically simulate sensor components such as transmitters, receivers, imagers and lenses, as well as include signal-processing software and transcoders.

Physically accurate light simulation using RTX real-time raytracing, in combination with detailed sensor models, is used to validate perception edge cases, for example at sunrise or sunset when sunlight is directly shining into the camera.

Multiple camera, radar and lidar suppliers already provide models of their sensors for DRIVE Sim. By incorporating sensor models with this level of granularity, DRIVE Sim can accurately recreate the output of what a physical sensor in the real world would create as the vehicle drives.

Finding the Unknowns

Vehicles driving in the real world aren’t the only ones on the road, and the same is true in simulation.

With detailed traffic models, developers can play out specific scenarios with the same variables and unpredictability of the real world. Some DRIVE Sim partners develop naturalistic traffic — or situations where the end result is unknown — to test and validate the autonomous vehicle systems.

Getting realistic (and sometimes unpredictable) events into DRIVE Sim can be achieved via scenario catalogues, traffic simulation models and scenario-based V&V methodologies from ecosystem partners.

Other partners contribute specific scenario-catalogs and scenario-based verification and validation methodologies that evaluate whether an autonomous vehicle system meets specific key performance indicators.

These criteria can be regulatory requirements or industry standards. NVIDIA is participating in multiple projects, consortia and standards organizations across the globe aimed at creating standards for autonomous vehicle simulation.

Always in the Loop

Finally, the DRIVE Sim ecosystem makes it possible to use simulation to test and validate the full autonomous vehicle hardware system.

The NVIDIA DRIVE Constellation hardware-in-the-loop platform, which contains the AI compute system that runs in the vehicle, allows for bit-accurate at-scale validation of the AV stack on the target hardware.

System integration partners provide the infrastructure to connect DRIVE Constellation to the rest of the vehicle’s electronic architecture. This full integration with components like the braking, engine and cockpit control units enables developers to evaluate how the full vehicle reacts in specific self-driving scenarios.

With experienced partners contributing diverse and constantly updated models, self-driving systems can be continually developed, tested and validated using the highest quality content.

The post NVIDIA DRIVE Sim Ecosystem Creates Diverse Proving Ground for Self-Driving Vehicles appeared first on The Official NVIDIA Blog.

Read More

Carestream Health and Startups Develop AI-Enabled Medical Instruments with NVIDIA Clara AGX Developer Kit

Carestream Health, a leading maker of medical imaging systems, is investigating the use of  NVIDIA Clara AGX — an embedded AI platform for medical devices — in the development of AI-powered features on single-frame and streaming x-ray applications.

Startups around the world, too, are adopting Clara AGX for AI solutions in medical imaging, surgery and electron microscopy. Among them is Boston-based Activ Surgical, which recently received FDA clearance for a hardware imaging module to deliver real-time AI insights to the operating room.

Now in general availability, the NVIDIA Clara AGX developer kit advances the development of software-defined instruments, such as microscopes, ultrasounds and endoscopes.

This emerging generation of medical devices is equipped with dozens of real-time AI applications providing support at every step of the clinical experience — from automating patient set-up for scans and improving image quality to analyzing data streams and delivering critical insights to care providers.

NVIDIA Clara AGX is accelerating the development of these new medical instruments by providing a universal platform that can deliver high-bandwidth signal processing, accelerated computing reconstruction, AI processing and advanced 3D visualization.

Helping Clinicians Sense in Real Time 

Medical instruments like endoscopes and surgical robots are mounted with cameras, sending a live video feed to the clinicians operating the devices. Capturing these streams and applying computer vision AI to the video content can give medical professionals tools to improve patient care and bolster the capabilities of hospitals that lack adequate medical imaging resources.

Architected with NVIDIA Jetson AGX Xavier, an NVIDIA RTX 6000 GPU and the NVIDIA Mellanox ConnectX-6 SmartNIC, the Clara AGX developer kit comes with an SDK that makes it easy for developers to get up and running with real-time system software, libraries for input/output and video pipelining, and reference applications to create AI models for ultrasound and endoscopy.

Built into the platform is the NVIDIA EGX stack for cloud-native containerized software and microservices, including NVIDIA Fleet Command to securely deploy fleets of devices in hospitals, which together transform everyday sensors into smart sensors.

These smart sensors will be software-defined, meaning they can be regularly updated with AI algorithms as they improve — an essential capability to continuously connect research breakthroughs with the day-to-day practice of medicine.

Enabling Intelligent Instruments

Carestream Health is creating smart X-ray rooms that will include AI-powered features for an enhanced imaging workflow and faster, more efficient exams. The devices include automated positioning and exposure settings for similar exam types, which helps improve the consistency of X-ray images, boosting diagnostic confidence.

And Activ Surgical, a member of the NVIDIA Inception startup accelerator program, is using NVIDIA GPU-accelerated AI to deliver real-time surgical guidance. The company’s newly FDA-cleared ActivSight module will power its ActivINSIGHT product, which will provide surgeons with previously unavailable visual overlays, including blood flow and perfusion without the need for the injection of dyes.

Carestream Health and Activ Surgical are just two of the pioneering companies worldwide using NVIDIA AGX systems to power intelligent medical devices. Others include:

  • AJA Video Systems, based in California’s Gold Country, develops professional video and audio PCIe cards for high-bandwidth streaming. When combined with the NVIDIA Clara AGX developer kit, which includes two PCIe slots and high-speed network ports, the company’s cards can be used for endoscopy and surgical visualization applications.
  • Kaliber Labs, an NVIDIA Inception member, is building real-time AI-powered software solutions to support surgeons performing arthroscopic and minimally invasive procedures. Kaliber uses NVIDIA Clara AGX to deploy its surgical software suite, which equips surgeons with a first-of-its-kind contextualized and personalized surgical toolkit to help surgeons perform at the highest level and reduce surgical variability.
  • KAYA Instruments, an NVIDIA Inception member, develops computer vision products that can be used with imaging devices, including electron microscopes, ultrasound machines and MRI equipment. The Israel-based company’s video acquisition cards and cameras transfer medical imaging content to NVIDIA GPUs for real-time processing and AI-accelerated analysis.
  • Subtle Medical, an NVIDIA Inception member, has deployed FDA-cleared and CE-marked deep-learning powered image enhancement software solutions for PET and MRI protocols. The company will leverage NVIDIA Clara AGX for SubtleIR, an AI-powered software under development that improves the speed and quality of interventional imaging procedures.
  • Theator, an NVIDIA Inception member, will use NVIDIA Clara AGX to develop its surgical analytics platform. The Palo Alto-based startup is developing edge GPU-accelerated AI systems to annotate operation room footage, allowing surgeons to conduct post-surgery reviews where they can compare parts of a procedure with previous identical procedures.
  • us4us, a Poland-based maker of ultrasound research systems, is using NVIDIA AGX systems for a portable ultrasound platform that will support real-time digital beamforming — a compute-intensive technique essential to capturing quality ultrasound images. The software-defined system uses embedded GPU modules so medical researchers can develop and deploy custom AI models for image processing during ultrasound scans.

Learn more about Clara AGX for AI-powered medical devices and instruments in the GTC talk, “Using Ethernet to Stream High-Throughput, Low-Latency Medical Sensor Data.” The NVIDIA GPU Technology Conference is free to register. The healthcare track includes 16 live webinars, 18 special events and over 100 recorded sessions.

Registration isn’t required to watch NVIDIA CEO Jensen Huang’s keynote address.

Subscribe to NVIDIA healthcare news, and follow NVIDIA Healthcare on Twitter.

The post Carestream Health and Startups Develop AI-Enabled Medical Instruments with NVIDIA Clara AGX Developer Kit appeared first on The Official NVIDIA Blog.

Read More

NVIDIA Gives Arm a Second Shot of Acceleration

The Arm ecosystem got a booster shot of advances from NVIDIA at GTC today.

NVIDIA discussed work with Arm-based silicon, software and service providers, showing the potential of energy-efficient, accelerated platforms and applications across client, cloud, HPC and edge computing.

NVIDIA also announced three new processors built around Arm IP, including “Grace,” its first data center CPU which takes AI, cloud and high performance computing to new heights.

Separately, the new BlueField-3 data processing unit (DPU) sports more Arm cores, opening doors to new more powerful applications in data center networking.

And NVIDIA DRIVE Atlan becomes the company’s first processor for autonomous vehicles packing an Arm-enabled DPU, showing the potential for high performance networks in automaker’s 2025 models.

A Vision of What’s Possible

In his GTC keynote, NVIDIA CEO Jensen Huang shared his vision for AI, HPC, data science, graphics and more. He also reaffirmed his pledge to expand the Arm ecosystem as part of the Arm acquisition deal NVIDIA announced in September 2020.

On the road to making that vision a reality, NVIDIA described a set of efforts to accelerate CPUs from four key Arm partners with NVIDIA GPUs, DPUs and software, enhancing apps from Arm developers.

GPUs Boost AWS Graviton2 Instances

In the cloud, NVIDIA announced it will provide GPU acceleration for Amazon Web Services Graviton2, the cloud-service provider’s own Arm-based processor. The accelerated Graviton2 instances will provide rich game-streaming experiences and lower the cost of powerful AI inference capabilities.

For example, game developers will use the AWS instances to stream Android games and other services that combine the efficiency of Graviton2 with NVIDIA RTX graphics technologies like ray tracing and DLSS.

In high performance computing, the new NVIDIA Arm HPC Developer Kit provides a high-performance, energy-efficient platform for supercomputers that combine Ampere Computing’s Altra — a CPU packing 80 Arm cores running up to 3.3 GHz — with the latest NVIDIA GPUs and DPUs.

The devkit runs a suite of NVIDIA compilers, libraries and tools for AI and HPC so developers can accelerate Arm-based systems for science and technical computing. Leading researchers including Oak Ridge and Los Alamos National Labs in the U.S. as well as national labs in South Korea and Taiwan will be among its first users.

Pumping Up Client, Edge Platforms

In PCs, NVIDIA is working with MediaTek, the world’s largest supplier of smartphone chips, to create a new class of notebooks powered by an Arm-based CPU alongside an NVIDIA RTX GPU.

The notebooks will use Arm cores and NVIDIA graphics to give consumers energy-efficient portables with no-compromise media capabilities based on a reference platform that supports Chromium, Linux and NVIDIA SDKs.

And in edge computing, NVIDIA is working with Marvell Semiconductor to team its OCTEON Arm-based processors with NVIDIA’s GPUs. Together they will speed up AI workloads for network optimization and security.

Top AI Systems Join Arm’s Family

Two powerful AI supercomputers will come online next year.

The Swiss National Supercomputing Centre is building a system with 20 exaflops of AI performance. And in the U.S., the Los Alamos National Laboratory will switch on a new AI supercomputer for its researchers.

Both will be powered by NVIDIA’s first data center CPU, “Grace,” an Arm-based processor that will deliver 10x the performance of today’s fastest servers on the most complex AI and HPC workloads.

Named after pioneering computer scientist Grace Hopper, this CPU has the plumbing needed for the data-driven AI era. It sports coherent connections running at 900 GB/s to NVIDIA GPUs, thanks to a fourth generation NVLink — that’s 14x the bandwidth of today’s servers.

More Arm Cores for Networking

NVIDIA Mellanox networking is more than doubling down on its investment in Arm. The BlueField-3 DPU announced today packs 400-Gbps links and 5x the Arm compute power of the current DPU, the BlueField-2 available today.

Simple math shows why bulking up on Arm makes sense: One BlueField-3 DPU delivers the equivalent data center services that could consume up to 300 x86 CPU cores.

The advance gives Arm developers an expanding set of opportunities to build fast, efficient and smart data center networks.

Today DPUs offload communications, storage, security and systems-management tasks. That’s enabling whole new classes of systems such as the cloud-native supercomputer NVIDIA announced today.

NVIDIA and Arm Behind the Wheel

Arm cores will debut in next-generation AI-enabled autonomous vehicles powered by NVIDIA DRIVE Atlan, the next leap on NVIDIA’s roadmap.

DRIVE Atlan will pack quite a punch, kicking out more than 1,000 trillion operations per second. Atlan marks the first time the DRIVE platform integrates a DPU, carrying Arm cores that will help it pack the equivalent of data center networking into autonomous vehicles.

The DPU in Atlan provides a platform for Arm developers to create innovative applications in security, storage, networking and more.

The Best Is Yet to Come 

The expanding products and partnerships mark progress on our intention announced in October to bring the Arm ecosystem four acceleration suites:

  • NVIDIA AI – the industry standard for accelerating AI training and inference
  • RAPIDS – a suite of open-source software libraries maintained by NVIDIA to run data science and analytics on GPUs
  • NVIDIA HPC SDK – compilers, libraries and software tools for high performance computing
  • NVIDIA RTX – graphics drivers that deliver ray tracing and AI capabilities

And we’re just getting started. There’s much more to come and much more to say.

Learn about new opportunities combining NVIDIA and Arm at GTC21. Registration is free.

The post NVIDIA Gives Arm a Second Shot of Acceleration appeared first on The Official NVIDIA Blog.

Read More

NVIDIA DRIVE Sim Powered by Omniverse Available for Early Access This Summer

The path to autonomous vehicle deployment is accelerating through the Omniverse.

During his opening keynote at GTC, NVIDIA founder and CEO Jensen Huang announced the next generation of autonomous vehicle simulation, NVIDIA DRIVE Sim, now powered by NVIDIA Omniverse.

DRIVE Sim enables high-fidelity simulation by tapping into NVIDIA’s core technologies to deliver a powerful, cloud-based computing platform. It can generate datasets to train the vehicle’s perception system and provide a virtual proving ground to test the vehicle’s decision-making process while accounting for edge cases. The platform can be connected to the AV stack in software-in-the-loop or hardware-in-the-loop configurations to test the full driving experience.

DRIVE Sim on Omniverse is a major step forward as NVIDIA transitions the foundation for autonomous vehicle simulation from a game engine to a simulation engine.

This shift to simulation architected specifically for self-driving development has required significant effort, but brings an array of new capabilities and opportunities.

Enter the Omniverse

Creating a purpose-built autonomous vehicle simulation platform is not a simple undertaking. Game engines are powerful tools that provide incredible capabilities, however, they’re designed to build games, not scientific, physically accurate, repeatable simulations.

Designing the next generation of DRIVE Sim required a new approach. This new simulator had to be repeatable with precise timing, easily scale across GPUs and server nodes, simulate sensor feeds with physical accuracy and act as a modular and extensible platform.

NVIDIA Omniverse is the confluence of almost every core technology developed by NVIDIA. And DRIVE Sim takes advantage of the company’s expertise in graphics, high performance computing, AI and hardware design. Combining these capabilities provides a technology platform that is perfect for autonomous vehicle simulation.

Specifically, Omniverse provides a platform that was designed from the ground up to support multi-GPU computing. It incorporates a physically accurate, ray-tracing renderer based on NVIDIA RTX technology.

NVIDIA Omniverse also includes “Kit,” a scalable and extensible simulation framework for building interactive 3D applications and microservices. Using Kit over the last year, NVIDIA has implemented the DRIVE Sim core simulation engine in a way that supports repeatable simulation with precise control over all processes.

Timing and Repeatability

Autonomous vehicle simulation can only be an effective development tool if scenarios are repeatable and timing is accurate.

For instance, NVIDIA Omniverse schedules and manages all sensor and environment rendering functions to ensure repeatability without loss of accuracy.  It does this across GPUs and across nodes giving DRIVE Sim the ability to handle detailed environments and test vehicles with complex sensor suites. Additionally, it can manage such workloads at slower or faster than real time, while generating repeatable results.

Omniverse was designed to scale to many GPUs providing DRIVE Sim real-time rendering capabilities with repeatable results for complex sensor sets.

Not only does the platform enable this flexibility and accuracy, it does so in a way that’s scalable, so developers can run fleets of vehicles with various sensor suites at large scale and at the highest levels of fidelity.

Physically Accurate Sensors

In addition to accurately recreating real-world driving conditions, the simulation environment must also render vehicle sensor data in the exact same way cameras, radars and lidars take in data from the physical world.

With NVIDIA RTX technology, DRIVE Sim is able to render physically accurate sensor data in real time. Ray tracing provides realistic lighting by simulating the physical properties of visible and non-visible waveforms. And the NVIDIA Omniverse RTX renderer coupled with NVIDIA RTX GPUs enables ray tracing at real-time frame rates.

This scene of vehicles in a tunnel uses indirect lighting, which is challenging to render accurately in real-time, but is enabled in DRIVE Sim by the Omniverse RTX renderer.

The capability to simulate light in real time has significant benefits for autonomous vehicle simulation. It makes it possible to recreate lighting environments that can be virtually impossible to capture using rasterization — from the reflections off a tanker truck to the shadows inside a dim tunnel.

Generating physically accurate sensor data is especially powerful for building datasets to train AI-based perception networks, outputting the ground-truth data with the virtual sensor data. DRIVE Sim includes tools for advanced dataset creation including a powerful Python scripting interface and domain randomization tools.

Using this synthetic data in the DNN training process saves the cost of collecting and labeling real-world data, and speeds up iteration for streamlined autonomous vehicle deployment.

DRIVE Sim provides tools to generate ground truth data with simulation data, enabling rapid generation of complex datasets to train Deep Neural Networks (DNNs) for autonomous vehicle perception.

Modular and Extensible

As a modular, open and extensible platform, DRIVE Sim provides developers the ultimate flexibility and efficiency in simulation testing.

DRIVE Sim on Omniverse allows different components of the simulator to be run to support different use cases. One group of engineers can run just the perception stack in simulation. Another can focus on the planning and control stack by simulating scenarios based on ground-truth object data (thus bypassing the perception stack).

This modularity significantly cuts down on development time by allowing developers to focus on the task at hand, while ensuring that the entire team is using the same tools, scenarios, models and assets in simulation for consistent results.

Using the NVIDIA Omniverse Kit SDK, DRIVE Sim allows developers to build custom models, 3D content and validation tools or to interface with other simulations. Users can create their own plugins or choose from a rich library of vehicle, sensor and traffic plugins provided by DRIVE Sim ecosystem partners. This flexibility enables users to customize DRIVE Sim for their unique use case and tailor the simulation experience to their development and validation needs.

DRIVE Sim on Omniverse will be available to developers via an early access program this summer. Learn more about DRIVE Sim and accelerate the development of safer, more efficient transportation today.

The post NVIDIA DRIVE Sim Powered by Omniverse Available for Early Access This Summer appeared first on The Official NVIDIA Blog.

Read More

A Data Center on Wheels: NVIDIA Unveils DRIVE Atlan Autonomous Vehicle Platform

The next stop on the NVIDIA DRIVE roadmap is Atlan.

During today’s opening keynote of the GPU Technology Conference, NVIDIA founder and CEO Jensen Huang unveiled the upcoming generation of AI compute for autonomous vehicles, NVIDIA DRIVE Atlan. A veritable data center on wheels, Atlan centralizes the vehicle’s entire compute infrastructure into a single system-on-a-chip.

While vehicles are packing in more and more compute technology, they’re lacking the physical security that comes with data center-level processing. Atlan is a technical marvel for safe and secure AI computing, fusing all of NVIDIA’s technologies in AI, automotive, robotics, safety and BlueField data centers.

The next-generation platform will achieve an unprecedented 1,000 trillion operations per second (TOPS) of performance and an estimated SPECint score of more than 100 (SPECrate2017_int) — greater than the total compute in most robotaxis today. Atlan is also the first SoC to be equipped with an NVIDIA BlueField data processing unit (DPU) for trusted security, advanced networking and storage services.

While Atlan will not be available for a couple of years, software development is well underway. Like NVIDIA DRIVE Orin, the next-gen platform is software compatible with previous DRIVE compute platforms, allowing customers to leverage their existing investments across multiple product generations.

“To achieve higher levels of autonomy in more conditions, the number of sensors and their resolutions will continue to increase,” Huang said. “AI models will get more sophisticated. There will be more redundancy and safety functionality. We’re going to need all of the computing we can get.”

Advancing Performance at Light Speed

Autonomous vehicle technology is developing faster than it has in previous years, and the core AI compute must advance in lockstep to support this critical progress.

Cars and trucks of the future will require an optimized AI architecture not only for autonomous driving, but also for intelligent vehicle features like speech recognition and driver monitoring. Upcoming software-defined vehicles will be able to converse with occupants: answering questions, providing directions and warning of road conditions ahead.

Atlan is able to deliver more than 1,000 TOPS — a 4x gain over the previous generation — by leveraging NVIDIA’s latest GPU architecture, new Arm CPU cores and deep learning and computer vision accelerators.The platform architecture provides ample compute horsepower for the redundant and diverse deep neural networks that will power future AI vehicles and leaves headroom for developers to continue adding features and improvements.

This high-performance platform will run autonomous vehicle, intelligent cockpit and traditional infotainment applications concurrently.

A Guaranteed Shield with BlueField

Like every generation of NVIDIA DRIVE, Atlan is designed with the highest level of safety and security.

As a data-center-infrastructure-on-a-chip, the NVIDIA BlueField DPU is architected to handle the complex compute and AI workloads required for autonomous vehicles. By combining the industry-leading ConnectX network adapter with an array of Arm cores, BlueField offers purpose-built hardware acceleration engines with full programmability to deliver “zero-trust” security to prevent data breaches and cyberattacks.

This secure architecture will extend the safety and reliability of the NVIDIA DRIVE platform for vehicle generations to come. NVIDIA DRIVE Orin vehicle production timelines start in 2022, and Atlan will follow, sampling in 2023 and slated for 2025 production vehicles.

The post A Data Center on Wheels: NVIDIA Unveils DRIVE Atlan Autonomous Vehicle Platform appeared first on The Official NVIDIA Blog.

Read More

NVIDIA Opens Up Hyperion 8 Autonomous Vehicle Platform for AV Ecosystem

The next generation of vehicles will be packed with more technology than any computing system today.

And with NVIDIA DRIVE Hyperion, companies can embrace this shift to more intelligent, software-defined vehicles. Announced at GTC, the eighth-generation Hyperion platform includes the sensors, high-performance compute and software necessary for autonomous vehicle development, all verified, calibrated and synchronized right out of the box.

Developing an AV — essentially a data center on wheels — requires an entirely new process. Both the hardware and software must be comprehensively tested and validated to ensure they can handle not only the real-time processing for autonomous driving, but also withstand the harsh conditions of daily driving.

Hyperion is a fully operational, production-ready and open autonomous vehicle platform that cuts down the massive amount of time and cost required to outfit vehicles with the technology required for AI features and autonomous driving.

What’s Included

Hyperion comes with all the hardware needed to validate an autonomous driving system at the highest levels of performance.

At its core, two NVIDIA DRIVE Orin systems-on-a-chip (SoCs) provide ample compute for level 4 self-driving and intelligent cockpit capabilities. These SoCs process data from a halo of 12 exterior cameras, three interior cameras, nine radars and two lidar sensors in real time for safe autonomous operation.

Hyperion also includes all the tools necessary to evaluate the NVIDIA DRIVE AV and DRIVE IX software stack, as well as real-time record and capture capabilities for streamlined driving data processing.

And this entire toolset is synchronized and calibrated precisely for 3D data collection, giving developers valuable time back in setting up and running autonomous vehicle test drives.

Seamless Integration

With much of the industry leveraging NVIDIA DRIVE Orin for in-vehicle compute, DRIVE Hyperion is the next step for full autonomous vehicle development and validation.

By including a complete sensor setup on top of centralized compute, Hyperion provides everything needed to validate an intelligent vehicle’s hardware on the road. And with its compatibility with the NVIDIA DRIVE AV and DRIVE IX software stacks, Hyperion is also a critical platform for evaluating and validating self-driving software.

Plus, it’s already streamlining critical self-driving research and development. Institutions such as the Virginia Tech Transportation Institute and Stanford University are leveraging the current generation of Hyperion in autonomous vehicle research pilots.

Developers can begin leveraging the latest open platform soon — the eighth generation of Hyperion will be available to the NVIDIA DRIVE ecosystem later in 2021.

The post NVIDIA Opens Up Hyperion 8 Autonomous Vehicle Platform for AV Ecosystem appeared first on The Official NVIDIA Blog.

Read More

Brain Gain: NVIDIA DRIVE Orin Now Central Computer for Intelligent Vehicles

NVIDIA DRIVE Orin, our breakthrough autonomous vehicle system-on-a-chip, is the new mega brain of the software-defined vehicle.

Beyond self-driving features, NVIDIA CEO and founder Jensen Huang announced today during his GTC keynote that the SoC can power all the intelligent computing functions inside vehicles, including confidence view visualization of autonomous driving capabilities, digital clusters, infotainment and passenger interaction AI.

Slated for 2022 vehicle product lines, Orin processes more than 250 trillion operations per second while achieving systematic safety standards such as ISO 26262 ASIL-D.

Typically, vehicle functions are controlled by tens of electronic control units distributed throughout a vehicle. By centralizing control of these core domains, Orin can replace these components and simplify what has been an incredibly complex supply chain for automakers.

“The future is one central computer — four domains, virtualized and isolated, architected for functional safety and security, software-defined and upgradeable for the life of the car — in addition to super-smart AI and beautiful graphics,” Huang said.

Secure Computing for Every Need

Managing a system with multiple complex applications is incredibly difficult. And when it comes to automotive, safety is critical.

DRIVE Orin supports multiple operating systems, including Linux, QNX and Android, to enable this wide range of applications. As a high-performance compute platform architected for the highest level of safety, it does so in a way that is secure, virtualized and accelerated.

The digital cluster, driver monitoring system and AV confidence view are all crucial to ensuring the safety of a vehicle’s occupants. Each must be functionally secure, with the ability to update each application individually without requiring a system reboot.

DRIVE Orin is designed for software-defined operation, meaning it’s purpose-built to handle these continuous upgrades throughout the life of the vehicle.

The Highest Levels of Confidence

As vehicles become more and more autonomous, visualization within the cabin will be critical for building trust with occupants. And with the DRIVE Orin platform, manufacturers can integrate enhanced capability into their fleets over the life of their vehicles.

The confidence view is a rendering of the mind of the vehicle’s AI. It shows exactly what the sensor suite and perception system are detecting in real time and constructs it into a 3D surround model.

By incorporating this view in the cabin interior, the vehicle can communicate the accuracy and reliability of the autonomous driving system at every step of the journey. And occupants can gain a better understanding of how the vehicle’s AI sees the world.

As a high-performance AI compute platform, DRIVE Orin enables this visualization alongside the digital cluster, infotainment, and driver and occupant monitoring, while maintaining enough compute headroom to add new features that delight customers through the life of their vehicles.

The ability to support this multi-functionality safely and securely is what makes NVIDIA DRIVE Orin truly central to the next-generation intelligent vehicle experience.

The post Brain Gain: NVIDIA DRIVE Orin Now Central Computer for Intelligent Vehicles appeared first on The Official NVIDIA Blog.

Read More

NVIDIA Triton Tames the Seas of AI Inference

You don’t need a hunky sea god with a three-pronged spear to make AI work, but a growing group of companies from car makers to cloud service providers say you’ll feel a sea change if you sail with Triton.

More than half a dozen companies share hands-on experiences this week in deep learning with the NVIDIA Triton Inference Server, open-source software that takes AI into production by simplifying how models run in any framework on any GPU or CPU for all forms of inference.

For instance, in a talk at GTC (free with registration) Fabian Bormann, an AI engineer at Volkswagen Group, conducts a virtual tour through the Computer Vision Model Zoo, a repository of solutions curated from the company’s internal teams and future partners.

The car maker integrates Triton into its Volkswagen Computer Vision Workbench so users can make contributions to the Model Zoo without needing to worry about whether they are based on ONNX, PyTorch or TensorFlow frameworks. Triton simplifies model management and deployment, and that’s key for VW’s work serving up AI models in new and interesting environments, Bormann says in a description of his talk (session E32736) at GTC.

Salesforce Sold on Triton Benchmarks

A leader in customer-relationship management software and services, Salesforce recently benchmarked Triton’s performance on some of the world’s largest AI models — the transformers used for natural-language processing.

“Triton not only has excellent serving performance, but also comes included with several critical functions like dynamic batching, model management and model prioritization. It is quick and easy to set up and works for many deep learning frameworks including TensorFlow and PyTorch,” said Nitish Shirish Keskar, a senior research manager at Salesforce who’s presenting his work at GTC (session S32713).

Keskar described in a recent blog his work validating that Triton can handle 500-600 queries per second (QPS) while processing 100 concurrent threads and staying under 200ms latency on the well-known BERT models used to understand speech and text. He tested Triton on the much larger CTRL and GPT2-XL models, finding that despite their billions of neural-network nodes, Triton still cranked out an amazing 32-35 QPS.

A Model Collaboration with Hugging Face

More than 5,000 organizations turn to Hugging Face for help summarizing, translating and analyzing text with its 7,000 AI models for natural-language processing. Jeff Boudier, its product director, will describe at GTC (session S32003) how his team drove 100x improvements in AI inference on its models, thanks to a flow that included Triton.

“We have a rich collaboration with NVIDIA, so our users can have the most optimized performance running models on a GPU,” said Boudier.

Hugging Face aims to combine Triton with TensorRT, NVIDIA’s software for optimizing AI models, to drive the time to process an inference with a BERT model down to less than a millisecond. “That would push the state of the art, opening up new use cases with benefits for a broad market,” he said.

Deployed at Scale for AI Inference

American Express uses Triton in an AI service that operates within a 2ms latency requirement to detect fraud in real time across $1 trillion in annual transactions.

As for throughput, Microsoft uses Triton on its Azure cloud service to power the AI behind GrammarLink, its online editor for Microsoft Word that’s expected to serve as many as half a trillion queries a year.

Less well known but well worth noting, LivePerson, based in New York, plans to run thousands of models on Triton in a cloud service that provides conversational AI capabilities to 18,000 customers including GM Financial, Home Depot and European cellular provider Orange.

Triton Inference Server
Triton simplifies the job of executing multiple styles of inference with models based on various frameworks while maintaining highest throughput and system utilization.

And the chief technology officer of London-based Intelligent Voice will describe at GTC (session S31452) its LexIQal system, which uses Triton for AI inference to detect fraud in insurance and financial services.

They are among many companies using NVIDIA for AI inference today. In the past year alone, users downloaded the Triton software more than 50,000 times.

Triton’s Swiss Army Spear

Triton is getting traction in part because it can handle any kind of AI inference job, whether it’s one that runs in real time, batch mode, as a streaming service or even if it involves a chain or ensemble of models. That flexibility eliminates the need for users to adopt and manage custom inference servers for each type of task.

In addition, Triton assures high system utilization, distributing work evenly across GPUs whether inference is running in a cloud service, in a local data center or at the edge of the network. And it’s open, extensible code lets users customize Triton to their specific needs.

NVIDIA keeps improving Triton, too. A recently added model analyzer combs through all the options to show users the optimal batch size or instances-per-GPU for their job. A new tool automates the job of translating and validating a model trained in Tensorflow or PyTorch into a TensorRT format; in future, it will support translating models to and from any neural-network format.

Meet Our Inference Partners

Triton’s attracted several partners who support the software in their cloud services, including Amazon, Google, Microsoft and Tencent. Others such as Allegro, Seldon and Red Hat support Triton in the software for enterprise data centers for workflows including MLOps, the extension to DevOps for AI.

At GTC (session S33118), Arm will describe how it adapted Triton as part of its neural-network software that runs inference directly on edge gateways. Two engineers from Dell EMC will show how to boost performance in video analytics 6x using Triton (session S31437), and NetApp will talk about its work integrating Triton with its solid-state storage arrays (session S32187).

To learn more, register for GTC and check out one of two introductory sessions (S31114, SE2690) with NVIDIA experts on Triton for deep learning inference.

The post NVIDIA Triton Tames the Seas of AI Inference appeared first on The Official NVIDIA Blog.

Read More

Like Magic: NVIDIA Merlin Gains Adoption for Training and Inference

Recommenders personalize the internet. They suggest videos, foods, sneakers and advertisements that seem magically clairvoyant in knowing your tastes and interests.

It’s an AI that makes online experiences more enjoyable and efficient, quickly taking you to the things you want to see. While delivering content you like, it also targets tempting ads for jeans, or recommends comfort dishes that fit those midnight cravings.

But not all recommender systems can handle the data requirements to make smarter suggestions. That leads to slower training and less intuitive internet user experiences.

NVIDIA Merlin is turbocharging recommenders, boosting training and inference. Leaders in media, entertainment and on-demand delivery use the open source recommender framework for running accelerated deep learning on GPUs. Improving recommendations increases clicks, purchases — and satisfaction.

Merlin-Accelerated Recommenders 

NVIDIA Merlin enables businesses of all types to build recommenders accelerated by NVIDIA GPUs.

Its collection of libraries includes tools for building deep learning-based systems that provide better predictions than traditional methods and increase clicks. Each stage of the pipeline is optimized to support hundreds of terabytes of data, all accessible through easy-to-use APIs.

Merlin is in testing with hundreds of companies worldwide. Social media and video services are evaluating it for suggestions on next views and ads. And major on-demand apps and retailers are looking at it for suggestions on new items to purchase.

Videos with Snap

With Merlin, Snap is improving the customer experience with better load times by ranking content and ads 60% faster while also reducing their infrastructure costs. Using GPUs and Merlin provides Snap with additional compute capacity to explore more complex and accurate ranking models. These improvements allow Snap to deliver even more engaging experiences at a lower cost.

Tencent: Ads that Click

China’s leading online video media platform uses Merlin HugeCTR to help connect over 500 million monthly active users with ads that are relevant and engaging. With such a huge dataset, training speed matters and determines the performance of the recommender model. Tencent deployed its real-time training with Merlin and achieved more than a 7x speedup over the original TensorFlow solution on the same GPU platform. Tencent dives into this further at its GTC presentation.

Postmates Food Picks

Merlin was designed to streamline and support recommender workflows. Postmates uses recommenders to help people decide what’s for dinner. Postmates utilizes Merlin NVTabular to optimize training time, reducing it from 1 hour on CPUs to just 5 minutes on GPUs.

Using NVTabular for feature engineering, the company reduced training costs by 95 percent and is exploring more advanced deep learning models. Postmates delves more into this in its GTC presentation.

Merlin Streamlines Recommender Workflows at Scale

As Merlin is interoperable, it provides flexibility to accelerate recommender workflow pipelines.

The open beta release of the Merlin recommendation engine delivers leaps in data loading and training of deep learning systems.

NVTabular reduces data preparation time by GPU-accelerating feature transformations and preprocessing. NVTabular, which makes loading massive data lakes into training pipelines easier, gets multi-GPU support and improved interoperability with TensorFlow and PyTorch.

Merlin’s Magic for Training

Merlin HugeCTR is the main training component. It’s designed for training deep learning recommender systems and comes with its own optimized data loader, vastly outperforming generic deep learning frameworks. HugeCTR provides a parquet data reader to digest the NVTabular preprocessed data. HugeCTR is a deep neural network training framework specifically designed for recommender workflows capable of distributed training across multiple GPUs and nodes for maximum performance.

NVIDIA Triton Inference Server accelerates production inference on GPUs for feature transforms and neural network execution.

Learn more about the technology advances behind Merlin since its initial launch, including its support for NVTabular, HugeCTR and NVIDIA Triton Inference Server.

 

The post Like Magic: NVIDIA Merlin Gains Adoption for Training and Inference appeared first on The Official NVIDIA Blog.

Read More