NVIDIA Takes Inference to New Heights Across MLPerf Tests

NVIDIA Takes Inference to New Heights Across MLPerf Tests

MLPerf remains the definitive measurement for AI performance as an independent, third-party benchmark. NVIDIA’s AI platform has consistently shown leadership across both training and inference since the inception of MLPerf, including the MLPerf Inference 3.0 benchmarks released today.

“Three years ago when we introduced A100, the AI world was dominated by computer vision. Generative AI has arrived,” said NVIDIA founder and CEO Jensen Huang.

“This is exactly why we built Hopper, specifically optimized for GPT with the Transformer Engine. Today’s MLPerf 3.0 highlights Hopper delivering 4x more performance than A100.

“The next level of Generative AI requires new AI infrastructure to train large language models with great energy efficiency. Customers are ramping Hopper at scale, building AI infrastructure with tens of thousands of Hopper GPUs connected by NVIDIA NVLink and InfiniBand.

“The industry is working hard on new advances in safe and trustworthy Generative AI. Hopper is enabling this essential work,” he said.

The latest MLPerf results show NVIDIA taking AI inference to new levels of performance and efficiency from the cloud to the edge.

Specifically, NVIDIA H100 Tensor Core GPUs running in DGX H100 systems delivered the highest performance in every test of AI inference, the job of running neural networks in production. Thanks to software optimizations, the GPUs delivered up to 54% performance gains from their debut in September.

In healthcare, H100 GPUs delivered a 31% performance increase since September on 3D-UNet, the MLPerf benchmark for medical imaging.

H100 GPU AI inference performance on MLPerf workloads

Powered by its Transformer Engine, the H100 GPU, based on the Hopper architecture, excelled on BERT, a transformer-based large language model that paved the way for today’s broad use of generative AI.

Generative AI lets users quickly create text, images, 3D models and more. It’s a capability companies from startups to cloud service providers are rapidly adopting to enable new business models and accelerate existing ones.

Hundreds of millions of people are now using generative AI tools like ChatGPT — also a transformer model — expecting instant responses.

At this iPhone moment of AI, performance on inference is vital. Deep learning is now being deployed nearly everywhere, driving an insatiable need for inference performance from factory floors to online recommendation systems.

L4 GPUs Speed Out of the Gate

NVIDIA L4 Tensor Core GPUs made their debut in the MLPerf tests at over 3x the speed of prior-generation T4 GPUs. Packaged in a low-profile form factor, these accelerators are designed to deliver high throughput and low latency in almost any server.

L4 GPUs ran all MLPerf workloads. Thanks to their support for the key FP8 format, their results were particularly stunning on the performance-hungry BERT model.

NVIDIA L4 GPU AI inference performance on MLPerf workloads

In addition to stellar AI performance, L4 GPUs deliver up to 10x faster image decode, up to 3.2x faster video processing and over 4x faster graphics and real-time rendering performance.

Announced two weeks ago at GTC, these accelerators are already available from major systems makers and cloud service providers. L4 GPUs are the latest addition to NVIDIA’s portfolio of AI inference platforms launched at GTC.

Software, Networks Shine in System Test

NVIDIA’s full-stack AI platform showed its leadership in a new MLPerf test.

The so-called network-division benchmark streams data to a remote inference server. It reflects the popular scenario of enterprise users running AI jobs in the cloud with data stored behind corporate firewalls.

On BERT, remote NVIDIA DGX A100 systems delivered up to 96% of their maximum local performance, slowed in part because they needed to wait for CPUs to complete some tasks. On the ResNet-50 test for computer vision, handled solely by GPUs, they hit the full 100%.

Both results are thanks, in large part, to NVIDIA Quantum Infiniband networking, NVIDIA ConnectX SmartNICs and software such as NVIDIA GPUDirect.

Orin Shows 3.2x Gains at the Edge

Separately, the NVIDIA Jetson AGX Orin system-on-module delivered gains of up to 63% in energy efficiency and 81% in performance compared with its results a year ago. Jetson AGX Orin supplies inference when AI is needed in confined spaces at low power levels, including on systems powered by batteries.

Jetson AGX Orin AI inference performance on MLPerf benchmarks

For applications needing even smaller modules drawing less power, the Jetson Orin NX 16G shined in its debut in the benchmarks. It delivered up to 3.2x the performance of the prior-generation Jetson Xavier NX processor.

A Broad NVIDIA AI Ecosystem

The MLPerf results show NVIDIA AI is backed by the industry’s broadest ecosystem in machine learning.

Ten companies submitted results on the NVIDIA platform in this round. They came from the Microsoft Azure cloud service and system makers including ASUS, Dell Technologies, GIGABYTE, H3C, Lenovo, Nettrix, Supermicro and xFusion.

Their work shows users can get great performance with NVIDIA AI both in the cloud and in servers running in their own data centers.

NVIDIA partners participate in MLPerf because they know it’s a valuable tool for customers evaluating AI platforms and vendors. Results in the latest round demonstrate that the performance they deliver today will grow with the NVIDIA platform.

Users Need Versatile Performance

NVIDIA AI is the only platform to run all MLPerf inference workloads and scenarios in data center and edge computing. Its versatile performance and efficiency make users the real winners.

Real-world applications typically employ many neural networks of different kinds that often need to deliver answers in real time.

For example, an AI application may need to understand a user’s spoken request, classify an image, make a recommendation and then deliver a response as a spoken message in a human-sounding voice. Each step requires a different type of AI model.

The MLPerf benchmarks cover these and other popular AI workloads. That’s why the tests ensure IT decision makers will get performance that’s dependable and flexible to deploy.

Users can rely on MLPerf results to make informed buying decisions, because the tests are transparent and objective. The benchmarks enjoy backing from a broad group that includes Arm, Baidu, Facebook AI, Google, Harvard, Intel, Microsoft, Stanford and the University of Toronto.

Software You Can Use

The software layer of the NVIDIA AI platform, NVIDIA AI Enterprise,  ensures users get optimized performance from their infrastructure investments as well as the enterprise-grade support, security and reliability required to run AI in the corporate data center.

All the software used for these tests is available from the MLPerf repository, so anyone can get these world-class results.

Optimizations are continuously folded into containers available on NGC, NVIDIA’s catalog for GPU-accelerated software. The catalog hosts NVIDIA TensorRT, used by every submission in this round to optimize AI inference.

Read this technical blog for a deeper dive into the optimizations fueling NVIDIA’s MLPerf performance and efficiency.

Read More

Automate and implement version control for Amazon Kendra FAQs

Automate and implement version control for Amazon Kendra FAQs

Amazon Kendra is an intelligent search service powered by machine learning (ML). Amazon Kendra reimagines enterprise search for your websites and applications so your employees and customers can easily find the content they’re looking for, even when it’s scattered across multiple locations and content repositories within your organization.

Amazon Kendra FAQs allow users to upload frequently asked questions with their corresponding answers. This helps to consistently answer common queries among end-users. As of this writing, when you want to update FAQs, you must delete the FAQ and create it again. In this post, we present a simpler, faster approach for updating your Amazon Kendra FAQs (with versioning enabled). Our method eliminates the manual steps of creating and deleting FAQs when you update their contents.

Overview of solution

We use a fully deployable AWS CloudFormation template to create an Amazon Simple Storage Service (Amazon S3) bucket, which becomes the source to store your Amazon Kendra FAQs. Each index-based FAQ is maintained in the folder with a prefix relating to the Amazon Kendra index.

This solution uses an AWS Lambda function that gets triggered by an Amazon S3 event notification. When you upload an FAQ to the S3 folder mapped to a specific Amazon Kendra index, it creates a new version of the FAQ for your index. Older versions of FAQs are deleted only after the new FAQ index version is created, achieving near-zero downtime of index searching.

The following figure shows the workflow of how our method creates and deletes a new version of an Amazon Kendra FAQ.

Architecture for Automated FAQ Update for Amazon Kendra

The workflow steps are as follows:

  1. The user uploads the Amazon Kendra FAQ document to the S3 bucket mapped to the Amazon Kendra index.
  2. The Amazon S3 PutObject event triggers the Lambda function, which reads the event details.
  3. The Lambda function creates a new version of the FAQ for the target index for each uploaded document and deletes the older versions of the FAQ.
  4. The Lambda function then publishes a message to Amazon Simple Notification Service (Amazon SNS), which sends an email to the user notifying them that the FAQ has been successfully updated.

Prerequisites

Before you begin the walkthrough, you need an AWS account (if you don’t have one, you can sign up for one). You also need to create the files containing the sample FAQs:

  • basic.csv – The following code is the sample FAQ CSV template:
    How many free clinics are in Spokane WA?, 13, https://www.freeclinics.com/
    How many free clinics are there in Mountain View Missouri?, 7, https://www.freeclinics.com/

  • demo.json – The following code is the sample FAQ JSON template:
    {
      "SchemaVersion": 1,
      "FaqDocuments": [
        {
          "Question": "How many free clinics are in Spokane WA?",
          "Answer": "13"
        },
        {
          "Question": "How many free clinics are there in Mountain View Missouri?",
          "Answer": "7",
          "Attributes": {
            "_source_uri": "https://www.freeclinics.com",
            "_category": "Charitable Clinics"
          }
        }
      ]
    }

  • header_demo.csv – The following code is the sample FAQ CSV template with header:
    _question,_answer,_last_updated_at
    How many free clinics are in Spokane WA?, 13, 2012-03-25T12:30:10+01:00
    How many free clinics are there in Mountain View Missouri?, 7, 2012-03-25T12:30:10+01:00

Deploy the solution

The CloudFormation templates that create the resources used by this solution can found in the GitHub repository. Follow the instructions in the repository to deploy the solution. AWS CloudFormation creates the following resources in your account:

  • An S3 bucket that will be the source for the Amazon Kendra FAQ.
  • An Amazon Kendra index.
  • An AWS Identity and Access Management (IAM) role for the Amazon Kendra FAQ to read (GetObject) from the S3 bucket.
  • A Lambda function that is configured to get triggered by an Amazon S3 event. The function is created outside of an Amazon VPC.

Note that resource creation can take approximately 30 minutes.

After you run the deployment, you’ll receive an email prompting you to confirm the subscription at the approver email address. Choose Confirm subscription.

Amazon SNS subscription Email

You’re redirected to a page confirming your subscription.

SNS Subscription Confirmation

Verify that the Amazon Kendra index is listed on the Amazon Kendra console. In this post, we named the Amazon Kendra index sample-kendra-index.

Amazon Kendra index as seen from the Amazon Kendra console

Upload a sample FAQ document to Amazon S3

In the previous step, you successfully deployed the CloudFormation stack. We use the output of the stack in the following steps:

  1. On the Outputs tab of the CloudFormation stack, note the values for S3Bucket (kendra-faq-<random-stack-id>) and KendraIndex.
    AWS CloudFormation Output
  2. On the Amazon S3 console, navigate to the S3 bucket created from the CloudFormation stack.
  3. Choose Create folder and create a folder called faq-<index-id>. For index-id, use the value you noted for the CloudFormation parameter KendraIndex. After the folder is created, this becomes the prefix for the sample-kendra-index FAQ.
    Create S3 folder prefixed with faq
  4. Upload the demo.json FAQ document to that folder.
    Upload the demo.json FAQ document in that folder

Verify that the index FAQ is created

To confirm that the index FAQ is created, complete the following steps:

  1. On the Amazon Kendra console, navigate to the index sample_kendra_index, which was created as part of the deployment.
  2. Navigate to the FAQs page for this index to check if an FAQ is listed.

The index has the naming convention <file-name>-faq-<Date-Time>.

Resulting FAQ created by the automation solution

When the FAQ is successfully created, you will receive another email informing you about it. You may upload new versions of the FAQ after you have received this email.

Receiving email for successful FAQ creation

Note that the automation identifies the file format that it must use while creating the FAQ by reading the uploaded file extension and as an exception case by the prefix of header_ for the CSV document with a header. The target Amazon Kendra index is identified by the S3 bucket folder name, which has the index ID as the suffix; for example, faq-1f01abb8-341c-4921-ad16-139ee517a845.

Upload additional FAQ documents

Amazon Kendra FAQ supports three types of file format: CSV, CSV_WITH_HEADER, and JSON. Make sure that when you upload a CSV file with the header, the file name should have a prefix with header_ (this is only when using the CSV file format with a header in its contents). To upload your FAQ documents, complete the following steps:

  1. Upload the header_demo.csv file to the same folder.
    Upload the heder_demo.csv FAQ document in that folder
  2. Verify that the FAQ is created on the Amazon Kendra console.
    Verify that the FAQ is created

FAQ creation is case-sensitive to the file format of the FAQ document that you upload. For example, if you upload demo.json and demo.JSON, both are treated as unique objects in Amazon S3. Therefore, this action creates two FAQs, such as demo-json-faq-22-09-2022-20-09-11 and demo-JSON-faq-22-09-2022-20-09-11.

  1. Upload demo.JSON.
    demo.json and demo.JSON are uploaded to the S3 bucket
  2. Verify that the FAQ for demo.JSON is created on the Amazon Kendra console.
    Case sensitive file names result in 2 new FAQs created

Create a new version of the index FAQ

Now the solution is self-sufficient and able to work independently whenever you upload a new version of the FAQ document in Amazon S3.

To test this, upload a new updated version of your demo.json FAQ document to the faq-<index-id> folder. When you navigate to the FAQ for the index, there will be an FAQ named <file-name>-faq-<Date-Time>.

This solution creates a new version of the FAQ for the new version of the FAQ document that was uploaded in Amazon S3. When the FAQ is active, it deletes the older version of the FAQ for the same document.

Verify that only the latest version of the FAQ exists in the index

Create an FAQ with a description

This solution also supports creating an FAQ with a description when files are named in a specific manner: <document_name>-desc-<your faq description>.fileformat[json|csv]. For example, demo-desc-hello world.json. Upload this FAQ document to the faq-<index-id> folder.

Upload the file with the description in its name to S3

After you upload the document, the FAQ will be created and it will have the description as mentioned in the file name.

FAQ created with description

You should only use -desc- when you must add a description to an FAQ. If you upload a file with the same document_name prefix, it will delete the old FAQ created from the document_name.fileformat FAQ document and create a new FAQ with the description.

Clean up

To clean up, perform the following actions:

  1. Empty the S3 bucket that was created by the CloudFormation stack to store the FAQ documents. For instructions, refer to Emptying a bucket.
  2. Delete the CloudFormation stack. For instructions, refer to Deleting a stack on the AWS CloudFormation console.

Conclusion

In this post, we introduced an automated way to manage your Amazon Kendra FAQs. After implementing this solution, you should be able to create and delete FAQs just by uploading them to an S3 bucket. This way, you save time by avoiding repetitive manual changes and troubleshooting inconsistent issues that are caused by unexpected operational incidents. You can also audit Amazon Kendra FAQs across your organization with confidence.

Do you have feedback about this post? Submit your comments in the comments section. You can also post questions on the AWS re:Post forum.


About the Author

debobhadDebojit is a DevOps consultant who specializes in helping customers deliver secure and reliable solutions using AWS services. He concentrates on infrastructure development and building serverless solutions with AWS and DevOps. Apart from work, Debojit enjoys watching movies and spending time with his family.

glennchiGlenn is a Cloud Architect at AWS. He utilizes technology to help customers deliver on their desired outcomes in their cloud adoption journey. His current focus is DevOps and developing open-source software.

shalabhShalabh is a Senior Consultant based in London. His main focus is helping companies deliver secure, reliable, and fast solutions using AWS services. He gets very excited about customers innovating with AWS and DevOps. Outside of work, Shalabh is a cricket fan and a passionate singer.

Read More

Boost your forecast accuracy with time series clustering

Boost your forecast accuracy with time series clustering

Time series are sequences of data points that occur in successive order over some period of time. We often analyze these data points to make better business decisions or gain competitive advantages. An example is Shimamura Music, who used Amazon Forecast to improve shortage rates and increase business efficiency. Another great example is Arneg, who used Forecast to predict maintenance needs.

AWS provides various services catered to time series data that are low code/no code, which both machine learning (ML) and non-ML practitioners can use for building ML solutions. These includes libraries and services like AutoGluon, Amazon SageMaker Canvas, Amazon SageMaker Data Wrangler, Amazon SageMaker Autopilot, and Amazon Forecast.

In this post, we seek to separate a time series dataset into individual clusters that exhibit a higher degree of similarity between its data points and reduce noise. The purpose is to improve accuracy by either training a global model that contains the cluster configuration or have local models specific to each cluster.

We explore how to extract characteristics, also called features, from time series data using the TSFresh library—a Python package for computing a large number of time series characteristics—and perform clustering using the K-Means algorithm implemented in the scikit-learn library.

We use the Time Series Clustering using TSFresh + KMeans notebook, which is available on our GitHub repo. We recommend running this notebook on Amazon SageMaker Studio, a web-based, integrated development environment (IDE) for ML.

Solution overview

Clustering is an unsupervised ML technique that groups items together based on a distance metric. The Euclidean distance is most commonly used for non-sequential datasets. However, because a time series inherently has a sequence (timestamp), the Euclidean distance doesn’t work well when used directly on time series because it’s invariant to time shifts, ignoring the time dimension of data. For a more detailed explanation, refer to Time Series Classification and Clustering with Python. A better distance metric that works directly on time series is Dynamic Time Warping (DTW). For an example of clustering based on this metric, refer to Cluster time series data for use with Amazon Forecast.

In this post, we generate features from the time series dataset using the TSFresh Python library for data extraction. TSFresh is a library that calculates a large number of time series characteristics, which include the standard deviation, quantile, and Fourier entropy, among others. This allows us to remove the time dimensionality of the dataset and apply common techniques that work for data with flattened formats. In addition to TSFresh, we also use StandardScaler, which standardizes features by removing the mean and scaling to unit variance, and Principal component analysis (PCA) to perform dimensionality reduction. Scaling reduces the distance between data points, which in turn promotes stability in the model training process, and dimensionality reduction allows the model to learn from fewer features while retaining the major trends and patterns, thereby enabling more efficient training.

Data loading

For this example, we use the UCI Online Retail II Data Set and perform basic data cleansing and preparation steps as detailed in the Data Cleaning and Preparation notebook.

Feature extraction with TSFresh

Let’s start by using TSFresh to extract features from our time series dataset:

from tsfresh import extract_features
extracted_features = extract_features(
    df_final, 
    column_id="StockCode", 
    column_sort="timestamp")

Note that our data has been converted from a time series to a table comparing StockCode values vs. Feature values.

feature table

Next, we drop all features with n/a values by utilizing the dropna method:

extracted_features_cleaned=extracted_features
extracted_features_cleaned=extracted_features_cleaned.dropna(axis=1)

Then we scale the features using StandardScaler. The values in the extracted features consist of both negative and positive values. Therefore, we use StandardScaler instead of MinMaxScaler:

from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
extracted_features_cleaned_std = scaler.fit_transform(extracted_features_cleaned)

We use PCA to do dimensionality reduction:

from sklearn.decomposition import PCA
pca = PCA()
pca.fit(extracted_features_cleaned_std)

And we determine the optimal number of components for PCA:

plt.figure(figsize=(20,10))
plt.grid()
plt.plot(np.cumsum(pca.explained_variance_ratio_))
plt.xlabel('number of components')
plt.ylabel('cumulative explained variance')

The explained variance ratio is the percentage of variance attributed to each of the selected components. Typically, you determine the number of components to include in your model by cumulatively adding the explained variance ratio of each component until you reach 0.8–0.9 to avoid overfitting. The optimal value usually occurs at the elbow.

As shown in the following chart, the elbow value is approximately 100. Therefore, we use 100 as the number of components for PCA.

PCA

Clustering with K-Means

Now let’s use K-Means with the Euclidean distance metric for clustering. In the following code snippet, we determine the optimal number of clusters. Adding more clusters decreases the inertia value, but it also decreases the information contained in each cluster. Additionally, more clusters means more local models to maintain. Therefore, we want to have a small cluster size with a relatively low inertia value. The elbow heuristic works well for finding the optimal number of clusters.

from sklearn.cluster import KMeans
wcss = []
for i in range(1,10):
    km = KMeans(n_clusters=i) 
    km.fit(scores_pca)
    wcss.append(km.inertia_)
plt.figure(figsize=(20,10))
plt.grid()
plt.plot(range(1,10),wcss,marker='o',linestyle='--')
plt.xlabel('number of clusters')
plt.ylabel('WCSSS')

The following chart visualizes our findings.

Elbow

Based on this chart, we have decided to use two clusters for K-Means. We made this decision because the within-cluster sum of squares (WCSS) decreases at the highest rate between one and two clusters. It’s important to balance ease of maintenance with model performance and complexity, because although WCSS continues to decrease with more clusters, additional clusters increase the risk of overfitting. Furthermore, slight variations in the dataset can unexpectedly reduce accuracy.

It’s important to note that both clustering methods, K-Means with Euclidian distance (discussed in this post) and K-means algorithm with DTW, have their strengths and weaknesses. The best approach depends on the nature of your data and the forecasting methods you’re using. Therefore, we highly recommend experimenting with both approaches and comparing their performance to gain a more holistic understanding of your data.

Conclusion

In this post, we discussed the powerful techniques of feature extraction and clustering for time series data. Specifically, we showed how to use TSFresh, a popular Python library for feature extraction, to preprocess your time series data and obtain meaningful features.

When the clustering step is complete, you can train multiple Forecast models for each cluster, or use the cluster configuration as a feature. Refer to the Amazon Forecast Developer Guide for information about data ingestion, predictor training, and generating forecasts. If you have item metadata and related time series data, you can also include these as input datasets for training in Forecast. For more information, refer to Start your successful journey with time series forecasting with Amazon Forecast.

References


About the Authors

patrusheAleksandr Patrushev is AI/ML Specialist Solutions Architect at AWS, based in Luxembourg. He is passionate about the cloud and machine learning, and the way they could change the world. Outside work, he enjoys hiking, sports, and spending time with his family.

celimawsChong En Lim is a Solutions Architect at AWS. He is always exploring ways to help customers innovate and improve their workflows. In his free time, he loves watching anime and listening to music.

emiasnikEgor Miasnikov is a Solutions Architect at AWS based in Germany. He is passionate about the digital transformation of our lives, businesses, and the world itself, as well as the role of artificial intelligence in this transformation. Outside of work, he enjoys reading adventure books, hiking, and spending time with his family.

Read More

Celebrating Google Summer of Code Responsible AI Projects

Celebrating Google Summer of Code Responsible AI Projects

Posted by Bhaktipriya Radharapu, Software Engineer, Google Research

One of the key goals of Responsible AI is to develop software ethically and in a way that is responsive to the needs of society and takes into account the diverse viewpoints of users. Open source software helps address this by providing a way for a wide range of stakeholders to contribute.

To continue making Responsible AI development more inclusive and transparent, and in line with our AI Principles, Google’s Responsible AI team partnered with Google Summer of Code (GSoC), to provide students and professionals with the opportunity to contribute to open source projects that promote Responsible AI resources and practices. GSoC is a global, online program focused on bringing new contributors into open source software development. GSoC contributors work with an open source organization on a 12+ week programming project under the guidance of mentors. By bringing in new contributors and ideas, we saw that GSoC helped to foster a more innovative and creative environment for Responsible AI development.

This was also the first time several of Google’s Responsible AI tools, such as The Learning Interpretability Tool (LIT), TensorFlow Model Remediation and Data Cards Playbook, pulled in contributions from third-party developers across the globe, bringing in diverse and new developers to join us in our journey for building Responsible AI for all.

We’re happy to share the work completed by GSoC participants and share what they learned about working with state-of-the-art fairness and interpretability techniques, what we learned as mentors, and how rewarding summer of code was for each of us, and for the Responsible AI community.

We had the opportunity to mentor four developers – Aryan Chaurasia, Taylor Lee, Anjishnu Mukherjee, Chris Schmitz. Aryan successfully implemented XAI tutorials for LIT under the mentorship of Ryan Mullins, software engineer at Google. These showcase how LIT can be used to evaluate the performance of (multi-lingual) question-answering models, and understand behavioral patterns in text-to-image generation models.

Anjishnu implemented Tutorials for LIT also under the mentorship of Ryan Mullins. Anjishnu’s work influenced in-review research assessing professionals’ interpretability practices in production settings.

Chris, under the technical guidance of Jenny Hamer, a software engineer at Google, created two tutorials for TensorFlow Model Remediations’ experimental technique, Fair Data Reweighting. The tutorials help developers apply a fairness-enforcing data reweighting algorithm, a pre-processing bias remediation technique that is model architecture agnostic.

Finally, Taylor, under the guidance of Mahima Pushkarna, a senior UX designer at Google Research, and Andrew Zaldivar, a Responsible AI Developer Advocate at Google, designed the information architecture and user experience for activities from the Data Cards Playbook. This project translated a manual calculator that helps groups assess the reader-centricity of their Data Card templates into virtual experiences to foster rich discussion.

The participants learned a lot about working with state-of-the-art fairness and interpretability techniques. They also learned about the challenges of developing Responsible AI systems, and about the importance of considering the social implications of their work. What is also unique about GSOC is that this wasn’t just code and development – mentees were exposed to the code-adjacent work such as design and technical writing skills that are essential for the success of software projects and critical for cutting-edge Responsible AI projects; giving them a 360º view into the lifecycle of Responsible AI projects.

The program was open to participants from all over the world, and saw participation from 14 countries. We set-up several community channels for participants and professionals to discuss Responsible AI topics and Google’s Responsible AI tools and offerings which organically grew to 300+ members. The community engaged in various hands-on starter projects for GSoC in the areas of fairness, interpretibility and transparency, and were guided by a team of 8 Google Research mentors and organizers.

We were able to underscore the importance of community and collaboration in open source software development, especially in a field like Responsible AI, which thrives on transparent, inclusive development. Overall, the Google Summer of Code program has been a valuable tool for democratizing the responsible development of AI technologies. By providing a platform for mentorship, and innovation, GSoC has helped us improve the quality of open source software and to guide developers with tools and techniques to build AI in a safe and responsible way.

We’d like to say a heartfelt thank you to all the participants, mentors, and organizers who made Summer of Code a success. We’re excited to see how our developer community continues to work on the future of Responsible AI, together.

We encourage you to check out Google’s Responsible AI toolkit and share what you have built with us by tagging #TFResponsibleAI on your social media posts, or share your work for the community spotlight program.

If you’re interested in participating in the Summer of Code with TensorFlow in 2023, you can find more information about our organization and suggested projects here.

Acknowledgements:

Mentors and Organizers:

Andrew Zaldivar, Mahima Pushkarna, Ryan Mullins, Jenny Hamer, Pranjal Awasthi, Tesh Goyal, Parker Barnes,Bhaktipriya Radharapu

Sponsors and champions:

Special thanks to Shivani Poddar, Amy Wang, Piyush Kumar, Donald Gonzalez, Nikhil Thorat, Daniel Smilkov, James Wexler, Stephanie Taylor, Thea Lamkin, Philip Nelson, Christina Greer, Kathy Meier-Hellstern and Marian Croak for enabling this work.

Read More

NVIDIA Honors Partners Helping Industries Harness AI to Transform Business

NVIDIA Honors Partners Helping Industries Harness AI to Transform Business

NVIDIA today recognized a dozen partners in the Americas for their work enabling customers to build and deploy AI applications across a broad range of industries.

NVIDIA Partner Network (NPN) Americas Partner of the Year awards were given out to companies in 13 categories covering AI, consulting, distribution, education, healthcare, integration, networking, the public sector, rising star, service delivery, software and the Canadian market.  A new award category created this year recognizes growing AI adoption in retail, as leaders begin to introduce new AI-powered services addressing customer service, loss prevention and restocking analytics.

“NVIDIA’s commitment to driving innovation in AI has created new opportunities for partners to help customers leverage cutting-edge technology to reduce costs, grow opportunities and solve business challenges,” said Rob Enderle, president and principal analyst at the Enderle Group. “The winners of the 2023 NPN awards reflect a diverse group of AI business experts that have showcased deep knowledge in delivering transformative solutions to customers across a range of industries.”

The 2023 NPN award winners for the Americas are:

  • Arrow ElectronicsDistribution Partner of the Year. Recognized for providing end-to-end NVIDIA AI technologies across a variety of industries, such as manufacturing, retail, healthcare and robotics, to help organizations drive accelerated computing and robotics strategies via on-prem, hybrid cloud and intelligent edge solutions, and through Arrow’s Autonomous Machines Center of Excellence.
  • Cambridge ComputerHigher Education Partner of the Year. Recognized for the third consecutive year for its continued focus on providing NVIDIA AI solutions to the education, life sciences and research computing sectors.
  • CDW Software Partner of the Year. Recognized for deploying NVIDIA AI and visualization solutions to customers from a broad range of industries and adopting deep industry expertise for end-to-end customer support.
  • CDW CanadaCanadian Partner of the Year. Recognized for providing IT solutions that enable the nation’s leading vendors to offer customized solutions with NVIDIA technology, meeting the needs of each client.
  • Deloitte Consulting Partner of the Year. Recognized for the third consecutive year for creating new AI markets for clients by expanding AI investments in solutions developed with NVIDIA across enterprise AI, as well as expanding into new offerings with generative AI and NVIDIA DGX Cloud.
  • FedData Technology SolutionsRising Star Partner of the Year. Recognized for NVIDIA DGX-based design wins with key federal customers and emerging work with the NVIDIA Omniverse platform for building and operating metaverse applications.
  • InsightRetail Partner of the Year. Recognized for its deep understanding of the industry, ecosystem partnerships and the ability to orchestrate best-in-class solutions to bring real-time speed and predictability to retailers, enabling intelligent stores, intelligent quick-service restaurants, intelligent supply chain and omni-channel management.
  • LambdaSolution Integration Partner of the Year. Recognized for the third consecutive year for its commitment to providing end-to-end NVIDIA solutions, both on premises and in the cloud, across industries including higher education and research, the federal and public sectors, and healthcare and life sciences.
  • Mark IIIHealthcare Partner of the Year. Recognized for its unique team and deep understanding of the NVIDIA portfolio, which provides academic medical centers, research institutions, healthcare systems and life sciences organizations with NVIDIA infrastructure, software and cloud technologies to build out AI, HPC and simulation Centers of Excellence.
  • Microway Public Sector Partner of the Year. Recognized for its technical depth and engineering focus on servicing the public sector using technologies across the NVIDIA portfolio, including high performance computing and other specializations.
  • Quantiphi Service Delivery Partner of the Year. Recognized for the second consecutive year for its commitment to driving adoption of NVIDIA products in areas like generative AI services with customized large language models, digital avatars, edge computing, medical imaging and data science, as well as its expertise in helping customers build and deploy AI solutions at scale.
  • World Wide TechnologyAI Solution Provider of the  Year. Recognized for its leadership in driving adoption of the NVIDIA portfolio of AI and accelerated computing solutions, as well as its continued investments in AI infrastructure for large language models, computer vision, Omniverse-based digital twins, and customer testing and labs in the WWT Advanced Technology Center.
  • World Wide Technology Networking Partner of the Year. Recognized for its expertise driving NVIDIA high-performance networking solutions to support accelerated computing environments across multiple industries and AI solutions.

This year’s awards arrive as AI adoption is rapidly expanding across industries, unlocking new opportunities and accelerating discovery in healthcare, finance, business services and more. As AI models become more complex, the 2023 NPN Award winners are expert partners that can help enterprises develop and deploy AI in production using the infrastructure that best aligns with their operations.

Learn how to join the NPN, or find your local NPN partner.

Read More

Video Editor Patrick Stirling Invents Custom Effect for DaVinci Resolve Software

Video Editor Patrick Stirling Invents Custom Effect for DaVinci Resolve Software

Editor’s note: This post is part of our weekly In the NVIDIA Studio series, which celebrates featured artists, offers creative tips and tricks, and demonstrates how NVIDIA Studio technology improves creative workflows. We’re also deep diving on new GeForce RTX 40 Series GPU features, technologies and resources, and how they dramatically accelerate content creation.

AI-powered technology in creative apps, once considered nice to have, is quickly becoming essential for aspiring and experienced content creators.

Video editor Patrick Stirling used the Magic Mask feature in Blackmagic Design’s DaVinci Resolve software to create a custom effect that creates textured animations of people, this week In the NVIDIA Studio.

“I wanted to use ‘Magic Mask’ to replace subjects with textured, simplified, cut-out versions of themselves,” said the artist. “This style is reminiscent of construction-paper creations that viewers might have played with in childhood, keeping the energy of a scene while also pulling attention away from any specific features of the subject.”

Stirling’s effect creates textured, animated characters.

Stirling’s original attempts to implement this effect were cut short due to the limitations of his six-year-old system. So Stirling built his first custom PC — equipped with a GeForce RTX 4080 GPU — to tackle the challenge. The difference was night and day, he said.

Stirling’s effect on full display in DaVinci Resolve.

“I was able to find and maintain a creative flow so much more easily when I didn’t feel like I was constantly running into a wall and waiting for my system to catch up,” said Stirling.

“While the raw power of RTX GPUs is incredible, the work NVIDIA does to improve working in DaVinci Resolve, specifically, is really impressive. It’s extremely reassuring to know that I have the power to build complex effects.” — Patrick Stirling

The AI-powered Magic Mask feature, which allows quick selection of objects and people in a scene, was accelerated by his RTX 4080 GPU, delivering up to a 2x increase in AI performance over the previous generation. “The GPU also provides the power the DaVinci Neural Engine needs for some of these really cool effects,” said Stirling.

Stirling opened a short clip within the RTX GPU-accelerated Fusion page in DaVinci Resolve, a node-based workflow with hundreds of 2D and 3D tools. Nodes are popular as they make video editing a completely procedural process — allowing for non-linear, non-destructive workflows.

He viewed edits in real time using two windows opened side by side, with original footage on the left and node modifications on the right.

Original footage and node-based modifications, side by side.

Stirling then drew blue lines to apply Magic Mask to each surface on the subject that he wanted to layer. As its name suggests, Magic Mask works like magic, but it’s not perfect. When the effect masked more than the extended jacket layer, Stirling drew a secondary red line to designate what not to capture in that area.

The suit-jacket layer is masked as intended.

He applied similar techniques to the dress shirt, hands, beard, hair and facial skin. The artist then added generic colored backgrounds with Background nodes on each layer to complete his 2D character.

Textures provide contrast to the scene.

Stirling used Merge nodes to combine background and foreground images. He deployed the Fast Noise node to create two types of textures for the 2D man and the real-life footage, providing more contrast for the visual.

Organizing nodes is important to this creative workflow.

Stirling then added a color corrector to tweak saturation, his RTX GPU accelerating the process. He completed his video editing by combining the Magic Mask effect and all remaining nodes — Background, Merge and Fast Noise.

“DaVinci Resolve and the GeForce RTX 4080 feel like a perfect fit,” said Stirling.

When it’s time to wrap up the project, Stirling can deploy the RTX 4080 GPU’s dual AV1 video encoders — which would cut export times in half.

Stirling encourages aspiring content creators to “stay curious” and “not ignore the value of connecting with other creative people.”

“Regularly being around people doing the same kind of work as you will constantly expose new methods and approaches for your own creative projects,” he said.

Video editor Patrick Stirling.

Check out Stirling’s YouTube channel for DaVinci Resolve tutorials.

Follow NVIDIA Studio on Instagram, Twitter and Facebook. Access tutorials on the Studio YouTube channel and get updates directly in your inbox by subscribing to the Studio newsletter.

Read More