Getting started with the Amazon Kendra Google Drive connector

Getting started with the Amazon Kendra Google Drive connector

Amazon Kendra is a highly accurate and easy-to-use intelligent search service powered by machine learning (ML). To simplify the process of connecting data sources to your index, Amazon Kendra offers several native data source connectors to help get your documents easily ingested.

For many organizations, Google Drive is a core part of their productivity suite, and often contains important documents and presentations. In this post, we illustrate how you can use the Google Drive connector in Amazon Kendra to synchronize content between Google Drive and your Amazon Kendra index, making it searchable using Amazon Kendra’s intelligent search capabilities.

The Google Drive connector indexes documents stored in shared drives as well as documents stored in a user’s own drive (such as My Drives). By default, Amazon Kendra indexes all documents in your Google Drive, but it also provides the flexibility to exclude documents from the index based on certain criteria, including the ID of a shared drive, document owner, the MIME type of the document, or the document path.

Prerequisites

The Amazon Kendra Google Drive connector supports Google Docs and Google Slides. We demonstrate how to search a Google Drive Workspace in Amazon Kendra using an AWS Whitepaper dataset.

First, we set up the necessary permissions within your Google Drive Workspace. We then illustrate how to create the Amazon Kendra Google Drive connector on the AWS Management Console, followed by creating the Amazon Kendra Google Drive connector via the (Python) API. Lastly, we perform some example search queries with Amazon Kendra after ingesting the AWS Whitepaper dataset.

Setting up an Amazon Kendra Google Drive connector includes the following steps:

  • Setting up a name and tags
  • Entering the credentials for your Google service account
  • Setting up a sync schedule
  • Configuring the index field mappings

Setting up the necessary permissions within your Google Drive Workspace includes the following steps:

  • Creating a Google Drive service account if one doesn’t exist
  • Configuring the Google Drive service account
  • Enabling the Admin and Google Drive APIs
  • Enabling the Google API scope

If you haven’t previously created a service account, see the section Creating a Google Drive service account in this post.

Creating a Google Drive data source on the Amazon Kendra console

Before you create your data source, you must create an Amazon Kendra index. For instructions, see the section Creating an Amazon Kendra index in Getting started with the Amazon Kendra SharePoint Online connector.

After you create your index, you can create a Google Drive data source.

  1. On the Amazon Kendra console, under Data management¸ choose Data sources.
  2. Choose Create data source.
  3. Under Google Drive, choose Add connector.

A diagram showing how to Choose create a data source

  1. For Data source name¸ a name (for example, MyGoogleDriveDataSource).
  2. Choose Next.

A diagram showing how to name a data source name.

  1. In the Authentication section, you need information from the JSON document that was downloaded when you configured the service account. Make sure you include everything between ” ” for your private key.

The following screenshot shows what the JSON document looks like.

The following screenshot shows what the JSON document looks like.

The following screenshot shows our configuration on the Authentication page.

The following screenshot shows our configuration on the Authentication page.

  1. For IAM role¸ choose Create a new role to create a new AWS Identity and Access Management (IAM) role.
  2. For Role name, enter a name for your role.

A screenshot showing entering a name for your IAM role.

  1. Choose Next.
  2. For Set sync scope, you can define which user accounts, shared drives, or file type to exclude. For this post, we don’t modify these settings.

You can define which user accounts, shared drives, or file type to exclude.

  1. For Additional configuration, you can also include or exclude paths, files, or file types. For this post, I ingest everything I have on my Google Drive.

11. For Additional configuration, you can also include or exclude paths, files, or file types.

  1. In the Sync run schedule section, for Frequency, you can choose the frequency of data source synchronization—on demand, hourly, daily, weekly or monthly, or custom. For this post, I choose Run on demand.
  2. Choose Next.

n the Sync run schedule section, for Frequency, you can choose the frequency of data source synchronization—on demand, hourly, daily, weekly or monthly, or custom. For this post, I choose Run on demand.

  1. In the Field mapping section, you can define which file attributes you want to map into your index. For this post, I use the default field mapping.

The following table lists the available fields.

Google Drive Property Name Suggested Amazon Kendra Field Name
createdTime _created_at
dataSize gd_data_size
displayUrl gd_source_url
fileExtension _file_type
id _document_id
mimeType gd_mime_type
modifiedTime _last_updated_at
name _document_title
owner gd_owner
version gd_version

The following screenshot shows our configuration.

The following screenshot shows our configuration.

  1. Choose Next.
  2. Review your settings and choose Create.
  3. After the data source is created, you can start the sync process by choosing Sync now.

After the data source is created, you can start the sync process by choosing Sync now.

Creating an Amazon Kendra Google Drive connector with Python

You can create a new Amazon Kendra index Google Drive connector and sync it by using the AWS SDK for Python (Boto3). Boto3 makes it easy to integrate your Python application, library, or script with AWS services, including Amazon Kendra.

IAM roles requirements and overview

To create an index using the AWS SDK, you need to have the policy AmazonKendraFullAccess attached to the role you’re using.

At a high level, Amazon Kendra requires the following:

  • IAM roles for indexes – Needed to write to Amazon CloudWatch Logs.
  • IAM roles for data sources – Needed when you use the CreateDataSource These roles require a specific set of permissions depending on the connector you use. For our use case, it needs permissions to access the following:
    • AWS Secrets Manager, where the Google Drive credentials are stored.
    • The AWS Key Management Service (AWS KMS) customer master key (CMK) to decrypt the credentials by Secrets Manager.
    • The BatchPutDocument and BatchDeleteDocument operations to update the index.

For more information, see IAM access roles for Amazon Kendra.

For this solution, you also need the following:

  • An Amazon Kendra IAM role for CloudWatch
  • An Amazon Kendra IAM role for the Google Drive connector
  • Google Drive service account credentials stored on Secrets Manager

Creating an Amazon Kendra index

To create an index, use the following code:

import boto3
from botocore.exceptions import ClientError
import pprint
import time
 
kendra = boto3.client("kendra")
 
print("Creating an index")
 
description = "<YOUR INDEX DESCRIPTION>"
index_name = "<YOUR NEW INDEX NAME>"
role_arn = "KENDRA ROLE WITH CLOUDWATCH PERMISSIONS ROLE"
 
try:
    index_response = kendra.create_index(
        Description = description,
        Name = index_name,
        RoleArn = role_arn,
        Edition = "DEVELOPER_EDITION",
        Tags=[
        {
            'Key': 'Project',
            'Value': 'Google Drive Test'
        } 
        ]
    )
 
    pprint.pprint(index_response)
 
    index_id = index_response['Id']
 
    print("Wait for Kendra to create the index.")
 
    while True:
        # Get index description
        index_description = kendra.describe_index(
            Id = index_id
        )
        # If status is not CREATING quit
        status = index_description["Status"]
        print("    Creating index. Status: "+status)
        if status != "CREATING":
            break
        time.sleep(60)
 
except ClientError as e:
        print("%s" % e)
 
print("Done creating index.")

While your index is being created, you get regular updates (every 60 seconds; check line 38) until the process is complete. See the following code:

Creating an index{'Id': '3311b507-bfef-4e2b-bde9-7c297b1fd13b','ResponseMetadata': {'HTTPHeaders': {'content-length': '45','content-type': 'application/x-amz-json-1.1','date': 'Mon, 20 Jul 2020 19:58:19 GMT','x-amzn-requestid': 'a148a4fc-7549-467e-b6ec-6f49512c1602'},'HTTPStatusCode': 200,'RequestId': 'a148a4fc-7549-467e-b6ec-6f49512c1602','RetryAttempts': 2}}
Wait for Kendra to create the index.
    Creating index. Status: CREATING
    Creating index. Status: CREATING
    Creating index. Status: CREATING
    Creating index. Status: CREATING
    Creating index. Status: ACTIVE
Done creating index

When your index is ready, it provides an ID 3311b507-bfef-4e2b-bde9-7c297b1fd13b on the response. Your index ID will be different than the ID in this post.

Providing the Google Drive service account credentials

You also need to have GetSecretValue for your secret stored in Secrets Manager.

If you need to create a new secret in Secrets Manager to store the Google service account credentials, make sure the role you use has permissions to create a secret and tagging. See the following policy code:

{"Version": "2012-10-17","Statement": [{"Sid": "SecretsManagerWritePolicy","Effect": "Allow","Action": ["secretsmanager:UntagResource","secretsmanager:CreateSecret","secretsmanager:TagResource"],"Resource": "*"}]}

To create a secret on Secrets Manager, enter the following code:

secretsmanager = boto3.client('secretsmanager')

SecretName = "<YOUR_SECRETNAME>"
GoogleDriveCredentials= "{'clientAccount': '<YOUR SERVICE ACCOUNT EMAIL>','adminAccount': '<YOUR GSUITE ADMINISTRATOR EMAIL>','privateKey': '<YOUR SERVICE ACCOUNT PRIVATE KEY>'}"

try:     
    create_secret_response = secretsmanager.create_secret(
        Name=SecretName,
        Description='Secret for a Google Drive data source connector',
        SecretString=GoogleDriveCredentials,
        Tags=[{'Key': 'Project','Value': 'Google Drive Test'}])
except ClientError as e:
    print('%s' % e)
pprint.pprint(create_secret_response)

If everything goes well, you get a response with your secret’s ARN:

{'ARN': <YOUR_SECRET_ARN>,
 'Name': 'YOUR_SECRETNAME',
 'ResponseMetadata': {'HTTPHeaders': {'connection': 'keep-alive',
                                      'content-length': '161',
                                      'content-type': 'application/x-amz-json-1.1',
                                      'date': 'Wed, 25 Nov 2020 14:23:54 GMT',
                                      'x-amzn-requestid': 'a2f7af73-be54-4388-bc53-427b5f201b8f'},
                      'HTTPStatusCode': 200,
                      'RequestId': 'a2f7af73-be54-4388-bc53-427b5f201b8f',
                      'RetryAttempts': 0},
 'VersionId': '90c1f8b7-6c26-4d42-ba4c-e1470b648c5c'}

Creating the Amazon Kendra Google Drive data source

Your Amazon Kendra index is up and running and you have established the attributes that you want to map to your Google Drive document’s attributes.

You now need an IAM role with Kendra:BatchPutDocument and kendra:BatchDeleteDocument permissions. For more information, see IAM access roles for Amazon Kendra. We use the ARN for this IAM role when invoking the CreateDataSource API.

Make sure the role you use for your data source connector has a trust relationship with Amazon Kendra. See the following code:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Principal": {
                "Service": "kendra.amazonaws.com"
            },
            "Action": "sts:AssumeRole"
        }
    ]
}

The following code is the policy structure used:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "secretsmanager:GetSecretValue"
            ],
            "Resource": [
                "arn:aws:secretsmanager:<REGION>-<YOUR ACCOUNT NUMBER>:secret:<YOUR-SECRET-ID>"
            ]
        },
        {
            "Effect": "Allow",
            "Action": [
                "kms:Decrypt"
            ],
            "Resource": [
                "arn:aws:kms:<REGION>-<YOUR ACCOUNT NUMBER>:index/<YOUR-INDEX-ID>"
            ],
            "Condition": {
                "StringLike": {
                    "kms:ViaService": [
                        "secretsmanager.*.amazonaws.com"
                    ]
                }
            }
        },
        {
            "Effect": "Allow",
            "Action": [
                "kendra:BatchPutDocument",
                "kendra:BatchDeleteDocument"
            ],
            "Resource": "arn:aws:kendra:<REGION>-<YOUR ACCOUNT NUMBER>:index/<YOUR-INDEX-ID>"
        }
    ]
}

The following code is my role’s ARN:

arn:aws:iam::<YOUR ACCOUNT NUMBER>:role/Kendra-Datasource

Following the least privilege principle, we only allow our role to put and delete documents in our index and read the credentials of the Google service account.

When creating a data source, you can specify the sync schedule, which indicates how often your index syncs with the data source we create. This schedule is defined on the Schedule key of our request. You can use schedule expressions for rules to define how often you want to sync your data source. For this use case, the ScheduleExpression is 'cron(0 11 * * ? *)', which sets the data source to sync every day at 11:00 AM.

I use the following code. Make sure you match your SiteURL and SecretARN, and IndexID.

import boto3
from botocore.exceptions import ClientError
import pprint
import time

print('Create a Google Drive data source')
 
SecretArn= "<YOUR-SECRET-ARN>"
DSName= "<YOUR-DATASOURCE-NAME>"
IndexId= "<YOUR-INDEX-ID>"
DSRoleArn= "<YOUR-DATASOURCE-ROLE-ARN>"
ScheduleExpression='cron(0 11 * * ? *)'

try:
    datasource_response = kendra.create_data_source(
    Name=DSName,
    IndexId=IndexId,        
    Type='GOOGLEDRIVE',
    Configuration={
        'GoogleDriveConfiguration': {
            'SecretArn': SecretArn,
        },
               },
    Description='My GoogleDrive Datasource',
    RoleArn=DSRoleArn,
    Schedule=ScheduleExpression,
    Tags=[
        {
            'Key': 'Project',
            'Value': 'GoogleDrive Test'
        }
    ]
    )
    pprint.pprint(datasource_response)
    print('Waiting for Kendra to create the DataSource.')
    datasource_id = datasource_response['Id']
    while True:
        # Get index description
        datasource_description = kendra.describe_data_source(
            Id=datasource_id,
            IndexId=IndexId
        )
        # If status is not CREATING quit
        status = datasource_description["Status"]
        print("    Creating index. Status: "+status)
        if status != "CREATING":
            break
        time.sleep(60)    

except  ClientError as e:
        print('%s' % e) 

You should get a response like the following code:

'ResponseMetadata': {'HTTPHeaders': {'content-length': '45',
                                      'content-type': 'application/x-amz-json-1.1',
                                      'date': 'Wed, 02 Dec 2020 19:03:17 GMT',
                                      'x-amzn-requestid': '8d19fa35-adb6-41e2-92d6-0df2797707d8'},
                      'HTTPStatusCode': 200,
                      'RequestId': '8d19fa35-adb6-41e2-92d6-0df2797707d8',
                      'RetryAttempts': 0}}

Syncing the data source

Even though you defined a schedule for syncing the data source, you can sync on demand by using start_data_source_sync_job:

DSId=<YOUR DATA SOURCE ID>
IndexId=<YOUR INDEX ID>

try:
  ds_sync_response = kendra.start_data_source_sync_job(
  Id=DSId,
  IndexId=IndexId
  )
except ClientError as e:
  print('%s' % e)

pprint.pprint(ds_sync_response)

You get a result similar to the following code:

{'ExecutionId': '99bdd945-fe1e-4401-a9d6-a0272ce2dae7',
 'ResponseMetadata': {'HTTPHeaders': {'content-length': '54',
                                      'content-type': 'application/x-amz-json-1.1',
                                      'date': 'Wed, 02 Dec 2020 19:12:25 GMT',
                                      'x-amzn-requestid': '68a05d7b-26bf-4821-ae43-1a491f4cf314'},
                      'HTTPStatusCode': 200,
                      'RequestId': '68a05d7b-26bf-4821-ae43-1a491f4cf314',
                      'RetryAttempts': 0}}

Testing

Now that you have ingested the AWS Whitepapers dataset into your Amazon Kendra index, you can test some queries. I submit each test query first into the built-in Google Drive search bar and then retry the search with Amazon Kendra.

The first query I test is “What AWS service has 11 9s of durability?” The following screenshot shows the Google Drive output.

The first query I test is “What AWS service has 11 9s of durability?” The following screenshot shows the Google Drive output.

The following screenshot shows the query results in Amazon Kendra.The following screenshot shows the query results in Amazon Kendra.

The next query is “How many pillars compose the well architected framework?” The following screenshot shows the response from Google Drive.

The next query is “How many pillars compose the well architected framework?” The following screenshot shows the response from Google Drive.

The following screenshot shows the results from Amazon Kendra.

The following screenshot shows the results from Amazon Kendra.

The third query is “How can I get volume discounts?” The following screenshot shows the response from Google Drive.

The third query is “How can I get volume discounts?” The following screenshot shows the response from Google Drive.

The following screenshot shows the query results in Amazon Kendra.

The following screenshot shows the query results in Amazon Kendra.

The fourth query is “How can you control access to an RDS instance?” The following screenshot shows the Google Drive response.

The fourth query is “How can you control access to an RDS instance?” The following screenshot shows the Google Drive response.

The following screenshot shows the query results in Amazon Kendra.

 

The following screenshot shows the query results in Amazon Kendra.

Now let’s try something else. Instead of natural language search, let’s try the keyword search “volume discounts.” The following screenshot shows the Google Drive response.

Instead of natural language search, let's try the keyword search “volume discounts.” The following screenshot shows the Google Drive response.

The following screenshot shows the Amazon Kendra response.

The following screenshot shows the Amazon Kendra response.

Conclusion

Helping customers and employees find relevant information quickly increases workforce productivity and enhances overall customer experiences. In this post, we outlined how you can set up an Amazon Kendra Google Drive connector with Google Workspace through either the Amazon Kendra console or via AWS API.

To learn more about the Amazon Kendra Google Drive connector, see Amazon Kendra Google data source documentation, or you can explore other Amazon Kendra data source connectors by visiting the Amazon Kendra connector library. To get started with Amazon Kendra, visit the Amazon Kendra Essentials+ workshop for an interactive walkthrough.

Appendix

If you haven’t previously created a service account, complete the steps in this section before creating your Google Drive data source.

Creating a Google Drive service account

To ingest your documents store in Google Drive to your Amazon Kendra index, you need to have a Google Drive service account with sufficient permissions to access the documents stored within the Google Drive Workspace.

Follow these instructions:

  1. Log in to the Google Cloud Platform console with an account that has administrator privilege.
  2. On the menu, choose your project (for this post, MyFirstProject).

The Google Cloud Platform page.

  1. Choose IAM & Admin and choose Service Accounts.

Choose IAM & Admin and choose Service Accounts.

  1. Choose CREATE SERVICE ACCOUNT.

Choose CREATE SERVICE ACCOUNT.

  1. Enter a service account name and description.

The service account ID, an email address, is generated automatically.

  1. Choose Create.

Choose Create.

  1. Skip steps 2 (Grant this service account access to project) and 3 (Grant users access to this service account).
  2. Choose Done to continue.

Choose Done to continue.

Configuring Google Drive service account

Now that you have your service account created, it’s time configure it.

  1. Choose the service account name you created.
  2. Choose Edit.

Now that you have your service account created, it’s time configure it.

  1. On the service account page, choose SHOW DOMAIN-WIDE DELEGATION to view the available options.

On the service account page, choose SHOW DOMAIN-WIDE DELEGATION to view the available options.

  1. Select Enable G Suite Domain-wide Delegation.
  2. For Product name for the consent screen, enter a name.

  1. In the Keys section, choose ADD KEY and choose Create new key.

In the Keys section, choose ADD KEY and choose Create new key.

  1. For Key type¸ select JSON.
  2. Choose Create.

A JSON file containing the service account email address and private key is downloaded to your computer.

  1. Choose CLOSE.

A JSON file containing the service account email address and private key is downloaded to your computer.

  1. On the service account details page, take note of the account’s unique ID, to use later.

On the service account details page, take note of the account's unique ID, to use later.

Enabling the Admin and Google Drive APIs

You’re now ready to enable the Admin and Google Drive APIs.

  1. Choose APIs & Services and choose Library.

Choose APIs & Services and choose Library.

  1. Search for and choose Admin SDK.

Search for and choose Admin SDK.

  1. Choose Enable.

Choose Enable.

  1. Choose APIs & Services and choose Library.
  2. Search for and choose Google Drive API.

Search for and choose Google Drive API.

 

  1. Click on Enable.

Click on Enable.

Enabling Google API scopes

In this section, you configure the OAuth 2.0 scopes needed to access the Admin and Google Drive APIs required by the Amazon Kendra Google Drive connector.

  1. Log in to Google’s admin interface as your organization’s administrator user.
  2. Choose Security and choose API controls.

Choose Security and choose API controls.

  1. Scroll down and choose MANAGE DOMAIN-WIDE DELEGATION in the Domain-wide delegation section.

Scroll down and choose MANAGE DOMAIN-WIDE DELEGATION in the Domain-wide delegation section.

  1. Choose Add new.

Choose Add new.

  1. For Client ID, enter the unique ID from your service account details.
  2. For OAuth scopes, enter the following code:
    https://www.googleapis.com/auth/drive.readonly,
    https://www.googleapis.com/auth/drive.metadata.readonly,
    https://www.googleapis.com/auth/admin.directory.user.readonly,
    https://www.googleapis.com/auth/admin.directory.group.readonly

  1. Choose Authorize.

Choose Authorize.

After you create a service account and configure it to use the Google API, you can create a Google Drive data source.


About the Authors

Juan Pablo Bustos is an AI Services Specialist Solutions Architect at Amazon Web Services, based in Dallas, TX. Outside of work, he loves spending time writing and playing music as well as trying random restaurants with his family.

 

 

 

David Shute is a Senior ML GTM Specialist at Amazon Web Services focused on Amazon Kendra. When not working, he enjoys hiking and walking on a beach.

Read More

NVIDIA Chief Scientist Bill Dally to Keynote at GTC China

NVIDIA Chief Scientist Bill Dally to Keynote at GTC China

Bill Dally — one of the world’s foremost computer scientists and head of NVIDIA’s research efforts — will deliver the keynote address during GTC China, the latest event in the world’s premier conference series focused on AI, deep learning and high performance computing.

Registration is not required to view the keynote, which will take place on Dec. 14, at 6 p.m. Pacific time (Dec. 15, 10 a.m. China Standard time). GTC China is a free, online event, running Dec. 15-19.

Tens of thousands of attendees are expected to join the event, with thousands more tuning in to hear Dally speak on the latest innovations in AI, graphics, HPC, healthcare, edge computing and autonomous machines. He will also share new research in the areas of AI inference, silicon photonics, and GPU cluster acceleration.

In a career spanning nearly four decades, Dally has pioneered many of the fundamental technologies underlying today’s supercomputer and networking architectures. As head of NVIDIA Research, he leads a team of more than 200 around the globe who are inventing technologies for a wide variety of applications, including AI, HPC, graphics and networking.

Prior to joining NVIDIA as chief scientist and senior vice president of research in 2009, he chaired Stanford University’s computer science department.

Dally is a member of the National Academy of Engineering and a fellow of the American Academy of Arts & Sciences, the Institute of Electrical and Electronics Engineers and the Association for Computing Machinery. He’s written four textbooks, published more than 250 papers and holds over 120 patents, and has received the IEEE Seymour Cray Award, ACM Eckert-Mauchly Award and ACM Maurice Wilkes Award.

Following Dally’s keynote, four senior NVIDIA executives will describe how the company’s latest breakthroughs in AI, data science and healthcare are being adopted in China. The panel discussion will take place on Monday, Dec. 14, at 7:10 p.m. Pacific (Dec. 15 at 11:10 a.m. CST).

GTC China Highlights

GTC is the premier conference for developers to strengthen their skills on a wide range of technologies. It will include 220+ live and on-demand sessions and enable attendees to ask questions and interact with experts.

Many leading organizations will participate, including Alibaba, AWS, Baidu, ByteDance, China Telecom, Dell Technologies, Didi, Hewlett Packard Enterprise, Inspur, Kuaishou, Lenovo, Microsoft, Ping An, Tencent, Tsinghua University and Xiaomi.

Certified instructors will provide virtual training for hundreds of participants in the NVIDIA Deep Learning Institute. DLI seats are currently sold out.

NVIDIA Inception, an acceleration program for AI and data science startups, will host 12 leading Chinese startups in the NVIDIA Inception Startup Showcase. Attendees will have the opportunity to see presentations from the 12 CXOs, whose companies were selected by winning a vote among more than 40 participating startups.

For more details and to register for GTC China at no charge, visit www.nvidia.cn/gtc/.

The post NVIDIA Chief Scientist Bill Dally to Keynote at GTC China appeared first on The Official NVIDIA Blog.

Read More

Meet our TensorFlow AI Service Partners

Meet our TensorFlow AI Service Partners

Posted by Amy Hsueh, TensorFlow Partnerships Lead, and Sandeep Gupta, TensorFlow Product Manager

Implementing machine learning solutions can help businesses innovate, but it can be a challenge if companies don’t have the knowledge, experience, or resources in-house to get started.

That’s where our TensorFlow AI Service Partners may be able to help. We’ve selected AI/ML practitioners who have experience helping businesses implement AI/ML and TensorFlow-based solutions. We hope that these partners can help more enterprises benefit from AI-based systems and innovate faster, solve smarter, and scale bigger.

“Service partners are critical in driving large adoption of AI in the enterprise”, said Kemal El Moujahid, Director of Product Management for TensorFlow, “so we are excited to partner with some of the leading AI Service companies, who excel at building powerful solutions with TensorFlow. The breadth and diversity of business problems that these companies solve for their customers are incredible, and we are looking forward to seeing even more real-world impact with TensorFlow.”

Our selected partners are experienced in creating a range of consulting and software solutions powered by TensorFlow and other frameworks that span across the machine learning workflow, including preparing and ingesting data, training and optimizing models, and productionizing them.

TensorFlow AI Service Partners share their insights and product feedback with the TensorFlow team, helping us make enhancements and improvements that address enterprise ML needs.

“We are thrilled to be partnering with companies that have deep TensorFlow expertise and demonstrated track records in solving their customer’s business critical needs,” remarks Sarah Sirajuddin, Engineering Director of TensorFlow, “We value user feedback tremendously, and believe that the feedback and insights gathered from these partners will help us improve TensorFlow for all our users.”

Choose from our partners, ranging in geographic reach and industry specializations, all with demonstrated expertise in TensorFlow on our website or hear from them directly on why they are excited about this program on their respective blogs: Determined AI, Labelbox, Paperspace, SpringML, Stradigi AI, Quantiphi.

We look forward to seeing this program grow and adding additional partners in the future. If you’re interested in becoming a partner, check out our application guide.

Read More

Majority Report: Experts Talk Future of AI and Its Impact on Global Industries

Majority Report: Experts Talk Future of AI and Its Impact on Global Industries

AI is the largest technology force of our time, with the most potential to transform industries. It will bring new intelligence to healthcare, education, automotive, retail and finance, creating trillions of dollars in a new AI economy.

As businesses look ahead to 2021 priorities, now’s a great time to look back at where the world stands on global AI adoption.

Retailers like Walmart and Tesco are mining new AI opportunities for product forecasting, supply chain management, intelligent store installations and predicting consumer buying trends. Healthcare players in the age of COVID are trying to speed scientific research and vaccine development.

Meantime, educators are employing AI to train a data-savvy workforce. And legions of businesses are examining how AI can help them adapt to remote work and distance collaboration.

Yet mainstream adoption of AI continues to skew toward big tech companies, automotive and retail, which are attempting to scale across their organizations instead of investing in skunkwork projects, according to a 2019 McKinsey global survey of about 2,000 organizations.

We asked some of the top experts at NVIDIA where they see the next big things in AI happening as companies parse big data and look for new revenue opportunities. NVIDIA works with thousands of AI-focused startups, ISVs, hardware vendors and cloud companies, as well as companies and research organizations around the world. These broad collaborations offer a bird’s eye view into what’s happening and where.

Here’s what our executives had to say:

Clement Farabet headshotCLEMENT FARABET
Vice President, NVIDIA AI Infrastructure

AI as a Compiler: As AI training algorithms get faster, more robust and with richer tooling, AI will become equivalent to a compiler — developers will organize their datasets as code, and use AI to compile them into models.The end state of this is a large ecosystem of tooling/platforms (just like today’s tools for regular software) to enable more and more non-experts to “program” AIs. We’re partially there, but I think the end state will look very different than where we are today — think compilation in seconds to minutes instead of days of training. And we’ll have very efficient tools to organize data, like we do for code via git today.

AI as a Driver: AI will be assisting most vehicles to move around the physical world and continuously learning from their environments and co-pilots (human drivers) to improve, on their way to becoming fully independent drivers. The value of this is there today and will only grow larger. The end state is commoditized level 4 autonomous vehicles, relying on cheap enough sensor platforms.

Bryan Catanzaro headshotBRYAN CATANZARO
Vice President, NVIDIA Applied Deep Learning Research

Conversational AI: Chabots might seem like so-last-decade when it comes to video games designed to take advantage of powerful PC graphics cards and CPUs in today’s computers. AI for some time has been used to generate responsive, adaptive or intelligent behaviors primarily in non-player characters. Conversational AI will take gameplay further by allowing real-time interaction via voice to flesh out character-driven approaches. When your in-game enemies start to talk and think like you, watch out.

Multimodal Synthesis: Can a virtual actor win an Academy Award? Advances in multimodal synthesis — the AI-driven art of creating speech and facial expressions from data — will be able to create characters that look and sound as real as a Meryl Streep or Dwayne Johnson.

Remote Work: AI solutions will make working from home easier and more reliable (and perhaps more pleasant) through better videoconferencing, audio quality and auto-transcription capabilities.

Anima Anandkumar headshotANIMA ANANDKUMAR
Director of ML Research, NVIDIA, and Bren Professor at Caltech

Embodied AI: The mind and body will start coming together. We will see greater adaptivity and agility in our robots as we train them to do more complex and diverse tasks.The role of high fidelity simulations is critical here to overcome the dearth of real data.

AI4Science: AI will continue to get integrated into scientific applications at scale. Traditional solvers and pipelines will be ultimately completely replaced with AI to achieve as high as a 1000x increase in speed. This will require combining deep learning with domain-specific knowledge and constraints.

Alison Lowndes headshotALISON LOWNDES
Artificial Intelligence, NVIDIA Developer Relations

Democratized AI: The more people who have access to the dataset, and who are trained in how to mine it, the more innovations that will emerge. Nations will begin to solidify AI strategies, while universities and colleges will work in partnership with private industry to create more end-user mobile applications and scientific breakthroughs.

Simulation AI: “What does (insert AI persona here) think? The AI-based simulation increasingly will mimic human intelligence, with the ability to reason, problem solve and make decisions. You’ll see increased use here for both AI research and design and engineering.

AI for Earth Observation (AI4EO): It may be a small world after all, but there’s still a lot we don’t know about Mother Earth. A global AI framework would process satellite data in orbit and on the ground for rapid, if not real-time, actionable knowledge. It could create new monitoring solutions, especially for climate change, disaster response and biodiversity loss.

Kimberly Powell headshotKIMBERLY POWELL
Vice President & General Manager, NVIDIA Healthcare

Federated Learning: The clinical community will increase their use of federated learning approaches to build robust AI models across various institutions, geographies, patient demographics and medical scanners. The sensitivity and selectivity of these models are outperforming AI models built at a single institution, even when there is copious data to train with. As an added bonus, researchers can collaborate on AI model creation without sharing confidential patient information. Federated learning is also beneficial for building AI models for areas where data is scarce, such as for pediatrics and rare diseases.

AI-Driven Drug Discovery: The COVID-19 pandemic has put a spotlight on drug discovery, which encompasses microscopic viewing of molecules and proteins, sorting through millions of chemical structures, in-silico methods for screening, protein-ligand interactions, genomic analysis, and assimilating data from structured and unstructured sources. Drug development typically takes over 10 years, however, in the wake of COVID, pharmaceutical companies, biotechs and researchers realize that acceleration of traditional methods is paramount. Newly created AI-powered discovery labs with GPU-accelerated instruments and AI models will expedite time to insight — creating a computing time machine.

Smart Hospitals: The need for smart hospitals has never been more urgent. Similar to the experience at home, smart speakers and smart cameras help automate and inform activities. The technology, when used in hospitals, will help scale the work of nurses on the front lines, increase operational efficiency and provide virtual patient monitoring to predict and prevent adverse patient events.

Charlie Boyle headshotCHARLIE BOYLE
Vice President & General Manager, NVIDIA DGX Systems

Shadow AI: Managing AI across an organization will be a hot-button internal issue if data science teams implement their own AI platforms and infrastructure without IT involvement. Avoiding shadow AI requires a centralized enterprise IT approach to infrastructure, tools and workflow, which ultimately enables faster, more successful deployments of AI applications.

AI Center of Excellence: Companies have scrambled over the past 10 years to snap up highly paid data scientists, yet their productivity has been lower than expected because of a lack of supportive infrastructure. More organizations will speed the investment return on AI by building centralized, shared infrastructure at supercomputing scale. This will facilitate the grooming and scaling of data science talent, the sharing of best practices and accelerate the solving of complex AI problems.

Hybrid Infrastructure: The Internet of Things will lead decision-makers to adopt a mixed AI approach, using the public cloud (AWS, Azure, Oracle Cloud,Google Cloud) and private clouds (on-premises servers) to deliver applications faster (with lower latency, in industry parlance) to customers and partners while maintaining security by limiting the amount of sensitive data shared across networks. Hybrid approaches will also become more popular as governments adopt strict data protection laws governing the use of personal information.

Kevin Deierling headshotKEVIN DEIERLING
Senior Vice President, NVIDIA Networking

Accelerating Change in the Data Center: Security and management will be offloaded from CPUs into GPUs, SmartNICs and programmable data processing units to deliver expanded application acceleration to all enterprise workloads and provide an extra layer of security. Virtualization and scalability will be faster, while CPUs will run apps faster and offer accelerated services.

AI as a Service: Companies that are reluctant to spend time and resources investing in AI, whether for financial reasons or otherwise, will begin turning to third-party providers for experimentation. AI platform companies and startups will become key partners by providing access to software, infrastructure and potential partners.

Transformational 5G: Companies will begin defining what “the edge” is. Autonomous driving is essentially a data center in the car, allowing the AI to make instantaneous decisions, while also being able to report back for training. You’ll see the same thing with robots in the warehouse and the workplace, where there will be inference learning at the edge and training at the core. Just like 4G spawned transformational change in transportation with Lyft and Uber, 5G will bring transformational deals and capabilities. It won’t happen all at once, but you’ll start to see the beginnings of companies seeking to take advantage of the confluence of AI, 5G and new computing platforms.

Sanja Fidler headshotSANJA FIDLER
Director AI, NVIDIA and Professor Vector Institute for Artificial Intelligence

AI for 3D Content Creation: AI will revolutionize the content creation process, offering smart tools to reduce mundane work and to empower creativity. In particular, creating 3D content for architecture, gaming, films and VR/AR has been very laborious: games like Call of Duty take at least a year to make, even with hundreds of people involved and millions budgeted.

With AI, one will be able to build virtual cities by describing them in words, and see virtual characters come to life to converse and behave in desired ways without needing to hard code the behavior. Creating a 3D asset will  become as easy as snapping a photo, and modernizing and restyling old games will happen with the click of a button.

AI for Robotics Simulation: Testing robots in simulated environments is key for safety-critical applications such as self-driving cars or operating robots. Deep learning will bring simulation to the next level, by learning to mimic the world from data, both in terms of creating 3D environments, simulating diverse behaviors, simulating and re-simulating new or observed road scenarios, and simulating the sensors in ways that are closer to reality.

An Opportunity for Reinvention

To accomplish any or all of these tasks, organizations will have to move more quickly for internal alignment. For example, 72 percent of big AI adopters in the McKinsey survey say their companies’ AI strategy aligns with their corporate strategy, compared with 29 percent of respondents from other companies. Similarly, 65 percent of the high performers report having a clear data strategy that supports and enables AI, compared with 20 percent from other companies.

Even as the global pandemic creates uncertainty around the world, 2021 will be a time of reinvention as players large and small leverage AI to improve on their business models. More companies will operationalize AI as early results prove promising enough to commit more resources to their efforts.

The post Majority Report: Experts Talk Future of AI and Its Impact on Global Industries appeared first on The Official NVIDIA Blog.

Read More

How Thomson Reuters accelerated research and development of natural language processing solutions with Amazon SageMaker

How Thomson Reuters accelerated research and development of natural language processing solutions with Amazon SageMaker

This post is co-written by John Duprey and Filippo Pompili from Thomson Reuters.

Thomson Reuters (TR) is one of the world’s most trusted providers of answers, helping professionals make confident decisions and run better businesses. Teams of experts from TR bring together information, innovation, and confident insights to unravel complex situations, and their worldwide network of journalists and editors keeps customers up to speed on global developments. TR has over 150 years of rich, human-annotated data on law, tax, news, and other segments. TR’s data is the crown jewel of the business. It’s one of the aspects that distinguishes TR from its competitors.

In 2018, a team of research scientists from the Center for AI and Cognitive Computing at TR started an experimental project at the forefront of natural language understanding. The project is based on the latest scientific discoveries that brought wide disruptions in the field of machine reading comprehension (MRC) and aims to develop technologies that you can use to solve numerous tasks, including text classification and natural language question answering.

In this post, we discuss how TR used Amazon SageMaker to accelerate their research and development efforts, and did so with significant cost savings and flexibility. We explain how the team experimented with many variants of BERT to produce a powerful question-answering capability. Lastly, we describe TR’s Secure Content Workspace (SCW), which provided the team with easy and secure access to Amazon SageMaker resources and TR proprietary data.

Customer challenge

The research and development team at TR needed to iterate quickly and securely. Team members already had significant expertise developing question-answering solutions, both via dedicated feature engineering for shallow algorithms and with featureless neural-based solutions. They played a key role in developing the technology powering Westlaw Edge (legal) and Checkpoint Edge (tax), two well-received products from TR. These projects each required 15–18 months of intense research and development efforts and have reached remarkable performance levels. For MRC, the research team decided to experiment with BERT and several of its variants on two sets of TR’s data, one from the legal domain and another from the tax domain.

The legal training corpus was composed of tens of thousands of editorially reviewed questions. Each question was compared against several potential answers in the form of short, on-point, text summaries. These summaries were highly curated editorial material that was extracted from legal cases across many decades—resulting in a candidate training set of several hundred thousand question-answer (QA) pairs, drawn from tens of millions of text summaries. The tax corpus, comprised of more than 60,000 editorially curated documents on US federal tax law, contained thousands of questions and tens of thousands of QA pairs.

Model pretraining and fine-tuning against these datasets would be impossible without state-of-art compute power. Procuring these compute resources typically required a big upfront investment with long lead times. For research ideas that might or might not become a product, it was hard to justify such a significant cost for experimentation.

Why AWS and Amazon SageMaker?

TR chose Amazon SageMaker as the machine learning (ML) service for this project. Amazon SageMaker is a fully managed service to build, train, tune, and deploy ML models at scale. One of the key factors in TR’s decision to choose Amazon SageMaker was the benefit of a managed service with pay-as-you-go billing. Amazon SageMaker lets TR decide how many experiments to run, and helps control the cost of training. More importantly, when a training job completes, the team is no longer charged for the GPU instances they were using. This resulted in substantial cost savings compared to managing their own training resources, which would have resulted in low server utilization. The research team could spin up as many instances as required and let the framework take care of shutting down long-running experiments when they were done. This enabled rapid prototyping at scale.

In addition, Amazon SageMaker has a built-in capability to use managed Spot Instances, which reduced the cost of training in some cases by more than 50%. For some large natural language processing (NLP) experiments using models like BERT on vast proprietary datasets, training time is measured in days, if not weeks, and the hardware involved is expensive GPUs. A single experiment can cost a few thousand dollars. Managed Spot Training with Amazon SageMaker helped TR reduce training costs by 40–50% on average. In comparison to self-managed training, Amazon SageMaker also comes with a full set of built-in security capabilities. This saved the team countless hours of coding that would have been necessary on a self-managed ML infrastructure.

After they launched the training jobs, TR could easily monitor them on the Amazon SageMaker console. The logging and hardware utilization metering facilities allowed the team to have a quick overview of their jobs’ status. For example, they could ensure the training loss was evolving as expected and see how well the allocated GPUs were utilized.

Amazon SageMaker provided TR easy access to state-of-the-art underlying GPU infrastructure without having to provision their own infrastructure or shoulder the burden of managing a set of servers, their security posture, and their patching levels. As faster and cheaper GPU instances become available going forward, TR can use them to reduce cost and training times with a simple configuration change to use the new type. On this project, the team was able to easily experiment with instances from the P2, P3, and G4 family based on their specific needs. AWS also gave TR a broad set of ML services, cost-effective pricing options, granular security controls, and technical support.

Solution overview

Customers operate in complex arenas that move society forward—law, tax, compliance, government, and media—and face increasing complexity as regulation and technology disrupts every industry. TR helps them reinvent the way they work. Using MRC, TR expects to offer natural language searches that outperform previous models that relied on manual feature engineering.

The BERT-based MRC models that the TR research team is developing run on text datasets exceeding several tens of GBs of compressed data. The deep learning frameworks of choice for TR are TensorFlow and PyTorch. The team uses GPU instances for time-consuming neural network training jobs, with runtimes ranging from tens of minutes to several days.

The MRC team has experimented with many variants of BERT. Initially starting from the base model, with 12 layers of stacked transformer encoders and 12 attention heads for 100 million parameters, up to the large model with 24 layers, 16 heads, and 300 million parameters. The availability of V100 GPUs with the largest amount of 32 GB of RAM was instrumental in training the largest model variants. The team formulated the question-answering problem as a binary classification task. Each QA pair is graded by a pool of subject matter experts (SMEs) assigning one of four different grades: A, C, D, and F, where A is for perfect answers and F for completely wrong errors. The grades of each QA pair are converted to numbers, averaged across graders, and binarized.

Because each question-answering system is domain-specific, the research team used transfer learning and domain-adaptation techniques to enable this capability across different sub-domains (for example, law isn’t a single domain). TR used Amazon SageMaker for both language model pretraining and fine-tuning of their BERT models. When compared to the available on-premises hardware, the Amazon SageMaker P3 instance shrunk the training time from many hours to less than 1 hour for fine-tuning jobs. The pretraining of BERT on the domain-specific corpus was reduced from an estimated several weeks to only a few days. Without the dramatic time savings and cost savings provided by Amazon SageMaker, the TR research team would likely not have completed the extensive experimentation required for this project. With Amazon SageMaker, they made breakthroughs that drove key improvements to their applications, enabling faster and more accurate searches by their users.

For inference, TR used the Amazon SageMaker batch transform function for model scoring on vast amounts of test samples. When testing of model performance was satisfactory, Amazon SageMaker managed hosting enabled real-time inference. TR is taking the results of the research and development effort and moving it to production, where they expect to use Amazon SageMaker endpoints to handle millions of requests per day on highly specialized professional domains.

Secure, easy, and continuous access to the vast amounts of proprietary data

Protecting TR’s intellectual property is very important to the long-term success of the business. Because of this, TR has clear, ever-evolving standards around security and ways of working in the cloud that must be followed to protect their assets.

This raises some key questions for TR’s scientists. How can they create an instance of an Amazon SageMaker notebook (or launch a training job) that’s secure and compliant with TR’s standards? How can a scientist get secure access to TR’s data within Amazon SageMaker? TR needed to ensure scientists could do this consistently, securely, and with minimal effort.

Enter Secure Content Workspaces. SCW is a web-based tool developed by TR’s research and development team and answers these questions. The following diagram shows SCW in the context of TR’s research effort described earlier.

SCW enables secure and controlled access to TR’s data. It also provisions services, like Amazon SageMaker, in ways that are compliant with TR’s standards. With the help of SCW, scientists can work in the cloud with peace of mind knowing they comply with security protocols. SCW lets them focus on what they’re good at—solving hard problems with artificial intelligence (AI).

Conclusion

Thomson Reuters is fully committed to the research and development of state-of-the-art AI capabilities to aid their customers’ work. The MRC research was the latest in these endeavors. Initial results indicate broad applications across TR’s product line—especially for natural language question answering. Whereas past solutions involved extensive feature engineering and complex systems, this new research shows simpler ML solutions are possible. The entire scientific community is very active in this space, and TR is proud to be a part of it.

This research would not have been possible without the significant computational power offered by GPUs and the ability to scale it on demand. The Amazon SageMaker suite of capabilities provided TR with the raw horsepower and necessary frameworks to build, train, and host models for testing. TR built SCW to support cloud-based research and development, like MRC. SCW sets up scientists’ working environment in the cloud and ensures compliance with all of TR’s security standards and recommendations. It made using tools like Amazon SageMaker with TR’s data safe.

Moving forward, the TR research team is looking at introducing a much wider range of AI/ML features based on these powerful deep learning architectures, using Amazon SageMaker and SCW. Examples of such advanced capabilities include on-the-fly answer generation, long text summarization, and fully interactive, conversational, question answering. These capabilities will enable a comprehensive assistive AI system that can guide users toward the best solution for all their information needs.


About the Authors

Mark Roy is a Machine Learning Specialist Solution Architect, helping customers on their journey to well-architected machine learning solutions at scale. In his spare time, Mark loves to play, coach, and follow basketball.

 

 

 

Qingwei Li is a Machine Learning Specialist at Amazon Web Services. He received his Ph.D. in Operations Research after he broke his advisor’s research grant account and failed to deliver the Noble Prize he promised. Currently he helps customers in financial service and insurance industry build machine learning solutions on AWS. In his spare time, he likes reading and teaching.

 

 

 

John Duprey is senior director of engineering for the Center for AI and Cognitive Computing (C3) at Thomson Reuters. John and the engineering team work alongside scientists and product technology teams to develop AI-based solutions to Thomson Reuters customers’ most challenging problems.

 

 

 

Filippo Pompili is Sr NLP Research Scientist at the Center for AI and Cognitive Computing (C3) at Thomson Reuters. Filippo has expertise in machine reading comprehension, information retrieval, and neural language modeling. He actively works on bringing state-of-the-art machine learning discoveries into Thomson Reuters’ most advanced products.

 

 

 

 

 

 

Read More

Faster Physics: How AI and NVIDIA A100 GPUs Automate Particle Physics

Faster Physics: How AI and NVIDIA A100 GPUs Automate Particle Physics

What are the fundamental laws that govern our universe? How did the matter in the universe today get there? What exactly is dark matter?

The questions may be eternal, but no human scientist has an eternity to answer them.

Now, thanks to NVIDIA technology and cutting-edge AI, the more than 1,000 collaborators from 26 countries working on the Belle II particle physics experiment are able to learn more about these big questions, faster.

The Belle II detector, based just north of Tokyo, reproduces the particles created during the early universe by smashing high-energy electrons and anti-electrons together.

These collisions generate a serious amount of data. Researchers will make high-precision recordings of hundreds of billions of collisions over the experiment’s lifetime. Sifting through all this data, without sacrificing the detailed information needed for high-precision measurements, is a daunting task.

To reconstruct the way individual particles, detected at Belle II, decayed from larger groups of particles, researchers turned to AI, says James Kahn from the Karlsruhe Institute of Technology, or KIT, a Belle II researcher and AI consultant with Helmholtz AI, a German public research platform for applied AI.

“Given the successes of AI and its ability to learn by example on large volumes of data, this is the perfect place to apply it,” Kahn said.

And to accelerate that AI, they’re using the NVIDIA Ampere architecture’s multi-instance GPU technology, built into the NVIDIA A100 GPU.

Physics Meets the A100

Kahn’s team was able to get early access to the “fresh out of the oven” NVIDIA DGX A100, a compact system packing 5 petaflops of AI computing power.

It’s among the first in Europe, and the first connected via InfiniBand high-speed interconnect technology. It was installed at KIT thanks to the high-performance computing operations team at the Steinbuch Center for Computing.

This close connection among the AI consultant team, international scientists and the HPC operations team will be a benefit for future research.

“We are really happy to see that only a few hours after we had the DGX A100 up and running, scientific analyses were already being performed,” said Jennifer Buchmüller, HPC core facility leader at KIT.

There’s more to come: HoreKa, the next supercomputer at KIT, will be equipped with more than 740 NVIDIA A100 GPUs.

A New View on Particle Decays

All of this helps Kahn and his team accelerate a new approach developed at KIT in collaboration with researchers from the nearby University of Strasbourg.

By designing a new representation of particle decays, or how unstable subatomic particles fall apart, Kahn’s team has been able to use a specialized neural network, known as a graph neural network, to automate the reconstruction of the particle decays from the individual particles detected by Belle II.

“We realized we could re-express particle decays in terms of the detected particles’ relations alone,” said Kahn. “This was the key ingredient to enable a full, end-to-end AI solution.”

The team has already demonstrated this technique’s success on a selection of specially designed simulations of particle decays, and recently scaled up to simulations of the interactions occurring at Belle II.

Scaling up, however, required resources that could handle both the volume of data and the large neural networks trained on it.

To do so they split up the GPUs using the multi-instance GPU technology — which allows a single GPU to perform multiple tasks simultaneously — to perform a spread-and-search of the network hyperparameters.

“Architecture searches which took days could now be completed in a matter of hours,” Kahn said.

The result: more time for more science, and for more of those eternal questions to be asked, and answered.

The post Faster Physics: How AI and NVIDIA A100 GPUs Automate Particle Physics appeared first on The Official NVIDIA Blog.

Read More

iGibson: A Simulation Environment to Train AI Agents in Large Realistic Scenes

iGibson: A Simulation Environment to Train AI Agents in Large Realistic Scenes

Why simulation for AI?

We are living in a Golden Age of simulation environments in AI and robotics. Looking back ten years, simulation environments were rare, with only a handful of available solutions, and were complex and used only by experts. Today, there are many available simulation environments and most papers in AI and robotics at first tier conferences such as NeurIPS, CoRL or even ICRA and IROS, make some use of them. What has changed?

This extensive use of simulation environments is the result of several trends:

  • First, the increasing role of machine learning in robotics creates a demand for more data (for example, interactive experiences) than what can be generated in real time 1234. Also, the initial data collection process often involves random exploration that may be dangerous for physical robots or their surroundings.
  • Second, simulation environments have matured to be more robust, realistic (visually and physically), user friendly and accessible to all types of users, and the necessary computation to simulate complex physics is reasonably fast on most modern machines. Therefore, simulation environments have the potential to lower the barrier to entry in robotics, even for researchers without the funds to acquire expensive real robot platforms.
  • Finally, the increasing number of robotic solutions to tasks such as grasping, navigation or manipulation have brought more attention to a critical absence in our community: the lack of repeatable benchmarks. Mature sciences are based on experiments that can be easily and reliably replicated, so that different techniques, theories, and solutions can be compared in fair conditions. Simulation environments can help us to establish repeatable benchmarks, which is very difficult to achieve with real robots, which can in turn help us understand the status of our field.

Why iGibson?

These ideas motivated us in the Stanford Vision and Learning Lab to develop a simulation environment that can serve as a “playground” to train and test interactive AI agents – an environment we call iGibson5. What makes iGibson special? To understand this, let’s first define what a simulation environment is and how it is different from a physics simulator. A physics simulator is an engine capable of computing the physical effect of actions on an environment (e.g. motion of bodies when a force is applied, or flow of liquid particles when being poured). There are many existing physics simulation engines. The best known in robotics are Bullet and its python extension, PyBullet, MuJoCo, Nvidia PhysX and Flex, UnrealEngine, DART, Unity, and ODE. Given a physical problem (objects, forces, particles, and physics parameters), these engines compute the temporal evolution of the system. On the other hand, a simulation environment is a framework that includes a physics simulator, a renderer of virtual signals, and a set of assets (i.e. models of scenes, objects, and robots) that can be used to create simulations of problems to study and develop solutions for different tasks. The decision on what physics engine to use is based on the type of physical process that dominates the problem, for example rigid body physics or motion of fluids. However, to decide on what simulation environment to use, researchers are guided by the application domain they are interested in, and the research questions they want to explore. With iGibson, we aim to support the study of interactive tasks in large realistic scenes, guided by high quality virtual visual signals.

Comparison to existing simulators

No existing simulation environments support developing solutions for problems involving interactions in large scale scenes like full houses. There are several simulation environments for tasks with stationary arms, such as meta-world, RLBench, RoboSuite or DoorGym, but none of them include large realistic scenes like homes with multiple rooms for tasks that include navigation. For navigation, our previous version, Gibson (v1) and Habitat have proven to be great environments that allow researchers to study visual and language guided navigation. However, the included assets (scenes) are single meshes that cannot change when interactions are applied, like opening doors or moving objects.

Finally, a set of recent simulation environments allow for scene-level interactive tasks, such as Sapien, AI2Thor and ThreeDWorld (TDW). Sapien focuses on interaction with articulated objects (doors, cabinets, and drawers). TDW is a multi-modal simulator with audio, high quality visuals, and simulation of flexible materials and liquids via Nvidia Flex. But neither Sapien nor TDW include fully interactive scenes aligned with real object distribution and layout as part of the environment. AI2Thor includes fully interactive scenes, but the interactions are scripted: interactable objects are annotated with the possible actions they can receive. When the agent is close enough to an object and the object is in the right state (precondition), the agent can select a predefined action, and the object is “transitioned’” to the next state (postcondition). RoboThor, an alternative version of AI2Thor, enables continuous interactions but focuses on navigation. It provides limited sensory signals to the agent (only RGB-D images) that is always embodied as a locobot, a low-cost platform with limited interaction capabilities. Here at SVL, we want to study complex, long-horizon mobile manipulation tasks such as tidying a house or searching for objects, which requires access to fully interactive realistic large-scale scenes.

iGibson’s new features

The main focus of iGibson in interactivity: enabling realistic interactions in large scenes. For that, we have included several key features:

  • Fifteen fully interactive visually realistic scenes representing real world homes with furniture and articulated object models annotated with materials and dynamics properties.
  • Capabilities to import models from CubiCasa5K 6 and 3D-Front 7, giving access to more than 8000 additional interactive home scenes.
  • Realistic virtual sensor signals, including high quality RGB images from a physics-based renderer, depth maps, 1 beam and 16 beams virtual LiDAR signals, semantic/instance/material segmentation, optical and scene flow, and surface normals.
  • Domain randomization for visual texture, dynamics properties and object instances for endless variations of scenes.
  • Human-computer interface for humans to provide demonstrations of fully physical interactions with the scenes.
  • Integration with sampling-based motion planners to facilitate motion of robotic bases (navigation in 2D layout) and arms (interaction in 3D space).



Using iGibson for robot learning

These novel features in iGibson allow us to study and develop solutions for new interactive tasks in large environments. One of these new problems is Interactive Navigation, where the agents need to interact with the environment to change its configuration, for example, to open doors or push obstacles away. This is a common type of navigation in our homes and offices, but non-interactive simulation environments cannot be used to study it. In iGibson we have developed hierarchical reinforcement learning solutions for interactive navigation that decide explicitly what part of the body to use in the next phase of the task: the arm (for interactions), the base (for navigation) or the combination of both 8. We also propose a new learning solution for interactive navigation that integrates a motion planner: the learning algorithm decides on the next point to interact, and the motion planner finds a collision free path to that point of interaction 9. But these are just the tips of the iceberg: many of SVL’s projects are leveraging iGibson to study a wide variety of interactive tasks in large realistic scenes.


Summary

Simulation environments have the potential to support researchers in their study of robotics and embodied AI problems. With iGibson, SVL contributes to the community with an open source, fully academically developed simulation environment for interactive tasks in large realistic scenes. If you want to start using it, visit our website and download – setup should be straightforward, and we’re happy to answer any questions about getting the simulator up and running for your research! We hope we can facilitate new avenues of research in robotics and AI.

  1. Andrychowicz, OpenAI: Marcin, et al. “Learning dexterous in-hand manipulation.” The International Journal of Robotics Research 39.1 (2020): 3-20. 

  2. Rajeswaran, Aravind, et al. “Learning complex dexterous manipulation with deep reinforcement learning and demonstrations.” Robotics: Science and Systems, 2017 

  3. Peng, Xue Bin, et al. “Sfv: Reinforcement learning of physical skills from videos.” ACM Transactions on Graphics (TOG) 37.6 (2018): 1-14. 

  4. Zhu, Yuke, et al. “robosuite: A modular simulation framework and benchmark for robot learning.” arXiv preprint arXiv:2009.12293 (2020). 

  5. A note on Gibson – Our simulation environment takes the name from James J. Gibson [1904-1979]. Gibson was an influential psychologist and cognitive scientist with, at the time, disruptive ideas. He pushed forward a new concept of perception to be considered 1) an ecological process that cannot and should not be studied in isolation from the environment, and 2) an active process that needs agency and interactivity. This was in contrast to the predominant view of the time of perception to be a passive process where signals “arrive” and “are processed” by the brain. Instead, he argued that agents seek for information, interacting and revealing it. He also coined the term “affordance” as the opportunity the environment offers to an agent to perform a task. This is a quote from a colleague summarizing his research that directly connects to the guiding principle behind our work in the iGibson team: “ask not what’s inside your head, but what your head is inside of”. 

  6. Kalervo, Ahti, et al. “Cubicasa5k: A dataset and an improved multi-task model for floorplan image analysis.” Scandinavian Conference on Image Analysis. Springer, Cham, 2019. 

  7. Fu, Huan, et al. “3D-FRONT: 3D Furnished Rooms with layOuts and semaNTics.” arXiv preprint arXiv:2011.09127 (2020). 

  8. Li, Chengshu, et al. “Hrl4in: Hierarchical reinforcement learning for interactive navigation with mobile manipulators.” Conference on Robot Learning. PMLR, 2020. 

  9. Xia, Fei, et al. “Relmogen: Leveraging motion generation in reinforcement learning for mobile manipulation.” arXiv preprint arXiv:2008.07792 (2020). 

Read More

Google at NeurIPS 2020

Google at NeurIPS 2020

Posted by Jaqui Herman and Cat Armato, Program Managers

This week marks the beginning of the 34th annual Conference on Neural Information Processing Systems (NeurIPS 2020), the biggest machine learning conference of the year. Held virtually for the first time, this conference includes invited talks, demonstrations and presentations of some of the latest in machine learning research. As a Platinum Sponsor of NeurIPS 2020, Google will have a strong presence with more than 180 accepted papers, additionally contributing to and learning from the broader academic research community via talks, posters, workshops and tutorials.

If you are registered for NeurIPS 2020, we hope you’ll visit our virtual booth and chat with our researchers about the projects and opportunities at Google that go into solving the world’s most challenging research problems, and to see demonstrations of some of the exciting research we pursue, such as Transformers for image recognition, Tone Transfer, large-scale distributed RL, recreating historical streetscapes and much more. You can also learn more about our work being presented in the list below (Google affiliations highlighted in blue).


Organizing Committees

General Chair: Hugo Larochelle

Workshop Co-Chair: Sanmi Koyejo

Diversity and Inclusion Chairs include: Katherine Heller

Expo Chair: Pablo Samuel Castro

Senior Area Chairs include: Corinna Cortes, Fei Sha, Mohammad Ghavamzadeh, Sanjiv Kumar, Charles Sutton, Dale Schuurmans, David Duvenaud, Elad Hazan, Marco Cuturi, Peter Bartlett, Samy Bengio, Tong Zhang, Claudio Gentile, Kevin Murphy, Cordelia Schmid, Amir Globerson

Area Chairs include: Boqing Gong, Afshin Rostamizadeh, Alex Kulesza, Branislav Kveton, Craig Boutilier, Heinrich Jiang, Manzil Zaheer, Silvio Lattanzi, Slav Petrov, Srinadh Bhojanapalli, Rodolphe Jenatton, Mathieu Blondel, Aleksandra Faust, Alexey Dosovitskiy, Ashish Vaswani, Augustus Odena, Balaji Lakshminarayanan, Ben Poole, Colin Raffel, Danny Tarlow, David Ha, Denny Zhou, Dumitru Erhan, Dustin Tran, George Tucker, Honglak Lee, Ilya Tolstikhin, Jasper Snoek, Jean-Philippe Vert, Jeffrey Pennington, Kevin Swersky, Matthew Johnson, Minmin Chen, Mohammad Norouzi, Moustapha Cisse, Naman Agarwal, Nicholas Carlini, Olivier Bachem, Tim Salimans, Vincent Dumoulin, Yann Dauphin, Andrew Dai, Izhak Shafran, Karthik Sridharan, Abhinav Gupta, Abhishek Kumar, Adam White, Aditya Menon, Kun Zhang, Ce Liu, Cristian Sminchisescu, Hossein Mobahi, Phillip IsolaTomer Koren, Chelsea Finn, Amin Karbasi

NeurIPS 2020 Foundation Board includes: Michael Mozer, Samy Bengio, Corinna Cortes, Hugo Larochelle, John C. Platt, Fernando Pereira


Accepted Papers

Rankmax: An Adaptive Projection Alternative to the Softmax Function
Weiwei Kong*, Walid Krichene, Nicolas Mayoraz, Steffen Rendle, Li Zhang

Unsupervised Sound Separation Using Mixture Invariant Training
Scott Wisdom, Efthymios Tzinis*, Hakan Erdogan, Ron Weiss, Kevin Wilson, John Hershey

Learning to Select Best Forecast Tasks for Clinical Outcome Prediction
Yuan Xue, Nan Du, Anne Mottram, Martin Seneviratne, Andrew M. Dai

Interpretable Sequence Learning for Covid-19 Forecasting
Sercan O. Arık, Chun-Liang Li, Jinsung Yoon, Rajarishi Sinha, Arkady Epshteyn, Long T. Le, Vikas Menon, Shashank Singh, Leyou Zhang, Nate Yoder, Martin Nikoltchev, Yash Sonthalia, Hootan Nakhost, Elli Kanal, Tomas Pfister

Towards Learning Convolutions from Scratch
Behnam Neyshabur

Emergent Complexity and Zero-shot Transfer via Unsupervised Environment Design
Michael Dennis, Natasha Jaques, Eugene Vinitsky, Alexandre Bayen, Stuart Russell, Andrew Critch, Sergey Levine

Inverse Rational Control with Partially Observable Continuous Nonlinear Dynamics
Minhae Kwon, Saurabh Daptardar, Paul Schrater, Xaq Pitkow

Off-Policy Evaluation via the Regularized Lagrangian
Mengjiao Yang, Ofir Nachum, Bo Dai, Lihong Li, Dale Schuurmans

CoinDICE: Off-Policy Confidence Interval Estimation
Bo Dai, Ofir Nachum, Yinlam Chow, Lihong Li, Csaba Szepesvári, Dale Schuurmans

Unsupervised Data Augmentation for Consistency Training
Qizhe Xie, Zihang Dai, Eduard Hovy, Minh-Thang Luong, Quoc V. Le

VIME: Extending the Success of Self- and Semi-supervised Learning to Tabular Domain
Jinsung Yoon, Yao Zhang, James Jordon, Mihaela van der Schaar

Funnel-Transformer: Filtering out Sequential Redundancy for Efficient Language Processing
Zihang Dai, Guokun Lai, Yiming Yang, Quoc Le

Big Bird: Transformers for Longer Sequences
Manzil Zaheer, Guru Guruganesh, Avinava Dubey, Joshua Ainslie, Chris Alberti, Santiago Ontanon, Philip Pham, Anirudh Ravula, Qifan Wang, Li Yang, Amr Ahmed

Provably Efficient Neural Estimation of Structural Equation Models: An Adversarial Approach
Luofeng Liao, You-Lin Chen, Zhuoran Yang, Bo Dai, Zhaoran Wang, Mladen Kolar

Conservative Q-Learning for Offline Reinforcement Learning
Aviral Kumar, Aurick Zhou, George Tucker, Sergey Levine

MOReL: Model-Based Offline Reinforcement Learning
Rahul Kidambi, Aravind Rajeswaran, Praneeth Netrapalli, Thorsten Joachims

Maximum-Entropy Adversarial Data Augmentation for Improved Generalization and Robustness
Long Zhao, Ting Liu, Xi Peng, Dimitris Metaxas

Generative View Synthesis: From Single-view Semantics to Novel-view Images
Tewodros Habtegebrial, Varun Jampani, Orazio Gallo, Didier Stricker

PIE-NET: Parametric Inference of Point Cloud Edges
Xiaogang Wang, Yuelang Xu, Kai Xu, Andrea Tagliasacchi, Bin Zhou, Ali Mahdavi-Amiri, Hao Zhang

Enabling Certification of Verification-Agnostic Networks via Memory-Efficient Semidefinite Programming
Sumanth Dathathri, Krishnamurthy (Dj) Dvijotham, Alex Kurakin, Aditi Raghunathan, Jonathan Uesato, Rudy Bunel, Shreya Shankar, Jacob Steinhardt, Ian Goodfellow*, Percy Liang, Pushmeet Kohli

An Analysis of SVD for Deep Rotation Estimation
Jake Levinson, Carlos Esteves, Kefan Chen, Noah Snavely, Angjoo Kanazawa, Afshin Rostamizadeh, Ameesh Makadia

Direct Policy Gradients: Direct Optimization of Policies in Discrete Action Spaces
Guy Lorberbom, Chris J. Maddison, Nicolas Heess, Tamir Hazan, Daniel Tarlow

Faster Differentially Private Samplers via Rényi Divergence Analysis of Discretized Langevin MCMC
Arun Ganesh*, Kunal Talwar*

DISK: Learning Local Features with Policy Gradient
Michał J. Tyszkiewicz, Pascal Fua, Eduard Trulls

Robust Large-margin Learning in Hyperbolic Space
Melanie Weber*, Manzil Zaheer, Ankit Singh Rawat, Aditya Menon, Sanjiv Kumar

Gamma-Models: Generative Temporal Difference Learning for Infinite-Horizon Prediction
Michael Janner, Igor Mordatch, Sergey Levine

Adversarially Robust Streaming Algorithms via Differential Privacy
Avinatan Hassidim, Haim Kaplan, Yishay Mansour, Yossi Matias, Uri Stemmer

Faster DBSCAN via Subsampled Similarity Queries
Heinrich Jiang, Jennifer Jang, Jakub Łacki

Exact Recovery of Mangled Clusters with Same-Cluster Queries
Marco Bressan, Nicolò Cesa-Bianchi, Silvio Lattanzi, Andrea Paudice

A Maximum-Entropy Approach to Off-Policy Evaluation in Average-Reward MDPs
Nevena Lazic, Dong Yin, Mehrdad Farajtabar, Nir Levine, Dilan Görür, Chris Harris, Dale Schuurmans

Fairness in Streaming Submodular Maximization: Algorithms and Hardness
Marwa El Halabi, Slobodan Mitrović, Ashkan Norouzi-Fard, Jakab Tardos, Jakub Tarnawski

Efficient Active Learning of Sparse Halfspaces with Arbitrary Bounded Noise
Chicheng Zhang, Jie Shen, Pranjal Awasthi

Private Learning of Halfspaces: Simplifying the Construction and Reducing the Sample Complexity
Haim Kaplan, Yishay Mansour, Uri Stemmer, Eliad Tsfadia

Synthetic Data Generators — Sequential and Private
Olivier Bousquet, Roi Livni, Shay Moran

Learning Discrete Distributions: User vs Item-level Privacy
Yuhan Liu, Ananda Theertha Suresh, Felix Xinnan X. Yu, Sanjiv Kumar, Michael Riley

Learning Differential Equations that are Easy to Solve
Jacob Kelly, Jesse Bettencourt, Matthew J. Johnson, David K. Duvenaud

An Optimal Elimination Algorithm for Learning a Best Arm
Avinatan Hassidim, Ron Kupfer, Yaron Singer

The Convex Relaxation Barrier, Revisited: Tightened Single-Neuron Relaxations for Neural Network Verification
Christian Tjandraatmadja, Ross Anderson, Joey Huchette, Will Ma, Krunal Kishor Patel*, Juan Pablo Vielma

Escaping the Gravitational Pull of Softmax
Jincheng Mei, Chenjun Xiao, Bo Dai, Lihong Li*, Csaba Szepesvari, Dale Schuurmans

The Complexity of Adversarially Robust Proper Learning of Halfspaces with Agnostic Noise
Ilias Diakonikolas, Daniel M. Kane, Pasin Manurangsi

PAC-Bayes Learning Bounds for Sample-Dependent Priors
Pranjal Awasthi, Satyen Kale, Stefani Karp, Mehryar Mohri

Fictitious Play for Mean Field Games: Continuous Time Analysis and Applications
Sarah Perrin, Julien Perolat, Mathieu Lauriere, Matthieu Geist, Romuald Elie, Olivier Pietquin

What Do Neural Networks Learn When Trained With Random Labels?
Hartmut Maennel, Ibrahim M. Alabdulmohsin, Ilya O. Tolstikhin, Robert Baldock*, Olivier Bousquet, Sylvain Gelly, Daniel Keysers

Online Planning with Lookahead Policies
Yonathan Efroni, Mohammad Ghavamzadeh, Shie Mannor

Smoothly Bounding User Contributions in Differential Privacy
Alessandro Epasto, Mohammad Mahdian, Jieming Mao, Vahab Mirrokni, Lijie Ren

Differentially Private Clustering: Tight Approximation Ratios
Badih Ghazi, Ravi Kumar, Pasin Manurangsi

Hitting the High Notes: Subset Selection for Maximizing Expected Order Statistics
Aranyak Mehta, Uri Nadav, Alexandros Psomas*, Aviad Rubinstein

Myersonian Regression
Allen Liu, Renato Leme, Jon Schneider

Assisted Learning: A Framework for Multi-Organization Learning
Xun Xian, Xinran Wang, Jie Ding, Reza Ghanadan

Adversarial Robustness via Robust Low Rank Representations
Pranjal Awasthi, Himanshu Jain, Ankit Singh Rawat, Aravindan Vijayaraghavan

Multi-Plane Program Induction with 3D Box Priors
Yikai Li, Jiayuan Mao, Xiuming Zhang, Bill Freeman, Josh Tenenbaum, Noah Snavely, Jiajun Wu

Privacy Amplification via Random Check-Ins
Borja Balle, Peter Kairouz, Brendan McMahan, Om Dipakbhai Thakkar, Abhradeep Thakurta

Rethinking Pre-training and Self-training
Barret Zoph, Golnaz Ghiasi, Tsung-Yi Lin, Yin Cui, Hanxiao Liu, Ekin Dogus Cubuk, Quoc Le

Reinforcement Learning with Combinatorial Actions: An Application to Vehicle Routing
Arthur Delarue, Ross Anderson, Christian Tjandraatmadja

Online Agnostic Boosting via Regret Minimization
Nataly Brukhim, Xinyi Chen, Elad Hazan, Shay Moran*

From Trees to Continuous Embeddings and Back: Hyperbolic Hierarchical Clustering
Ines Chami, Albert Gu, Vaggos Chatziafratis, Christopher Ré

Faithful Embeddings for Knowledge Base Queries
Haitian Sun, Andrew Arnold*, Tania Bedrax Weiss, Fernando Pereira, William W. Cohen

Contextual Reserve Price Optimization in Auctions via Mixed Integer Programming
Joey Huchette, Haihao Lu, Hossein Esfandiari, Vahab Mirrokni

An Operator View of Policy Gradient Methods
Dibya Ghosh, Marlos C. Machado, Nicolas Le Roux

Reinforcement Learning with Feedback Graphs
Christoph Dann, Yishay Mansour, Mehryar Mohri, Ayush Sekhari, Karthik Sridharan

On Completeness-aware Concept-Based Explanations in Deep Neural Networks
Chih-Kuan Yeh, Been Kim, Sercan Arik, Chun-Liang Li, Tomas Pfister, Pradeep Ravikumar

Rewriting History with Inverse RL: Hindsight Inference for Policy Improvement
Benjamin Eysenbach, Xinyang Geng, Sergey Levine, Ruslan Salakhutdinov

The Flajolet-Martin Sketch Itself Preserves Differential Privacy: Private Counting with Minimal Space
Adam Smith, Shuang Song, Abhradeep Thakurta

What is Being Transferred in Transfer Learning?
Behnam Neyshabur, Hanie Sedghi, Chiyuan Zhang

Latent Bandits Revisited
Joey Hong, Branislav Kveton, Manzil Zaheer, Yinlam Chow, Amr Ahmed, Craig Boutilier

MetaSDF: Meta-Learning Signed Distance Functions
Vincent Sitzmann, Eric Chan, Richard Tucker, Noah Snavely, Gordon Wetzstein

Measuring Robustness to Natural Distribution Shifts in Image Classification
Rohan Taori, Achal Dave, Vaishaal Shankar, Nicholas Carlini, Benjamin Recht, Ludwig Schmidt

Robust Optimization for Fairness with Noisy Protected Groups
Serena Wang, Wenshuo Guo, Harikrishna Narasimhan, Andrew Cotter, Maya Gupta, Michael I. Jordan

Learning Discrete Energy-based Models via Auxiliary-variable Local Exploration
Hanjun Dai, Rishabh Singh, Bo Dai, Charles Sutton, Dale Schuurmans

Breaking the Communication-Privacy-Accuracy Trilemma
Wei-Ning Chen, Peter Kairouz, Ayfer Ozgur

Differentiable Meta-Learning of Bandit Policies
Craig Boutilier, Chih-wei Hsu, Branislav Kveton, Martin Mladenov, Csaba Szepesvari, Manzil Zaheer

Multi-Stage Influence Function
Hongge Chen*, Si Si, Yang Li, Ciprian Chelba, Sanjiv Kumar, Duane Boning, Cho-Jui Hsieh

Compositional Visual Generation with Energy Based Models
Yilun Du, Shuang Li, Igor Mordatch

O(n) Connections are Expressive Enough: Universal Approximability of Sparse Transformers
Chulhee Yun, Yin-Wen Chang, Srinadh Bhojanapalli, Ankit Singh Rawat, Sashank Reddi, Sanjiv Kumar

Curriculum By Smoothing
Samarth Sinha, Animesh Garg, Hugo Larochelle

Online Linear Optimization with Many Hints
Aditya Bhaskara, Ashok Cutkosky, Ravi Kumar, Manish Purohit

Prediction with Corrupted Expert Advice
Idan Amir, Idan Attias, Tomer Koren, Roi Livni, Yishay Mansour

Agnostic Learning with Multiple Objectives
Corinna Cortes, Mehryar Mohri, Javier Gonzalvo, Dmitry Storcheus

CoSE: Compositional Stroke Embeddings
Emre Aksan, Thomas Deselaers*, Andrea Tagliasacchi, Otmar Hilliges

Reparameterizing Mirror Descent as Gradient Descent
Ehsan Amid, Manfred K. Warmuth

Understanding Double Descent Requires A Fine-Grained Bias-Variance Decomposition
Ben Adlam, Jeffrey Pennington

DisARM: An Antithetic Gradient Estimator for Binary Latent Variables
Zhe Dong, Andriy Mnih, George Tucker

Big Self-Supervised Models are Strong Semi-Supervised Learners
Ting Chen, Simon Kornblith, Kevin Swersky, Mohammad Norouzi, Geoffrey Hinton

JAX MD: A Framework for Differentiable Physics
Samuel S. Schoenholz, Ekin D. Cubuk

Gradient Surgery for Multi-Task Learning
Tianhe Yu, Saurabh Kumar, Abhishek Gupta, Sergey Levine, Karol Hausman, Chelsea Finn

LoopReg: Self-supervised Learning of Implicit Surface Correspondences, Pose and Shape for 3D Human Mesh Registration
Bharat Lal Bhatnagar, Cristian Sminchisescu, Christian Theobalt, Gerard Pons-Moll

ICE-BeeM: Identifiable Conditional Energy-Based Deep Models Based on Nonlinear ICA
Ilyes Khemakhem, Ricardo P. Monti, Diederik P. Kingma, Aapo Hyvärinen

Demystifying Orthogonal Monte Carlo and Beyond
Han Lin, Haoxian Chen, Tianyi Zhang, Clement Laroche, Krzysztof Choromanski

FixMatch: Simplifying Semi-Supervised Learning with Consistency and Confidence
Kihyuk Sohn, David Berthelot, Chun-Liang Li, Zizhao Zhang, Nicholas Carlini, Ekin D. Cubuk, Alex Kurakin, Han Zhang, Colin Raffel

Compositional Generalization via Neural-Symbolic Stack Machines
Xinyun Chen, Chen Liang, Adams Wei Yu, Dawn Song, Denny Zhou

Universally Quantized Neural Compression
Eirikur Agustsson, Lucas Theis

Self-Distillation Amplifies Regularization in Hilbert Space
Hossein Mobahi, Mehrdad Farajtabar, Peter L. Bartlett

ShapeFlow: Learnable Deformation Flows Among 3D Shapes
Chiyu “Max” Jiang, Jingwei Huang, Andrea Tagliasacchi, Leonidas Guibas

Entropic Optimal Transport between Unbalanced Gaussian Measures has a Closed Form
Hicham Janati, Boris Muzellec, Gabriel Peyré, Marco Cuturi

High-Fidelity Generative Image Compression
Fabian Mentzer*, George Toderici, Michael Tschannen*, Eirikur Agustsson

COT-GAN: Generating Sequential Data via Causal Optimal Transport
Tianlin Xu, Li K. Wenliang, Michael Munn, Beatrice Acciaio

When Do Neural Networks Outperform Kernel Methods?
Behrooz Ghorbani, Song Mei, Theodor Misiakiewicz, Andrea Montanari

Sense and Sensitivity Analysis: Simple Post-Hoc Analysis of Bias Due to Unobserved Confounding
Victor Veitch, Anisha Zaveri

Exemplar VAE: Linking Generative Models, Nearest Neighbor Retrieval, and Data Augmentation
Sajad Norouzi, David J. Fleet, Mohamamd Norouzi

Mitigating Forgetting in Online Continual Learning via Instance-Aware Parameterization
Hung-Jen Chen, An-Chieh Cheng, Da-Cheng Juan, Wei Wei, Min Sun
 
Consistent Plug-in Classifiers for Complex Objectives and Constraints
Shiv Kumar Tavker, Harish Guruprasad Ramaswamy, Harikrishna Narasimhan

Online MAP Inference of Determinantal Point Processes
Aditya Bhaskara, Amin Karbasi, Silvio Lattanzi, Morteza Zadimoghaddam

Organizing Recurrent Network Dynamics by Task-computation to Enable Continual Learning
Lea Duncker, Laura Driscoll, Krishna V. Shenoy, Maneesh Sahani, David Sussillo

RL Unplugged: A Collection of Benchmarks for Offline Reinforcement Learning
Caglar Gulcehre, Ziyu Wang, Alexander Novikov, Thomas Paine, Sergio Gómez, Konrad Zolna, Rishabh Agarwal, Josh S. Merel, Daniel J. Mankowitz, Cosmin Paduraru, Gabriel Dulac-Arnold, Jerry Li, Mohammad Norouzi, Matthew Hoffman, Nicolas Heess, Nando de Freitas

Neural Execution Engines: Learning to Execute Subroutines
Yujun Yan*, Kevin Swersky, Danai Koutra, Parthasarathy Ranganathan, Milad Hashemi

Spin-Weighted Spherical CNNs
Carlos Esteves, Ameesh Makadia, Kostas Daniilidis

An Efficient Nonconvex Reformulation of Stagewise Convex Optimization Problems
Rudy R. Bunel, Oliver Hinder, Srinadh Bhojanapalli, Krishnamurthy Dvijotham

Stochastic Optimization with Laggard Data Pipelines
Naman Agarwal, Rohan Anil, Tomer Koren, Kunal Talwar*, Cyril Zhang*

Regularizing Towards Permutation Invariance In Recurrent Models
Edo Cohen-Karlik, Avichai Ben David, Amir Globerson

Fast and Accurate kk-means++ via Rejection Sampling
Vincent Cohen-Addad, Silvio Lattanzi, Ashkan Norouzi-Fard, Christian Sohler*, Ola Svensson

Fairness Without Demographics Through Adversarially Reweighted Learning
Preethi Lahoti*, Alex Beutel, Jilin Chen, Kang Lee, Flavien Prost, Nithum Thain, Xuezhi Wang, Ed Chi

Gradient Estimation with Stochastic Softmax Tricks
Max Paulus, Dami Choi, Daniel Tarlow, Andreas Krause, Chris J. Maddison

Just Pick a Sign: Optimizing Deep Multitask Models with Gradient Sign Dropout
Zhao Chen, Jiquan Ngiam, Yanping Huang, Thang Luong, Henrik Kretzschmar, Yuning Chai, Dragomir Anguelov

A Spectral Energy Distance for Parallel Speech Synthesis
Alexey A. Gritsenko, Tim Salimans, Rianne van den Berg, Jasper Snoek, Nal Kalchbrenner

Ode to an ODE
Krzysztof Choromanski, Jared Quincy Davis, Valerii Likhosherstov, Xingyou Song, Jean-Jacques Slotine, Jacob Varley, Honglak Lee, Adrian Weller, Vikas Sindhwani

RandAugment: Practical Automated Data Augmentation with a Reduced Search Space
Ekin Dogus Cubuk, Barret Zoph, Jon Shlens, Quoc Le

On Adaptive Attacks to Adversarial Example Defenses
Florian Tramer, Nicholas Carlini, Wieland Brendel, Aleksander Madry

Fair Performance Metric Elicitation
Gaurush Hiranandani, Harikrishna Narasimhan, Oluwasanmi O. Koyejo

Robust Pre-Training by Adversarial Contrastive Learning
Ziyu Jiang, Tianlong Chen, Ting Chen, Zhangyang Wang

Why are Adaptive Methods Good for Attention Models?
Jingzhao Zhang, Sai Praneeth Karimireddy, Andreas Veit, Seungyeon Kim, Sashank Reddi, Sanjiv Kumar, Suvrit Sra

PyGlove: Symbolic Programming for Automated Machine Learning
Daiyi Peng, Xuanyi Dong, Esteban Real, Mingxing Tan, Yifeng Lu, Gabriel Bender, Hanxiao Liu, Adam Kraft, Chen Liang, Quoc Le

Fair Hierarchical Clustering
Sara Ahmadian, Alessandro Epasto, Marina Knittel, Ravi Kumar, Mohammad Mahdian, Benjamin Moseley, Philip Pham, Sergei Vassilvitskii, Yuyan Wang

Fairness with Overlapping Groups; a Probabilistic Perspective
Forest Yang*, Moustapha Cisse, Sanmi Koyejo

Differentiable Top-k with Optimal Transport
Yujia Xie*, Hanjun Dai, Minshuo Chen, Bo Dai, Tuo Zhao, Hongyuan Zha, Wei Wei, Tomas Pfister

The Origins and Prevalence of Texture Bias in Convolutional Neural Networks
Katherine Hermann, Ting Chen, Simon Kornblith

Approximate Heavily-Constrained Learning with Lagrange Multiplier Models
Harikrishna Narasimhan, Andrew Cotter, Yichen Zhou, Serena Wang, Wenshuo Guo

Evaluating Attribution for Graph Neural Networks
Benjamin Sanchez-Lengeling, Jennifer Wei, Brian Lee, Emily Reif, Peter Wang, Wesley Wei Qian, Kevin McCloskey, Lucy Colwell, Alexander Wiltschko

Sliding Window Algorithms for k-Clustering Problems
Michele Borassi, Alessandro Epasto, Silvio Lattanzi, Sergei Vassilvitskii, Morteza Zadimoghaddam

Meta-Learning Requires Meta-Augmentation
Janarthanan Rajendran*, Alex Irpan, Eric Jang

What Makes for Good Views for Contrastive Learning?
Yonglong Tian, Chen Sun, Ben Poole, Dilip Krishnan, Cordelia Schmid, Phillip Isola

Supervised Contrastive Learning
Prannay Khosla*, Piotr Teterwak*, Chen Wang*, Aaron Sarna, Yonglong Tian, Phillip Isola, Aaron Maschinot, Ce Liu, Dilip Krishnan

Critic Regularized Regression
Ziyu Wang, Alexander Novikov, Konrad Zolna, Josh Merel, Jost Tobias Springenberg, Scott Reed, Bobak Shahriari, Noah Siegel, Caglar Gulcehre, Nicolas Heess, Nando de Freitas

Off-Policy Imitation Learning from Observations
Zhuangdi Zhu, Kaixiang Lin, Bo Dai, Jiayu Zhou

Effective Diversity in Population Based Reinforcement Learning
Jack Parker-Holder, Aldo Pacchiano, Krzysztof Choromanski, Stephen Roberts

Memory Based Trajectory-conditioned Policies for Learning from Sparse Rewards
Yijie Guo, Jongwook Choi, Marcin Moczulski, Shengyu Feng, Samy Bengio, Mohammad Norouzi, Honglak Lee

Object-Centric Learning with Slot Attention
Francesco Locatello*, Dirk Weissenborn, Thomas Unterthiner, Aravindh Mahendran, Georg Heigold, Jakob Uszkoreit, Alexey Dosovitskiy, Thomas Kipf

On the Power of Louvain in the Stochastic Block Model
Vincent Cohen-Addad, Adrian Kosowski, Frederik Mallmann-Trenn, David Saulpic

Learning to Execute Programs with Instruction Pointer Attention Graph Neural Networks
David Bieber, Charles Sutton, Hugo Larochelle, Daniel Tarlow

SMYRF – Efficient Attention using Asymmetric Clustering
Giannis Daras, Nikita Kitaev, Augustus Odena, Alexandros G. Dimakis

Graph Contrastive Learning with Augmentations
Yuning You, Tianlong Chen, Yongduo Sui, Ting Chen, Zhangyang Wang, Yang Shen

WOR and p’s: Sketches for ℓp-Sampling Without Replacement
Edith Cohen, Rasmus Pagh, David P. Woodruff

Fourier Features Let Networks Learn High Frequency Functions in Low Dimensional Domains
Matthew Tancik, Pratul Srinivasan, Ben Mildenhall, Sara Fridovich-Keil, Nithin Raghavan, Utkarsh Singhal, Ravi Ramamoorthi, Jonathan Barron, Ren Ng

Model Selection in Contextual Stochastic Bandit Problems
Aldo Pacchiano, My Phan, Yasin Abbasi Yadkori, Anup Rao, Julian Zimmert, Tor Lattimore, Csaba Szepesvari

Adapting to Misspecification in Contextual Bandits
Dylan J. Foster, Claudio Gentile, Mehryar Mohri, Julian Zimmert

Leverage the Average: an Analysis of KL Regularization in Reinforcement Learning
Nino Vieillard, Tadashi Kozunoú, Bruno Scherrer, Olivier Pietquin, Rémi Munos, Matthieu Geist

Learning with Differentiable Pertubed Optimizers
Quentin Berthet, Mathieu Blondel, Olivier Teboul, Marco Cuturi, Jean-Philippe Vert, Francis Bach

Munchausen Reinforcement Learning
Nino Vieillard, Olivier Pietquin, Matthieu Geist

Log-Likelihood Ratio Minimizing Flows: Towards Robust and Quantifiable Neural Distribution Alignment
Ben Usman, Avneesh Sud, Nick Dufour, Kate Saenko

Your GAN is Secretly an Energy-based Model and You Should Use Discriminator Driven Latent Sampling
Tong Che, Ruixiang Zhang, Jascha Sohl-Dickstein, Hugo Larochelle, Liam Paull, Yuan Cao, Yoshua Bengio

Sample Complexity of Uniform Convergence for Multicalibration
Eliran Shabat, Lee Cohen, Yishay Mansour

Implicit Regularization and Convergence for Weight Normalization
Xiaoxia Wu, Edgar Dobriban, Tongzheng Ren, Shanshan Wu, Zhiyuan Li, Suriya Gunasekar, Rachel Ward, Qiang Liu

Most ReLU Networks Suffer from ℓ² Adversarial Perturbations
Amit Daniely, Hadas Shacham

Geometric Exploration for Online Control
Orestis Plevrakis, Elad Hazan

PLLay: Efficient Topological Layer Based on Persistent Landscapes
Kwangho Kim, Jisu Kim, Manzil Zaheer, Joon Sik Kim, Frederic Chazal, Larry Wasserman

Simple and Principled Uncertainty Estimation with Deterministic Deep Learning via Distance Awareness
Jeremiah Zhe Liu*, Zi Lin, Shreyas Padhy, Dustin Tran, Tania Bedrax-Weiss, Balaji Lakshminarayanan

Bayesian Deep Ensembles via the Neural Tangent Kernel
Bobby He, Balaji Lakshminarayanan, Yee Whye Teh

Hyperparameter Ensembles for Robustness and Uncertainty Quantification
Florian Wenzel, Jasper Snoek, Dustin Tran, Rodolphe Jenatton

Conic Descent and its Application to Memory-efficient Optimization Over Positive Semidefinite Matrices
John Duchi, Oliver Hinder, Andrew Naber, Yinyu Ye

On the Training Dynamics of Deep Networks with L₂ Regularization
Aitor Lewkowycz, Guy Gur-Ari

The Surprising Simplicity of the Early-Time Learning Dynamics of Neural Networks
Wei Hu*, Lechao Xiao, Ben Adlam, Jeffrey Pennington

Adaptive Probing Policies for Shortest Path Routing
Aditya Bhaskara, Sreenivas Gollapudi, Kostas Kollias, Kamesh Munagala

Optimal Approximation — Smoothness Tradeoffs for Soft-Max Functions
Alessandro Epasto, Mohammad Mahdian, Vahab Mirrokni, Emmanouil Zampetakis

An Unsupervised Information-Theoretic Perceptual Quality Metric
Sangnie Bhardwaj, Ian Fischer, Johannes Ballé, Troy Chinen

Learning Graph Structure With A Finite-State Automaton Layer
Daniel Johnson, Hugo Larochelle, Daniel Tarlow

Estimating Training Data Influence by Tracing Gradient Descent
Garima Pruthi, Frederick Liu, Satyen Kale, Mukund Sundararajan


Tutorials

Practical Uncertainty Estimation and Out-of-Distribution Robustness in Deep Learning
Organizers: Dustin Tran, Balaji Lakshminarayanan, Jasper Snoek

Abstraction & Reasoning in AI systems: Modern Perspectives
Organizers: Francois Chollet, Melanie Mitchell, Christian Szegedy

Policy Optimization in Reinforcement Learning
Organizers: Sham M Kakade, Martha White, Nicolas Le Roux

Federated Learning and Analytics: Industry Meets Academia
Organizers: Brendan McMahan, Virginia Smith, Peter Kairouz

Deep Implicit Layers: Neural ODEs, Equilibrium Models, and Differentiable Optimization
Organizers: David Duvenaud, J. Zico Kolter, Matthew Johnson

Beyond Accuracy: Grounding Evaluation Metrics for Human-Machine Learning Systems
Organizers: Praveen Chandar, Fernando Diaz, Brian St. Thomas


Workshops

Black in AI Workshop @ NeurIPS 2020 (Diamond Sponsor)
Mentorship Roundtables: Natasha Jacques

LatinX in AI Workshop @ NeurIPS 2020 (Platinum Sponsor)
Organizers include: Pablo Samuel Castro
Invited Speaker: Fernanda Viégas
Mentorship Roundtables: Tomas Izo

Queer in AI Workshop @ NeurIPS 2020 (Platinum Sponsor)
Organizers include: Raphael Gontijo Lopes

Women in Machine Learning (Platinum Sponsor)
Organizers include: Xinyi Chen, Jessica Schrouff
Invited Speaker: Fernanda Viégas
Sponsor Talk: Jessica Schrouff
Mentorship Roundtables: Hanie Sedghi, Marc Bellemare, Katherine Heller, Rianne van den Berg, Natalie Schluter, Colin Raffel, Azalia Mirhoseini, Emily Denton, Jesse Engel, Anusha Ramesh, Matt Johnson, Jeff Dean, Laurent Dinh, Samy Bengio, Yasaman Bahri, Corinna Cortes, Nicolas le Roux, Hugo Larochelle, Sergio Guadarrama, Natasha Jaques, Pablo Samuel Castro, Elaine Le, Cory Silvear

Muslims in ML
Organizers include: Mohammad Norouzi

Resistance AI Workshop
Organizers include: Elliot Creager, Raphael Gontijo Lopes

Privacy Preserving Machine Learning — PriML and PPML Joint Edition
Organizers include: Adria Gascon, Mariana Raykova

OPT2020: Optimization for Machine Learning
Organizers include: Courtney Paquette

Human in the Loop Dialogue Systems
Organizers include: Rahul Goel
Invited Speaker: Ankur Parikh

Self-Supervised Learning for Speech and Audio Processing
Organizers include: Tara Sainath
Invited Speaker: Bhuvana Ramabhadran

3rd Robot Learning Workshop
Organizers include: Alex Bewley, Vincent Vanhoucke
Invited Speaker: Pete Florence

Deep Reinforcement Learning
Organizers include: Chelsea Finn
Invited Speaker: Marc Bellemare

Machine Learning for Engineering Modeling, Simulation and Design
Organizers include: Stephan Hoyer

Machine Learning for Molecules
Organizers include: Jennifer Wei
Invited Speaker: Benjamin Sanchez-Lengeling

The Challenges of Real World Reinforcement Learning
Organizers include: Gabriel Dulac-Arnold
Invited Speaker: Chelsea Finn

Workshop on Computer Assisted Programming (CAP)
Organizers include: Charles Sutton, Augustus Odena

Self-Supervised Learning — Theory and Practice
Organizers include: Barret Zoph
Invited Speaker: Quoc V. Le

Deep Learning Through Information Geometry
Organizers include: Alexander Alemi

Expo

Drifting Efficiently Through the Stratosphere Using Deep Reinforcement Learning
Organizers include: Sal Candido

Accelerating Eye Movement Research via Smartphone Gaze
Organizers include: Vidhya Navalpakkam

Mining and Learning with Graphs at Scale
Organizers include: Bryan Perozzi, Vahab Mirrokni, Jonathan Halcrow, Jakub Lacki

*Work performed while at Google

Read More

Researchers can use qsim to explore quantum algorithms

Researchers can use qsim to explore quantum algorithms

A year ago, Google’s Quantum AI team achieved a beyond-classical computation by using a quantum computer to outperform the world’s fastest classical computer. With this, we entered a new era of quantum computing. We still have a long journey ahead of us to find practical applications, and we know we can’t get there alone. So today we’re launching qsim, a new open source quantum simulator that will help researchers develop quantum algorithms. 

The importance of simulators in quantum computing

Simulators are important tools for writing and debugging quantum code, and they’re essential for developing quantum algorithms. The few experimental quantum processors currently available, like the one that achieved a beyond-classical computation, are prone to noise and don’t perform error correction. This is where simulators like qsim come in. They allow researchers to explore quantum algorithms under idealized conditions and are more readily available. They also help prepare experiments to run on actual quantum hardware.

qsim can simulate around 30 qubits on a laptop, or up to 40 qubits in Google Cloud. What used to take an expensive cluster of computers to simulate can now be done on a single computer with qsim. We use qsim frequently at Google to test and benchmark quantum algorithms and processors. One example of this is our research in quantum neural networks. By using qsim with Cirq and TensorFlow Quantum, we’ve trained quantum ML models involving hundreds of thousands of circuits. 

Open source software tools for developing quantum algorithms

qsim is part of our open source ecosystem of software tools. These include Cirq, our quantum programming framework, ReCirq, a repository of research examples, and application-specific libraries such as OpenFermion for quantum chemistry and TensorFlow Quantum for quantum machine learning. These tools are designed to work together and to help you get started easily. Researchers who have developed quantum algorithms with Cirq can now use qsim by changing one line of code in Colab and experience an instant speedup in their circuit simulations.

Google Quantum AI website

To help you get started with qsim and our other open source quantum software, we’ve launched a new website that brings together all of our tools, research initiatives, and educational material. Researchers can access our latest publications and research repositories, students can find educational resources or apply for internships, and developers interested in quantum computing can join our growing community of contributors

Read More