This month in AWS Machine Learning: August 2020 edition

This month in AWS Machine Learning: August 2020 edition

Every day there is something new going on in the world of AWS Machine Learning—from launches to new use cases to interactive trainings. We’re packaging some of the not-to-miss information from the ML Blog and beyond for easy perusing each month. Check back at the end of each month for the latest roundup.


This month we gave you a new way to add intelligence to your contact center, improved personalized recommendations, made our Machine Learning University content available, and more. Read on for our August launches:

Use cases

Get ideas and architectures from AWS customers, partners, ML Heroes, and AWS experts on how to apply ML to your use case:

Explore more ML stories

Want more news about developments in ML? Check out the following stories:

Mark your calendars

Join us for the following exciting ML events:

  • Register for the Public Sector AWS Artificial Intelligence and Machine Learning Week, September 14–18, 2020. Whether you’re in a government, nonprofit, university, or hospital setting, this webinar series is designed to help educate those new to AI, spark new ideas for business stakeholders, and deep dive into technical implementation for developers.
  • AWS Power Hour: Machine Learning streams every Thursday at 4:00 PM PST on Twitch. The series offers free, fun, and interactive training with AWS expert hosts as they demonstrate how to build apps with AWS AI services. Designed for developers—even those without prior ML experience—the show helps you learn to build apps that showcase natural language, speech recognition, and other personalized recommendations. Tune in live, or catch the recorded episodes whenever it’s convenient for you.
  • AWS and Pluralsight are hosting a three-part webinar series on the ins-and-outs of AWS DeepRacer. In the series, you will learn about the basics of DeepRacer, reinforcement learning and refinement, and the future of DeepRacer. View the first two webinars and register for the live webinar on September 22 here.

Also, if you missed it, the season finale of SageMaker Fridays aired on August 28. Stay tuned for more news on season 2!

See you next month for more on AWS ML!

About the Author

Laura Jones is a product marketing lead for AWS AI/ML where she focuses on sharing the stories of AWS’s customers and educating organizations on the impact of machine learning. As a Florida native living and surviving in rainy Seattle, she enjoys coffee, attempting to ski and enjoying the great outdoors.

Read More

Getting started with the Amazon Kendra SharePoint Online connector

Getting started with the Amazon Kendra SharePoint Online connector

Amazon Kendra is a highly accurate and easy-to-use enterprise search service powered by machine learning (ML). To get started with Amazon Kendra, we offer data source connectors to get your documents easily ingested and indexed.

This post describes how to use Amazon Kendra’s SharePoint Online connector. To allow the connector to access your SharePoint Online site, you only need to provide the index URL and the credentials of a user with owner rights. These access credentials will be securely stored in AWS Secrets Manager.

Currently, Amazon Kendra has two provisioning editions: the Amazon Kendra Developer Edition for building proof of concepts (POCs) and the Amazon Kendra Enterprise Edition. Amazon Kendra connectors work with both editions.


To get started, you need the following:

  • A SharePoint Online site
  • A SharePoint Online user with owner rights

Owner rights are the minimum admin rights needed for the connector to access and ingest documents from your SharePoint site. This follows the AWS principle of granting least privilege access.

The metadata in your SharePoint Online documents must be specifically mapped to Amazon Kendra attributes. This mapping is done in the Attributes and field mappings section in this post. The SharePoint document title is mapped to the Amazon Kendra system attribute _document_title. If you skip the field mapping step, you need to create a new data connector to the SharePoint Online site.

The AWS Identity and Access Management (IAM) role for the SharePoint Online data source is not the same as the Amazon Kendra index IAM role. Please read the section Defining targets: Site URL and data source IAM role carefully. It’s important to pay particular attention to the interplay between the SharePoint Online data source’s IAM role and the secrets manager that contains your SharePoint Online credentials.

For this post, we assume that you already have a SharePoint Online site deployed.

Setting up a SharePoint Online connector for Amazon Kendra from the console

The following section describes the process of deploying an Amazon Kendra index and configuring a SharePoint Online connector. If you already have an index, you can skip to the Configuring the SharePoint Online connector section.

For this use case, our SharePoint Online site contains a collection of AWS whitepapers with custom columns, such as Topics.

Creating an Amazon Kendra index

In an Amazon Kendra setup workflow, the first step is to create an index, where you define an IAM role and the method you want Amazon Kendra to use for data encryption. For this use case, we create a new role.

If you use an existing role, check that it has permission to write to an Amazon CloudWatch log. For more information, see IAM roles for indexes.

Next, you select which provisioning edition to use. For this post, I select the Developer edition. If you’re new to Amazon Kendra, we recommend creating an Amazon Kendra Developer Edition index because it’s a more cost-efficient way to explore Amazon Kendra. For production environments, we highly recommended using the Enterprise Edition because it allows for more storage capacity and queries per day, and is designed for high availability.

Configuring the SharePoint Online connector

After you create your index, you set up the data sources. One of the advantages of implementing Amazon Kendra is that you can use a set of prebuilt connectors for data sources such as Amazon Simple Storage Service (Amazon S3), Amazon Relational Database Service (Amazon RDS), SharePoint Online, and Salesforce.

For this use case, we choose SharePoint Online.

Assigning a name to the data source

In the Define attributes section, you enter a name for the data source, an optional description, and assign optional tags.

Defining targets: Site URL and data source IAM role

In the Define targets section, you enter the targets where you need to define the SharePoint Online site URLs where the documents reside and the IAM role that the connecter uses to operate. It’s important to remember that this IAM role is different from the one used to create the index. For more information, see IAM roles for data sources.

If you don’t have an IAM role for this task, you can easily create one by choosing Create New Role. For this use case, I use a previously created role.

Under the URL text box, you can select Use change log, which enables the connector to use the SharePoint change log to determine the documents that need to be updated in the index. If your SharePoint change log is too large, your sync process may take longer.

You can also select Crawl attachments, which allows the crawler to include the attachments associated with items stored in your site.

You can also include or exclude documents by using regular expressions. You can define patterns that Amazon Kendra either uses to exclude certain documents from indexing or include only documents with that pattern. For more information, see SharePointConfiguration.

Providing SharePoint Online credentials

In the Configure settings section, you set up your SharePoint Online user (if you don’t have one created, you can create an additional user). The credentials you enter are stored in the Secrets Manager.

Save the authentication information and set up the sync run schedule, which determines how often Kendra checks your SharePoint Online site URLs for changes. For this use case, I choose to Run on demand.

Attributes and field mappings

In this next step, you can create field mappings. Even though this is an optional step, it’s a good idea to add this extra layer of metadata to your documents from SharePoint Online. Metadata enables you to improve accuracy through manual tuning, filtering, and faceting. You can’t add metadata to already ingested documents, so if you want to add metadata later, you need to delete this data source and recreate this data source with metadata and re-ingest your documents.

The default SharePoint Online metadata fields are Title, Created, and Modified.

One powerful feature is the ability to create custom field mappings. For example, on my SharePoint Online site, I created a column named Category. By importing this extra piece of information, we can create filters based on category names.

To import that extra information, you create a custom field mapping by choosing Add a new field mapping button.

If you’re combining multiple data sources, you can map this new field to an existing field. For this use case, I have other documents that have the attribute Category, so I choose Option A to map fields to an existing document attributes field in my Amazon Kendra index. For more information, see Creating custom document attributes.

Also, on my SharePoint Site, I have an additional field called Topic. Because I don’t have that field on my index yet, I select Option B and enter the data source field name and select the data type (for this use case, String).

Field names are case-sensitive, so we need to make sure we match them. Additionally, when a data field on SharePoint is renamed, only the display name changes. This means that if you want to import a data field, you need to refer to the original name. A way to find it is to sort by that column and check the name as listed on the address bar.

Let’s check what field is used for sorting:

Reviewing settings and creating a SharePoint Online data source

As a last step, you review the settings and create the data source. The Domain(s) and role section provides additional configuration information.

After you create your SharePoint Online data source, a banner similar to the following screenshot will appear at the top of your screen. To start the syncing and document ingestion process, choose Sync now.

You see a banner indicating the progress of the data source sync job. After the sync job is finished, you can test your index.


You can test your new index on the Amazon Kendra search console. See the following screenshot.

Also, if you configured extra fields as facetable, you can filter your documents by those facets. See the following screenshot.

Creating an Amazon Kendra index with a SharePoint Online connector with Python

In addition to the console, you can create a new Amazon Kendra index SharePoint online connector and sync it by using the AWS SDK for Python (Boto3). Boto3 makes it easy to integrate your Python application, library, or script with AWS services, including Amazon Kendra.

My personal preference for testing my Python scripts is to spin up an Amazon SageMaker notebook instance, a fully managed ML Amazon Elastic Compute Cloud (Amazon EC2) instance that runs the Jupyter Notebook app. For instructions, see Create an Amazon SageMaker Notebook Instance.

IAM roles requirements and overview

To create an index using the AWS SDK, you need to have the policy AmazonKendraFullAccess attached to the role you are using.

At a high level, these are the different roles Amazon Kendra requires:

  • IAM roles for indexes – Needed to write to CloudWatch Logs.
  • IAM roles for data sources – Needed when you use the CreateDataSource method. These roles require a specific set of permissions depending on the connector you use. For our use case, it needs permissions to access the following:
    • Secrets Manager, where the SharePoint online credentials are stored.
    • The AWS Key Management Service (AWS KMS) customer master key (CMK) to decrypt the credentials by Secrets Manager.
    • The BatchPutDocument and BatchDeleteDocument operations to update the index.
    • The Amazon S3 bucket that contains the SSL certificate used to communicate with the SharePoint Site (we use SSL for this use case).

For more information, see IAM access roles for Amazon Kendra.

For this method, you need:

  • An Amazon SageMaker notebooks role with permission to create an Amazon Kendra index where you’re using the notebook
  • An Amazon Kendra IAM role for CloudWatch
  • An Amazon Kendra IAM role for the SharePoint Online connector
  • A SharePoint Online credentials store on Secrets Manager

Creating an Amazon Kendra index

To create an index, you use the following code:

import boto3
from botocore.exceptions import ClientError
import pprint
import time
kendra = boto3.client("kendra")
print("Creating an index")
index_name = <YOUR NEW INDEX NAME>
    index_response = kendra.create_index(
        Description = description,
        Name = index_name,
        RoleArn = role_arn,
        Edition = "DEVELOPER_EDITION",
            'Key': 'Project',
            'Value': 'SharePoint Test'
    index_id = index_response['Id']
    print("Wait for Kendra to create the index.")
    while True:
        # Get index description
        index_description = kendra.describe_index(
            Id = index_id
        # If status is not CREATING quit
        status = index_description["Status"]
        print("    Creating index. Status: "+status)
        if status != "CREATING":
except  ClientError as e:
        print("%s" % e)
print("Done creating index.")

While your index is being created, you get regular updates (every 60 seconds; check line 38) until the process is complete. See the following code:

Creating an index
{'Id': '3311b507-bfef-4e2b-bde9-7c297b1fd13b',
 'ResponseMetadata': {'HTTPHeaders': {'content-length': '45',
                                      'content-type': 'application/x-amz-json-1.1',
                                      'date': 'Mon, 20 Jul 2020 19:58:19 GMT',
                                      'x-amzn-requestid': 'a148a4fc-7549-467e-b6ec-6f49512c1602'},
                      'HTTPStatusCode': 200,
                      'RequestId': 'a148a4fc-7549-467e-b6ec-6f49512c1602',
                      'RetryAttempts': 2}}
Wait for Kendra to create the index.
    Creating index. Status: CREATING
    Creating index. Status: CREATING
    Creating index. Status: CREATING
    Creating index. Status: CREATING
    Creating index. Status: CREATING
    Creating index. Status: CREATING
    Creating index. Status: CREATING
    Creating index. Status: CREATING
    Creating index. Status: CREATING
    Creating index. Status: CREATING
    Creating index. Status: CREATING
    Creating index. Status: CREATING
    Creating index. Status: CREATING
    Creating index. Status: CREATING
    Creating index. Status: CREATING
    Creating index. Status: CREATING
    Creating index. Status: CREATING
    Creating index. Status: ACTIVE
Done creating index

When your index is ready it will provide an ID 3311b507-bfef-4e2b-bde9-7c297b1fd13b on the response. Your index ID will be different than the ID in this post.

Adding attributes to the Amazon Kendra index

If you have metadata attributes associated with your SharePoint Online documents, you should do the following:

  1. Determine the Amazon Kendra attribute name you want for each of your SharePoint Online metadata attributes. By default, Amazon Kendra has six reserved fields (_category, created_at, _file_type, _last_updated_at, _source_uri, and _view_count).
  2. Update the Amazon Kendra index with the Amazon Kendra attribute names.
  3. Map each SharePoint Online metadata attribute to each Amazon Kendra metadata attribute.

If you have the metadata attribute Topic associated with your SharePoint Online document, and you want to use the same attribute name in the Amazon Kendra index, the following code adds the attribute Topic to your Amazon Kendra index:

    update_response = kendra.update_index(
        RoleArn='arn:aws:iam::<YOUR ACCOUNT NUMBER>-NUMBER:role/service-role/AmazonKendra-us-east-1-KendraRole',
            'Name': 'Topic',
            'Type': 'STRING_VALUE',
            'Search': {
                'Facetable': True,
                'Searchable': True,
                'Displayable': True
except  ClientError as e:
        print('%s' % e)   

If everything goes well, we receive a 200 response:

{'ResponseMetadata': {'HTTPHeaders': {'content-length': '0',
                                      'content-type': 'application/x-amz-json-1.1',
                                      'date': 'Mon, 20 Jul 2020 20:17:07 GMT',
                                      'x-amzn-requestid': '3eba66c9-972b-4757-8d92-37be17c8f8a2},
                      'HTTPStatusCode': 200,
                      'RequestId': '3eba66c9-972b-4757-8d92-37be17c8f8a2',
                      'RetryAttempts': 0}} 

Providing the SharePoint Online credentials

You also need to have GetSecretValue for your secret stored in Secrets Manager.

If you need to create a new secret in Secrets Manager to store the SharePoint Online credentials, make sure the role you use has permissions to create a secret and tagging. See the following policy code:

    "Version": "2012-10-17",
    "Statement": [
            "Sid": "SecretsManagerWritePolicy",
            "Effect": "Allow",
            "Action": [
            "Resource": "*"

To create a secret on Secrets Manager, enter the following code:

secretsmanager = boto3.client('secretsmanager')

SharePointCredentials = "{'username': <YOUR SHAREPOINT SITE USERNAME>, 'password': <YOUR SHAREPOINT SITE PASSWORD>}"

  create_secret_response = secretsmanager.create_secret(
  Description='Secret for a Sharepoint data source connector',
    'Key': 'Project',
    'Value': 'SharePoint Test'
except ClientError as e:
  print('%s' % e)

If everything went well, you get a response with your secret’s ARN:

 'ResponseMetadata': {'HTTPHeaders': {'connection': 'keep-alive',
                                      'content-length': '159',
                                      'content-type': 'application/x-amz-json-1.1',
                                      'date': 'Wed, 22 Jul 2020 16:05:32 GMT',
                                      'x-amzn-requestid': '3d0ac6ff-bd32-4d2e-8107-13e49f070de5'},
                      'HTTPStatusCode': 200,
                      'RequestId': '3d0ac6ff-bd32-4d2e-8107-13e49f070de5',
                      'RetryAttempts': 0},
 'VersionId': '7f7633ce-7f6c-4b10-b5b2-2943dd3fd6ee'}

Creating the SharePoint Online data source

Your Amazon Kendra index is up and running and you have established the attributes that you want to map to our SharePoint Online document’s attributes.

You now need an IAM role with Kendra:BatchPutDocument and kendra:BatchDeleteDocument permissions. For more information, see IAM roles for Microsoft SharePoint Online data sources. We use the ARN for this IAM role when invoking the CreateDataSource API.

Make sure the role you use for your data source connector has a trust relationship with Amazon Kendra. See the following code:

  "Version": "2012-10-17",
  "Statement": [
      "Effect": "Allow",
      "Principal": {
        "Service": ""
      "Action": "sts:AssumeRole"

The following code is the policy structure used:

    "Version": "2012-10-17",
    "Statement": [
            "Effect": "Allow",
            "Action": [
            "Resource": [
                "arn:aws:secretsmanager:region:account ID:secret:secret ID"
            "Effect": "Allow",
            "Action": [
            "Resource": [
                "arn:aws:kms:region:account ID:key/key ID"
            "Effect": "Allow",
            "Action": [
            "Resource": [
                "arn:aws:kendra:region:account ID:index/index ID"
            "Condition": {
                "StringLike": {
                    "kms:ViaService": [
            "Effect": "Allow",
            "Action": [
            "Resource": [
                "arn:aws:s3:::bucket name/*"

The following code is my role’s ARN:

arn:aws:iam::<YOUR ACCOUNT NUMBER>:role/Kendra-Datasource

Following the least privilege principle, we only allow our role to put and delete documents in our index and read the secrets to connect to our SharePoint Online site.

When creating a data source, you can specify the sync schedule, which indicates how often your index syncs with the data source we create. This schedule is defined on the Schedule key of our request. You can use schedule expressions for rules to define how often you want to sync your data source. For this use case, the ScheduleExpression is 'cron(0 11 * * ? *)', which sets the data source to sync every day at 11:00 AM.

I use the following code. Make sure you match your SiteURL and SecretARN, as well as your IndexID. Additionally, FieldMappings is where you map between the SharePoint Online attribute name and the Amazon Kendra index attribute name. I use the same attribute name in both, but you can name the Amazon Kendra attribute whatever you’d like.

print('Create a data source')
ScheduleExpression='cron(0 11 * * ? *)'

    datasource_response = kendra.create_data_source(
        'SharePointConfiguration': {
            'SharePointVersion': 'SHAREPOINT_ONLINE',
            'Urls': [
            'SecretArn': SecretArn,
            'CrawlAttachments': True,
            'UseChangeLog': True,
            'FieldMappings': [
                    'DataSourceFieldName': 'Topic',
                    'IndexFieldName': 'Topic'
            'DocumentTitleFieldName': 'Title'
    Description='My SharePointOnline Datasource',
            'Key': 'Project',
            'Value': 'SharePoint Test'
    print('Waiting for Kendra to create the DataSource.')
    datasource_id = datasource_response['Id']
    while True:
        # Get index description
        datasource_description = kendra.describe_data_source(
        # If status is not CREATING quit
        status = datasource_description["Status"]
        print("    Creating index. Status: "+status)
        if status != "CREATING":

except  ClientError as e:
        print('%s' % e)     

At this point, you should receive a 200 response:

Create a data source
{'Id': '527ac6f7-5f3c-46ec-b2cd-43980c714bf7',
 'ResponseMetadata': {'HTTPHeaders': {'content-length': '45',
                                      'content-type': 'application/x-amz-json-1.1',
                                      'date': 'Mon, 20 Jul 2020 15:26:13 GMT',
                                      'x-amzn-requestid': '30480044-0a86-446c-aadc-f64acb4b3a86'},
                      'HTTPStatusCode': 200,
                      'RequestId': '30480044-0a86-446c-aadc-f64acb4b3a86',
                      'RetryAttempts': 0}}

Syncing the data source

Even though you defined a schedule for syncing the data source, you can sync on demand by using start_data_source_sync_job:

    ds_sync_response = kendra.start_data_source_sync_job(
except  ClientError as e:
        print('%s' % e)  

The response should look like the following code:

{'ExecutionId': '6574acd6-e66f-4797-85cf-278dce9256b4',
 'ResponseMetadata': {'HTTPHeaders': {'content-length': '54',
                                      'content-type': 'application/x-amz-json-1.1',
                                      'date': 'Mon, 20 Jul 2020 15:54:24 GMT',
                                      'x-amzn-requestid': '415547b2-d095-4501-b6ad-eba4b731d109'},
                      'HTTPStatusCode': 200,
                      'RequestId': '415547b2-d095-4501-b6ad-eba4b731d109',
                      'RetryAttempts': 0}}


Finally, you can query your index. See the following code:

response = kendra.query(
QueryText='Is there a service that has 11 9s of durability?')
if response['TotalNumberOfResults'] > 0:
    print("More information: "+response['ResultItems'][0]['DocumentURI'])
    print('No results found, please try a different search term.')

You will get a result like the following code:

Amazon S3 has a data durability of 11 nines. 
For transactional data storage, customers have the option to take advantage of the fully 
managed Amazon Relational Database Service (Amazon RDS) that supports Amazon 
Aurora, PostgreSQL, MySQL, MariaDB, Oracle, and Microsoft SQL Server with high 
More information:

Common errors

Each of the errors noted in this section can occur if you’re using the Amazon Kendra console or the Amazon Kendra API.

You should look at the CloudWatch logs and error messages returned on the Amazon Kendra console or via the Amazon Kendra API. The CloudWatch logs help you determine the reason for a particular error, whether you are experiencing it using the console or programmatically.

Common errors when trying to access SharePoint Online as a data source are:

  • Secrets Manager errors
  • SharePoint credential errors
  • IAM role errors
  • URL errors

In the following sections, we provide more details on how to address each error.

Secrets Manager errors

You might get an error message from the Secrets Manager stating that your role doesn’t have permissions to retrieve the secrets value. This can occur when you create a new secret manager and you don’t add read permissions to the data source role.

Here’s an example of the error message:

Create a DataSource
('An error occurred (ValidationException) when calling the CreateDataSource '
 'operation: Secrets Manager throws the exception: User: '
 'arn:aws:sts::<YOUR ACCOUNT NUMBER>:assumed-role/Kendra-Datasource/DataSourceConfigurationValidator '
 'is not authorized to perform: secretsmanager:GetSecretValue on resource: '
 <YOUR SECRET ARN> '(Service: AWSSecretsManager; Status Code: 400; Error Code: '
 'AccessDeniedException; Request ID: 886ff6ac-f8f3-46b0-94dc-8286fd1682c1; '
 'Proxy: null)')

To address this, you need to make sure that our role has a policy attached to with GetSecretValue permissions on the secret.

If you’re troubleshooting on the console, complete the following steps:

  1. On the Secrets Manager console, copy the secret ARN.

The secret ARN is listed in the Secret details section. See the following screenshot.

  1. On the IAM console, choose Roles.
  2. Search for the role associated with Amazon Kendra.

  1. Choose the role that you assigned to the data source.
  2. Choose Add inline policy.

  1. For Select Service, choose Secrets Manager.
  2. On the visual editor, on the Access Level, choose Read.
  3. Choose GetSecretValue.
  4. Under Resources, select Specific.
  5. Choose Add ARN.
  6. For Specify ARN for secret, enter the secret ARN you copied.

  1. Review and choose Create Policy.

You can now go back to your Amazon Kendra data source setup and finish the process.

SharePoint credential errors

Another common issue can be caused by a failure to crawl the site. On the sync details, the error message may say something about invalid URLs. To dive deeper into the issue, select the error message.

This takes you to the CloudWatch console, where you can enter a query on the latest logs and choose Run Query.

The results appear on the Logs tab.

You can see three records matching the logStream generated by the data source sync job.

For the first document, the error message is “The URLs specified in the data source configuration aren’t valid. The URLs should be either a SharePoint site or list. Check the URLs and try the request again.”

However, it’s interesting to notice that this is the last generated message. let’s see what Document #2 shows us:

You may receive an invalid URL for the data source configuration that is triggered because of an underlying authentication problem.

The easiest way to address this issue is to generate new credentials for the Amazon Kendra crawler.

  1. To set up a user for the crawler to run, log in to your SharePoint Online configuration and open the Microsoft 365 Admin page.

  1. In the User management section, choose Add user.

  1. Fill in the form with the details for the crawler.

For this use case, you don’t need to assign a license for this user.

  1. Set it up as a user without admin center access.

  1. After you create the user, record the generated password because you need to modify it later.

  1. We can now go back to our site and choose the members icon on the top right of the screen.

  1. To add a member, choose Add members.

  1. Add the new user you just created and choose Save.

  1. From the drop-down menu under the new user’s name, choose Owner.

IAM role issues

Another common issue is caused by lack of permissions for the IAM role used to crawl your data source.

You can identify this issue on the CloudWatch logs. See the following code:

    "CrawlStatus": "ERROR",
    "ErrorCode": "InvalidRequest",
    "ErrorMessage": "Amazon Kendra can't run the BatchDeleteDocument action with the 
                     specified role. Make sure that the role grants 
                     the kendra:BatchDeleteDocument permission."

The permissions needed for this task are BatchPutDocument and BatchDeleteDocument.

Make sure that the resource matches your index ID (you can find your index ID on the index details page on the console).

Wrong SharePoint site URL

You may experience an error stating you need to provide a URL. Make sure your site URL is under


You have now learned how to ingest the documents from your SharePoint Online site into your Amazon Kendra Index, either through the console or programmatically. In this example case, you have loaded some AWS Whitepapers into your index. You are now able to run some queries such as “What AWS service has 11 nines of durability?

Finally, don’t forget to check the other blog posts about Amazon Kendra!

About the Author

Juan Pablo Bustos is an AI Services Specialist Solutions Architect at Amazon Web Services, based in Dallas, TX. Outside of work, he loves spending time writing and playing music as well as trying random restaurants with his family.




David Shute is a Senior ML GTM Specialist at Amazon Web Services focused on Amazon Kendra. When not working, he enjoys hiking and walking on a beach.


Read More

Six strategic areas identified for shared faculty hiring in computing

Six strategic areas identified for shared faculty hiring in computing

Nearly every aspect of the modern world is being transformed by computing. As computing technology continues to revolutionize the way people live, work, learn, and interact, computing research and education are increasingly playing a role in a broad range of academic disciplines, and are in turn being shaped by this expanding breadth.

To connect computing and other disciplines in addressing critical challenges and opportunities facing the world today, the MIT Stephen A. Schwarzman College of Computing is planning to create 25 new faculty positions that will be shared between the college and an MIT department or school. Hiring for these new positions will be focused on six strategic areas of inquiry, to build capacity at MIT in key computing domains that cut across departments and schools. The shared faculty members are expected to engage in research and teaching that contributes to their home department, that is of mutual value to that department and the college, and that helps form and strengthen cross-departmental ties.

“These new shared faculty positions present an unprecedented opportunity to develop crucial areas at MIT which connect computing with other disciplines,” says Daniel Huttenlocher, dean of the MIT Schwarzman College of Computing. “By coordinated hiring between the college and departments and schools, we expect to have significant impact with multiple touch points across MIT.”

The six strategic areas and the schools expected to be involved in hiring for each are as follows:

Social, Economic, and Ethical Implications of Computing and Networks. Associated schools: School of Humanities, Arts and the Social Sciences and MIT Sloan School of Management.

There have been tremendous advances in new digital platforms and algorithms, which have already transformed our economic, social, and even political lives. But the future societal implications of these technologies and the consequences of the use and misuse of massive social data are poorly understood. There are exciting opportunities for building on the growing intellectual connections between computer science, data science, and social science and humanities, in order to bring a better conceptual framework to understand the social and economic implications, ethical dimensions, and regulation of these technologies.

Focusing on the interplay between computing systems and our understanding of individuals and societal institutions, this strategic hiring area will include faculty whose work focuses on the broader consequences of the changing digital and information environment, market design, digital commerce and competition, and economic and social networks. Issues of interest include how computing and AI technologies have shaped and are shaping the work of the future; how social media tools have reshaped political campaigns, changed the nature and organization of mass protests, and spurred governments to either reduce or dramatically enhance censorship and social control; increasing challenges in adjudicating what information is reliable, what is slanted, and what is entirely fake; conceptions of privacy, fairness, and transparency of algorithms; and the effects of new technologies on democratic governance.

Computing and Natural Intelligence: Cognition, Perception, and Language. Associated schools: School of Science; School of Humanities, Arts, and Social Sciences; and School of Architecture and Planning.

Intelligence — what it is, how the brain produces it, and how it can be engineered — is simultaneously one of the greatest open questions in natural sciences and the most important engineering challenges of our time. Significant advances in computing and machine learning have enabled a better understanding of the brain and the mind. Concurrently, neuroscience and cognitive science have started to give meaningful engineering guidance to AI and related computing efforts. Yet, huge gaps remain in connecting the science and engineering of intelligence.

Integrating science, computing, and social sciences and humanities, this strategic hiring area aims to address the gap between science and engineering of intelligence, in order to make transformative advances in AI and deepen our understanding of natural intelligence. Hiring in this area is expected to advance a holistic approach to understanding human perception and cognition through work such as the study of computational properties of language by bridging linguistic theory, cognitive science, and computer science; improving the art of listening by re-engineering music through music classification and machine learning, music cognition, and new interfaces for musical expression; discovering how artificial systems might help explain natural intelligence and vice versa; and seeking ways in which computing can aid in human expression, communications, health, and meaning.

Computing in Health and Life Sciences. Associated schools: School of Engineering; School of Science; and MIT Sloan School of Management.

Computing is increasingly becoming an indispensable tool in the health and life sciences. A key area is facilitating new approaches to identifying molecular and biomolecular agents with desired functions and for discovering new medications and new means of diagnosis. For instance, machine learning provides a unique opportunity in the pursuit of molecular and biomolecular discovery to parameterize and augment physics-based models, or possibly even replace them, and enable a revolution in molecular science and engineering. Another major area is health-care delivery, where novel algorithms, high performance computing, and machine learning offer new possibilities to transform health monitoring and treatment planning, facilitating better patient care, and enabling more effective ways to help prevent disease. In diagnosis, machine learning methods hold the promise of improved detection of diseases, increasing both specificity and sensitivity of imaging and testing.

This strategic area aims to hire faculty who help create transformative new computational methods in health and life sciences, while complementing the considerable existing work at MIT by forging additional connections. The broad scope ranges from computational approaches to fundamental problems in molecular design and synthesis for human health; to reshaping health-care delivery and personalized medicine; to understanding radiation effects and optimizing dose delivery on target cells; to improving tracing, imaging, and diagnosis techniques.

Computing for Health of the Planet. Associated schools: School of Engineering; School of Science; and School of Architecture and Planning.

The health of the planet is one of the most important challenges facing humankind today. Rapid industrialization has led to a number of serious threats to human and ecosystem health, including climate change, unsafe levels of air and water pollution, coastal and agricultural land erosion, and many others. Ensuring the health and safety of our planet necessitates an interdisciplinary approach that connects scientific understanding, engineering solutions, social, economic, and political aspects, with new computational methods to provide data-driven models and solutions for providing clean air, usable water, resilient food, efficient transportation systems, and identifying sustainable sources of energy.

This strategic hiring area will help facilitate such collaborations by bringing together expertise that will enable us to advance physical understanding of low-carbon energy solutions, earth-climate modelling, and urban planning through high performance computing, transformational numerical methods, and/or machine learning techniques.

Computing and Human Experience. Associated schools: School of Humanities, Arts, and Social Sciences and School of Architecture and Planning.

Computing and digital technologies are challenging the very ways in which people understand reality and our role in it. These technologies are embedded in the everyday lives of people around the world, and while frequently highly useful, they can reflect cultural assumptions and technological heritage, even though they are often viewed as being neutral prescriptions for structuring the world. Indeed, as becomes increasingly apparent, these technologies are able to alter individual and societal perceptions and actions, or affect societal institutions, in ways that are not broadly understood or intended. Moreover, although these technologies are conventionally developed for improved efficacy or efficiency, they can also provide opportunities for less utilitarian purposes such as supporting introspection and personal reflection.

This strategic hiring area focuses on growing the set of scholars in the social sciences, humanities, and computing who examine technology designs, systems, policies, and practices that can address the dual challenges of the lack of understanding of these technologies and their implications, including the design of systems that may help ameliorate rather than exacerbate inequalities. It further aims to develop techniques and systems that help people interpret and gain understanding from societal and historical data, including in humanities disciplines such as comparative literature, history, and art and architectural history.

Quantum Computing. Associated schools: School of Engineering and School of Science.

One of the most promising directions for continuing improvements in computing power comes from quantum mechanics. In the coming years, new hardware, algorithms, and discoveries offer the potential to dramatically increase the power of quantum computers far beyond current machines. Achieving these advances poses challenges that span multiple scientific and engineering fields, and from quantum hardware to quantum computing algorithms. Potential quantum computing applications span a broad range of fields, including chemistry, biology, materials science, atmospheric modeling, urban system simulation, nuclear engineering, finance, optimization, and others, requiring a deep understanding of both quantum computing algorithms and the problem space.

This strategic hiring area aims to build on MIT’s rich set of activities in the space to catalyze research and education in quantum computing and quantum information across the Institute, including the study of quantum materials; developing robust controllable quantum devices and networks that can faithfully transmit quantum information; and new algorithms for machine learning, AI, optimization, and data processing to fully leverage the promise of quantum computing.

A coordinated approach

Over the past few months, the MIT Schwarzman College of Computing has undertaken a strategic planning exercise to identify key areas for hiring the new shared faculty. The process has been led by Huttenlocher, together with MIT Provost Martin Schmidt and the deans of the five schools — Anantha Chandrakasan, dean of the School of Engineering; Melissa Nobles, Kenan Sahin Dean of the School of Humanities, Arts, and Social Sciences; Hashim Sarkis, dean of the School of Architecture and Planning; David Schmittlein, John C. Head III Dean of MIT Sloan; and Michael Sipser, dean of the School of Science — beginning with input from departments across the Institute.

This input was in the form of proposals for interdisciplinary computing areas that were solicited from department heads. A total of 29 proposals were received. Over a six-week period, the committee worked with proposing departments to identify strategic hiring themes. The process yielded the six areas that cover several critically important directions. 

“These areas not only bring together computing with numerous departments and schools, but also involve multiple modes of academic inquiry, offering opportunities for new collaborations in research and teaching across a broad range of fields,” says Schmidt. “I’m excited to see us launch this critical part of the college’s mission.”

The college will also coordinate with each of the five schools to ensure that diversity, equity, and inclusion is at the forefront for all of the hiring areas.

Hiring for the 2020-21 academic year

While the number of searches and involved schools will vary from year to year, the plan for the coming academic year is to have five searches, one with each school. These searches will be in three of the six strategic hiring areas as follows:

Social, Economic, and Ethical Implications of Computing and Networks will focus on two searches, one with the Department of Philosophy in the School of Humanities, Arts, and Social Sciences, and one with the MIT Sloan School of Management.

Computing and Natural Intelligence: Cognition, Perception and Language will focus on one search with the Department of Brain and Cognitive Sciences in the School of Science.

Computing for Health of the Planet will focus on two searches, one with the Department of Urban Studies and Planning in the School of Architecture and Planning, and one with a department to be identified in the School of Engineering.

Read More

F19 – Topics in data analysis

F19 – Topics in data analysis

This series of blog posts is based on the Fall 2019 10-718 Data Analysis class at Carnegie Mellon University, taught by Leila Wehbe, with the assistance of Jacob Tyo, Aria Wang and Fabricio Flores. The blog posts were written by the students and edited by the instructors and the ML@CMU blog team.

What is data analysis? A simple definition is: the application of machine learning and statistical methods to real world data to solve a problem. While this statement is simple, data analysis eventually requires expertise from a vast number of disciplines such as the real world domain in question (e.g. healthcare, specific scientific field, finance, etc.) and machine learning and statistics, but could also require knowledge from other fields as diverse as computing or policy or law. The complexity of data science leads to a plethora of possible pitfalls, with no clear instructions on how to avoid them. It is very difficult to construct a specific set of such instructions because every application domain has very specific setups, goals and constraints. 

We focus here on these issues from the perspective of a machine learning expert and attempt to provide some general guidelines to avoid pitfalls. In some cases where it’s difficult to provide guidelines, we present a set of notable mistakes to avoid. Unlike usual machine learning classes or tutorials that focus on introducing methods and algorithms, we focus on the higher level of motivating the use of these algorithms and testing the generalizability of their conclusions. We focus on the connection between machine learning and its practice.

In this series of educational blog posts, we highlight components of data analysis by focusing on 7 topics. Each topic is based on key papers, book chapters or blog posts that we have discussed in class. For each topic, we highlight pitfalls to watch out for and propose solutions when possible, some inspired by the literature and others by class discussion. We invite the readers to share their comments with us to help us improve the posts.

1 – The Importance of Domain Knowledge

Why is domain knowledge important in data science? This blog post shows the value of domain knowledge in data analysis from multiple perspectives. It includes some simple case studies to demonstrate how domain knowledge can help us with every stage of the data analysis workflow, focuses on several examples to give an in-depth view of the use of domain knowledge in specific tasks and includes an interesting discussion about the relationship between domain knowledge and machine learning algorithms.

2 – Data Exploration

Although sometimes practitioners tend to spend more time on model architecture design and parameters tuning, the importance of data exploration should not be ignored. If data breaks the assumption of the model or contains errors, it will not be possible to get desired results even with the best of models. This blog post introduces a protocol for data exploration along with several methods that may be useful in this process, including statistical and visualization methods, as well as examples of traps in data exploration and of how data exploration helps reduce bias in the dataset.

3 – Baselines

A baseline guides our selection of more complex models and provides insights into the task at hand. Nonetheless, such a useful tool is not easy to handle, and many researchers tend to compare their novel models against weak baselines which poses a problem in the current research sphere as it leads to optimistic, but false results. This blog provides a definition of different types of baselines, case studies of examples in which they are not correctly used, a discussion on such issues and questions that are still open-ended.

4 – The Overfitting Iceberg

Overfitting, as a conventional and important topic of machine learning, has been well-studied with tons of solid fundamental theories and empirical evidence. However, as breakthroughs in deep learning are rapidly changing science and society in recent years, practitioners have observed many phenomena that seem to contradict classical learning theory. This blog aims to understand the nuances and subtleties behind this apparent contradiction by introducing a proposed mechanism for their emergence; it also summarizes some state-of-the-art strategies to deal with overfitting in the modern DL practice.

5 – Reproducibility

It is now widely agreed that reproducibility is a key part of any scientific process and that it should be considered a regular practice to make our research reproducible. Despite this widely accepted notion, many fields including machine learning are experiencing a reproducibility crisis. This blog explains the different definitions of reproducibility, relates the reproducibility crisis and discusses its implications for scientific research and its more general impacts on society.

6 – Interpretability

The objectives that machine learning models optimize for do not always reflect the actual desiderata of the task at hand. Interpretability in models allow us to evaluate their decisions and obtain information that the objective alone cannot confer. Interpretability takes many forms and can be difficult to define; this blog explores general frameworks and sets of definitions in which model interpretability can be evaluated and compared and analyzes several well-known examples of interpretability methods in the context of this framework.

7 – Causal Inference

The rules of causality play a role in almost everything we do and it is reasonable to assume that considering causality in a world model will be a critical component of intelligent systems in the future. However, the formalisms, mechanisms, and techniques of causal inference remain a niche subject few explore. In this blog we formally consider the statement “association does not equal causation”, review some of the basics of causal inference, discuss causal relationship discovery, and describe a few examples of the benefits of utilizing causality in AI research.

Read More

Toward a machine learning model that can reason about everyday actions

Toward a machine learning model that can reason about everyday actions

The ability to reason abstractly about events as they unfold is a defining feature of human intelligence. We know instinctively that crying and writing are means of communicating, and that a panda falling from a tree and a plane landing are variations on descending. 

Organizing the world into abstract categories does not come easily to computers, but in recent years researchers have inched closer by training machine learning models on words and images infused with structural information about the world, and how objects, animals, and actions relate. In a new study at the European Conference on Computer Vision this month, researchers unveiled a hybrid language-vision model that can compare and contrast a set of dynamic events captured on video to tease out the high-level concepts connecting them. 

Their model did as well as or better than humans at two types of visual reasoning tasks — picking the video that conceptually best completes the set, and picking the video that doesn’t fit. Shown videos of a dog barking and a man howling beside his dog, for example, the model completed the set by picking the crying baby from a set of five videos. Researchers replicated their results on two datasets for training AI systems in action recognition: MIT’s Multi-Moments in Time and DeepMind’s Kinetics.

“We show that you can build abstraction into an AI system to perform ordinary visual reasoning tasks close to a human level,” says the study’s senior author Aude Oliva, a senior research scientist at MIT, co-director of the MIT Quest for Intelligence, and MIT director of the MIT-IBM Watson AI Lab. “A model that can recognize abstract events will give more accurate, logical predictions and be more useful for decision-making.”

As deep neural networks become expert at recognizing objects and actions in photos and video, researchers have set their sights on the next milestone: abstraction, and training models to reason about what they see. In one approach, researchers have merged the pattern-matching power of deep nets with the logic of symbolic programs to teach a model to interpret complex object relationships in a scene. Here, in another approach, researchers capitalize on the relationships embedded in the meanings of words to give their model visual reasoning power.

“Language representations allow us to integrate contextual information learned from text databases into our visual models,” says study co-author Mathew Monfort, a research scientist at MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL). “Words like ‘running,’ ‘lifting,’ and ‘boxing’ share some common characteristics that make them more closely related to the concept ‘exercising,’ for example, than ‘driving.’ ”

Using WordNet, a database of word meanings, the researchers mapped the relation of each action-class label in Moments and Kinetics to the other labels in both datasets. Words like “sculpting,” “carving,” and “cutting,” for example, were connected to higher-level concepts like “crafting,” “making art,” and “cooking.” Now when the model recognizes an activity like sculpting, it can pick out conceptually similar activities in the dataset. 

This relational graph of abstract classes is used to train the model to perform two basic tasks. Given a set of videos, the model creates a numerical representation for each video that aligns with the word representations of the actions shown in the video. An abstraction module then combines the representations generated for each video in the set to create a new set representation that is used to identify the abstraction shared by all the videos in the set.

To see how the model would do compared to humans, the researchers asked human subjects to perform the same set of visual reasoning tasks online. To their surprise, the model performed as well as humans in many scenarios, sometimes with unexpected results. In a variation on the set completion task, after watching a video of someone wrapping a gift and covering an item in tape, the model suggested a video of someone at the beach burying someone else in the sand. 

“It’s effectively ‘covering,’ but very different from the visual features of the other clips,” says Camilo Fosco, a PhD student at MIT who is co-first author of the study with PhD student Alex Andonian. “Conceptually it fits, but I had to think about it.”

Limitations of the model include a tendency to overemphasize some features. In one case, it suggested completing a set of sports videos with a video of a baby and a ball, apparently associating balls with exercise and competition.

A deep learning model that can be trained to “think” more abstractly may be capable of learning with fewer data, say researchers. Abstraction also paves the way toward higher-level, more human-like reasoning.

“One hallmark of human cognition is our ability to describe something in relation to something else — to compare and to contrast,” says Oliva. “It’s a rich and efficient way to learn that could eventually lead to machine learning models that can understand analogies and are that much closer to communicating intelligently with us.”

Other authors of the study are Allen Lee from MIT, Rogerio Feris from IBM, and Carl Vondrick from Columbia University.

Read More

Speed Reader: Startup Primer Helps Analysts Make Every Second Count

Speed Reader: Startup Primer Helps Analysts Make Every Second Count

Expected to read upwards of 200,000 words daily from hundreds, if not thousands, of documents, financial analysts are asked to perform the impossible.

Primer is using AI to apply the equivalent of compression technology to this mountain of data to help make work easier for them as well as analysts across a range of other industries.

The five-year-old company, based in San Francisco, has built a natural language processing and machine learning platform that essentially does all the reading and collating for analysts in a tiny fraction of the time it would normally take them.

Whatever a given analyst might be monitoring, whether it’s a natural disaster, credit default or geo-political event, Primer slashes hours of human research into a few seconds of analysis.

The software combs through massive amounts of content, highlights pertinent information such as quotes and facts, and assembles them into related lists. It distills vast topics into the essentials in seconds.

“We train the models to mimic that human behavior,” said Barry Dauber, vice president of commercial sales at Primer. “It’s really a powerful analyst platform that uses natural language processing and machine learning to surface and summarize information at scale.”

The Power of 1,000 Analysts

Using Primer’s platform running on NVIDIA GPUs is akin to giving an analyst a virtual staff that delivers near-instantaneous results. The software can analyze and report on tens of thousands of documents from financial reports, internal proprietary content, social media, 30,000-40,000 news sources and elsewhere.

“Every time an analyst wants to know something about Syria, we cluster together documents about Syria, in real time,” said Ethan Chan, engineering manager and staff machine learning engineer at Primer. “The goal is to reduce the amount of effort an analyst has to expend to process more information.”

Primer has done just that to the relief of its customers, which includes financial services firms, government agencies and an array of Fortune 500 companies.

As powerful as Primer’s natural language processing algorithms are, up until two years ago they required 20 minutes to deliver results because of the complexity of the document clustering they were asking CPUs to support.

“The clustering was the bottleneck,” said Chan. “Because we have to compare every document with every other document, we’re looking at nearly a trillion flops for a million documents.”

GPUs Slash Analysis Times

Primer’s team added GPUs to the clustering process in 2018 after joining NVIDIA Inception — an accelerator program for AI startups — and quickly slashed those analysis times to mere seconds.

Primer’s GPU work unfolds in the cloud, where it makes equally generous use of AWS, Google Cloud and Microsoft Azure. For prototyping and training of its NLP algorithms such as Named Entity Recognition and Headline Generation (on public, open-source news datasets), Primer uses instances with NVIDIA V100 Tensor Core GPUs.

Model serving and clustering happens on instances with NVIDIA T4 GPUs, which can be dialed up and down based on clustering needs. The company also uses a wrapper called CuPy, which allows for CUDA-powered acceleration of GPUs on Python.

But what Chan believes is Primer’s most innovative use of GPUs is in acceleration of its clustering algorithms.

“Grouping documents together is not something anyone else is doing,” he said, adding that Primer’s success in this area further establishes that “you can use NVIDIA for new use cases and new markets.”

Flexible Delivery Model

With the cloud-based SaaS model, customers can increase or decrease their analysis speed, depending on how much they want to spend on GPUs.

Primer’s offering can also be deployed in a customer’s data center. There, the models can be trained on a customer’s IP and clustering can be performed on premises. This is an important consideration for those working in highly regulated or sensitive markets.

Analysts in finance and national security are currently Primer’s primary users, however, the company could help anyone tasked with combing through mounds of data actually make decisions instead of preparing to make decisions.

The post Speed Reader: Startup Primer Helps Analysts Make Every Second Count appeared first on The Official NVIDIA Blog.

Read More

Rise and Sunshine: NASA Uses Deep Learning to Map Flows on Sun’s Surface, Predict Solar Flares

Rise and Sunshine: NASA Uses Deep Learning to Map Flows on Sun’s Surface, Predict Solar Flares

Looking directly at the sun isn’t recommended — unless you’re doing it with AI, which is what NASA is working on.

The surface of the sun, which is the layer you can see with the eye, is actually bubbly: intense heat creates a boiling reaction, similar to water at high temperature. So when NASA researchers magnify images of the sun with a telescope, they can see tiny blobs, called granules, moving on the surface.

Studying the movement and flows of the granules helps the researchers better understand what’s happening underneath that outer layer of the sun.

The computations for tracking the motion of granules requires advanced imaging techniques. Using data science and GPU computing with NVIDIA Quadro RTX-powered HP Z8 workstations, NASA researchers have developed deep learning techniques to more easily track the flows on the sun’s surface.

RTX Flares Up Deep Learning Performance

When studying how storms and hurricanes form, meteorologists analyze the flows of winds in Earth’s atmosphere. For this same reason, it’s important to measure the flows of plasma in the sun’s atmosphere to learn more about the short- and long-term evolution of our nearest star.

This helps NASA understand and anticipate events like solar flares, which can affect power grids, communication systems like GPS or radios, or even put space travel at risk because of the intense radiation and charged particles associated with space weather.

“It’s like predicting earthquakes,” said Michael Kirk, research astrophysicist at NASA. “Since we can’t see very well beneath the surface of the sun, we have to take measurements from the flows on the exterior to infer what is happening subsurface.”

Granules are transported by plasma motions — hot ionized gas under the surface. To capture these motions, NASA developed customized algorithms best tailored to their solar observations, with a deep learning neural network that observes the granules using images from the Solar Dynamics Observatory, and then learns how to reconstruct their motions.

“Neural networks can generate estimates of plasma motions at resolutions beyond what traditional flow tracking methods can achieve,” said Benoit Tremblay from the National Solar Observatory. “Flow estimates are no longer limited to the surface — deep learning can look for a relationship between what we see on the surface and the plasma motions at different altitudes in the solar atmosphere.”

“We’re training neural networks using synthetic images of these granules to learn the flow fields, so it helps us understand precursor environments that surround the active magnetic regions that can become the source of solar flares,” said Raphael Attie, solar astronomer at NASA’s Goddard Space Flight Center.

NVIDIA GPUs were essential in training the neural networks because NASA needed to complete several training sessions with data preprocessed in multiple ways to develop robust deep learning models, and CPU power was not enough for these computations.

When using TensorFlow on a 72 CPU-core compute node, it took an hour to complete only one pass with the training data. Even in a CPU-based cloud environment, it would still take weeks to train all the models that the scientists needed for a single project.

With an NVIDIA Quadro RTX 8000 GPU, the researchers can complete one training in about three minutes — a 20x speedup. This allows them to start testing the trained models after a day instead of having to wait weeks.

“This incredible speedup enables us to try out different ways to train the models and make ‘stress tests,’ like preprocessing images at different resolutions or introducing synthetic errors to better emulate imperfections in the telescopes,” said Attie. “That kind of accelerated workflow completely changed the scope of what we can afford to explore, and it allows us to be much more daring and creative.”

With NVIDIA Quadro RTX GPUs, the NASA researchers can accelerate workflows for their solar physics projects, and they have more time to conduct thorough research with simulations to gain deeper understandings of the sun’s dynamics.

Learn more about NVIDIA and HP data science workstations, and listen to the AI Podcast with NASA.

The post Rise and Sunshine: NASA Uses Deep Learning to Map Flows on Sun’s Surface, Predict Solar Flares appeared first on The Official NVIDIA Blog.

Read More

Safety Validation of Black-Box Autonomous Systems

Safety Validation of Black-Box Autonomous Systems

With autonomous systems becoming more capable, they are entering into safety-critical domains such as autonomous driving, aircraft collision avoidance, and healthcare. Ensuring the safe operations of these systems is a crucial step before they can be deployed and accepted by our society. Failure to perform the proper degree of safety validation can risk the loss of property or even human life.

The autonomous system design cycle.

Safety can be incorporated at various stages of the development of an autonomous system. Consider the above model for the design cycle of such a system. A necessary component of safety is the definition of a complete set of realistic and safe requirements such as the Responsibility-Sensitive Safety model1 which encodes commonsense driving rules—such as don’t rear end anyone and right of way is given, not taken—into formal mathematical statements about what a vehicle is and is not allowed to do in a given driving scenario. Safety can also be incorporated directly into the design of the system through techniques such as safety-masked reinforcement learning (RL)2 where a driving agent learns how to drive under the constraint that it only takes actions that have a minimal likelihood of causing a collision. Compared to traditional reinforcement learning techniques which have no constraint on their exploratory actions, safety-masked RL results in a safer driving policy.

Once a prototype of a system is available, safety validation can be performed through testing, performance evaluation, and interpretation of the failure modes of the system. Testing can discover failures due to implementation bugs, missing requirements, and emergent behavior due to the complex interaction of subcomponents. For complex autonomous systems operating in physical environments, we can not guarantee safety in all situations, so performance evaluation techniques can determine if the system is acceptably safe. The failure examples generated from testing can then be used to understand flaws in the systems and help engineers to fix them in the next iteration. Even with safety embedded in the process of defining requirements and system design, safety validation is a critical part of ensuring safe autonomy.

There are multiple ways to go about safety validation. White-box approaches use knowledge of the design of the system to construct challenging scenarios and evaluate the behavior of the system. They are often interpretable and can give a high degree of confidence in a system, but can suffer from problems of scalability. Modern autonomous systems employ complex components such as deep neural networks for perception and decision making. Despite improvements to white-box approaches for small neural networks3, they don’t scale to the large networks used in practice. We can, however, trade formal guarantees for scalability by employing algorithms that treat the autonomous system as a black-box.

Safety validation algorithms for black-box autonomous systems have become the preferred tool for validation since they scale to complex systems and can rely on the latest advancements in machine learning to become more effective. In this blog post we cover the latest research in algorithms for the safety validation of black box autonomous systems. For a more in-depth description of the following algorithms (including pseudocode) see our recent survey paper A Survey of Algorithms for Black-Box Safety Validation.

The problem formulation for the safety validation of black-box autonomous systems.

The setup for safety validation algorithms for black-box systems is shown above. We have a black-box system that is going to be tested, such as an autonomous vehicle driving policy or an aircraft collision avoidance system. We assume we have a simulated environment in which the system takes actions after making observations with its sensors, while an adversary perturbs the environment through disturbances in an effort to make the system fail. Disturbances could include sensor noise, the behavior of other agents in the environment, or environmental conditions such as weather. The adversary may have access to the state of the environment which, for example, may describe the positions and velocity of all the vehicles and pedestrians in a driving scenario. The systems we care about usually operate over time in a physical environment, in which case the adversary seeks to find the sequence of disturbances that leads to failure. Finding a disturbance trajectory that leads to failure, rather than just a single disturbance, makes the problem much more challenging. We may also have a model of the disturbances in the environment that describes which sequences of disturbances are most likely. The disturbance model can be constructed through expert knowledge or learned from real-world data. The exact goal of the adversary may be

  1. Falsification: Find any disturbance trajectory that leads to a failure.
  2. Most likely failure analysis: Find the most likely disturbance trajectory that leads to a failure (i.e. maximize ).
  3. Estimation of the probability of failure: Determine how likely it is that any failure will occur based on knowledge of .

The adversary can use a variety of algorithms to generate disturbances. We will cover 4 categories: optimization, path-planning, reinforcement learning, and importance sampling.


Optimization approaches search over the space of possible disturbance trajectories to find those that lead to a system failure. Optimization techniques can involve adaptive sampling or a coordinated search, both of which are guided by a cost function which measures the level of safety for a particular disturbance trajectory. The lower the cost, the closer we are to a failure. Some common cost functions include

  • Miss distance: Often a physically-motivated measure of safety such as the point of closest approach between two aircraft or two vehicles.
  • Temporal logic robustness: When the safety requirements of a system are expressed formally using temporal logic, a language used to reason about events over time, the robustness4 measures how close a trajectory is to violating the specification5.

When performing most likely failure analysis, the probability of the disturbance trajectory is incorporated into the regular cost function to produce a new cost . Ideally, probability can be incorporated as a piecewise objective where when does not lead to failure and when does lead to a failure. In practice, however, using a penalty term may be easier to optimize.

The upside of formulating safety validation as an optimization problem is the ability to use off-the-shelf optimizers and rely on the significant amount of optimization literature (see Kochenderfer and Wheeler6 for an overview). Approaches that have been successfully used for safety validation include simulated annealing7, genetic algorithms8, Bayesian optimization9, extended ant-colony optimization10, and genetic programming11.

The downsides of optimization-based approaches are twofold. First, we are directly searching over the space of all possible disturbance trajectories which is exponential in the length of the trajectory. This can quickly get out of hand. Second, the state of the environment is not typically used when choosing the disturbance trajectory. The state of the environment may not be available for logistical or privacy reasons, but if it is, then the state can provide additional information to the adversary. The next two sections describe techniques to address these limitations by building the disturbance trajectories sequentially and using the state information to help guide the search.

Path Planning

When the safety validation problem is cast as a path-planning problem, we search for failures by sequentially building disturbance trajectories that explore the state space of the environment. There are several metrics of state-space coverage that can be used to guide the search and decide when the state space has been sufficiently explored12.

Two sample trees generated by the RRT Algorithm.

One of the most common path-planning algorithms that has been used for safety validation is the rapidly-exploring random tree (RRT) algorithm, depicted above13. In RRT, a space-filling tree is iteratively constructed by choosing disturbances that bring the environment into unexplored regions of the state space. The RRT algorithm has been used to find failures of an adaptive cruise control system14 where failures involved complex motion of the lead vehicle (shown below) that would be rarely discovered by traditional sampling techniques.

Sample failure of an adaptive cruise control system.

Many path planning approaches were designed to be used with white-box systems and environments where dynamics and gradient information is available. When applied to black-box safety validation, these algorithms need to be adapted to forego the use of such information. For example, in multiple shooting methods, a trajectory is constructed through disjoint segments, which are then joined using gradient descent. In the absence of gradient information, a black-box multiple shooting method was developed that connected segments by successively refining the segment inputs and outputs through full trajectory rollouts15.

Reinforcement Learning

The safety validation problem can be further simplified if we describe it as a Markov decision process where the next state of the environment is only a function of the current state and disturbance. The Markov assumption allows us to select disturbances based only on the current state and apply reinforcement learning (RL) algorithms such as Monte Carlo tree search (MCTS), and deep RL algorithms such as Deep Q-Networks or Proximal Policy Optimization.

Monte Carlo tree search is similar to RRT in that a search tree is iteratively created to find disturbance trajectories that end in failure. Unlike RRT, however, MCTS is designed for use with black-box systems. The trajectories are always rolled out from the initial state of the simulator and the search is guided by a reward function rather than a coverage of the state space. These modifications allow MCTS to be applied in the most information-poor environments. Lee et. al16 used MCTS to find failures of an aircraft collision avoidance system (an example failure is depicted below) where they had no access to the simulator state and could only control actions through a pseudorandom seed. This approach may be preferred when organizations don’t want to expose any aspect of the functioning of their system.

Deep RL has seen a lot of success in recent years due to its ability to solve problems with large state spaces, complex dynamics, and large action spaces. The success of deep RL is due to the large representational capacity of neural networks and advanced optimization techniques, which make it a natural choice as a safety validation algorithm. For example, it has been used to find failures of autonomous driving policies17 where the state and action spaces are large and continuous—attributes that are difficult for other algorithms to handle well. A sample failure of an autonomous driving policy is demonstrated below18.

(Left) Sample failure of an aircraft collision avoidance system, (right) sample failure of a driving policy.

Optimization, path-planning and RL approaches all lend themselves to solving the problems of falsification and most likely failure analysis. However, when we need to evaluate the failure probability of a system, importance sampling approaches should be used.

Importance Sampling

The final set of approaches are well-suited for the task of estimating the probability of failure of the system from many failure examples. Importance sampling approaches seek to learn a sampling distribution that reliably produces failures and can be used to estimate the probability of failure with the minimal number of samples. Some common approaches are the cross-entropy method19, multilevel splitting20, supervised learning21, and approximate dynamic programming22.

Most importance sampling approaches suffer the same drawback as optimization-based approaches: they are constructing a distribution across the entire disturbance trajectory . If we can invoke the Markov assumption, however, then we can construct a good sampling distribution based only on the current state using dynamic programming. However, the downside to dynamic programming is its inability to scale to large state spaces and thus complex scenarios. Our recent work23 shows that we can overcome this scalability problem by decomposing the system into subproblems and combining the subproblem solutions. For example, in an autonomous driving scenario, each adversarial agent on the road is paired with the ego vehicle to create a smaller safety validation problem with just two agents. Each of these problems are solved and then recombined using a neural network based on the Attend, Adapt and Transfer (A2T) architecture24. The combined solution is then refined using simulations of the full scenario. The decomposition strategy, network architecture and a sample failure for a 5-agent driving scenario is shown below. These types of hybrid approaches will be required to solve the most challenging safety validation problems.

(Left) Decomposition into pairwise subproblems, each involving the blue ego vehicle. (Right) The network used to fuse the subproblem solutions based on A2T.

Sample failure for an autonomous driving policy in a complex environment.

The Future

The validation of complex and safety-critical autonomous systems will likely involve many different techniques throughout the system design cycle, and black-box safety validation algorithms will play a crucial role. In particular, black-box algorithms are useful to the engineers who design safety-critical systems as well as third-party organizations that wish to validate the safety of such systems for regulatory or risk-assessment purposes. Although this post reviews many algorithms that will be of practical use for the validation of safety-critical autonomous systems, there are still areas that require more investigation. For example, we would like to be able to answer the question: if no failure has been found, how sure are we that the system is safe? This will require the development of algorithms that have formal or probabilistic guarantees of convergence. Scalability also remains a significant challenge. Autonomous systems can encounter a wide range of complex interactions, so safety validation algorithms must be able to efficiently discover failures in the most complex scenarios. The algorithms presented in this survey are a promising step toward safe and beneficial autonomy.


Many thanks to Michelle Lee, Andrey Kurenkov, Robert Moss, Mark Koren, Ritchie Lee, and Mykel Kochenderfer for comments and edits on this blog post.

  1. Shalev-Shwartz, Shai, et al. “On a formal model of safe and scalable self-driving cars.” arXiv preprint arXiv:1708.06374 (2017). 

  2. Bouton, Maxime, et al. “Reinforcement learning with probabilistic guarantees for autonomous driving.” arXiv preprint arXiv:1904.07189 (2019). 

  3. Katz, Guy, et al. “Reluplex: An efficient SMT solver for verifying deep neural networks.” International Conference on Computer Aided Verification. Springer, 2017. 

  4. Fainekos, Georgios E., et al. “Robustness of temporal logic specifications for continuous-time signals.” Theoretical Computer Science 410.42 (2009): 4262-4291. 

  5. Mathesen, Logan, et al. “Falsification of cyber-physical systems with robustness uncertainty quantification through stochastic optimization with adaptive restart.” International Conference on Automation Science and Engineering (CASE). IEEE, 2019. 

  6. M. J. Kochenderfer and T. A. Wheeler, Algorithms for optimization. MIT Press, 2019. 

  7. Abbas, Houssam, et al. “Probabilistic temporal logic falsification of cyber-physical systems.” ACM Transactions on Embedded Computing Systems (TECS) 12.2s (2013): 1-30. 

  8. Zou, Xueyi, et al. “Safety validation of sense and avoid algorithms using simulation and evolutionary search.” International Conference on Computer Safety, Reliability, and Security. Springer, 2014. 

  9. Mullins, Galen E., et al. “Adaptive generation of challenging scenarios for testing and evaluation of autonomous vehicles.” Journal of Systems and Software 137 (2018): 197-215. 

  10. Annapureddy, Yashwanth Singh Rahul, et al. “Ant colonies for temporal logic falsification of hybrid systems.” Annual Conference on IEEE Industrial Electronics Society (IECON). IEEE, 2010. 

  11. Corso, Anthony, et al. “Interpretable safety validation for autonomous vehicles.” To appear in International Conference on Intelligent Transportation Systems (ITSC). IEEE, 2020. 

  12. Nahhal, Tarik, et al. “Test coverage for continuous and hybrid systems.” International Conference on Computer Aided Verification. Springer, Berlin, Heidelberg, 2007. 

  13. LaValle, Steven M. Planning algorithms. Cambridge University Press, 2006. 

  14. Koschi, Markus, et al. “Computationally efficient safety falsification of adaptive cruise control systems.”_ Intelligent Transportation Systems Conference (ITSC)_. IEEE, 2019. 

  15. Zutshi, Aditya, et al. “Multiple shooting, cegar-based falsification for hybrid systems.” International Conference on Embedded Software. 2014. 

  16. Lee, Ritchie, et al. “Adaptive stress testing of airborne collision avoidance systems.” Digital Avionics Systems Conference (DASC). IEEE, 2015. 

  17. Koren, Mark, et al. “Adaptive stress testing for autonomous vehicles.” Intelligent Vehicles Symposium (IV). IEEE, 2018. 

  18. Corso, Anthony, et al. “Adaptive stress testing with reward augmentation for autonomous vehicle validation.” Intelligent Transportation Systems Conference (ITSC). IEEE, 2019. 

  19. O’Kelly, Matthew, et al. “Scalable end-to-end autonomous vehicle testing via rare-event simulation.” Advances in Neural Information Processing Systems. 2018. 

  20. Norden, Justin, et al. “Efficient black-box assessment of autonomous vehicle safety.” arXiv preprint arXiv:1912.03618 (2019). 

  21. Uesato, Jonathan, et al. “Rigorous agent evaluation: An adversarial approach to uncover catastrophic failures.” arXiv preprint arXiv:1812.01647 (2018). 

  22. Corso, Anthony, et al. “Scalable autonomous vehicle safety validation through dynamic programming and scene decomposition.” To appear in International Conference on Intelligent Transportation Systems (ITSC). IEEE, 2020. 

  23. Corso, Anthony, et al. “Scalable autonomous vehicle safety validation through dynamic programming and scene decomposition.” To appear in International Conference on Intelligent Transportation Systems (ITSC). IEEE, 2020. 

  24. Rajendran, Janarthanan, et al. “Attend, adapt and transfer: Attentive deep architecture for adaptive transfer from multiple sources in the same domain.” arXiv preprint arXiv:1510.02879 (2015). 

Read More

Robot takes contact-free measurements of patients’ vital signs

Robot takes contact-free measurements of patients’ vital signs

The research described in this article has been published on a preprint server but has not yet been peer-reviewed by scientific or medical experts.

During the current coronavirus pandemic, one of the riskiest parts of a health care worker’s job is assessing people who have symptoms of Covid-19. Researchers from MIT and Brigham and Women’s Hospital hope to reduce that risk by using robots to remotely measure patients’ vital signs.

The robots, which are controlled by a handheld device, can also carry a tablet that allows doctors to ask patients about their symptoms without being in the same room.

“In robotics, one of our goals is to use automation and robotic technology to remove people from dangerous jobs,” says Henwei Huang, an MIT postdoc. “We thought it should be possible for us to use a robot to remove the health care worker from the risk of directly exposing themselves to the patient.”

Using four cameras mounted on a dog-like robot developed by Boston Dynamics, the researchers have shown that they can measure skin temperature, breathing rate, pulse rate, and blood oxygen saturation in healthy patients, from a distance of 2 meters. They are now making plans to test it in patients with Covid-19 symptoms.

“We are thrilled to have forged this industry-academia partnership in which scientists with engineering and robotics expertise worked with clinical teams at the hospital to bring sophisticated technologies to the bedside,” says Giovanni Traverso, an MIT assistant professor of mechanical engineering, a gastroenterologist at Brigham and Women’s Hospital, and the senior author of the study.

The researchers have posted a paper on their system on the preprint server techRxiv, and have submitted it to a peer-reviewed journal. Huang is one of the lead authors of the study, along with Peter Chai, an assistant professor of emergency medicine at Brigham and Women’s Hospital, and Claas Ehmke, a visiting scholar from ETH Zurich.

Measuring vital signs

When Covid-19 cases began surging in Boston in March, many hospitals, including Brigham and Women’s, set up triage tents outside their emergency departments to evaluate people with Covid-19 symptoms. One major component of this initial evaluation is measuring vital signs, including body temperature.

The MIT and BWH researchers came up with the idea to use robotics to enable contactless monitoring of vital signs, to allow health care workers to minimize their exposure to potentially infectious patients. They decided to use existing computer vision technologies that can measure temperature, breathing rate, pulse, and blood oxygen saturation, and worked to make them mobile.

To achieve that, they used a robot known as Spot, which can walk on four legs, similarly to a dog. Health care workers can maneuver the robot to wherever patients are sitting, using a handheld controller. The researchers mounted four different cameras onto the robot — an infrared camera plus three monochrome cameras that filter different wavelengths of light.

The researchers developed algorithms that allow them to use the infrared camera to measure both elevated skin temperature and breathing rate. For body temperature, the camera measures skin temperature on the face, and the algorithm correlates that temperature with core body temperature. The algorithm also takes into account the ambient temperature and the distance between the camera and the patient, so that measurements can be taken from different distances, under different weather conditions, and still be accurate.

Measurements from the infrared camera can also be used to calculate the patient’s breathing rate. As the patient breathes in and out, wearing a mask, their breath changes the temperature of the mask. Measuring this temperature change allows the researchers to calculate how rapidly the patient is breathing.

The three monochrome cameras each filter a different wavelength of light — 670, 810, and 880 nanometers. These wavelengths allow the researchers to measure the slight color changes that result when hemoglobin in blood cells binds to oxygen and flows through blood vessels. The researchers’ algorithm uses these measurements to calculate both pulse rate and blood oxygen saturation.

“We didn’t really develop new technology to do the measurements,” Huang says. “What we did is integrate them together very specifically for the Covid application, to analyze different vital signs at the same time.”

Continuous monitoring

In this study, the researchers performed the measurements on healthy volunteers, and they are now making plans to test their robotic approach in people who are showing symptoms of Covid-19, in a hospital emergency department.

While in the near term, the researchers plan to focus on triage applications, in the longer term, they envision that the robots could be deployed in patients’ hospital rooms. This would allow the robots to continuously monitor patients and also allow doctors to check on them, via tablet, without having to enter the room. Both applications would require approval from the U.S. Food and Drug Administration.

The research was funded by the MIT Department of Mechanical Engineering and the Karl van Tassel (1925) Career Development Professorship.

Read More

Pixel Perfect: V7 Labs Automates Image Annotation for Deep Learning Models

Pixel Perfect: V7 Labs Automates Image Annotation for Deep Learning Models

Cells under a microscope, grapes on a vine and species in a forest are just a few of the things that AI can identify using the image annotation platform created by startup V7 Labs.

Whether a user wants AI to detect and label images showing equipment in an operating room or livestock on a farm, the London-based company offers V7 Darwin, an AI-powered web platform with a trained model that already knows what almost any object looks like, according to Alberto Rizzoli, co-founder of V7 Labs.

It’s a boon for small businesses and other users that are new to AI or want to reduce the costs of training deep learning models with custom data. Users can load their data onto the platform, which then segments objects and annotates them. It also allows for training and deploying models.

V7 Darwin is trained on several million images and optimized on NVIDIA GPUs. The startup is also exploring the use of NVIDIA Clara Guardian, which includes NVIDIA DeepStream SDK intelligent video analytics framework on edge AI embedded systems. So far, it’s piloted laboratory perception, quality inspection, and livestock monitoring projects, using the NVIDIA Jetson AGX Xavier and Jetson TX2 modules for the edge deployment of trained models.

V7 Labs is a member of NVIDIA Inception, a program that provides AI startups with go-to-market support, expertise and technology assistance.

Pixel-Perfect Object Classification

“For AI to learn to see something, you need to give it examples,” said Rizzoli. “And to have it accurately identify an object based on an image, you need to make sure the training sample captures 100 percent of the object’s pixels.”

Annotating and labeling an object based on such a level of “pixel-perfect” granular detail takes just two-and-a-half seconds for V7 Darwin — up to 50x faster than a human, depending on the complexity of the image, said Rizzoli.

Saving time and costs around image annotation is especially important in the context of healthcare, he said. Healthcare professionals must look at hundreds of thousands of X-ray or CT scans and annotate abnormalities, Rizzoli said, but this can be automated.

For example, during the COVID-19 pandemic, V7 Labs worked with the U.K.’s National Health Service and Italy’s San Matteo Hospital to develop a model that detects the severity of pneumonia in a chest X-ray and predicts whether a patient will need to enter an intensive care unit.

The company also published an open dataset with over 6,500 X-ray images showing pneumonia, 500 cases of which were caused by COVID-19.

V7 Darwin can be used in a laboratory setting, helping to detect protocol errors and automatically log experiments.

Application Across Industries

Companies in a wide variety of industries beyond healthcare can benefit from V7’s technology.

“Our goal is to capture all of computer vision and make it remarkably easy to use” said Rizzoli. “We believe that if we can identify a cell under a microscope, we can also identify, say, a house from a satellite. And if we can identify a doctor performing an operation or a lab technician performing an experiment, we can also identify a sculptor or a person preparing a cake.”

Global uses of the platform include assessing the damage of natural disasters, observing the growth of human and animal embryos, detecting caries in dental X-rays, creating autonomous machines to evaluate safety protocols in manufacturing, and allowing farming robots to count their harvests.

Stay up to date with the latest healthcare news from NVIDIA, and explore how AI, accelerated computing, and GPU technology contribute to the worldwide battle against the novel coronavirus on our COVID-19 research hub.

The post Pixel Perfect: V7 Labs Automates Image Annotation for Deep Learning Models appeared first on The Official NVIDIA Blog.

Read More