An update on our work on AI and responsible innovation

An update on our work on AI and responsible innovation

AI is a powerful tool that will have a significant impact on society for many years to come, from improving sustainability around the globe to advancing the accuracy of disease screenings. As a leader in AI, we’ve always prioritized the importance of understanding its societal implications and developing it in a way that gets it right for everyone. 

That’s why we first published our AI Principles two years ago and why we continue to provide regular updates on our work. As our CEO Sundar Pichai said in January, developing AI responsibly and with social benefit in mind can help avoid significant challenges and increase the potential to improve billions of lives. 

The world has changed a lot since January, and in many ways our Principles have become even more important to the work of our researchers and product teams. As we develop AI we are committed to testing safety, measuring social benefits, and building strong privacy protections into products. Our Principles give us a clear framework for the kinds of AI applications we will not design or deploy, like those that violate human rights or enable surveillance that violates international norms. For example, we were the first major company to have decided, several years ago, not to make general-purpose facial recognition commercially available.

Over the last 12 months, we’ve shared our point of view on how to develop AI responsibly—see our 2019 annual report and our recent submission to the European Commission’s Consultation on Artificial Intelligence. This year, we’ve also expanded our internal education programs, applied our principles to our tools and research, continued to refine our comprehensive review process, and engaged with external stakeholders around the world, while identifying emerging trends and patterns in AI. 

Building on previous AI Principles updates we shared here on the Keyword in 2018 and 2019, here’s our latest overview of what we’ve learned, and how we’re applying these learnings in practice.

Internal education

In addition to launching the initial Tech Ethics training that 800+ Googlers have taken since its launch last year, this year we developed a new training for AI Principles issue spotting. We piloted the course with more than 2,000 Googlers, and it is now available as an online self-study course to all Googlers across the company. The course coaches employees on asking critical questions to spot potential ethical issues, such as whether an AI application might lead to economic or educational exclusion, or cause physical, psychological, social or environmental harm. We recently released a version of this training as a mandatory course for customer-facing Cloud teams and 5,000 Cloud employees have already taken it.

Tools and research

Our researchers are working on computer science and technology not just for today, but for tomorrow as well. They continue to play a leading role in the field, publishing more than 200 academic papers and articles in the last year on new methods for putting our principles into practice. These publications address technical approaches to fairness, safety, privacy, and accountability to people, including effective techniques for improving fairness in machine learning at scale, a method for incorporating ethical principles into a machine-learned model, and design principles for interpretable machine learning systems.

Over the last year, a team of Google researchers and collaborators published an academic paper proposing a framework called Model Cards that’s similar to a food nutrition label and designed to report an AI model’s intent of use, and its performance for people from a variety of backgrounds. We’ve applied this research by releasing Model Cards for Face Detection and Object Detection models used in Google Cloud’s Vision API product.

Our goal is for Google to be a helpful partner not only to researchers and developers who are building AI applications, but also to the billions of people who use them in everyday products. We’ve gone a step further, releasing 14 new tools that help explain how responsible AI works, from simple data visualizations on algorithmic bias for general audiences to Explainable AIdashboards and tool suites for enterprise users. You’ll find a number of these within our new Responsible AI with TensorFlow toolkit.

Review process 

As we’ve shared previously, Google has a central, dedicated team that reviews proposals for AI research and applications for alignment with our principles. Operationalizing the AI Principles is challenging work. Our review process is iterative, and we continue to refine and improve our assessments as advanced technologies emerge and evolve. The team also consults with internal domain experts in machine-learning fairness, security, privacy, human rights, and other areas. 

Whenever relevant, we conduct additional expert human rights assessments of new products in our review process, before launch. For example, we enlisted the nonprofit organization BSR (Business for Social Responsibility) to conduct a formal human rights assessment of the new Celebrity Recognition tool, offered within Google Cloud Vision and Video Intelligence products. BSR applied the UN’s Guiding Principles on Business and Human Rights as a framework to guide the product team to consider the product’s implications across people’s privacy and freedom of expression, as well as potential harms that could result, such as discrimination. This assessment informed not only the product’s design, but also the policies around its use. 

In addition, because any robust evaluation of AI needs to consider not just technical methods but also social context(s), we consult a wider spectrum of perspectives to inform our AI review process, including social scientists and Google’s employee resource groups.

As one example, consider how we’ve built upon learnings from a case we published in our last AI Principles update: the review of academic research on text-to-speech (TTS) technology. Since then, we have applied what we learned in that earlier review to establish a Google-wide approach to TTS. Google Cloud’s Text-to-Speech service, used in products such as Google Lens, puts this approach into practice.

Because TTS could be used across a variety of products, a group of senior Google technical and business leads were consulted. They considered the proposal against our AI Principles of being socially beneficial and accountable to people, as well as the need to incorporate privacy by design and avoiding technologies that cause or are likely to cause overall harm.

  • Reviewers identified the benefits of an improved user interface for various products, and significant accessibility benefits for people with hearing impairments. 

  • They considered the risks of voice mimicry and impersonation, media manipulation, and defamation.

  • They took into account how an AI model is used, and recognized the importance of adding layers of barriers for potential bad actors, to make harmful outcomes less likely.

  • They recommended on-device privacy and security precautions that serve as barriers to misuse, reducing the risk of overall harm from use of TTS technology for nefarious purposes.  

  • The reviewers recommended approving TTS technology for use in our products, but only with user consent and on-device privacy and security measures.

  • They did not approve open-sourcing of TTS models, due to the risk that someone might misuse them to build harmful deepfakes and distribute misinformation. 

Text to Speech.jpg

External engagement

To increase the number and variety of outside perspectives, this year we launched the Equitable AI Research Roundtable, which brings together advocates for communities of people who are currently underrepresented in the technology industry, and who are most likely to be impacted by the consequences of AI and advanced technology. This group of community-based, non-profit leaders and academics meet with us quarterly to discuss AI ethics issues, and learnings from these discussions help shape operational efforts and decision-making frameworks. 

Our global efforts this year included new programs to support non-technical audiences in their understanding of, and participation in, the creation of responsible AI systems, whether they are policymakers, first-time ML (machine learning) practitioners or domain experts. These included:

 

  • Partnering with Yielding Accomplished African Women to implement the first-ever Women in Machine Learning Conference in Africa. We built a network of 1,250 female machine learning engineers from six different African countries. Using the Google Cloud Platform, we trained and certified 100 women at the conference in Accra, Ghana. More than 30 universities and 50 companies and organizations were represented. The conference schedule included workshops on Qwiklabs, AutoML, TensorFlow, human-centered approach to AI, mindfulness and #IamRemarkable

  • Releasing, in partnership with the Ministry of Public Health in Thailand, the first studyof its kind on how researchers apply nurses’ and patients’ input to make recommendations on future AI applications, based on how nurses deployed a new AI system to screen patients for diabetic retinopathy. 

  • Launching an ML workshop for policymakers featuring content and case studies covering the topics of Explainability, Fairness, Privacy, and Security. We’ve run this workshop, via Google Meet, with over 80 participants in the policy space with more workshops planned for the remainder of the year. 

  • Hosting the PAIR (People + AI Research) Symposium in London, which focused on participatory ML and marked PAIR’s expansion to the EMEA region. The event drew 160 attendees across academia, industry, engineering, and design, and featured cross-disciplinary discussions on human-centered AI and hands-on demos of ML Fairness and interpretability tools. 

We remain committed to external, cross-stakeholder collaboration. We continue to serve on the board and as a member of the Partnership on AI, a multi-stakeholder organization that studies and formulates best practices on AI technologies. As an example of our work together, the Partnership on AI is developing best practices that draw from our Model Cards proposal as a framework for accountability among its member organizations. 

Trends, technologies and patterns emerging in AI

We know no system, whether human or AI powered, will ever be perfect, so we don’t consider the task of improving it to ever be finished. We continue to identify emerging trends and challenges that surface in our AI Principles reviews. These prompt us to ask questions such as when and how to responsibly develop synthetic media, keep humans in an appropriate loop of AI decisions, launch products with strong fairness metrics, deploy affective technologies, and offer explanations on how AI works, within products themselves. 

As Sundar wrote in January, it’s crucial that companies like ours not only build promising new technologies, but also harness them for good—and make them available for everyone. This is why we believe regulation can offer helpful guidelines for AI innovation, and why we share our principled approach to applying AI. As we continue to responsibly develop and use AI to benefit people and society, we look forward to continuing to update you on specific actions we’re taking, and on our progress.

Read More

Ask a Techspert: How do machine learning models explain themselves?

Ask a Techspert: How do machine learning models explain themselves?

Editor’s Note: Do you ever feel like a fish out of water? Try being a tech novice and talking to an engineer at a place like Google. Ask a Techspert is a series on the Keyword asking Googler experts to explain complicated technology for the rest of us. This isn’t meant to be comprehensive, but just enough to make you sound smart at a dinner party. 

A few years ago, I learned that a translation from Finnish to English using Google Translate led to an unexpected outcome. The sentence “hän on lentäjä” became “he is a pilot” in English, even though “hän” is a gender-neutral word in Finnish. Why did Translate assume it was “he” as the default? 

As I started looking into it, I became aware that just like humans, machines are affected by society’s biases. The machine learning model for Translate relied on training data, which consisted of the input from hundreds of millions of already-translated examples from the web. “He” was more associated with some professions than “she” was, and vice versa. 

Now, Google provides options for both feminine and masculine translations when adapting gender-neutral words in several languages, and there’s a continued effort to roll it out more broadly. But it’s still a good example of how machine learning can reflect the biases we see all around us. Thankfully, there are teams at Google dedicated to finding human-centered solutions to making technology inclusive for everyone. I sat down with Been Kim, a Google researcher working on the People + AI Research (PAIR) team, who devotes her time to making sure artificial intelligence puts people, not machines, at its center, and helping others understand the full spectrum of human interaction with machine intelligence. We talked about how you make machine learning models easy to interpret and understand, and why it’s important for everybody to have a basic idea of how the technology works.

Been Kim

Why is this field of work so important?

Machine learning is such a powerful tool, and because of that, you want to make sure you’re using it responsibly. Let’s take an electric machine saw as an example. It’s a super powerful tool, but you need to learn how to use it in order not to cut your fingers. Once you learn, it’s so useful and efficient that you’ll never want to go back to using a hand saw. And the same goes for machine learning. We want to help you understand and use machine learning correctly, fairly and safely. 

Since machine learning is used in our everyday lives, it’s also important for everyone to understand how it impacts us. No matter whether you’re a coffee shop owner using machine learning to optimize the purchase of your beans based on seasonal trends, or your doctor diagnoses you with a disease with the help of this technology, it’s often crucial to understand why a machine learning model has produced the outcome it has. It’s also important for developers and decision-makers to be able to explain or present a machine learning model to people in order to do so. This is what we call “interpretability.” 

How do you make machine learning models easier to understand and interpret? 

There are many different ways to make an ML model easier to understand. One way is to make the model reflect how humans think from the start, and have the model “trained” to provide explanations along with predictions, meaning when it gives you an outcome, it also has to explain how it got there. 

Another way is to try and explain a model after the training on data is done. This is something you can do when the model has been built to use input to provide an output from its own perspective, optimizing for prediction, without a clear “how” included. This means you’re able to plug things into it and see what comes out, and that can give you some insight into how the model generally makes decisions, but you don’t necessarily know exactly how specific inputs are interpreted by the model in specific cases. 

One way to try and explain models after they’ve been trained is using low level features or high level concepts. Let me give you an example of what this means. Imagine a system that classifies pictures: you give it a picture and it says, “This is a cat.” A low level feature is when I then ask the machine which pixels mattered for that prediction, it can tell us if it was one pixel or the other, and we might be able to see that the pixels in question show the cat’s whiskers. But we might also see that it is a scattering of pixels that don’t appear meaningful to the human eye, or that it’s made the wrong interpretation. High level concepts are more similar to the way humans communicate with one another. Instead of asking about pixels, I’d ask, “Did the whiskers matter for the prediction? or the paws?” and again, the machine can show me what imagery led it to reach this conclusion. Based on the outcome, I can understand the model better. (Together with researchers from Stanford, we’ve published papers that go into further detail on this for those who are interested.)

Can machines understand some things that we humans can’t? 

Yes! This is an area that I am very interested in myself. I am currently working on a way to showcase how technology can help humans learn new things. Machine learning technology is better at some things than we are; for example it can analyze and interpret data at a much larger scale than humans can. Leveraging this technology, I believe we can enlighten human scientists with knowledge they haven’t previously been aware of. 

What do you need to be careful of when you’re making conclusions based on machine learning models?

First of all, we have to be careful that human bias doesn’t come into play. Humans carry biases that we simply cannot help and are often unaware of, so if an explanation is up to a human’s interpretation, and often it is, then we have a problem. Humans read what they want to read. Now, this doesn’t mean that you should remove humans from the loop. Humans communicate with machines, and vice versa. Machines need to communicate their outcomes in the form of a clear statement using quantitative data, not one that is vague and completely open for interpretation. If the latter happens, then the machine hasn’t done a very good job and the human isn’t able to provide good feedback to the machine. It could also be that the outcome simply lacks additional context only the human can provide, or that it could benefit from having caveats, in order for them to make an informed judgement about the results of the model. 

What are some of the main challenges of this work? 

Well, one of the challenges for computer scientists in this field is dealing with non mathematical objectives, which are things you might want to optimize for, but don’t have an equation for. You can’t always define what is good for humans using math. That requires us to test and evaluate methods with rigor, and have a table full of different people to discuss the outcome. Another thing has to do with complexity. Humans are so complex that we have a whole field of work – psychology – to study this. So in my work, we don’t just have computational challenges, but also complex humans that we have to consider. Value-based questions such as “what defines fairness?” are even harder. They require interdisciplinary collaboration, and a diverse group of people in the room to discuss each individual matter.

What’s the most exciting part? 

I think interpretability research and methods are making a huge impact. Machine learning technology is a powerful tool that will transform society as we know it, and helping others to use it safely is very rewarding. 

On a more personal note, I come from South Korea and grew up in circumstances where I feel I didn’t have too many opportunities. I was incredibly lucky to get a scholarship to MIT and come to the U.S. When I think about the people who haven’t had these opportunities to be educated in science or machine learning, and knowing that this machine learning technology can really help and be useful to them in their everyday lives if they use it safely, I feel really motivated to be working on democratizing this technology. There’s many ways to do it, and interpretability is one of the things that I can contribute with.  

Read More

OpenAI Scholars Spring 2020: Final Projects

OpenAI Scholars Spring 2020: Final Projects

OpenAI Scholars Spring 2020: Final Projects

Our third class of OpenAI Scholars presented their final projects at virtual Demo Day, showcasing their research results from over the past five months. These projects investigated problems such as analyzing how GPT-2 represents grammar, measuring the interpretability of models trained on Coinrun, and predicting epileptic seizures using brain recordings. More information about the next class of Scholars and how to apply will be announced this fall.

The OpenAI Scholars program provides stipends and mentorship to individuals from underrepresented groups to study deep learning and open-source a project.

Our Scholars have demonstrated core technical skills across various expert domains and self-motivation—critical competencies for a self-directed program like this one. They each entered the field of machine learning as relative newcomers, and we hope their progress shows how accessible machine learning is.

Demo Day introductions by Sam Altman and Greg Brockman

Learn more about our Scholars program.

Alethea Power

Looking for Grammar in All The Right Places

I’m fascinated by neural network interpretability. Understanding how networks of various architectures represent information can help us build simpler and more efficient networks, as well as predict how the networks we’ve built will behave, and perhaps even give us some insight into how human beings think. Along these lines, I analyzed how GPT-2 represents English grammar, and found smaller sub-networks that seem to correspond to various grammatical structures. I will present my methodology and results.

Next, I want to work on understanding how neural networks represent information, and use that understanding to better predict how deep learning systems behave. I believe this work will make such systems safer and more beneficial to humanity, as well as making them simpler, faster, and more computationally efficient.

Blog

Andre Carerra

Semantic Parsing English to GraphQL

My scholars program project is semantic parsing English-to-GraphQL. Given an English prompt such as “How many employees do we have?”, find a corresponding GraphQL query to return the information. The project involved creating a dataset, training models, and creating an interaction tool to see results.

I wanted to have a say in how AI is shaped—the Scholars program has been a great opportunity to learn and participate.

Blog

Cathy Yeh

Long Term Credit Assignment with Temporal Reward Transport

Standard reinforcement learning algorithms struggle with poor sample efficiency in the presence of sparse rewards with long temporal delays between action and effect. To address the long term credit assignment problem, we use “temporal reward transport” (TRT) to augment the immediate rewards of significant state-action pairs with rewards from the distant future, using an attention mechanism to identify candidates for TRT. A series of gridworld experiments show clear improvements in learning when TRT is used in conjunction with a standard advantage actor critic algorithm.

I appreciate that this program gave me the freedom to learn deeply and flex my creativity.

Blog

Jorge Orbay

Quantifying Interpretability of Models Trained on Coinrun

This project’s purpose is to create a scalar that measures the interpretability of an A2C model trained on Procgen’s Coinrun. The scalar is generated using a combination of attribution on the model and masks of Coinrun’s assets. The scalar is used to test the validity of the diversity hypothesis.

This program, and specifically my mentor, has fostered a self-confidence in me to dive into a field I don’t understand and breakdown problems until I can solve them. I’m hoping to take the self-confidence I’ve learned from this program to continue breaking-down problems in and with AI.

Blog

Kamal Ndousse

Social Learning in Independent Multi-Agent Reinforcement Learning

My project has explored the social transfer of expertise among completely independent RL agents trained in shared environments. The motivating question is whether novice agents can learn to mimic expert behavior to solve hard-exploration tasks that they couldn’t master in isolation. I’ll discuss my observations as well as the environments I developed to experiment with social skill transfer.

I joined the Scholars program in order to learn from the brilliant folks at OpenAI and to immerse myself in AI research. I’m grateful to have had the opportunity to explore state of the art research with the support of such talented researchers (special thanks to my mentor Natasha Jaques!)

Blog


Kata Slama

Towards Epileptic Seizure Prediction with Deep Network

I have been working on a project to predict epileptic seizures using brain recordings. I framed it as an image classification problem based on the spectrogram representation of the brain data. My most successful model so far has been a ResNet18. In my post-Scholars life, I plan to continue working on this project, and make my way to interpretability of spectrogram classification networks.

I wanted to learn how to apply deep learning for solving scientific and real-world problems. The OpenAI Scholars program was this magical opportunity to get started by learning from the very best minds in the field.

Blog


Pamela Mishkin

Universal Adversarial Perturbations and Language Models

Adversarial perturbations are well-understood for images but less so for language. My presentation will review the literature on how universal adversarial examples can inform understanding of generative models, replicating results generating universal adversarial triggers for GPT-2 and for attacking NLI models.

This program strengthened my technical basis in machine learning and helped me understand how AI researchers understand policy implications of their work.

Blog

Diversity is core to AI having a positive effect on the world—it’s necessary to ensure the advanced AI systems in the future are built to benefit everyone.

If you’re excited to begin your own journey into ML, check out some of our educational materials. More information about the next class of scholars and how to apply will be announced this fall. Stay tuned!

Huge thanks to Microsoft for providing Azure compute credits to scholars, to our mentors for their time and commitment, and to all the supporters that made this program possible.

OpenAI

NVIDIA Ampere GPUs Come to Google Cloud at Speed of Light

NVIDIA Ampere GPUs Come to Google Cloud at Speed of Light

The NVIDIA A100 Tensor Core GPU has landed on Google Cloud.

Available in alpha on Google Compute Engine just over a month after its introduction, A100 has come to the cloud faster than any NVIDIA GPU in history.

Today’s introduction of the Accelerator-Optimized VM (A2) instance family featuring A100 makes Google the first cloud service provider to offer the new NVIDIA GPU.

A100, which is built on the newly introduced NVIDIA Ampere architecture, delivers NVIDIA’s greatest generational leap ever. It boosts training and inference computing performance by 20x over its predecessors, providing tremendous speedups for workloads to power the AI revolution.

“Google Cloud customers often look to us to provide the latest hardware and software services to help them drive innovation on AI and scientific computing workloads, ” said Manish Sainani, director of Product Management at Google Cloud. “With our new A2 VM family, we are proud to be the first major cloud provider to market NVIDIA A100 GPUs, just as we were with NVIDIA T4 GPUs. We are excited to see what our customers will do with these new capabilities.”

In cloud data centers, A100 can power a broad range of compute-intensive applications, including AI training and inference, data analytics, scientific computing, genomics, edge video analytics, 5G services, and more.

Fast-growing, critical industries will be able to accelerate their discoveries with the breakthrough performance of A100 on Google Compute Engine. From scaling up AI training and scientific computing, to scaling out inference applications, to enabling real-time conversational AI, A100 accelerates complex and unpredictable workloads of all sizes running in the cloud. 

NVIDIA CUDA 11, coming to general availability soon, makes accessible to developers the new capabilities of NVIDIA A100 GPUs, including Tensor Cores, mixed-precision modes, multi-instance GPU, advanced memory management and standard C++/Fortran parallel language constructs.

Breakthrough A100 Performance in the Cloud for Every Size Workload

The new A2 VM instances can deliver different levels of performance to efficiently accelerate workloads across CUDA-enabled machine learning training and inference, data analytics, as well as high performance computing.

For large, demanding workloads, Google Compute Engine offers customers the a2-megagpu-16g instance, which comes with 16 A100 GPUs, offering a total of 640GB of GPU memory and 1.3TB of system memory — all connected through NVSwitch with up to 9.6TB/s of aggregate bandwidth.

For those with smaller workloads, Google Compute Engine is also offering A2 VMs in smaller configurations to match specific applications’ needs.

Google Cloud announced that additional NVIDIA A100 support is coming soon to Google Kubernetes Engine, Cloud AI Platform and other Google Cloud services. For more information, including technical details on the new A2 VM family and how to sign up for access, visit the Google Cloud blog.

The post NVIDIA Ampere GPUs Come to Google Cloud at Speed of Light appeared first on The Official NVIDIA Blog.

Read More

Building a custom Angular application for labeling jobs with Amazon SageMaker Ground Truth

Building a custom Angular application for labeling jobs with Amazon SageMaker Ground Truth

As a data scientist attempting to solve a problem using supervised learning, you usually need a high-quality labeled dataset before starting your model building. Amazon SageMaker Ground Truth makes dataset building for a different range of tasks, like text classification and object detection, easier and more accessible to everyone.

Ground Truth also helps you build datasets for custom user-defined tasks that let you annotate anything. This capability is powered by the following:

  • Custom AWS Lambda functions that can be triggered between labeling steps. This allows you to have custom logic pre-labeling like filtering examples or augmenting them with metadata using other services like Amazon Translate or Amazon Rekognition, and post-labeling logic for label consolidation or quality control.
  • Custom web templates that let you build unique user interfaces using HTML and Javascript that integrate perfectly with Ground Truth workflows. These templates are easy to build with Crowd HTML Elements, which are a set of common UI elements used for text, video, and audio labeling jobs that you can arrange like blocks in your custom template.
  • Availability of a large set of skilled and specialized workforces in the AWS Marketplace and in Amazon Mechanical Turk if you need to augment your private teams of subject matter experts. Vetted partners in the AWS Marketplace cover numerous languages as well as specific skills in video and image annotations that fit different industry needs (like medical labeling).

For complex labeling tasks, such as complex taxonomy classification, extreme multi-class classifications, or autonomous driving labeling tasks, you may need to build a more complex front-end application for your labeling workforce. Front-end frameworks like Angular are helpful in these cases because they bring useful design patterns like model-view-controller (MVC), which makes your codebase more robust and maintainable for a larger team composed of UX/UI designers and software developers.

This post walks you through using Angular and Angular Elements to create fully customizable solutions that work nicely with Ground Truth. This walkthrough assumes that you’re familiar with running a custom labeling job with Ground Truth and Crowd HTML Elements. For more information, see Build a custom data labeling workflow with Amazon SageMaker Ground Truth.

The approach described in this post also works with Amazon Augmented AI (Amazon A2I), which makes it easy to build the workflows required for human review of machine learning predictions. This is possible because Amazon A2I uses Crowd HTML Elements to create custom worker templates. For more information, see Create Custom Worker Templates.

Building a custom UI for complex taxonomy classification

If you manage large supply chains and interact with different types of suppliers, like global food restaurants or automotive manufacturers, you likely receive invoices in different formats and languages. To keep track of your operations and drive financial efficiencies, you need teams behind the scenes to map invoices and receipts to large categories of products and organize them in hierarchical taxonomies.

The following diagram illustrates a hierarchical taxonomy of computer components.

The following diagram illustrates a hierarchical taxonomy of types of food.

Hierarchical taxonomies can have thousands of categories at their leaf level. Such examples can include web directories (the Yahoo! Directory or the Open Directory Project), library classification schemes (Dewey Decimal or Library of Congress), or the classification schemes used in natural science, legal, or medical applications.

What if a natural language processing (NLP) model could help you automatically tag every invoice to the proper category? What if text labeling tools could extract categories from invoices?

Even if accurate classification over large sets of closely related classes is inherently difficult, it all starts with constructing a high-quality dataset in the most cost-efficient manner.

Taxonomy labeling with Angular Elements

For the following use case, you are one of the biggest fast food chains operating and sourcing materials across the world. To build a dataset for your NLP model, you came up with a single-page web app based on UX research that helps your workforce read an invoice description and select the corresponding category in the taxonomy. See the following screenshot.

This implementation makes use of Angular Materials tabs and a filter box that makes navigating the categories easy. It also displays an English translation of your invoice description so that the workers can labels invoices from across the world. Moreover, because it’s built on a framework like Angular, you can improve it down the line with more elements, such as drop-downs for the higher levels of the taxonomy or dynamic content like images or videos based on third-party APIs.

For more information about this application, see the GitHub repo.

The application is built using Angular Elements, which creates Angular components packaged as custom elements (also called web components), a web standard for defining new HTML elements in a framework-agnostic way. This enables you to integrate smoothly with Crowd HTML Elements later on.

Angular Elements inputs and outputs

In this use case, your Angular component expects two inputs: an invoice description and an invoice translation. These are passed to it using tag attributes in the <ng-home> (the directive that designates the root element of the application). The values are then captured by the @Input() annotations defined in the Angular Controller in src/app/home.ts. See the following code:

<ng-home source='10牛ステーキ-20パッケージ-ブランドX' translation='10 beef steak - 20 packages - brand X' id="home">loading</ng-home> 

export class Home implements OnInit {

  @Input() invoice = '';
  @Input() translation = '';
  
  ...

The values are rendered using two-binding in the placehoders {{source}} and {{translation}} in the Angular View in src/app/home.html. See the following code:

<!-- Invoice Description -->
<div class="card" >
    <div class="card-header">
        <h3>Invoice Description</h3>
    </div>
    <div>
        <p id="step1">
        <span>Invoice Description: <br />
        <b>{{ invoice }}</b></span>
        </p>
        <p style='font-weight: small; color: gray;' id="step2">
        <span>English Translation: <br /> {{ translation }}</span>
        </p>
    </div>
</div>

The following screenshot shows the Meats tab on the Food Categories page.

When you choose a category and choose Submit, the Angular component should also broadcast a Javascript event contaning the category ID to its parent DOM element. This is achieved using the @Output() in the Angular Controller in src/app/home.ts. See the following code:

<button mat-button color="primary" (click)="onSubmit()" id="submitButton">Submit</button>

<table>
    ...
    <tr mat-row *matRowDef="let row; columns: displayedColumns;"
        (click)="selectRow(row)" [ngClass]="{ 'highlight': row === selectedRow }">
    </tr>
</table>
@Output('rowselected') rowselected = new EventEmitter<any>();

#called when user click on a row in the table ("selecting" a category)
selectRow(row) {
      this.selectedRow = row;
}

#called when user click on Submit button
onSubmit(){
    this.rowselected.emit(this.selectedRow);
}

Angular integration with Crowd HTML Elements

Communication between Angular Elements and Crowd HTML Elements happens through the mechanism described in the preceding section.

Following the steps described in Build a custom data labeling workflow with Amazon SageMaker Ground Truth, you can adapt how to pass the text to annotate and how to catch the broadcasted event from Angular Elements to create your custom template.

The following code shows the full Liquid HTML template to use in your job creations. This file should also be your index.html root file of the Angular app under src/ folder. (Make sure to use the index.html file under the dist folder that has the minified .js files injected into it with the right Amazon Simple Storage Service (Amazon S3) path to host your app.)

<!doctype html>
<html lang="en">
<html>
  <head>
    <script src="https://assets.crowd.aws/crowd-html-elements.js"></script>
  </head>
  <body>

    <crowd-form style="display: none;">
        <input name="annotations" id="annotations" type="hidden">
        <input name="timeElapsed" id="timeElapsed" type="hidden">
         <!-- Prevent crowd-form from creating its own button -->
        <crowd-button form-action="submit" style="display: none;"></crowd-button>
    </crowd-form>

    <div class="mat-app-background basic-container">
      <!-- Dev Mode to test the Angular Element -->
      <!-- <ng-home source='10牛ステーキ-20パッケージ-ブランドX' translation='10 beef steak - 20 packages - brand X' id="home">loading</ng-home> -->
      <ng-home source='{{ task.input.source }}' translation='{{ task.input.translatedDesc }}'>loading</ng-home>
    </div>

    <script src="<your-s3-bucket-angular-app>/runtime-es2015.js" type="module"></script>
    <script src="<your-s3-bucket-angular-app>/runtime-es5.js" nomodule defer></script>
    <script src="<your-s3-bucket-angular-app>/polyfills-es5.js" nomodule defer></script>
    <script src="<your-s3-bucket-angular-app>/polyfills-es2015.js" type="module"></script>
    <script src="<your-s3-bucket-angular-app>/styles-es2015.js" type="module"></script>
    <script src="<your-s3-bucket-angular-app>/styles-es5.js" nomodule defer></script>
    <script src="<your-s3-bucket-angular-app>/vendor-es2015.js" type="module"></script>
    <script src="<your-s3-bucket-angular-app>/vendor-es5.js" nomodule defer></script>
    <script src="<your-s3-bucket-angular-app>/main-es2015.js" type="module"></script>
    <script src="<your-s3-bucket-angular-app>/main-es5.js" nomodule defer></script>
</body>
</html>

<script>

  document.addEventListener("DOMContentLoaded", function(event) {
    // Counter
    var enterDate = new Date();
    function secondsSinceEnter()
    {
      return (new Date() - enterDate) / 1000;
    }

    // GT Form Submitting
    const component = document.querySelector('ng-home').addEventListener('rowselected', (event) => {
      // alert(event.detail.CODE);
      document.getElementById('annotations').value = event.detail.CODE;
      document.getElementById('timeElapsed').value = secondsSinceEnter();
      document.querySelector('crowd-form').submit();
    });

  });

</script>
<style>
  .body {
    background-color: #fafafa;
  }

  .header {
    background: #673ab7;
      color: #fff;
      padding: 0 16px;
      margin: 20px 20px 0px 20px;
      padding: 20px;
  }

  .cards {
    display: grid;
    grid-template-columns: 30% auto;
    grid-auto-rows: auto;
    grid-gap: 1rem;
    margin: 20px 20px 0px 20px;
  }

  .card {
    box-shadow: 0 2px 1px -1px rgba(0,0,0,.2), 0 1px 1px 0 rgba(0,0,0,.14), 0 1px 3px 0 rgba(0,0,0,.12);
    transition: box-shadow 280ms cubic-bezier(.4,0,.2,1);
    display: block;
    position: relative;
    padding: 16px;
    border-radius: 4px;
    /* margin: 20px 0px 0px 20px; */
    border: 2px solid #e7e7e7;
    border-radius: 4px;
  }

  .highlight-step {
    background-color: #2515424a;
    margin: 0px -15px 0px -15px;
    padding: 15px;
  }
</style>

Creating the template

To create the preceding template, complete the following steps:

  1. Add the crowd-html-element.js script at the top of the template so you can use Crowd HTML Elements:
    <script src="https://assets.crowd.aws/crowd-html-elements.js"></script>

  2. Inject the text to annotate and the associated metadata coming from the pre-processing Lambda function to the user interface using the Liquid templating language directly in root element <ng-home>:
    <ng-home source='{{ task.input.source }}' translation='{{ task.input.translated }}' id="home">loading</ng-home>

  3. Use the <crowd-form /> element, which submits the annotations to Ground Truth. The element is hidden because the submission happens in the background. See the following code:
    <crowd-form style="display: none;">
            <input name="annotations" id="annotations" type="hidden">
            <input name="timeElapsed" id="timeElapsed" type="hidden">
             <!-- Prevent crowd-form from creating its own button -->
            <crowd-button form-action="submit" style="display: none;"></crowd-button>
    </crowd-form>
    

  4. Instead of using Crowd HTML Elements to submit the annotation, include a small script to integrate the Angular Element with <crowd-form />:
    ocument.addEventListener("DOMContentLoaded", function(event) {
    
        var enterDate = new Date();    
        function secondsSinceEnter()
        {
          return (new Date() - enterDate) / 1000;
        }
      
        const component = document.querySelector('ng-home').addEventListener('rowselected', (event) => 
          document.getElementById('annotations').value = event.detail.CODE;
          document.getElementById('timeElapsed').value = secondsSinceEnter();
          document.querySelector('crowd-form').submit();
        });
      
      });
    

For this use case, I’m also keeping a counter to monitor the time it takes a worker to complete the annotation.

The following diagram illustrates the data flow between each element.

Conclusion

This post showed how to build custom labeling UI with Angular and Ground Truth. The solution can handle communication between the different scopes in the custom template provided in the labeling job creation. The ability to use a custom front-end framework like Angular enables you to easily create modern web applications that serve your exact needs when tapping into public, private, or vendor labeling workforces.

For more information about hierarchical taxonomies in Ground Truth, see Creating hierarchical label taxonomies using Amazon SageMaker Ground Truth.

If you have any comments or questions about this post, please use the comments section. Happy labeling!


About the Authors

Yassine Landa is a Data Scientist at AWS. He holds an undergraduate degree in Math and Physics, and master’s degrees from French universities in Computer Science and Data Science, Web Intelligence, and Environment Engineering. He is passionate about building machine learning and artificial intelligence products for customers, and has won multiple awards for machine learning products he has built with tech startups and as a startup founder.

 

 

 

 

Read More