Introducing AWS AI Service Cards: A new resource to enhance transparency and advance responsible AI

Introducing AWS AI Service Cards: A new resource to enhance transparency and advance responsible AI

Artificial intelligence (AI) and machine learning (ML) are some of the most transformative technologies we will encounter in our generation—to tackle business and societal problems, improve customer experiences, and spur innovation. Along with the widespread use and growing scale of AI comes the recognition that we must all build responsibly. At AWS, we think responsible AI encompasses a number of core dimensions including:

  • Fairness and bias– How a system impacts different subpopulations of users (e.g., by gender, ethnicity)
  • Explainability– Mechanisms to understand and evaluate the outputs of an AI system
  • Privacy and Security– Data protected from theft and exposure
  • Robustness– Mechanisms to ensure an AI system operates reliably
  • Governance– Processes to define, implement and enforce responsible AI practices within an organization
  • Transparency– Communicating information about an AI system so stakeholders can make informed choices about their use of the system

Our commitment to developing AI and ML in a responsible way is integral to how we build our services, engage with customers, and drive innovation. We are also committed to providing customers with tools and resources to develop and use AI/ML responsibly, from enabling ML builders with a fully managed development environment to helping customers embed AI services into common business use cases.

Providing customers with more transparency

Our customers want to know that the technology they are using was developed in a responsible way. They want resources and guidance to implement that technology responsibly at their own organization. And most importantly, they want to ensure that the technology they roll out is for everyone’s benefit, especially their end-users’. At AWS, we want to help them bring this vision to life.

To deliver the transparency that customers are asking for, we are excited to launch AWS AI Service Cards, a new resource to help customers better understand our AWS AI services. AI Service Cards are a form of responsible AI documentation that provide customers with a single place to find information on the intended use cases and limitations, responsible AI design choices, and deployment and performance optimization best practices for our AI services. They are part of a comprehensive development process we undertake to build our services in a responsible way that addresses fairness and bias, explainability, robustness, governance, transparency, privacy, and security. At AWS re:Invent 2022 we’re making the first three AI Service Cards available: Amazon Rekognition – Face Matching, Amazon Textract – AnalyzeID, and Amazon Transcribe – Batch (English-US).

Components of the AI Service Cards

Each AI Service Card contains four sections covering:

  • Basic concepts to help customers better understand the service or service features
  • Intended use cases and limitations
  • Responsible AI design considerations
  • Guidance on deployment and performance optimization

The content of the AI Service Cards addresses a broad audience of customers, technologists, researchers, and other stakeholders who seek to better understand key considerations in the responsible design and use of an AI service.

Our customers use AI in an increasingly diverse set of applications. The intended use cases and limitations section provides information about common uses for a service, and helps customers assess whether a service is a good fit for their application. For example, in the Amazon Transcribe – Batch (English-US) Card we describe the service use case of transcribing general-purpose vocabulary spoken in US English from an audio file. If a company wants a solution that automatically transcribes a domain-specific event, such as an international neuroscience conference, they can add custom vocabularies and language models to include scientific vocabulary in order to increase the accuracy of the transcription.

In the design section of each AI Service Card, we explain key responsible AI design considerations across important areas, such as our test-driven methodology, fairness and bias, explainability, and performance expectations. We provide example performance results on an evaluation dataset that is representative of a common use case. This example is just a starting point though, as we encourage customers to test on their own datasets to better understand how the service will perform on their own content and use cases in order to deliver the best experience for their end customers. And this is not a one-time evaluation. To build in a responsible way, we recommend an iterative approach where customers periodically test and evaluate their applications for accuracy or potential bias.

In the best practices for deployment and performance optimization section, we lay out key levers that customers should consider to optimize the performance of their application for real-world deployment. It’s important to explain how customers can optimize the performance of an AI system that acts as a component of their overall application or workflow to get the maximum benefit. For example, in the Amazon Rekognition Face Matching Card that covers adding face recognition capabilities to identity verification applications, we share steps customers can take to increase the quality of the face matching predictions incorporated into their workflow.

Delivering responsible AI resources and capabilities

Offering our customers the resources and tools they need to transform responsible AI from theory to practice is an ongoing priority for AWS. Earlier this year we launched our Responsible Use of Machine Learning guide that provides considerations and recommendations for responsibly using ML across all phases of the ML lifecycle. AI Service Cards complement our existing developer guides and blog posts, which provide builders with descriptions of service features and detailed instructions for using our service APIs. And with Amazon SageMaker Clarify and Amazon SageMaker Model Monitor, we offer capabilities to help detect bias in datasets and models and better monitor and review model predictions through automation and human oversight.

At the same time, we continue to advance responsible AI across other key dimensions, such as governance. At re:Invent today we launched a new set of purpose-built tools to help customers improve governance of their ML projects with Amazon SageMaker Role Manager, Amazon SageMaker Model Cards, and Amazon SageMaker Model Dashboard. Learn more on the AWS News blog and website about how these tools help to streamline ML governance processes.

Education is another key resource that helps advance responsible AI. At AWS we are committed to building the next generation of developers and data scientists in AI with the AI and ML Scholarship Program and AWS Machine Learning University (MLU). This week at re:Invent we launched a new, public MLU course on fairness considerations and bias mitigation across the ML lifecycle. Taught by the same Amazon data scientists who train AWS employees on ML, this free course features 9 hours of lectures and hands-on exercises and it is easy to get started.

AI Service Cards: A new resource—and an ongoing commitment

We are excited to bring a new transparency resource to our customers and the broader community and provide additional information on the intended uses, limitations, design, and optimization of our AI services, informed by our rigorous approach to building AWS AI services in a responsible way. Our hope is that AI Service Cards will act as a useful transparency resource and an important step in the evolving landscape of responsible AI. AI Service Cards will continue to evolve and expand as we engage with our customers and the broader community to gather feedback and continually iterate on our approach.

Contact our group of responsible AI experts to start a conversation.


About the authors

Vasi Philomin is currently a Vice President in the AWS AI team for services in the language and speech technologies areas such as Amazon Lex, Amazon Polly, Amazon Translate, Amazon Transcribe/Transcribe Medical, Amazon Comprehend, Amazon Kendra, Amazon Code Whisperer, Amazon Monitron, Amazon Lookout for Equipment and Contact Lens/Voice ID for Amazon Connect as well as Machine Learning Solutions Lab and Responsible AI.

Peter Hallinan leads initiatives in the science and practice of Responsible AI at AWS AI, alongside a team of responsible AI experts. He has deep expertise in AI (PhD, Harvard) and entrepreneurship (Blindsight, sold to Amazon). His volunteer activities have included serving as a consulting professor at the Stanford University School of Medicine, and as the president of the American Chamber of Commerce in Madagascar. When possible, he’s off in the mountains with his children: skiing, climbing, hiking and rafting

Read More

Making a Traversable Wormhole with a Quantum Computer

Making a Traversable Wormhole with a Quantum Computer

Wormholes — wrinkles in the fabric of spacetime that connect two disparate locations — may seem like the stuff of science fiction. But whether or not they exist in reality, studying these hypothetical objects could be the key to making concrete the tantalizing link between information and matter that has bedeviled physicists for decades.

Surprisingly, a quantum computer is an ideal platform to investigate this connection. The trick is to use a correspondence called AdS/CFT, which establishes an equivalence between a theory that describes gravity and spacetime (and wormholes) in a fictional world with a special geometry (AdS) to a quantum theory that does not contain gravity at all (CFT).

In “Traversable wormhole dynamics on a quantum processor”, published in Nature today, we report on a collaboration with researchers at Caltech, Harvard, MIT, and Fermilab to simulate the CFT on the Google Sycamore processor. By studying this quantum theory on the processor, we are able to leverage the AdS/CFT correspondence to probe the dynamics of a quantum system equivalent to a wormhole in a model of gravity. The Google Sycamore processor is among the first to have the fidelity needed to carry out this experiment.

Background: It from Qubit

The AdS/CFT correspondence was discovered at the end of a series of inquiries arising from the question: What’s the maximum amount of information that can fit in a single region of space? If one asked an engineer how much information could possibly be stored in a datacenter the answer would likely be that it depends on the number and type of memory chips inside it. But surprisingly, what is inside the data center is ultimately irrelevant. If one were to cram more and more memory chips with denser and denser electronics into the datacenter then it will eventually collapse into a black hole and disappear behind an event horizon.

When physicists such as Jacob Bekenstein and Stephen Hawking tried to compute the information content of a black hole, they found to their surprise that it is given by the area of the event horizon — not by the volume of the black hole. It looks as if the information inside the black hole was written on the event horizon. Specifically, a black hole with an event horizon that can be tiled with A tiny units of area (each unit, called a “Planck area,” is 2.6121×10−70 m2) has at most A/4 bits of information. This limit is known as the Bekenstein-Hawking bound.

This discovery that the maximum amount of information that could fit in a region was proportional not to its volume, but to the surface area of the region’s boundary hinted at an intriguing relationship between quantum information and the three-dimensional spatial world of our everyday experience. This relationship has been epitomized by the phrase “It from qubit,” describing how matter (“it”) emerges from quantum information (“qubit”).

While formalizing such a relationship is difficult for ordinary spacetime, recent research has led to remarkable progress with a hypothetical universe with hyperbolic geometry known as “anti-de Sitter space” in which the theory of quantum gravity is more naturally constructed. In anti-de Sitter space, the description of a volume of space with gravity acting in it can be thought of as encoded on the boundary enclosing the volume: every object inside the space has a corresponding description on the boundary and vice versa. This correspondence of information is called the holographic principle, which is a general principle inspired by Bekenstein and Hawking’s observations.

Schematic representation of anti-de Sitter space (interior of cylinder) and its dual representation as quantum information on the boundary (surface of cylinder).

The AdS/CFT correspondence allows physicists to connect objects in space with specific ensembles of interacting qubits on the surface. That is, each region of the boundary encodes (in quantum information) the content of a region in spacetime such that matter at any given location can be “constructed” from the quantum information. This allows quantum processors to work directly with qubits while providing insights into spacetime physics. By carefully defining the parameters of the quantum computer to emulate a given model, we can look at black holes, or even go further and look at two black holes connected to each other — a configuration known as a wormhole, or an Einstein-Rosen bridge.

Experiment: Quantum Gravity in the Lab

Implementing these ideas on a Sycamore processor, we have constructed a quantum system that is dual to a traversable wormhole. Translated from the language of quantum information to spacetime physics via the holographic principle, the experiment let a particle fall into one side of a wormhole and observed it emerging on the other side.

Traversable wormholes were recently shown to be possible by Daniel Jafferis, Ping Gao and Aron Wall. While wormholes have long been a staple of science fiction, there are many possible spacetime geometries in which the formation of a wormhole is possible, but a naïvely constructed one would collapse on a particle traveling through it. The authors showed that a shockwave — i.e., a deformation of spacetime that propagates at the speed of light — of negative energy would solve this problem, propping open the wormhole long enough to enable traversability. The presence of negative energy in a traversable wormhole is similar to negative energy in the Casimir effect, where vacuum energy pushes together closely spaced plates. In both cases, quantum mechanics permits the energy density at a given location in space to be either positive or negative. On the other hand, if the wormhole experienced a shockwave of positive energy, no information would be allowed to pass through.

The simplest application of the holographic principle to create a wormhole requires many, many qubits — in fact, to approach the pencil-and-paper solutions given by theoretical physicists, one would need an arbitrarily large number of qubits. As the number of qubits is reduced, additional corrections are required that are still poorly understood today. New ideas were needed to build a traversable wormhole on a quantum computer with a limited number of qubits.

One of us (Zlokapa) adopted ideas from deep learning to design a small quantum system that preserved key aspects of gravitational physics. Neural networks are trained via backpropagation, a method that optimizes parameters by directly computing the gradient through the layers of the network. To improve the performance of a neural network and prevent it from overfitting to the training dataset, machine learning (ML) practitioners employ a host of techniques. One of these, sparsification, attempts to restrict the detail of information in the network by setting as many weights as possible to zero.

Similarly, to create the wormhole, we started with a large quantum system and treated it like a neural network. Backpropagation updated the parameters of the system in order to maintain gravitational properties while sparsification reduced the size of the system. We applied ML to learn a system that preserved only one key gravitational signature: the importance of using a negative energy shockwave. The training dataset compared dynamics of a particle traversing a wormhole propped open with negative energy and collapsed with positive energy. By ensuring the learned system preserved this asymmetry, we obtained a sparse model consistent with wormhole dynamics.

Learning procedure to produce a sparse quantum system that captures gravitational dynamics. A single coupling consists of all six possible connections between a given group of four fermions.

Working with Jafferis and a handful of collaborators from Caltech, Fermilab, and Harvard, we subjected the new quantum system to numerous tests to determine if it showed gravitational behavior beyond signatures induced by different energy shockwaves. For example, while quantum mechanical effects can transmit information across a quantum system in a diverse set of ways, information that travels in spacetime — including through a wormhole — must be causally consistent. This and other signatures were verified on classical computers, confirming that the dynamics of the quantum system were consistent with a gravitational interpretation as viewed through the dictionary of the holographic principle.

Implementing the traversable wormhole as an experiment on a quantum processor is an extraordinarily delicate process. The microscopic mechanism of information transfer across qubits is highly chaotic: imagine an ink drop swirling in water. As a particle falls into a wormhole, its information gets smeared over the entire quantum system in the holographic picture. For the negative energy shockwave to work, the scrambling of information must follow a particular pattern known as perfect size winding. After the particle hits the negative energy shockwave, the chaotic patterns effectively proceed in reverse: when the particle emerges from the wormhole, it is as if the ink drop has come back together by exactly undoing its original turbulent spread. If, at any point in time, a small error occurs, the chaotic dynamics will not undo themselves, and the particle will not make it through the wormhole.

Left: Quantum circuit describing a traversable wormhole. A maximally entangled pair of qubits (“EPR pair”) are used as an entanglement probe to send a qubit through the wormhole. The qubit is swapped into the left side of the wormhole at time –t0; the energy shockwave is applied at time 0; and the right side of the wormhole is measured at time t1. Right: Photograph of the Google Sycamore quantum processor.

On the Sycamore quantum processor, we measured how much quantum information passed from one side of the system to the other when applying a negative versus a positive energy shockwave. We observed a slight asymmetry between the two energies, showing the key signature of a traversable wormhole. Due to the protocol’s sensitivity to noise, the Sycamore processor’s low error rates were critical to measuring the signal; with even 1.5x the amount of noise, the signal would have been entirely obscured.

Looking Forward

As quantum devices continue to improve, lower error rates and larger chips will allow deeper probes of gravitational phenomena. Unlike experiments such as LIGO that record data about gravity in the world around us, quantum computers provide a tool to explore theories of quantum gravity. We hope that quantum computers will help develop our understanding of future theories of quantum gravity beyond current models.

Gravity is only one example of the unique ability of quantum computers to probe complex physical theories: quantum processors can provide insight into time crystals, quantum chaos, and chemistry. Our work demonstrating wormhole dynamics represents a step towards discovering fundamental physics using quantum processors at Google Quantum AI.

You can also read more about this result here.

Acknowledgements

We would like to thank our Quantum Science Communicator Katherine McCormick for her help writing this blog post.

Read More

Qubit Pharmaceuticals Accelerates Drug Discovery With Hybrid Quantum Computing

Qubit Pharmaceuticals Accelerates Drug Discovery With Hybrid Quantum Computing

The promise of quantum computing is to solve unsolvable problems. And companies are already making headway with hybrid approaches — those that combine classical and quantum computing — to tackle challenges like drug discovery for incurable diseases.

By accelerating drug molecule simulation and modeling with hybrid quantum computing, startup Qubit Pharmaceuticals is significantly reducing the time and investment needed to identify promising treatments in oncology, inflammatory diseases and antivirals.

Qubit is building a drug discovery platform using the NVIDIA QODA programming model for hybrid quantum-classical computers and the startup’s Atlas software suite. Atlas creates detailed simulations of physical molecules, accelerating calculations by a factor of 100,000 compared to traditional research methods.

Founded in 2020, the Paris and Boston-based company is a member of NVIDIA Inception, a program that offers go-to-market support, expertise and technology for cutting-edge startups.

Qubit has one of France’s largest GPU supercomputers for drug discovery, powered by NVIDIA DGX systems. The startup aims for pharmaceutical companies to begin testing their first drug candidates discovered through its GPU-accelerated research next year.

“By combining NVIDIA’s computational power and leading-edge software with Qubit’s simulation and molecular modeling capabilities, we are confident in our ability to dramatically reduce drug discovery time and cut its cost by a factor of 10,” said Robert Marino, president of Qubit Pharmaceuticals. “This unique collaboration should enable us to develop the first quantum physics algorithms applied to drug discovery.”

Tapping Unprecedented Computational Capabilities 

Computational drug discovery involves generating high-resolution simulations of potential drug molecules and predicting how well those molecules might bind to a target protein in the body.

For accurate results, researchers need to perform massive sampling, simulating hundreds of different conformations — possible spatial arrangements of a molecule’s atoms. They must also correctly model molecules’ force fields, the electric charges that predict affinity, or how a molecule will bind to another.

This simulation and modeling requires high performance computing, so Qubit selected an in-house supercomputer built with NVIDIA DGX systems and other NVIDIA-accelerated servers, totaling 200 NVIDIA Tensor Core GPUs. The supercomputer runs Qubit’s Atlas software, performing in just a few hours calculations that would take several years with conventional methods.

Atlas models quantum physics at the microscopic level to achieve maximum accuracy. The Qubit team is adopting NVIDIA QODA to explore the hybrid use of GPU-accelerated supercomputers and quantum computers, where QPUs, or quantum processing units, could one day speed up key software kernels for molecular modeling.

Using the NVIDIA cuQuantum SDK, Qubit’s developers can simulate quantum circuits, allowing the team to design algorithms ready to run on future quantum computers.

AI for Every Stage of Drug Discovery

Qubit estimates that while conventional research methods require pharmaceutical developers to start by synthesizing an average of 5,000 drug compounds before preclinical testing to bring a single drug to market, a simulation-based drug discovery approach could reduce the figure to about 200 — saving hundreds of millions of dollars and years of development time.

The company’s Atlas software includes AI algorithms for every stage of the drug discovery cycle. To support target characterization, where researchers analyze a protein that plays a role in disease, Atlas supports molecular dynamics simulations at microsecond timescales — helping scientists identify new pockets for drug molecules to bind with the protein.

During drug candidate screening and validation, researchers can use AI models that help narrow the field of potential molecules and generate novel compounds. Qubit is also developing additional filters that predict a candidate molecule’s druggability, safety and cross-reactivity.

Learn more about Qubit’s HPC and quantum-accelerated molecular dynamics software from company co-founders Jean-Philip Piquemal and Louis Lagardère through NVIDIA On-Demand.

Main image courtesy of Qubit Pharmaceuticals.

The post Qubit Pharmaceuticals Accelerates Drug Discovery With Hybrid Quantum Computing appeared first on NVIDIA Blog.

Read More

Introducing ChatGPT

We’ve trained a model called ChatGPT which interacts in a conversational way. The dialogue format makes it possible for ChatGPT to answer followup questions, admit its mistakes, challenge incorrect premises, and reject inappropriate requests.OpenAI Blog

Research Focus: Week of November 28, 2022

Research Focus: Week of November 28, 2022

Microsoft Research - Research Focus 05
Week of November 28th, 2022

This special edition of Research Focus highlights some of the 100+ papers from Microsoft Research that were accepted for publication at NeurIPS 2022 – the thirty-sixth annual Conference on Neural Information Processing Systems.

Few-shot Task-agnostic Neural Architecture Search for Distilling Large Language Models

Dongkuan Xu, Subhabrata Mukherjee, Xiaodong Liu, Debadeepta Dey, Wenhui Wang, Xiang Zhang, Ahmed Hassan Awadallah, Jianfeng Gao

Knowledge distillation (KD) is effective in compressing large pre-trained language models, where we train a small student model to mimic the output distribution of a large teacher model (e.g., BERT, GPT-X). KD relies on hand-designed student model architectures that require several trials and pre-specified compression rates. In our paper, Few-shot Task-agnostic Neural Architecture Search for Distilling Large Language Models, we discuss AutoDistil, a new technique pioneered by Microsoft Research that leverages advances in and neural architecture search (NAS) to automatically generate a suite of compressed models with variable computational cost (e.g., varying sizes, FLOPs and latency). NAS for distillation addresses customization challenges of hand-engineering compressed model architectures for diverse deployment environments having variable resource constraints with an automated framework. AutoDistil-generated compressed models obtain up to 41x reduction in FLOPs with limited regression in task performance and 6x FLOPs reduction with parity in performance with large teacher model. Given any state-of-the-art compressed model, AutoDistil finds a better compressed variant with better trade-off in task performance vs. computational cost during inference.


Neuron with steady response leads to better generalization

Qiang Fu, Lun Du, Haitao Mao, Xu Chen, Wei Fang, Shi Han, Dongmei Zhang

Improving models’ ability to generalize is one of the most important research problems in machine learning. Deep neural networks with diverse architectures have been invented and widely applied to various domains and tasks. Our goal was to study and identify the fundamental properties commonly shared by different kinds of deep neural networks, and then design a generic technique applicable for all of them to improve their generalization.

In this paper, from the neural level granularity, we study the characteristics of individual neurons’ response during the training dynamics. We find that keeping the response of activated neurons stable for the same class helps improve models’ ability to generalize. This is a new regularization perspective based on the neuron-level class-dependent response distribution. Meanwhile, we observed that the traditional vanilla model usually lacks good steadiness of intra-class response. Based on these observations, we designed a generic regularization method, Neuron Steadiness Regularization (NSR), to reduce large intra-class neuron response variance. NSR is computationally efficient and applicable to various architectures and tasks. Significant improvements are obtained on extensive experiments with multiple types of datasets and various network architectures. We will continue the research for improving the model generalization ability.


Long-form video-language pre-training with multimodal temporal contrastive learning

Yuchong Sun, Hongwei Xue, Ruihua Song, Bei Liu, Huan Yang, Jianlong Fu

Huge numbers of videos on diverse topics and of various lengths are shared on social media. Analyzing and understanding these videos is an important but challenging problem. Previous work on action and scene recognition has been limited to certain labels, while neglecting the rich semantic and dynamic information in other videos. Inspired by the cross-modal pre-training paradigm in image-language domain (e.g., CLIP, Florence), researchers have explored video-language joint pre-training, which mainly use short-form videos (e.g.,

In this research, we propose a Long-Form VIdeo-LAnguage pre-training model (LF-VILA) to explore long-form video representation learning, and train it on a long-form video-language dataset (LF-VILA-8M) on the basis of our new collected video-language dataset (HD-VILA-100M). We then design a Multimodal Temporal Contrastive (MTC) loss to capture the temporal relation between video clips and single sentences. We also propose the Hierarchical Temporal Window Attention (HTWA) mechanism on video encoder to reduce the training time by one-third. Our model achieves significant improvements on nine benchmarks, including paragraph-to-video retrieval, long-form video question-answering, and action recognition tasks. In the future, we will explore using it for broader scenarios, such as ego-centric video understanding.


Microsoft Research Causality and ML team features multiple papers and workshops at NeurIPS 2022

Parikshit Bansal, Ranveer Chandra, Eleanor Dillon, Saloni Dash, Rui Ding, Darren Edge, Adam Foster, Wenbo Gong, Shi Han, Agrin Hilmkil, Joel Jennings, Jian Jiao, Emre Kıcıman, Hua Li, Chao Ma, Sara Malvar, Robert Ness, Nick Pawlowski, Yashoteja Prabhu, Eduardo Rodrigues, Amit Sharma, Swati Sharma, Cheng Zhang, Dongmei Zhang

Identifying causal effects is an integral part of scientific inquiry, helping us to understand everything from educational outcomes to the effects of social policies to risk factors for diseases. Questions of cause-and-effect are also critical for the design and data-driven improvement and evaluation of business and technological systems we build today. The intersection of causal analysis and machine learning is driving rapid advances. Microsoft researchers are excited to be presenting three papers at NeurIPS, along with workshops on new methods and their applications. This includes work improving deep methods for causal discovery, applying causal insights to improve responsible language models, and improving soil carbon modeling with causal approaches. To accelerate research and broaden adoption of the latest causal methods, Microsoft researchers are co-organizing the Workshop on Causality for Real-world Impact and releasing new no-code interactive ShowWhy tools for causal discovery and analysis. We encourage NeurIPS attendees to learn more via the links below or stop by the Microsoft booth for demos and talks.

Main conference papers

Workshop papers

Workshop on Causality for Real-world Impact

Workshop on Tackling Climate Change with Machine Learning

Workshop on Distribution Shifts

Workshop on Understanding Deep Learning Through Empirical Falsification (“I can’t believe it’s not better”)
We’ll be participating in the panel.


New research on generative models

Two papers covering new research on generative models will be presented at NeurIPS 2022.

Vikas Raunak, Matt Post, Arul Menezes

The first paper, Operationalizing Specifications, In Addition to Test Sets for Evaluating Constrained Generative Models, presents recommendations on the evaluation of state-of-the-art generative models for constrained generation tasks. The progress on generative models has been rapid in recent years. These large-scale models have had three impacts: 1) The fluency of generation in both language and vision modalities has rendered common average-case evaluation metrics much less useful in diagnosing system errors; 2) The same substrate models now form the basis of a number of applications, driven both by the utility of their representations as well as phenomena such as in-context learning, which raise the abstraction level of interacting with such models; 3) The user expectations around these models have made the technical challenge of out-of-domain generalization much less excusable in practice. Subsequently, our evaluation methodologies haven’t adapted to these changes. More concretely, while the associated utility and methods of interacting with generative models have expanded, a similar expansion has not been observed in their evaluation practices. In this paper, we argue that the scale of generative models could be exploited to raise the abstraction level at which evaluation itself is conducted and provide recommendations for the same. Our recommendations are based on leveraging specifications as a powerful instrument to evaluate generation quality and are readily applicable to a variety of tasks. 

Vikas Raunak, Arul Menezes

The second paper is Rank-One Editing of Encoder-Decoder Models. Here, we look at large sequence-to-sequence models for tasks such as neural machine translation (NMT), which are usually trained over hundreds of millions of samples. However, training is just the origin of a model’s life-cycle. Real-world deployments of models require further behavioral adaptations as new requirements emerge or shortcomings become known. Typically, in the space of model behaviors, behavior deletion requests are addressed through model retrainings, whereas model finetuning is done to address behavior addition requests. Both procedures are instances of data-based model intervention. In this work, we present a preliminary study investigating rank-one editing as a direct intervention method for behavior deletion requests in encoder-decoder transformer models. We propose four editing tasks for NMT and show that the proposed editing algorithm achieves high efficacy, while requiring only a single instance of positive example to fix an erroneous (negative) model behavior. This research therefore explores a path towards fixing the deleterious behaviors of encoder-decoder models for tasks such as translation, making them safer and more reliable without investing in a huge computational budget. 


Award Winner: A Neural Corpus Indexer for Document Retrieval

Yujing Wang, Yingyan Hou, Haonan Wang, Ziming Miao, Shibin Wu, Hao Sun, Qi Chen, Yuqing Xia, Chengmin Chi, Guoshuai Zhao, Zheng Liu, Xing Xie, Hao Allen Sun, Weiwei Deng, Qi Zhang, Mao Yang

Note: this paper was named an Outstanding Paper at NeurIPS 2022

Current state-of-the-art document retrieval solutions typically follow an index-retrieve paradigm, where the index is not directly optimized towards the final target. The proposed Neural Corpus Indexer (NCI) model, instead, leverages a sequence-to-sequence architecture, which serves as a model-based index that takes a query as input and outputs the most relevant document identifiers. For the first time, we demonstrate that an end-to-end differentiable document retrieval model can significantly outperform both sparse inverted index and dense retrieval methods. Specifically, NCI achieves +17.6% and +16.8% relative enhancement for Recall@1 on NQ320k dataset and R-Precision on TriviaQA dataset respectively, and a competitive MRR without using an explicit re-ranking model. This work has received a NeurIPS 2022 Outstanding Paper award.

The pipeline is composed of three stages. In the first stage, documents are encoded into semantic identifiers by the hierarchical k-means algorithm. In the second stage, a query generation model is employed to prepare training pairs. At the third stage, the NCI is trained with cross-entropy and consistency-based regularization losses. To further align with the hierarchical nature of the semantic identifiers, a weight adaptation mechanism is introduced to make the decoder aware of semantic prefixes. During inference, top N relevant documents can be easily obtained via beam search. The proposed approach introduces architectural and training choices that demonstrate the promising future of neural indexers as a viable alternative. And the discussed open questions can serve as an inspiration for future research.


Microsoft Research career opportunities – come join us!

We’re hiring for multiple roles including internships and researchers at all levels in multiple Microsoft Research labs. Join us and work on causal ML, precision health, genomics, deep learning, robotics, or computational chemistry. If you’re attending the conference, stop by the Microsoft booth (Expo Hall G, Booth #202) to speak with researchers and recruiters about working at Microsoft and open job opportunities. Or you can browse our current openings at NeurIPS 2022 – Microsoft Research career opportunities.

The post Research Focus: Week of November 28, 2022 appeared first on Microsoft Research.

Read More

Better Language Models Without Massive Compute

Better Language Models Without Massive Compute

In recent years, language models (LMs) have become more prominent in natural language processing (NLP) research and are also becoming increasingly impactful in practice. Scaling up LMs has been shown to improve performance across a range of NLP tasks. For instance, scaling up language models can improve perplexity across seven orders of magnitude of model sizes, and new abilities such as multi-step reasoning have been observed to arise as a result of model scale. However, one of the challenges of continued scaling is that training new, larger models requires great amounts of computational resources. Moreover, new models are often trained from scratch and do not leverage the weights from previously existing models.

In this blog post, we explore two complementary methods for improving existing language models by a large margin without using massive computational resources. First, in “Transcending Scaling Laws with 0.1% Extra Compute”, we introduce UL2R, which is a lightweight second stage of pre-training that uses a mixture-of-denoisers objective. UL2R improves performance across a range of tasks and even unlocks emergent performance on tasks that previously had close to random performance. Second, in “Scaling Instruction-Finetuned Language Models”, we explore fine-tuning a language model on a collection of datasets phrased as instructions, a process we call “Flan”. This approach not only boosts performance, but also improves the usability of the language model to user inputs without engineering of prompts. Finally, we show that Flan and UL2R can be combined as complementary techniques in a model called Flan-U-PaLM 540B, which outperforms the unadapted PaLM 540B model by 10% across a suite of challenging evaluation benchmarks.

UL2R Training

Traditionally, most language models are pre-trained on either a causal language modeling objective that enables the model to predict the next word in a sequence (e.g., GPT-3 or PaLM) or a denoising objective, where the model learns to recover the original sentence from a corrupted sequence of words, (e.g., T5). Although there are some tradeoffs in language modeling objectives in that causal LMs are better at long-form generation and LMs trained on a denoising objective are better for fine-tuning, in prior work we demonstrated that a mixture-of-denoisers objective that includes both objectives results in better performance on both scenarios.

However, pre-training a large language model on a different objective from scratch can be computationally prohibitive. Hence, we propose UL2 Repair (UL2R), an additional stage of continued pre-training with the UL2 objective that only requires a relatively small amount of compute. We apply UL2R to PaLM and call the resulting new language model U-PaLM.

In empirical evaluations, we found that scaling curves improve substantially with only a small amount of UL2 training. For instance, we show that by using UL2R on the intermediate checkpoint of PaLM 540B, we reach the performance of the final PaLM 540B checkpoint while using 2x less compute (or a difference of 4.4 million TPUv4 hours). Naturally, applying UL2R to the final PaLM 540B checkpoint also leads to substantial improvements, as described in the paper.

Compute versus model performance of PaLM 540B and U-PaLM 540B on 26 NLP benchmarks (listed in Table 8 in the paper). U-PaLM 540B continues training PaLM for a very small amount of compute but provides a substantial gain in performance.

Another benefit that we observed from using UL2R is that on some tasks, performance is much better than models trained purely on the causal language modeling objective. For instance, there are many BIG-Bench tasks that have been described as “emergent abilities”, i.e., abilities that can only be observed in sufficiently large language models. Although the way that emergent abilities are most commonly found is by scaling up the size of the LM, we found that UL2R can actually elicit emergent abilities without increasing the scale of the LM.

For instance, in the Navigate task from BIG-Bench, which measures the model’s ability to perform state tracking, all models except U-PaLM with less than 1023 training FLOPs achieve approximately random performance. U-PaLM performance is more than 10 points above that. Another example of this is the Snarks task from BIG-Bench, which measures the model’s ability to detect sarcasm. Again, whereas all models less than 1024 training FLOPs achieve approximately random performance, U-PaLM achieves well above even for the 8B and 62B models.

For two abilities from BIG-Bench that demonstrate emergent task performance, U-PaLM achieves emergence at a smaller model size due to its use of the UL2R objective.

Instruction Fine-Tuning

In our second paper, we explore instruction fine-tuning, which involves fine-tuning LMs on a collection of NLP datasets phrased as instructions. In prior work, we applied instruction fine-tuning to a 137B-parameter model on 62 NLP tasks, such as answering a trivia question, classifying the sentiment of a movie, or translating a sentence to Spanish.

In this work we fine-tune a 540B parameter language model on more than 1.8K tasks. Moreover, whereas previous efforts only fine-tuned a LM with few-shot exemplars (e.g., MetaICL) or zero-shot without exemplars (e.g., FLAN, T0), we fine-tune on a combination of both. We also include chain of thought fine-tuning data, which enables the model to perform multi-step reasoning. We call our improved methodology “Flan”, for fine-tuning language models. Notably, even with fine-tuning on 1.8K tasks, Flan only uses a small portion of compute compared to pre-training (e.g., for PaLM 540B, Flan only requires 0.2% of the pre-training compute).

We fine-tune language models on 1.8K tasks phrased as instructions, and evaluate them on unseen tasks, which are not included in fine-tuning. We fine-tune both with and without exemplars (i.e., zero-shot and few-shot) and with and without chain of thought, enabling generalization across a range of evaluation scenarios.

In the paper, we instruction–fine-tune LMs of a range of sizes to investigate the joint effect of scaling both the size of the LM and the number of fine-tuning tasks. For instance, for the PaLM class of LMs, which includes models of 8B, 62B, and 540B parameters. We evaluate our models on four challenging benchmark evaluation suites (MMLU, BBH, TyDiQA, and MGSM), and find that both scaling the number of parameters and number of fine-tuning tasks improves performance on unseen tasks.

Both scaling up to a 540B parameter model and using 1.8K fine-tuning tasks improves the performance on unseen tasks. The y-axis is the normalized average over four evaluation suites (MMLU, BBH, TyDiQA, and MGSM).

In addition to better performance, instruction fine-tuning a LM enables it to respond to user instructions at inference time, without few-shot exemplars or prompt engineering. This makes LMs more user-friendly across a range of inputs. For instance, LMs without instruction fine-tuning can sometimes repeat the input or fail to follow instructions, but instruction fine-tuning mitigates such errors.

Our instruction–fine-tuned language model, Flan-PaLM, responds better to instructions compared to the PaLM model without instruction fine-tuning.

Putting Them Together

Finally, we show that UL2R and Flan can be combined to train the Flan-U-PaLM model. Since Flan uses new data from NLP tasks and enables zero-shot instruction following, we apply Flan as the second method after UL2R. We again evaluate on the four benchmark suites, and find that the Flan-U-PaLM model outperforms PaLM models with just UL2R (U-PaLM) or just Flan (Flan-PaLM). Further, Flan-U-PaLM achieves a new state-of-the-art on the MMLU benchmark with a score of 75.4% when combined with chain of thought and self-consistency.

Combining UL2R and Flan (Flan-U-PaLM) leads to the best performance compared to just using UL2R (U-PaLM) or just Flan (Flan-U-PaLM). Performance is the normalized average over four evaluation suites (MMLU, BBH, TyDiQA, and MGSM).

<!–

Average performance on four challenging evaluation suites
PaLM 49.1%
U-PaLM 50.2%
Flan-PaLM 58.4%
Flan-U-PaLM 59.1%

Combining UL2R and Flan (Flan-U-PaLM) leads to the best performance compared to just using UL2R (U-PaLM) or just Flan (Flan-U-PaLM). Performance is the normalized average over four evaluation suites (MMLU, BBH, TyDiQA, and MGSM).

–>

Overall, UL2R and Flan are two complementary methods for improving pre-trained language models. UL2R adapts the LM to a mixture-of-denoisers objective using the same data, whereas Flan leverages training data from over 1.8K NLP tasks to teach the model to follow instructions. As LMs become even larger, techniques such as UL2R and Flan that improve general performance without large amounts of compute may become increasingly attractive.

Acknowledgements

It was a privilege to collaborate on these two papers with Hyung Won Chung, Vinh Q. Tran, David R. So, Siamak Shakeri, Xavier Garcia, Huaixiu Steven Zheng, Jinfeng Rao, Aakanksha Chowdhery, Denny Zhou, Donald Metzler, Slav Petrov, Neil Houlsby, Quoc V. Le, Mostafa Dehghani, Le Hou, Shayne Longpre, Barret Zoph, Yi Tay, William Fedus, Yunxuan Li, Xuezhi Wang, Mostafa Dehghani, Siddhartha Brahma, Albert Webson, Shixiang Shane Gu, Zhuyun Dai, Mirac Suzgun, Xinyun Chen, Sharan Narang, Gaurav Mishra, Adams Yu, Vincent Zhao, Yanping Huang, Andrew Dai, Hongkun Yu, Ed H. Chi, Jeff Dean, Jacob Devlin, and Adam Roberts.

Read More

Building a TensorFlow Lite based computer vision emoji input device with OpenMV

Building a TensorFlow Lite based computer vision emoji input device with OpenMV

A guest post by Sandeep Mistry, Arm

Introduction

Emojis allow us to express emotions in the digital world, they are relatively easy to input on smartphone and tablet devices equipped with touch screen based virtual keyboards, but they are not as easy to input on traditional computing devices that have physical keyboards. To input emojis on these devices, users typically use a keyboard shortcut or mouse to bring up an on-screen emoji selector, and then use a mouse to select the desired emoji from a series of categories.

This blog will highlight an in-depth open-source guide that uses tinyML on an Arm Cortex-M based device to create a dedicated input device. This device will take real-time input from a camera and applies a machine learning (ML) image classification model to detect if the image from the camera contains a set of known hand gestures (✋, 👎, 👍, 👊). When the hand gesture is detected with high certainty, the device will then use the USB Human Interface Device (HID) protocol to “type” the emoji on the PC.

The TensorFlow Lite for Microcontrollers run-time with Arm CMSIS-NN is used as the on-device ML inferencing framework on the dedicated input device. On-device inferencing will allow us to reduce the latency of the system, as the image data will be processed at the source (instead of being transmitted to a cloud service). The user’s privacy will also be preserved, as no image data will leave the device at inference time.

NOTE: The complete in-depth and interactive tutorial is available on Google Colab and all technical assets for the guide can be found on GitHub.

Microcontrollers and Keyboards

Microcontroller Units (MCUs) are self-contained computing systems embedded in the devices you use every day, including your keyboard! Like all computing systems, they have inputs and outputs.

The MCU inside a USB keyboard reacts to the digital events that occur when one or more of the key switches on the keyboard are pressed or released. The MCU determines which key(s) triggered the event and then translates the event into a USB HID message to send to the PC using the USB standard.
Block diagram of USB keyboard
Block diagram of USB keyboard
The emoji ‘keyboard’ will use an image sensor for input (instead of key switches) and then process the image data locally on a more powerful Arm Cortex-M7 based microcontroller. All operations, including ML inferencing, are performed on a STM32H7 MCU, which contains an Arm Cortex-M7 CPU along with a digital interface for the image sensor and USB communications.
Block diagram of computer vision based emoji 'keyboard'
Block diagram of computer vision based emoji “keyboard”
Even though the STM32 H7 is a constrained computing platform that runs at 480 MHz with 1 MB of on-board RAM – we can still process a grayscale 96×96 pixel image input from the camera at just under 20 frames per second (fps)!

The OpenMV development platform

OpenMV is an open source (Micro) Python powered Machine Vision platform. The OpenMV product line-up consists of several Arm Cortex-M based development boards. Each board is equipped with an on-board camera and MCU. For this project, the OpenMV Cam H7 or OpenMV Cam H7 R2 board will suit our needs.

What we will need

OpenMV Cam H7 Camera (left) and microSD card (right)
  • Hardware

Dataset

Kaggle user Sparsh Gupta (@imsparsh) has previously curated and shared an excellent Gesture Recognition dataset and made it publicly available on Kaggle under a permissive CC0 1.0 Universal (CC0 1.0) Public Domain license.

The dataset contains ~23k image files of people performing various hand gestures over a 30 second period.

Images from the dataset will need to be relabeled as follows:

Original Labels

New Labels

  1. Left hand swipe

  2. Right hand swipe

  3. Thumbs down

  4. Thumbs up

  1. 🚫 – No gesture

  2. ✋ – Hand up

  3. 👎 – Thumbs down

  4. 👍 – Thumbs up

  5. 👊 – Fist

Since the swipe right and swipe left gestures in the Kaggle dataset do not correspond to any of these classes, any images in these classes will need to be discarded for our model.

Images in the Kaggle dataset are taken over a 30 second period, they might contain other gestures at the start or end of the series. For example, some of the people in the dataset started with their hands in a fist position before eventually going to the labeled gesture hand up, thumbs up and thumbs down. Other times the person in the dataset starts off with no hand gesture in frame.

We have gone ahead and manually re-labeled the images into the classes, it can be found in CSV format in the data folder on GitHub, and contains labels for ~14k images.

TensorFlow model

You can find more details on the training pipeline used here in this Colab Notebook.

Loading and Augmenting Images

Images from the dataset can be loaded as a TensorFlow Dataset using the tf.keras.utils.image_dataset_from_directory(…) API. This API supports adjusting the image’s color mode (to grayscale) and size (96×96 pixels) to meet the model’s desired input format. Built-in Keras layers for data augmentation (random: flipping, rotation, zooming, and contrast adjustments) will also be used during training.

Model Architecture

MobileNetV1 is a well-known model architecture used for image classification tasks, including the TensorLite for Microcontrollers Person detection example. This model architecture is trained on our dataset, with the same alpha (0.25) and image sizes (96x96x1) used in the Visual Wake Words Dataset paper. A MobileNetV1 model is composed of 28 layers, but a single call to the Keras tf.keras.applications.mobilenet.MobileNet(…) API can be used to easily create a MobileNetV1 model for 5 output classes and the desired alpha and input shape values:

python

mobilenet_025_96 = tf.keras.applications.mobilenet.MobileNet(
    input_shape=(96, 96, 1),
    alpha=0.25,
    dropout=0.10,
    weights=None,
    pooling=‘avg’,
    classes=5,
)

The MicroPython based firmware used on the OpenMV Cam H7 does not include support for all of the layer types in the MobileNetV1 model created using the Keras API, however it can be adapted to use supported layers using only ~30 lines of Python code. Once the model is adapted and trained it can then be converted to TensorFlow Lite format using the tf.lite.TFLiteConverter.from_keras_model(..) API. The resulting .tflite file can then be used for on-device inference on the OpenMV development board.
OpenMV Application and inferencing

The .tflite model can then be integrated into the OpenMV application. You can find more details on the inference application in the Colab Notebook and full source code in the openmv folder on GitHub.

The application will loop continuously performing the following steps:

Block Diagram of Application processing pipeline
Block Diagram of Application processing pipeline
  1. Grab an image frame from the camera.
  2. Get the ML model’s output for the captured image frame.
  3. Filter the ML model’s output for high certainty predictions using “low activation” and “margin of confidence” techniques.
  4. Use an exponential smoothing function to smooth the model’s noisy (Softmax) outputs.
  5. Use the exponentially smoothed model outputs to determine if a new hand gesture is present.
  6. Then “type” the associated emoji on a PC using the USB HID protocol.

Conclusion

Throughout this project we’ve covered an end-to-end flow of training a custom image classification model and how to deploy it locally to a Arm Cortex-M7 based OpenMV development board using TensorFlow Lite! TensorFlow was used in a Google Colab notebook to train the model on a re-labeled public dataset from Kaggle. After training, the model was converted into TensorFlow Lite format to run on the OpenMV board using the TensorFlow Lite for Microcontrollers run-time along with accelerated Arm CMSIS-NN kernels.

At inference time the model’s outputs were processed using model certainty techniques, and then fed output from the (Softmax) activation output into an exponential smoothing function to determine when to send keystrokes over USB HID to type emojis on a PC. The dedicated input device we created was able to capture and process grayscale 96×96 image data at just under 20 fps on an Arm Cortex-M7 processor running at 480 MHz. On-device inferencing provided a low latency response and preserved the privacy of the user by keeping all image data at the source and processing it locally.

Build one yourself by purchasing an OpenMV Cam H7 R2 board on openmv.io or a distributor. The project can be extended by fine tuning the model on your own data or applying transfer learning techniques and using the model we developed as base to train other hand gestures. Maybe you can find another public dataset for facial gestures and use it to type 😀 emojis when you smile!

A big thanks to Sparsh Gupta for sharing the Gesture Recognition dataset on Kaggle under a public domain license and my Arm colleagues Rod Crawford, Prathyusha Venkata, Elham Harirpoush, and Liliya Wu for their help in reviewing the material for this blog post and associated tutorial!

Read More

Siemens Taps Omniverse Replicator on AWS for Synthetic Data Generation to Accelerate Defect Detection Model Development by 5X

Siemens Taps Omniverse Replicator on AWS for Synthetic Data Generation to Accelerate Defect Detection Model Development by 5X

Industrial leader Siemens is accelerating development of defect detection models with 3D synthetic data generation from NVIDIA Omniverse, the latest manufacturing gains to emerge from an extended partnership for the industrial metaverse that aims to advance digital twins.

The Siemens Xcelerator and NVIDIA Omniverse platforms are building connections to enable full-design-fidelity, live digital twins that connect software-defined AI systems from edge to cloud.

Europe’s largest industrial manufacturer manages a lot of moving parts, so AI-driven defect detection promises to boost quality assurance and yield at massive scale.

But building AI models requires hefty amounts of data, and producing labeled datasets for training models to detect defects is a time-consuming and expensive process. In most cases, such data may not cover all the types of defects or their locations.

Using NVIDIA Replicator and Siemens SynthAI technology, we can procedurally generate sets of photorealistic images using the digital models of our products and production resources and an integrated training pipeline to train ready-to-use models. This speeds up our set-up time for AI inspection models by a factor of five,” said Maximilian Metzner, global lead for autonomous manufacturing systems for electronics at GWE.

As a result, Siemens has begun tapping into NVIDIA Omniverse Replicator running on Amazon G5 instances for synthetic data generation, accelerating its AI model development times from taking “months” to “days,” according to the company.

Synthetic data is turbocharging model development. It’s boosting data sets for everything from German company Festo’s robotic arm work, to efforts at Amazon Robotics using synthetic data to train robots to identify packages.

At Siemens, synthetic data generation is being used beyond defect detection to assist in areas including, but not limited to, robotic bin picking, safety monitoring, welding and wiring inspections, and checking kits of parts.

“The better the synthetic data you have, the less real data you need — obtaining real data is a hassle, so you want to reduce that as much as possible without sacrificing accuracy,” said Alex Greenberg, director of advanced robotics simulation at Siemens Digital Industries Software.

Inspecting Motion Control Devices

The Siemens Motion Control Business Unit produces inverters, drive controllers and motors for more than 30,000 customers worldwide. The lead electronics plant, GWE, based in Erlangen, Germany, has been working on AI-enabled computer vision for defect detection using custom methods and different modes of synthetic data generation.

Common synthetic data generation methods, however, weren’t sufficient for production-ready robustness in some use-cases, leading to a need for real data acquisition and labeling, which could take months.

GWE worked with the Siemens’ Digital Industries Software division to find a better way to produce datasets.

“For many industrial use cases, products are changing rapidly. Materials are changing rapidly. It needs to be automated in a fast way and without a lot of know-how from the endpoint engineer,” said Zac Mann, advanced robotics simulation lead at Siemens Digital Industries Software.

Catching Printed Circuit Board Defects

The challenge at GWE is to catch defects early in the ramp-up of new products and production lines. Waiting for real errors to happen just to enhance the training-datasets is not an option.

One area of focus for defects in a printed circuit board (PCB) is examining the thermal paste that’s applied to some components on the PCB in order to help transfer heat quickly to the attached heatsink, away from the components.

To catch PCB defects, the Siemens Digital Industries Software team took another approach by relying on synthetic data driven by Omniverse Replicator.

With Omniverse, a platform for building custom 3D pipelines and simulating virtual worlds, Siemens can generate scenarios and much more realistic images easily, aided with RTX technology-enabled physics-based rendering and materials.

This enables Siemens to move more quickly and smoothly at developing to close the gap from simulation to reality, said Mann.

“Using Omniverse Replicator and Siemens SynthAI technology, we can procedurally generate sets of photorealistic images using the digital models of our products and production resources and an integrated training pipeline to train ready-to-use models. This speeds up our set-up time for AI inspection models by a factor of five and increases their robustness massively,” said Maximilian Metzner, global lead for autonomous manufacturing systems for electronics at GWE.

Tapping Into Randomization With SynthAI

GWE engineers can now take a 3D CAD model of the PCB and import that into Siemens’ SynthAI tool. SynthAI is designed to build data sets for training AI models.

Tapping into Replicator, SynthAI can access its powerful randomization features to vary the sizes and locations of defects, change lighting, color, texture and more to develop a robust dataset.

Once data is generated with Replicator, it can be run through a defect detection model for initial training. This enables GWE engineers to quickly test and iterate on models, requiring only a small set of data to begin.

“This gives you visibility earlier into the design phase, and it can shorten time to market, which is very important,”  said Greenberg.

Get started using NVIDIA Omniverse Replicator.

The post Siemens Taps Omniverse Replicator on AWS for Synthetic Data Generation to Accelerate Defect Detection Model Development by 5X appeared first on NVIDIA Blog.

Read More