Optimize for sustainability with Amazon CodeWhisperer

Optimize for sustainability with Amazon CodeWhisperer

This post explores how Amazon CodeWhisperer can help with code optimization for sustainability through increased resource efficiency. Computationally resource-efficient coding is one technique that aims to reduce the amount of energy required to process a line of code and, as a result, aid companies in consuming less energy overall. In this era of cloud computing, developers are now harnessing open source libraries and advanced processing power available to them to build out large-scale microservices that need to be operationally efficient, performant, and resilient. However, modern applications often consist of extensive code, demanding significant computing resources. Although the direct environmental impact might not be obvious, sub-optimized code amplifies the carbon footprint of modern applications through factors like heightened energy consumption, prolonged hardware usage, and outdated algorithms. In this post, we discover how Amazon CodeWhisperer helps address these concerns and reduce the environmental footprint of your code.

Amazon CodeWhisperer is a generative AI coding companion that speeds up software development by making suggestions based on the existing code and natural language comments, reducing the overall development effort and freeing up time for brainstorming, solving complex problems, and authoring differentiated code. Amazon CodeWhisperer can help developers streamline their workflows, enhance code quality, build stronger security postures, generate robust test suites, and write computationally resource friendly code, which can help you optimize for environmental sustainability. It is available as part of the Toolkit for Visual Studio Code, AWS Cloud9, JupyterLab, Amazon SageMaker Studio, AWS Lambda, AWS Glue, and JetBrains IntelliJ IDEA. Amazon CodeWhisperer currently supports Python, Java, JavaScript, TypeScript, C#, Go, Rust, PHP, Ruby, Kotlin, C, C++, Shell scripting, SQL, and Scala.

Impact of unoptimized code on cloud computing and application carbon footprint

AWS’s infrastructure is 3.6 times more energy efficient than the median of surveyed US enterprise data centers and up to 5 times more energy efficient than the average European enterprise data center. Therefore, AWS can help lower the workload carbon footprint up to 96%. You can now use Amazon CodeWhisperer to write quality code with reduced resource usage and energy consumption, and meet scalability objectives while benefiting from AWS energy efficient infrastructure.

Increased resource usage

Unoptimized code can result in the ineffective usage of cloud computing resources. As a result, more virtual machines (VMs) or containers may be required, increasing resource allocation, energy use, and the related carbon footprint of the workload. You might encounter increases in the following:

  • CPU utilization – Unoptimized code often contains inefficient algorithms or coding practices that require excessive CPU cycles to run.
  • Memory consumption – Inefficient memory management in unoptimized code can result in unnecessary memory allocation, deallocation, or data duplication.
  • Disk I/O operations – Inefficient code can perform excessive input/output (I/O) operations. For example, if data is read from or written to disk more frequently than necessary, it can increase disk I/O utilization and latency.
  • Network usage – Due to ineffective data transmission techniques or duplicate communication, poorly optimized code may cause an excessive amount of network traffic. This can lead to higher latency and increased network bandwidth utilization. Increased network utilization may result in higher expenses and resource needs in situations where network resources are taxed based on usage, such as in cloud computing.

Higher energy consumption

Infrastructure-supporting applications with inefficient code uses more processing power. Overusing computing resources due to inefficient, bloated code can result in higher energy consumption and heat production, which subsequently necessitates more energy for cooling. Along with the servers, the cooling systems, the infrastructure for power distribution, and other auxiliary elements also consume energy.

Scalability challenges

In application development, scalability issues can be caused by unoptimized code. Such code may not scale effectively as the task grows, necessitating more resources and using more energy. This increases the energy consumed by these code fragments. As mentioned previously, inefficient or wasteful code has a compounding effect at scale.

The compounded energy savings from optimizing code that customers run in certain data centers is even further compounded when we take into consideration that cloud providers such as AWS have dozens of data centers around the world.

Amazon CodeWhisperer uses machine learning (ML) and large language models to provide code recommendations in real time based on the original code and natural language comments, and provides code recommendations that could be more efficient. The program’s infrastructure usage efficiency can be increased by optimizing the code using strategies including algorithmic advancements, effective memory management, and a reduction in pointless I/O operations.

Code generation, completion, and suggestions

Let’s examine several situations where Amazon CodeWhisperer can be useful.

By automating the development of repetitive or complex code, code generation tools minimize the possibility of human error while focusing on platform-specific optimizations. By using established patterns or templates, these programs may produce code that more consistently adheres to sustainability best practices. Developers can produce code that complies with particular coding standards, helping deliver more consistent and dependable code throughout the project. The resulting code may be more efficient and because it removes human coding variations, and can be more legible, improving development speed. It can automatically implement ways to reduce the application program size and length, such as deleting superfluous code, improving variable storage, or using compression methods. These optimizations can aid in memory consumption optimization and boosts overall system efficiency by shrinking the package size.

Generative AI has the potential to make programming more sustainable by optimizing resource allocation. Looking holistically at an application’s carbon footprint is important. Tools like Amazon CodeGuru Profiler can collect performance data to optimize latency between components. The profiling service examines code runs and identifies potential improvements. Developers can then manually refine the auto generated code based on these findings to further improve energy efficiency. The combination of generative AI, profiling, and human oversight creates a feedback loop that can continuously improve code efficiency and reduce environmental impact.

The following screenshot shows you results generated from CodeGuru Profiler in latency mode, which includes network and disk I/O. In this case, the application still spends most of its time in ImageProcessor.extractTasks (second bottom row), and almost all the time inside that is runnable, which means that it wasn’t waiting for anything. You can view these thread states by changing to latency mode from CPU mode. This can help you get a good idea of what is impacting the wall clock time of the application. For more information, refer to Reducing Your Organization’s Carbon Footprint with Amazon CodeGuru Profiler.

image

Generating test cases

Amazon CodeWhisperer can help suggest test cases and verify the code’s functionality by considering boundary values, edge cases, and other potential issues that may need to be tested. Also, Amazon CodeWhisperer can simplify creating repetitive code for unit testing. For example, if you need to create sample data using INSERT statements, Amazon CodeWhisperer can generate the necessary inserts based on a pattern. The overall resource requirements for software testing can also be decreased by identifying and optimizing resource-intensive test cases or removing redundant ones. Improved test suites have the potential to make the application become more environmentally friendly by increasing energy efficiency, decreasing resource consumption, minimizing waste, and reducing the workload carbon footprint.

For a more hands-on experience with Amazon CodeWhisperer, refer to Optimize software development with Amazon CodeWhisperer. The post showcases the code recommendations from Amazon CodeWhisperer in Amazon SageMaker Studio. It also demonstrates the suggested code based on comments for loading and analyzing a dataset.

Conclusion

In this post, we learned how Amazon CodeWhisperer can help developers write optimized, more sustainable code. Using advanced ML models, Amazon CodeWhisperer analyzes your code and provides personalized recommendations for improving efficiency, which can reduce costs and help decrease the carbon footprint.

By suggesting minor adjustments and alternative approaches, Amazon CodeWhisperer enables developers to significantly cut resource usage and emissions without sacrificing functionality. Whether you’re looking to optimize an existing code base or ensure new projects are resource efficient, Amazon CodeWhisperer can be an invaluable aid. To learn more about Amazon CodeWhisperer and AWS Sustainability resources for code optimization, consider the following next steps:


About the authors

Isha Dua is a Senior Solutions Architect based in the San Francisco Bay Area. She helps AWS enterprise customers grow by understanding their goals and challenges, and guides them on how they can architect their applications in a cloud-native manner while ensuring resilience and scalability. She’s passionate about machine learning technologies and environmental sustainability.

Ajjay Govindaram is a Senior Solutions Architect at AWS. He works with strategic customers who are using AI/ML to solve complex business problems. His experience lies in providing technical direction as well as design assistance for modest to large-scale AI/ML application deployments. His knowledge ranges from application architecture to big data, analytics, and machine learning. He enjoys listening to music while resting, experiencing the outdoors, and spending time with his loved ones.

Erick Irigoyen is a Solutions Architect at Amazon Web Services focusing on clients in the Semiconductors and Electronics industry. He works closely with customers to understand their business challenges and identify how AWS can be leveraged to achieve their strategic goals. His work has primarily focused on projects related to Artificial Intelligence and Machine Learning (AI/ML). Prior to joining AWS, he was a Senior Consultant at Deloitte’s Advanced Analytics practice where he led workstreams in several engagements across the United States focusing on Analytics and AI/ML. Erick holds a B.S. in Business from the University of San Francisco and an M.S. in Analytics from North Carolina State University.

Read More

Acing the Test: NVIDIA Turbocharges Generative AI Training in MLPerf Benchmarks

Acing the Test: NVIDIA Turbocharges Generative AI Training in MLPerf Benchmarks

NVIDIA’s AI platform raised the bar for AI training and high performance computing in the latest MLPerf industry benchmarks.

Among many new records and milestones, one in generative AI stands out: NVIDIA Eos — an AI supercomputer powered by a whopping 10,752 NVIDIA H100 Tensor Core GPUs and NVIDIA Quantum-2 InfiniBand networking — completed a training benchmark based on a GPT-3 model with 175 billion parameters trained on one billion tokens in just 3.9 minutes.

That’s a nearly 3x gain from 10.9 minutes, the record NVIDIA set when the test was introduced less than six months ago.

NVIDIA H100 training results over time on MLPerf benchmarks

The benchmark uses a portion of the full GPT-3 data set behind the popular ChatGPT service that, by extrapolation, Eos could now train in just eight days, 73x faster than a prior state-of-the-art system using 512 A100 GPUs.

The acceleration in training time reduces costs, saves energy and speeds time-to-market. It’s heavy lifting that makes large language models widely available so every business can adopt them with tools like NVIDIA NeMo, a framework for customizing LLMs.

In a new generative AI test ‌this round, 1,024 NVIDIA Hopper architecture GPUs completed a training benchmark based on the Stable Diffusion text-to-image model in 2.5 minutes, setting a high bar on this new workload.

By adopting these two tests, MLPerf reinforces its leadership as the industry standard for measuring AI performance, since generative AI is the most transformative technology of our time.

System Scaling Soars

The latest results were due in part to the use of the most accelerators ever applied to an MLPerf benchmark. The 10,752 H100 GPUs far surpassed the scaling in AI training in June, when NVIDIA used 3,584 Hopper GPUs.

The 3x scaling in GPU numbers delivered a 2.8x scaling in performance, a 93% efficiency rate thanks in part to software optimizations.

Efficient scaling is a key requirement in generative AI because LLMs are growing by an order of magnitude every year. The latest results show NVIDIA’s ability to meet this unprecedented challenge for even the world’s largest data centers.

Chart of near linear scaling of H100 GPUs on MLPerf training

The achievement is thanks to a full-stack platform of innovations in accelerators, systems and software that both Eos and Microsoft Azure used in the latest round.

Eos and Azure both employed 10,752 H100 GPUs in separate submissions. They achieved within 2% of the same performance, demonstrating the efficiency of NVIDIA AI in data center and public-cloud deployments.

Chart of record Azure scaling in MLPerf training

NVIDIA relies on Eos for a wide array of critical jobs. It helps advance initiatives like NVIDIA DLSS, AI-powered software for state-of-the-art computer graphics and NVIDIA Research projects like ChipNeMo, generative AI tools that help design next-generation GPUs.

Advances Across Workloads

NVIDIA set several new records in this round in addition to making advances in generative AI.

For example, H100 GPUs were 1.6x faster than the prior-round training recommender models widely employed to help users find what they’re looking for online. Performance was up 1.8x on RetinaNet, a computer vision model.

These increases came from a combination of advances in software and scaled-up hardware.

NVIDIA was once again the only company to run all MLPerf tests. H100 GPUs demonstrated the fastest performance and the greatest scaling in each of the nine benchmarks.

List of six new NVIDIA records in MLPerf training

Speedups translate to faster time to market, lower costs and energy savings for users training massive LLMs or customizing them with frameworks like NeMo for the specific needs of their business.

Eleven systems makers used the NVIDIA AI platform in their submissions this round, including ASUS, Dell Technologies, Fujitsu, GIGABYTE, Lenovo, QCT and Supermicro.

NVIDIA partners participate in MLPerf because they know it’s a valuable tool for customers evaluating AI platforms and vendors.

HPC Benchmarks Expand

In MLPerf HPC, a separate benchmark for AI-assisted simulations on supercomputers, H100 GPUs delivered up to twice the performance of NVIDIA A100 Tensor Core GPUs in the last HPC round. The results showed up to 16x gains since the first MLPerf HPC round in 2019.

The benchmark included a new test that trains OpenFold, a model that predicts the 3D structure of a protein from its sequence of amino acids. OpenFold can do in minutes vital work for healthcare that used to take researchers weeks or months.

Understanding a protein’s structure is key to finding effective drugs fast because most drugs act on proteins, the cellular machinery that helps control many biological processes.

In the MLPerf HPC test, H100 GPUs trained OpenFold in 7.5 minutes.  The OpenFold test is a representative part of the entire AlphaFold training process that two years ago took 11 days using 128 accelerators.

A version of the OpenFold model and the software NVIDIA used to train it will be available soon in NVIDIA BioNeMo, a generative AI platform for drug discovery.

Several partners made submissions on the NVIDIA AI platform in this round. They included Dell Technologies and supercomputing centers at Clemson University, the Texas Advanced Computing Center and — with assistance from Hewlett Packard Enterprise (HPE) — Lawrence Berkeley National Laboratory.

Benchmarks With Broad Backing

Since its inception in May 2018, the MLPerf benchmarks have enjoyed broad backing from both industry and academia. Organizations that support them include Amazon, Arm, Baidu, Google, Harvard, HPE, Intel, Lenovo, Meta, Microsoft, NVIDIA, Stanford University and the University of Toronto.

MLPerf tests are transparent and objective, so users can rely on the results to make informed buying decisions.

All the software NVIDIA used is available from the MLPerf repository, so all developers can get the same world-class results. These software optimizations get continuously folded into containers available on NGC, NVIDIA’s software hub for GPU applications.

Learn more about MLPerf and the details of this round.

Read More

Research Focus: Week of November 8, 2023

Research Focus: Week of November 8, 2023

Welcome to Research Focus, a series of blog posts that highlights notable publications, events, code/datasets, new hires and other milestones from across the research community at Microsoft.

Research Focus: November 8, 2023 on a gradient patterned background

NEW RESEARCH

HMD-NeMo: Online 3D Avatar Motion Generation From Sparse Observations

Generating both plausible and accurate full body avatar motion is essential for creating high quality immersive experiences in mixed reality scenarios. Head-mounted devices (HMDs) typically only provide a few input signals, such as head and hands 6-DoF—or the six degrees of freedom of movement by a rigid body in a three-dimensional space. Recent approaches have achieved impressive performance in generating full body motion given only head and hands signal. However, all known existing approaches rely on full hand visibility. While this is the case when using motion controllers, for example, a considerable proportion of mixed reality experiences do not involve motion controllers and instead rely on egocentric hand tracking. This introduces the challenge of partial hand visibility, owing to the restricted field of view of the HMD.

In a recent paper: HMD-NeMo: Online 3D Avatar Motion Generation From Sparse Observations, researchers from Microsoft propose HMD-NeMo, the first unified approach that addresses plausible and accurate full body motion generation even when the hands may be only partially visible. HMD-NeMo is a lightweight neural network that predicts full body motion in an online and real-time fashion. At the heart of HMD-NeMo is a spatio-temporal encoder with novel temporally adaptable mask tokens that encourage plausible motion in the absence of hand observations. The researchers perform extensive analysis of the impact of different components in HMD-NeMo and, through their evaluation, introduce a new state-of-the-art on AMASS, a large database of human motion unifying different optical marker-based motion capture datasets.

Microsoft Research Podcast

Collaborators: Renewable energy storage with Bichlien Nguyen and David Kwabi

Dr. Bichlien Nguyen and Dr. David Kwabi explore their work in flow batteries and how machine learning can help more effectively search the vast organic chemistry space to identify compounds with properties just right for storing waterpower and other renewables.


NEW ARTICLE

Will Code Remain a Relevant User Interface for End-User Programming with Generative AI Models?

The research field of end-user programming has largely been concerned with helping non-experts learn to code well enough to achieve their own tasks. Generative AI stands to obviate this entirely by allowing users to generate code from naturalistic language prompts.

In a recent essay: Will Code Remain a Relevant User Interface for End-User Programming with Generative AI Models?, researchers from Microsoft explore the relevance of “traditional” programming languages for non-expert end-user programmers in a world with generative AI. They posit the “generative shift hypothesis”: that generative AI will create qualitative and quantitative expansions in the traditional scope of end-user programming. They outline some reasons that traditional programming languages may still be relevant and useful for end-user programmers, and speculate whether each of these reasons might endure or disappear with further improvements and innovations in generative AI. And finally, they articulate a set of implications for end-user programming research, including the possibility of needing to revisit many well-established core concepts, such as Ko’s learning barriers and Blackwell’s attention investment model.


NEW RESEARCH

LUT-NN: Empower Efficient Neural Network Inference with Centroid Learning and Table Lookup

On-device deep neural network (DNN) inference, widely used in mobile devices such as smartphones and smartwatches, offers unparalleled intelligent services, but also stresses the limited hardware resources on those devices.

In a recent paper: LUT-NN: Empower Efficient Neural Network Inference with Centroid Learning and Table Lookup, researchers at Microsoft propose a system that consumes less latency, memory, disk, and power, for more efficient DNN inference. LUT-NN learns the typical features for each operator, known as the centroid, and precomputes the results for these centroids to save in lookup tables. During inference, the results of the closest centroids with the inputs can be read directly from the table, as the approximated outputs without computations.

LUT-NN integrates two major novel techniques: (1) differentiable centroid learning through backpropagation, which adapts three levels of approximation to minimize the accuracy impact by centroids; (2) table lookup inference execution, which comprehensively considers different levels of parallelism, memory access reduction, and dedicated hardware units for optimal performance.

The post Research Focus: Week of November 8, 2023 appeared first on Microsoft Research.

Read More

NVIDIA Partners With APEC Economies to Change Lives, Increase Opportunity, Improve Outcomes

NVIDIA Partners With APEC Economies to Change Lives, Increase Opportunity, Improve Outcomes

When patients in Vietnam enter a medical facility in distress, doctors use NVIDIA technology to get more accurate scans to diagnose their ailments. In Hong Kong, a different set of doctors leverage generative AI to discover new cures for patients.

Improving the health and well-being of citizens and strengthening economies and communities are key themes as world leaders soon gather in San Francisco for the 2023 Asia-Pacific Economic Cooperation (APEC) Summit.

When they meet to discuss bold solutions to improve the lives of their citizens and societies, NVIDIA’s AI and accelerated computing initiatives are a crucial enabler.

NVIDIA’s work to improve outcomes for everyday people while tackling future challenges builds on years of deep investment with APEC partners. With a strong presence in countries across the region, including a workforce of thousands and numerous collaborative projects in areas from farming to healthcare to education, NVIDIA is delivering new technologies and workforce training programs to enhance industrial development and advance generative AI research.

Beyond technological advancements, these efforts spur economic growth, create good-paying jobs and improve the health and well-being of people globally.

Research and National Compute Partnerships

NVIDIA has advanced AI research partnerships with several APEC economies. These accelerate scientific breakthroughs in AI and HPC to address national challenges, such as healthcare, skills development and creating more robust local AI ecosystems to protect and advance well-being, prosperity and security. For example:

  • Australia’s national science and research organization, CSIRO, has teamed with NVIDIA to advance Australia’s AI program across climate action, space exploration, quantum computing and AI education.
  • Singapore’s National Supercomputing Centre and Ministry of Education have partnered with NVIDIA to drive sovereign AI capabilities with a priority focus on sectors such as healthcare, climate science and digital twins.
  • Thailand was Southeast Asia’s first country to participate in NVIDIA’s AI Nations initiative, bringing together the Ministry of Education with a consortium of top universities to advance public-private collaborations in urban planning, public health and autonomous vehicles.
  • In Vietnam, NVIDIA is partnering with Viettel,  the nation’s largest employer, and Vietnam’s Academy for Science & Technology to upskill workforces, accelerate the introduction of AI services to industry and deploy next-generation 5G services.

Innovation Ecosystems

Startups are at the leading edge of AI innovation, and a robust startup ecosystem is vital to advancing technology within APEC economies.

NVIDIA Inception is a free program to help startups innovate faster. Through it, NVIDIA supports over 5,000 startups across APEC economies, and more than 15,000 globally, by providing cutting-edge technology, connections with venture capitalists and access to the latest technical resources.

In 2023, NVIDIA added nearly 1,000 APEC-area startups to the program. In addition to creating economic opportunities, Inception supports small- and medium-sized enterprises in developing novel solutions to some of society’s biggest challenges. Here’s what some of its members are doing:

  • In Malaysia, Tapway uses AI to reduce congestion and streamline traffic for more than 1 million daily travelers.
  • In New Zealand, Lynker uses geospatial analysis, deep learning and remote sensing for earth observation.  Lynker’s technology measures carbon sequestration on farms, detecting, monitoring and restoring wetlands and enabling more effective disaster relief.
  • In Thailand, AltoTech Global, an Inception partner, integrates AI software with Internet of Things devices to optimize energy consumption for hotels, buildings, factories and smart cities. AltoTech’s ultimate goal is contributing to the net-zero economy and helping customers achieve their net-zero targets.

Digital Upskilling and Tools for Growth

The NVIDIA Deep Learning Institute (DLI) provides AI training and digital upskilling programs that cultivate innovation and create economic opportunities.

DLI’s training and certification program helps individuals and organizations accelerate skills development and workforce transformation in AI, high performance computing and industrial digitalization.

Hands-on, self-paced and instructor-led courses are created and taught by NVIDIA experts, bringing real-world experience and deep technical know-how to developers and IT professionals.

Through this program, NVIDIA has trained more than 115,000 individuals in APEC economies, including more than 16,000 new trainees this year.

Separately, the NVIDIA Developer Program offers more than 2 million developers in APEC economies access to software development kits, application program interfaces, pretrained AI models and performance analysis tools to help developers create and innovate. Members receive free hands-on training, access to developer forums and early access to new products and services.

Creating a Better Future for All

As nations work together to address common challenges and improve the lives of their citizens, NVIDIA will continue to leverage its world-class technologies to help create a better world for all.

Read More

Dr Aengus Tran, co-founder of Annalise.ai and Harrison.ai on Using AI as a Spell Check for Health Checks

Dr Aengus Tran, co-founder of Annalise.ai and Harrison.ai on Using AI as a Spell Check for Health Checks

Clinician-led healthcare AI company Harrison.ai has built an AI system that effectively serves as a “spell checker” for radiologists — flagging critical findings to improve the speed and accuracy of radiology image analysis, reducing misdiagnoses.

In the latest episode of NVIDIA’s AI Podcast, host Noah Kravitz spoke with Harrison.ai CEO and cofounder Aengus Tran about the company’s mission to scale global healthcare capacity with autonomous AI systems.

Harrison.ai’s initial product, annalise.ai, is an AI tool that automates radiology image analysis to enable faster, more accurate diagnoses. It can produce 124-130 different possible diagnoses and flag key findings to aid radiologists in their final diagnosis. Currently, annalise.ai works for chest X-rays and brain CT scans, with more on the way.

While an AI designed for categorizing traffic lights, for example, doesn’t need perfection,  medical tools must be highly accurate — any oversight could be fatal. To overcome this challenge, annalise.ai was trained on millions of meticulously annotated images — some were annotated three to five times over before being used for training.

Harrison.ai is also developing Franklin.ai, a sibling AI tool aimed to accelerate and improve the accuracy of histopathology diagnosis — in which a clinician performs a biopsy and inspects the tissue for the presence of cancerous cells. Similarly to annalise.ai, Franklin.ai flags critical findings to assist pathologists in speeding and increasing the accuracy of diagnoses.

Ethical concerns about AI use are ever-rising, but for Tran, the concern is less about whether it’s ethical to use AI for medical diagnosis but “actually the converse: Is it ethical to not use AI for medical diagnosis,” especially if “humans using those AI systems simply pick up more misdiagnosis, pick up more cancer and conditions?”

Tran also talked about the future of AI systems and suggested that the focus is dual: first, focus on improving pree-xisting systems and then think of new cutting-edge solutions.

And for those looking to break into careers in AI and healthcare, Tran says that the “first step is to decide upfront what problems you’re willing to spend a huge part of your time solving first, before the AI part,” emphasizing that the “first thing is actually to fall in love with some problem.”

You Might Also Like

Jules Anh Tuan Nguyen Explains How AI Lets Amputee Control Prosthetic Hand, Video Games
A postdoctoral researcher at the University of Minnesota discusses his efforts to allow amputees to control their prosthetic limb — right down to the finger motions — with their minds.

Overjet’s Ai Wardah Inam on Bringing AI to Dentistry
Overjet, a member of NVIDIA Inception, is moving fast to bring AI to dentists’ offices. Dr. Wardah Inam, CEO of the company, discusses using AI to improve patient care.

Immunai CTO and Co-Founder Luis Voloch on Using Deep Learning to Develop New Drugs
Luis Voloch talks about tackling the challenges of the immune system with a machine learning and data science mindset.

Subscribe to the AI Podcast: Now Available on Amazon Music

The AI Podcast is now available through Amazon Music.

In addition, get the AI Podcast through iTunes, Google Podcasts, Google Play, Castbox, DoggCatcher, Overcast, PlayerFM, Pocket Casts, Podbay, PodBean, PodCruncher, PodKicker, Soundcloud, Spotify, Stitcher and TuneIn.

Make the AI Podcast better. Have a few minutes to spare? Fill out this listener survey.

Read More

STEER: Semantic Turn Extension-Expansion Recognition for Voice Assistants

*= Equal Contributors
In the context of a voice assistant system, steering refers to the phenomenon in which a user issues a follow-up command attempting to direct or clarify a previous turn. We propose STEER, a steering detection model that predicts whether a follow-up turn is a user’s attempt to steer the previous command. Constructing a training dataset for steering use cases poses challenges due to the cold-start problem. To overcome this, we developed heuristic rules to sample opt-in usage data, approximating positive and negative samples without any annotation. Our experimental results…Apple Machine Learning Research

SeMAnD: Self-Supervised Anomaly Detection in Multimodal Geospatial Datasets

*= Equal Contributors
We propose a Self-supervised Anomaly Detection technique, called SeMAnD, to detect geometric anomalies in Multimodal geospatial datasets. Geospatial data comprises acquired and derived heterogeneous data modalities that we transform to semantically meaningful, image-like tensors to address the challenges of representation, alignment, and fusion of multimodal data. SeMAnD is comprised of (i) a simple data augmentation strategy, called RandPolyAugment, capable of generating diverse augmentations of vector geometries, and (ii) a self-supervised training objective with three…Apple Machine Learning Research

EELBERT: Tiny Models through Dynamic Embeddings

We introduce EELBERT, an approach for compression of transformer-based models (for example, BERT), with minimal impact on the accuracy of downstream tasks. This is achieved by replacing the input embedding layer of the model with dynamic, for example, on-the-fly, embedding computations. Since the input embedding layer accounts for a significant fraction of the model size, especially for the smaller BERT variants, replacing this layer with an embedding computation function helps us reduce the model size significantly. Empirical evaluation on the GLUE benchmark shows that our BERT variants…Apple Machine Learning Research