March 2024 – Page 17

Google at APS 2024

Posted by Kate Weber and Shannon Leon, Google Research, Quantum AI Team

Today the 2024 March Meeting of the American Physical Society (APS) kicks off in Minneapolis, MN. A premier conference on topics ranging across physics and related fields, APS 2024 brings together researchers, students, and industry professionals to share their discoveries and build partnerships with the goal of realizing fundamental advances in physics-related sciences and technology.

This year, Google has a strong presence at APS with a booth hosted by the Google Quantum AI team, 50+ talks throughout the conference, and participation in conference organizing activities, special sessions and events. Attending APS 2024 in person? Come visit Google’s Quantum AI booth to learn more about the exciting work we’re doing to solve some of the field’s most interesting challenges. <!–Visit the @GoogleAI X (Twitter) account to find out about Google booth activities (e.g., demos and Q&A sessions). –>

You can learn more about the latest cutting edge work we are presenting at the conference along with our schedule of booth events below (Googlers listed in bold).

Organizing Committee

Session Chairs include: Aaron Szasz

Booth Activities

This schedule is subject to change. Please visit the Google Quantum AI booth for more information.

Crumble

Presenter: Matt McEwen

Tue, Mar 5 | 11:00 AM CST

Qualtran

Presenter: Tanuj Khattar

Tue, Mar 5 | 2:30 PM CST

Qualtran

Presenter: Tanuj Khattar

Thu, Mar 7 | 11:00 AM CST

$5M XPRIZE / Google Quantum AI competition to accelerate quantum applications Q&A

Presenter: Ryan Babbush

Thu, Mar 7 | 11:00 AM CST

Talks

Monday

Certifying highly-entangled states from few single-qubit measurements

Presenter: Hsin-Yuan Huang

Author: Hsin-Yuan Huang

Session A45: New Frontiers in Machine Learning Quantum Physics

Toward high-fidelity analog quantum simulation with superconducting qubits

Presenter: Trond Andersen

Authors: Trond I Andersen, Xiao Mi, Amir H Karamlou, Nikita Astrakhantsev, Andrey Klots, Julia Berndtsson, Andre Petukhov, Dmitry Abanin, Lev B Ioffe, Yu Chen, Vadim Smelyanskiy, Pedram Roushan

Session A51: Applications on Noisy Quantum Hardware I

Measuring circuit errors in context for surface code circuits

Presenter: Dripto M Debroy

Authors: Dripto M Debroy, Jonathan A Gross, Élie Genois, Zhang Jiang

Session B50: Characterizing Noise with QCVV Techniques

Quantum computation of stopping power for inertial fusion target design I: Physics overview and the limits of classical algorithms

Presenter: Andrew D. Baczewski

Authors: Nicholas C. Rubin, Dominic W. Berry, Alina Kononov, Fionn D. Malone, Tanuj Khattar, Alec White, Joonho Lee, Hartmut Neven, Ryan Babbush, Andrew D. Baczewski

Session B51: Heterogeneous Design for Quantum Applications

Link to Paper

Quantum computation of stopping power for inertial fusion target design II: Physics overview and the limits of classical algorithms

Presenter: Nicholas C. Rubin

Authors: Nicholas C. Rubin, Dominic W. Berry, Alina Kononov, Fionn D. Malone, Tanuj Khattar, Alec White, Joonho Lee, Hartmut Neven, Ryan Babbush, Andrew D. Baczewski

Session B51: Heterogeneous Design for Quantum Applications

Link to Paper

Calibrating Superconducting Qubits: From NISQ to Fault Tolerance

Presenter: Sabrina S Hong

Author: Sabrina S Hong

Session B56: From NISQ to Fault Tolerance

Measurement and feedforward induced entanglement negativity transition

Presenter: Ramis Movassagh

Authors: Alireza Seif, Yu-Xin Wang, Ramis Movassagh, Aashish A. Clerk

Session B31: Measurement Induced Criticality in Many-Body Systems

Link to Paper

Effective quantum volume, fidelity and computational cost of noisy quantum processing experiments

Presenter: Salvatore Mandra

Authors: Kostyantyn Kechedzhi, Sergei V Isakov, Salvatore Mandra, Benjamin Villalonga, X. Mi, Sergio Boixo, Vadim Smelyanskiy

Session B52: Quantum Algorithms and Complexity

Link to Paper

Accurate thermodynamic tables for solids using Machine Learning Interaction Potentials and Covariance of Atomic Positions

Presenter: Mgcini K Phuthi

Authors: Mgcini K Phuthi, Yang Huang, Michael Widom, Ekin D Cubuk, Venkat Viswanathan

Session D60: Machine Learning of Molecules and Materials: Chemical Space and Dynamics

Tuesday

IN-Situ Pulse Envelope Characterization Technique (INSPECT)

Presenter: Zhang Jiang

Authors: Zhang Jiang, Jonathan A Gross, Élie Genois

Session F50: Advanced Randomized Benchmarking and Gate Calibration

Characterizing two-qubit gates with dynamical decoupling

Presenter: Jonathan A Gross

Authors: Jonathan A Gross, Zhang Jiang, Élie Genois, Dripto M Debroy, Ze-Pei Cian*, Wojciech Mruczkiewicz

Session F50: Advanced Randomized Benchmarking and Gate Calibration

Statistical physics of regression with quadratic models

Presenter: Blake Bordelon

Authors: Blake Bordelon, Cengiz Pehlevan, Yasaman Bahri

Session EE01: V: Statistical and Nonlinear Physics II

Improved state preparation for first-quantized simulation of electronic structure

Presenter: William J Huggins

Authors: William J Huggins, Oskar Leimkuhler, Torin F Stetina, Birgitta Whaley

Session G51: Hamiltonian Simulation

Controlling large superconducting quantum processors

Presenter: Paul V. Klimov

Authors: Paul V. Klimov, Andreas Bengtsson, Chris Quintana, Alexandre Bourassa, Sabrina Hong, Andrew Dunsworth, Kevin J. Satzinger, William P. Livingston, Volodymyr Sivak, Murphy Y. Niu, Trond I. Andersen, Yaxing Zhang, Desmond Chik, Zijun Chen, Charles Neill, Catherine Erickson, Alejandro Grajales Dau, Anthony Megrant, Pedram Roushan, Alexander N. Korotkov, Julian Kelly, Vadim Smelyanskiy, Yu Chen, Hartmut Neven

Session G30: Commercial Applications of Quantum Computing)

Link to Paper

Gaussian boson sampling: Determining quantum advantage

Presenter: Peter D Drummond

Authors: Peter D Drummond, Alex Dellios, Ned Goodman, Margaret D Reid, Ben Villalonga

Session G50: Quantum Characterization, Verification, and Validation II

Attention to complexity III: learning the complexity of random quantum circuit states

Presenter: Hyejin Kim

Authors: Hyejin Kim, Yiqing Zhou, Yichen Xu, Chao Wan, Jin Zhou, Yuri D Lensky, Jesse Hoke, Pedram Roushan, Kilian Q Weinberger, Eun-Ah Kim

Session G50: Quantum Characterization, Verification, and Validation II

Balanced coupling in superconducting circuits

Presenter: Daniel T Sank

Authors: Daniel T Sank, Sergei V Isakov, Mostafa Khezri, Juan Atalaya

Session K48: Strongly Driven Superconducting Systems

Resource estimation of Fault Tolerant algorithms using Qᴜᴀʟᴛʀᴀɴ

Presenter: Tanuj Khattar

Author: Tanuj Khattar

Session K49: Algorithms and Implementations on Near-Term Quantum Computers

Wednesday

Discovering novel quantum dynamics with superconducting qubits

Presenter: Pedram Roushan

Author: Pedram Roushan

Session M24: Analog Quantum Simulations Across Platforms

Deciphering Tumor Heterogeneity in Triple-Negative Breast Cancer: The Crucial Role of Dynamic Cell-Cell and Cell-Matrix Interactions

Presenter: Susan Leggett

Authors: Susan Leggett, Ian Wong, Celeste Nelson, Molly Brennan, Mohak Patel, Christian Franck, Sophia Martinez, Joe Tien, Lena Gamboa, Thomas Valentin, Amanda Khoo, Evelyn K Williams

Session M27: Mechanics of Cells and Tissues II

Toward implementation of protected charge-parity qubits

Presenter: Abigail Shearrow

Authors: Abigail Shearrow, Matthew Snyder, Bradley G Cole, Kenneth R Dodge, Yebin Liu, Andrey Klots, Lev B Ioffe, Britton L Plourde, Robert McDermott

Session N48: Unconventional Superconducting Qubits

Electronic capacitance in tunnel junctions for protected charge-parity qubits

Presenter: Bradley G Cole

Authors: Bradley G Cole, Kenneth R Dodge, Yebin Liu, Abigail Shearrow, Matthew Snyder, Andrey Klots, Lev B Ioffe, Robert McDermott, B.L.T. Plourde

Session N48: Unconventional Superconducting Qubits

Overcoming leakage in quantum error correction

Presenter: Kevin C. Miao

Authors: Kevin C. Miao, Matt McEwen, Juan Atalaya, Dvir Kafri, Leonid P. Pryadko, Andreas Bengtsson, Alex Opremcak, Kevin J. Satzinger, Zijun Chen, Paul V. Klimov, Chris Quintana, Rajeev Acharya, Kyle Anderson, Markus Ansmann, Frank Arute, Kunal Arya, Abraham Asfaw, Joseph C. Bardin, Alexandre Bourassa, Jenna Bovaird, Leon Brill, Bob B. Buckley, David A. Buell, Tim Burger, Brian Burkett, Nicholas Bushnell, Juan Campero, Ben Chiaro, Roberto Collins, Paul Conner, Alexander L. Crook, Ben Curtin, Dripto M. Debroy, Sean Demura, Andrew Dunsworth, Catherine Erickson, Reza Fatemi, Vinicius S. Ferreira, Leslie Flores Burgos, Ebrahim Forati, Austin G. Fowler, Brooks Foxen, Gonzalo Garcia, William Giang, Craig Gidney, Marissa Giustina, Raja Gosula, Alejandro Grajales Dau, Jonathan A. Gross, Michael C. Hamilton, Sean D. Harrington, Paula Heu, Jeremy Hilton, Markus R. Hoffmann, Sabrina Hong, Trent Huang, Ashley Huff, Justin Iveland, Evan Jeffrey, Zhang Jiang, Cody Jones, Julian Kelly, Seon Kim, Fedor Kostritsa, John Mark Kreikebaum, David Landhuis, Pavel Laptev, Lily Laws, Kenny Lee, Brian J. Lester, Alexander T. Lill, Wayne Liu, Aditya Locharla, Erik Lucero, Steven Martin, Anthony Megrant, Xiao Mi, Shirin Montazeri, Alexis Morvan, Ofer Naaman, Matthew Neeley, Charles Neill, Ani Nersisyan, Michael Newman, Jiun How Ng, Anthony Nguyen, Murray Nguyen, Rebecca Potter, Charles Rocque, Pedram Roushan, Kannan Sankaragomathi, Christopher Schuster, Michael J. Shearn, Aaron Shorter, Noah Shutty, Vladimir Shvarts, Jindra Skruzny, W. Clarke Smith, George Sterling, Marco Szalay, Douglas Thor, Alfredo Torres, Theodore White, Bryan W. K. Woo, Z. Jamie Yao, Ping Yeh, Juhwan Yoo, Grayson Young, Adam Zalcman, Ningfeng Zhu, Nicholas Zobrist, Hartmut Neven, Vadim Smelyanskiy, Andre Petukhov, Alexander N. Korotkov, Daniel Sank, Yu Chen

Session N51: Quantum Error Correction Code Performance and Implementation I

Link to Paper

Modeling the performance of the surface code with non-uniform error distribution: Part 1

Presenter: Yuri D Lensky

Authors: Yuri D Lensky, Volodymyr Sivak, Kostyantyn Kechedzhi, Igor Aleiner

Session N51: Quantum Error Correction Code Performance and Implementation I

Modeling the performance of the surface code with non-uniform error distribution: Part 2

Presenter: Volodymyr Sivak

Authors: Volodymyr Sivak, Michael Newman, Cody Jones, Henry Schurkus, Dvir Kafri, Yuri D Lensky, Paul Klimov, Kostyantyn Kechedzhi, Vadim Smelyanskiy

Session N51: Quantum Error Correction Code Performance and Implementation I

Highly optimized tensor network contractions for the simulation of classically challenging quantum computations

Presenter: Benjamin Villalonga

Author: Benjamin Villalonga

Session Q51: Co-evolution of Quantum Classical Algorithms

Teaching modern quantum computing concepts using hands-on open-source software at all levels

Presenter: Abraham Asfaw

Author: Abraham Asfaw

Session Q61: Teaching Quantum Information at All Levels II

Thursday

New circuits and an open source decoder for the color code

Presenter: Craig Gidney

Authors: Craig Gidney, Cody Jones

Session S51: Quantum Error Correction Code Performance and Implementation II

Link to Paper

Performing Hartree-Fock many-body physics calculations with large language models

Presenter: Eun-Ah Kim

Authors: Eun-Ah Kim, Haining Pan, Nayantara Mudur, William Taranto, Subhashini Venugopalan, Yasaman Bahri, Michael P Brenner

Session S18: Data Science, AI and Machine Learning in Physics I

New methods for reducing resource overhead in the surface code

Presenter: Michael Newman

Authors: Craig M Gidney, Michael Newman, Peter Brooks, Cody Jones

Session S51: Quantum Error Correction Code Performance and Implementation II

Link to Paper

Challenges and opportunities for applying quantum computers to drug design

Presenter: Raffaele Santagati

Authors: Raffaele Santagati, Alan Aspuru-Guzik, Ryan Babbush, Matthias Degroote, Leticia Gonzalez, Elica Kyoseva, Nikolaj Moll, Markus Oppel, Robert M. Parrish, Nicholas C. Rubin, Michael Streif, Christofer S. Tautermann, Horst Weiss, Nathan Wiebe, Clemens Utschig-Utschig

Session S49: Advances in Quantum Algorithms for Near-Term Applications

Link to Paper

Dispatches from Google’s hunt for super-quadratic quantum advantage in new applications

Presenter: Ryan Babbush

Author: Ryan Babbush

Session T45: Recent Advances in Quantum Algorithms

Qubit as a reflectometer

Presenter: Yaxing Zhang

Authors: Yaxing Zhang, Benjamin Chiaro

Session T48: Superconducting Fabrication, Packaging, & Validation

Random-matrix theory of measurement-induced phase transitions in nonlocal Floquet quantum circuits

Presenter: Aleksei Khindanov

Authors: Aleksei Khindanov, Lara Faoro, Lev Ioffe, Igor Aleiner

Session W14: Measurement-Induced Phase Transitions

Continuum limit of finite density many-body ground states with MERA

Presenter: Subhayan Sahu

Authors: Subhayan Sahu, Guifré Vidal

Session W58: Extreme-Scale Computational Science Discovery in Fluid Dynamics and Related Disciplines II

Dynamics of magnetization at infinite temperature in a Heisenberg spin chain

Presenter: Eliott Rosenberg

Authors: Eliott Rosenberg, Trond Andersen, Rhine Samajdar, Andre Petukhov, Jesse Hoke*, Dmitry Abanin, Andreas Bengtsson, Ilya Drozdov, Catherine Erickson, Paul Klimov, Xiao Mi, Alexis Morvan, Matthew Neeley, Charles Neill, Rajeev Acharya, Richard Allen, Kyle Anderson, Markus Ansmann, Frank Arute, Kunal Arya, Abraham Asfaw, Juan Atalaya, Joseph Bardin, A. Bilmes, Gina Bortoli, Alexandre Bourassa, Jenna Bovaird, Leon Brill, Michael Broughton, Bob B. Buckley, David Buell, Tim Burger, Brian Burkett, Nicholas Bushnell, Juan Campero, Hung-Shen Chang, Zijun Chen, Benjamin Chiaro, Desmond Chik, Josh Cogan, Roberto Collins, Paul Conner, William Courtney, Alexander Crook, Ben Curtin, Dripto Debroy, Alexander Del Toro Barba, Sean Demura, Agustin Di Paolo, Andrew Dunsworth, Clint Earle, E. Farhi, Reza Fatemi, Vinicius Ferreira, Leslie Flores, Ebrahim Forati, Austin Fowler, Brooks Foxen, Gonzalo Garcia, Élie Genois, William Giang, Craig Gidney, Dar Gilboa, Marissa Giustina, Raja Gosula, Alejandro Grajales Dau, Jonathan Gross, Steve Habegger, Michael Hamilton, Monica Hansen, Matthew Harrigan, Sean Harrington, Paula Heu, Gordon Hill, Markus Hoffmann, Sabrina Hong, Trent Huang, Ashley Huff, William Huggins, Lev Ioffe, Sergei Isakov, Justin Iveland, Evan Jeffrey, Zhang Jiang, Cody Jones, Pavol Juhas, D. Kafri, Tanuj Khattar, Mostafa Khezri, Mária Kieferová, Seon Kim, Alexei Kitaev, Andrey Klots, Alexander Korotkov, Fedor Kostritsa, John Mark Kreikebaum, David Landhuis, Pavel Laptev, Kim Ming Lau, Lily Laws, Joonho Lee, Kenneth Lee, Yuri Lensky, Brian Lester, Alexander Lill, Wayne Liu, William P. Livingston, A. Locharla, Salvatore Mandrà, Orion Martin, Steven Martin, Jarrod McClean, Matthew McEwen, Seneca Meeks, Kevin Miao, Amanda Mieszala, Shirin Montazeri, Ramis Movassagh, Wojciech Mruczkiewicz, Ani Nersisyan, Michael Newman, Jiun How Ng, Anthony Nguyen, Murray Nguyen, M. Niu, Thomas O’Brien, Seun Omonije, Alex Opremcak, Rebecca Potter, Leonid Pryadko, Chris Quintana, David Rhodes, Charles Rocque, N. Rubin, Negar Saei, Daniel Sank, Kannan Sankaragomathi, Kevin Satzinger, Henry Schurkus, Christopher Schuster, Michael Shearn, Aaron Shorter, Noah Shutty, Vladimir Shvarts, Volodymyr Sivak, Jindra Skruzny, Clarke Smith, Rolando Somma, George Sterling, Doug Strain, Marco Szalay, Douglas Thor, Alfredo Torres, Guifre Vidal, Benjamin Villalonga, Catherine Vollgraff Heidweiller, Theodore White, Bryan Woo, Cheng Xing, Jamie Yao, Ping Yeh, Juhwan Yoo, Grayson Young, Adam Zalcman, Yaxing Zhang, Ningfeng Zhu, Nicholas Zobrist, Hartmut Neven, Ryan Babbush, Dave Bacon, Sergio Boixo, Jeremy Hilton, Erik Lucero, Anthony Megrant, Julian Kelly, Yu Chen, Vadim Smelyanskiy, Vedika Khemani, Sarang Gopalakrishnan, Tomaž Prosen, Pedram Roushan

Session W50: Quantum Simulation of Many-Body Physics

Link to Paper

The fast multipole method on a quantum computer

Presenter: Kianna Wan

Authors: Kianna Wan, Dominic W Berry, Ryan Babbush

Session W50: Quantum Simulation of Many-Body Physics

Friday

The quantum computing industry and protecting national security: what tools will work?

Presenter: Kate Weber

Author: Kate Weber

Session Y43: Industry, Innovation, and National Security: Finding the Right Balance

Novel charging effects in the fluxonium qubit

Presenter: Agustin Di Paolo

Authors: Agustin Di Paolo, Kyle Serniak, Andrew J Kerman, William D Oliver

Session Y46: Fluxonium-Based Superconducting Quibits

Microwave Engineering of Parametric Interactions in Superconducting Circuits

Presenter: Ofer Naaman

Author: Ofer Naaman

Session Z46: Broadband Parametric Amplifiers and Circulators

Linear spin wave theory of large magnetic unit cells using the Kernel Polynomial Method

Presenter: Harry Lane

Authors: Harry Lane, Hao Zhang, David A Dahlbom, Sam Quinn, Rolando D Somma, Martin P Mourigal, Cristian D Batista, Kipton Barros

Session Z62: Cooperative Phenomena, Theory

^*Work done while at Google

Unlocking Innovation: AWS and Anthropic push the boundaries of generative AI together

Amazon Bedrock is the best place to build and scale generative AI applications with large language models (LLM) and other foundation models (FMs). It enables customers to leverage a variety of high-performing FMs, such as the Claude family of models by Anthropic, to build custom generative AI applications. Looking back to 2021, when Anthropic first started building on AWS, no one could have envisioned how transformative the Claude family of models would be. We have been making state-of-the-art generative AI models accessible and usable for businesses of all sizes through Amazon Bedrock. In just a few short months since Amazon Bedrock became generally available on September 28, 2023, more than 10K customers have been using it to deliver, and many of them are using Claude. Customers such as ADP, Broadridge, Cloudera, Dana-Farber Cancer Institute, Genesys, Genomics England, GoDaddy, Intuit, M1 Finance, Perplexity AI, Proto Hologram, Rocket Companies and more are using Anthropic’s Claude models on Amazon Bedrock to drive innovation in generative AI and to build transformative customer experiences. And today, we are announcing an exciting milestone with the next generation of Claude coming to Amazon Bedrock: Claude 3 Opus, Claude 3 Sonnet, and Claude 3 Haiku.

Introducing Anthropic’s Claude 3 models

Anthropic is unveiling its next generation of Claude with three advanced models optimized for different use cases. Haiku is the fastest and most cost-effective model on the market. It is a fast compact model for near-instant responsiveness. For the vast majority of workloads, Sonnet is 2x faster than Claude 2 and Claude 2.1 with higher levels of intelligence. It excels at intelligent tasks demanding rapid responses, like knowledge retrieval or sales automation. And it strikes the ideal balance between intelligence and speed – qualities especially critical for enterprise use cases. Opus is the most advanced, capable, state-of-the-art FM with deep reasoning, advanced math, and coding abilities, with top-level performance on highly complex tasks. It can navigate open-ended prompts, and novel scenarios with remarkable fluency, including task automation, hypothesis generation, and analysis of charts, graphs, and forecasts. And Sonnet is first available on Amazon Bedrock today. Current evaluations from Anthropic suggest that the Claude 3 model family outperforms comparable models in math word problem solving (MATH) and multilingual math (MGSM) benchmarks, critical benchmarks used today for LLMs.

Vision capabilities – Claude 3 models have been trained to understand structured and unstructured data across different formats, not just language, but also images, charts, diagrams, and more. This lets businesses build generative AI applications integrating diverse multimedia sources and solving truly cross-domain problems. For instance, pharmaceutical companies can query drug research papers alongside protein structure diagrams to accelerate discovery. Media organizations can generate image captions or video scripts automatically.
Best-in-class benchmarks – Claude 3 exceeds existing models on standardized evaluations such as math problems, programming exercises, and scientific reasoning. Customers can optimize domain specific experimental procedures in manufacturing, or audit financial reports based on contextual data, in an automated way and with high accuracy using AI-driven responses.

Source: https://www.anthropic.com/news/claude-3-family

Specifically, Opus outperforms its peers on most of the common evaluation benchmarks for AI systems, including undergraduate level expert knowledge (MMLU), graduate level expert reasoning (GPQA), basic mathematics (GSM8K), and more. It exhibits high levels of comprehension and fluency on complex tasks, leading the frontier of general intelligence.
Reduced hallucination – Businesses require predictive, controllable outputs from AI systems directing automated processes or customer interactions. Claude 3 models mitigate hallucination through constitutional AI techniques that provide transparency into the model’s reasoning, as well as improve accuracy. Claude 3 Opus shows an estimated 2x gain in accuracy over Claude 2.1 on difficult open-ended questions, reducing the likelihood of faulty responses. As enterprise customers rely on Claude across industries like healthcare, finance, and legal research, reducing hallucinations is essential for safety and performance. The Claude 3 family sets a new standard for reliable generative AI output.

Benefits of Anthropic Claude 3 FMs on Amazon Bedrock

Through Amazon Bedrock, customers will get easy access to build with Anthropic’s newest models. This includes not only natural language models but also their expanded range of multimodal AI models capable of advanced reasoning across text, images, charts, and more. Our collaboration has already helped customers accelerate generative AI adoption and delivered business value to them. Here are a few ways customers have been using Anthropic’s Claude models on Amazon Bedrock:

“We are developing a generative AI solution on AWS to help customers plan epic trips and create life-changing experiences with personalized travel itineraries. By building with Claude on Amazon Bedrock, we reduced itinerary generation costs by nearly 80% percent when we quickly created a scalable, secure AI platform that can organize our book content in minutes to deliver cohesive, highly accurate travel recommendations. Now we can repackage and personalize our content in various ways on our digital platforms, based on customer preference, all while highlighting trusted local voices–just like Lonely Planet has done for 50 years.”

— Chris Whyde, Senior VP of Engineering and Data Science, Lonely Planet

“We are working with AWS and Anthropic to host our custom, fine-tuned Anthropic Claude model on Amazon Bedrock to support our strategy of rapidly delivering generative AI solutions at scale and with cutting-edge encryption, data privacy, and safe AI technology embedded in everything we do. Our new Lexis+ AI platform technology features conversational search, insightful summarization, and intelligent legal drafting capabilities, which enable lawyers to increase their efficiency, effectiveness, and productivity.”

— Jeff Reihl, Executive VP and CTO, LexisNexis Legal & Professional

“At Broadridge, we have been working to automate the understanding of regulatory reporting requirements to create greater transparency and increase efficiency for our customers operating in domestic and global financial markets. With use of Claude on Amazon Bedrock, we’re thrilled to get even higher accuracy in our experiments with processing and summarizing capabilities. With Amazon Bedrock, we have choice in our use of LLMs, and we value the performance and integration capabilities it offers.”

— Saumin Patel, VP Engineering generative AI, Broadridge

The Claude 3 model family caters to various needs, allowing customers to choose the model best suited for their specific use case, which is key to developing a successful prototype and later production systems that can deliver real impact—whether for a new product, feature or process that boosts the bottom line. Keeping customer needs top of mind, Anthropic and AWS are delivering where it matters most to organizations of all sizes:

Improved performance – Claude 3 models are significantly faster for real-time interactions thanks to optimizations across hardware and software.
Increased accuracy and reliability – Through massive scaling as well as new self-supervision techniques, expected gains of 2x in accuracy for complex questions over long contexts mean AI that’s even more helpful, safe, and honest.
Simpler and secure customization – Customization capabilities, like retrieval-augmented generation (RAG), simplify training models on proprietary data and building applications backed by diverse data sources, so customers get AI tuned for their unique needs. In addition, proprietary data is never exposed to the public internet, never leaves the AWS network, is securely transferred through VPC, and is encrypted in transit and at rest.

And AWS and Anthropic are continuously reaffirming our commitment to advancing generative AI in a responsible manner. By constantly improving model capabilities committing to frameworks like Constitutional AI or the White House voluntary commitments on AI, we can accelerate the safe, ethical development and deployment of this transformative technology.

The future of generative AI

Looking ahead, customers will build entirely new categories of generative AI-powered applications and experiences with the latest generation of models. We’ve only begun to tap generative AI’s potential to automate complex processes, augment human expertise, and reshape digital experiences. We expect to see unprecedented levels of innovation as customers choose Anthropic’s models augmented with multimodal skills leveraging all the tools they need to build and scale generative AI applications on Amazon Bedrock. Imagine sophisticated conversational assistants providing fast and highly-contextual responses, picture personalized recommendation engines that seamlessly blend in relevant images, diagrams and associated knowledge to intuitively guide decisions. Envision scientific research turbocharged by generative AI able to read experiments, synthesize hypotheses, and even propose novel areas for exploration. There are so many possibilities that will be realized by taking full advantage of all generative AI has to offer through Amazon Bedrock. Our collaboration ensures enterprises and innovators worldwide will have the tools to reach the next frontier of generative AI-powered innovation responsibly, and for the benefit of all.

Conclusion

It’s still early days for generative AI, but strong collaboration and a focus on innovation are ushering in a new era of generative AI on AWS. We can’t wait to see what customers build next.

Resources

Check out the following resources to learn more about this announcement:

Access to the most powerful Anthropic models begins today on Amazon Bedrock
Matt Wood’s blog: Introducing Claude 3
Learn more about Anthropic Claude 3 models on Amazon Bedrock: Anthropic’s Claude on Amazon Bedrock
Learn about Amazon Bedrock, the easiest way to build and scale generative AI applications with FMs
Explore generative AI on AWS
Learn about Unlocking the business value of Generative AI

About the author

Swami Sivasubramanian is Vice President of Data and Machine Learning at AWS. In this role, Swami oversees all AWS Database, Analytics, and AI & Machine Learning services. His team’s mission is to help organizations put their data to work with a complete, end-to-end data solution to store, access, analyze, and visualize, and predict.

Human Following in Mobile Platforms with Person Re-Identification

Human following serves an important human-robotics interaction feature, while real-world scenarios make it challenging particularly for a mobile agent. The main challenge is that when a mobile agent try to locate and follow a targeted person, this person can be in a crowd, be occluded by other people, and/or be facing (partially) away from the mobile agent. To address the challenge, we present a novel person re-identification module, which contains three parts: 1) a 360-degree visual registration process, 2) a neural-based person re-identification mechanism by multiple body parts – human faces…Apple Machine Learning Research

Privacy-Preserving Quantile Treatment Effect Estimation for Randomized Controlled Trials

In accordance with the principle of “data minimization,” many internet companies are opting to record less data. However, this is often at odds with A/B testing efficacy. For experiments with units with multiple observations, one popular data-minimizing technique is to aggregate data for each unit. However, exact quantile estimation requires the full observation-level data. In this paper, we develop a method for approximate Quantile Treatment Effect (QTE) analysis using histogram aggregation. In addition, we can also achieve formal privacy guarantees using differential privacy.Apple Machine Learning Research

What Can CLIP Learn From Task-specific Experts?

This paper has been accepted to the UniReps Workshop in NeurIPS 2023.
Contrastive language image pretraining has become the standard approach for training vision language models. Despite the utility of CLIP visual features as global representations for images, they have limitations when it comes to tasks involving object localization, pixel-level understanding of the image, or 3D perception. Multi-task training is a popular solution to address this drawback, but collecting a large-scale annotated multi-task dataset incurs significant costs. Furthermore, training on separate task specific…Apple Machine Learning Research

Knowledge Bases for Amazon Bedrock now supports hybrid search

At AWS re:Invent 2023, we announced the general availability of Knowledge Bases for Amazon Bedrock. With a knowledge base, you can securely connect foundation models (FMs) in Amazon Bedrock to your company data for fully managed Retrieval Augmented Generation (RAG).

In a previous post, we described how Knowledge Bases for Amazon Bedrock manages the end-to-end RAG workflow for you and shared details about some of the recent feature launches.

For RAG-based applications, the accuracy of the generated response from large language models (LLMs) is dependent on the context provided to the model. Context is retrieved from the vector database based on the user query. Semantic search is widely used because it is able to understand more human-like questions—a user’s query is not always directly related to the exact keywords in the content that answers it. Semantic search helps provide answers based on the meaning of the text. However, it has limitations in capturing all the relevant keywords. Its performance relies on the quality of the word embeddings used to represent meaning of the text. To overcome such limitations, combining semantic search with keyword search (hybrid) will give better results.

In this post, we discuss the new feature of hybrid search, which you can select as a query option alongside semantic search.

Hybrid search overview

Hybrid search takes advantage of the strengths of multiple search algorithms, integrating their unique capabilities to enhance the relevance of returned search results. For RAG-based applications, semantic search capabilities are commonly combined with traditional keyword-based search to improve the relevance of search results. It enables searching over both the content of documents and their underlying meaning. For example, consider the following query:

What is the cost of the book "<book_name>" on <website_name>?

In this query for a book name and website name, a keyword search will give better results, because we want the cost of the specific book. However, the term “cost” might have synonyms such as “price,” so it will be better to use semantic search, which understands the meaning of the text. Hybrid search brings the best of both approaches: precision of semantic search and coverage of keywords. It works great for RAG-based applications where the retriever has to handle a wide variety of natural language queries. The keywords help cover specific entities in the query such as product name, color, and price, while semantics better understands the meaning and intent within the query. For example, if you have want to build a chatbot for an ecommerce website to handle customer queries such as the return policy or details of the product, using hybrid search will be most suitable.

Use cases for hybrid search

The following are some common use cases for hybrid search:

Open domain question answering – This involves answering questions on a wide variety of topics. This requires searching over large collections of documents with diverse content, such as website data, which can include various topics such as sustainability, leadership, financial results, and more. Semantic search alone can’t generalize well for this task, because it lacks the capacity for lexical matching of unseen entities, which is important for handling out-of-domain examples. Therefore, combining keyword-based search with semantic search can help narrow down the scope and provide better results for open domain question answering.
Contextual-based chatbots – Conversations can rapidly change direction and cover unpredictable topics. Hybrid search can better handle such open-ended dialogs.
Personalized search – Web-scale search over heterogeneous content benefits from a hybrid approach. Semantic search handles popular head queries, while keywords cover rare long-tail queries.

Although hybrid search offers wider coverage by combining two approaches, semantic search has precision advantages when the domain is narrow and semantics are well-defined, or when there is little room for misinterpretation, like factoid question answering systems.

Benefits of hybrid search

Both keyword and semantic search will return a separate set of results along with their relevancy scores, which are then combined to return the most relevant results. Knowledge Bases for Amazon Bedrock currently supports four vector stores: Amazon OpenSearch Serverless, Amazon Aurora PostgreSQL-Compatible Edition, Pinecone, and Redis Enterprise Cloud. As of this writing, the hybrid search feature is available for OpenSearch Serverless, with support for other vector stores coming soon.

The following are some of the benefits of using hybrid search:

Improved accuracy – The accuracy of the generated response from the FM is directly dependent on the relevancy of retrieved results. Based on your data, it can be challenging to improve the accuracy of your application only using semantic search. The key benefit of using hybrid search is to get improved quality of retrieved results, which in turn helps the FM generate more accurate answers.
Expanded search capabilities – Keyword search casts a wider net and finds documents that may be relevant but might not contain semantic structure throughout the document. It allows you to search on keywords as well as the semantic meaning of the text, thereby expanding the search capabilities.

In the following sections, we demonstrate how to use hybrid search with Knowledge Bases for Amazon Bedrock.

Use hybrid search and semantic search options via SDK

When you call the Retrieve API, Knowledge Bases for Amazon Bedrock selects the right search strategy for you to give you most relevant results. You have the option to override it to use either hybrid or semantic search in the API.

Retrieve API

The Retrieve API is designed to fetch relevant search results by providing the user query, knowledge base ID, and number of results that you want the API to return. This API converts user queries into embeddings, searches the knowledge base using either hybrid search or semantic (vector) search, and returns the relevant results, giving you more control to build custom workflows on top of the search results. For example, you can add postprocessing logic to the retrieved results or add your own prompt and connect with any FM provided by Amazon Bedrock for generating answers.

To show you an example of switching between hybrid and semantic (vector) search options, we have created a knowledge base using the Amazon 10K document for 2023. For more details on creating a knowledge base, refer to Build a contextual chatbot application using Knowledge Bases for Amazon Bedrock.

To demonstrate the value of hybrid search, we use the following query:

As of December 31st 2023, what is the leased square footage for physical stores in North America?

The answer for the preceding query involves a few keywords, such as the date, physical stores, and North America. The correct response is 22,871 thousand square feet. Let’s observe the difference in the search results for both hybrid and semantic search.

The following code shows how to use hybrid or semantic (vector) search using the Retrieve API with Boto3:

import boto3

bedrock_agent_runtime = boto3.client(
    service_name = "bedrock-agent-runtime"
)

def retrieve(query, kbId, numberOfResults=5):
    return bedrock_agent_runtime.retrieve(
        retrievalQuery= {
            'text': query
        },
        knowledgeBaseId=kbId,
        retrievalConfiguration= {
            'vectorSearchConfiguration': {
                'numberOfResults': numberOfResults,
                'overrideSearchType': "HYBRID/SEMANTIC", # optional
            }
        }
    )
response = retrieve("As of December 31st 2023, what is the leased square footage for physical stores in North America?", "<knowledge base id>")["retrievalResults"]

The overrideSearchType option in retrievalConfiguration offers the choice to use either HYBRID or SEMANTIC. By default, it will select the right strategy for you to give you most relevant results, and if you want to override the default option to use either hybrid or semantic search, you can set the value to HYBRID/SEMANTIC. The output of the Retrieve API includes the retrieved text chunks, the location type and URI of the source data, and the relevancy scores of the retrievals. The scores help determine which chunks best match the response of the query.

The following are the results for the preceding query using hybrid search (with some of the output redacted for brevity):

[
  {
    "content": {
      "text": "... Description of Use Leased Square Footage (1).... Physical stores (2) 22,871  ..."
    },
    "location": {
      "type": "S3",
      "s3Location": {
        "uri": "s3://<bucket_name>/amazon-10k-2023.pdf"
      }
    },
    "score": 0.6389407
  },
  {
    "content": {
      "text": "Property and equipment, net by segment is as follows (in millions): December 31, 2021 2022 2023 North America $ 83,640 $ 90,076 $ 93,632 International 21,718 23,347 24,357 AWS 43,245 60,324 72,701 Corporate 1.."
    },
    "location": {
      "type": "S3",
      "s3Location": {
        "uri": "s3://<bucket_name>/amazon-10k-2023.pdf"
      }
    },
    "score": 0.6389407
  },
  {
    "content": {
      "text": "..amortization of property and equipment acquired under finance leases of $9.9 billion, $6.1 billion, and $5.9 billion for 2021, 2022, and 2023. 54 Table of Contents Note 4 — LEASES We have entered into non-cancellable operating and finance leases for fulfillment network, data center, office, and physical store facilities as well as server and networking equipment, aircraft, and vehicles. Gross assets acquired under finance leases, ..."
    },
    "location": {
      "type": "S3",
      "s3Location": {
        "uri": "s3://<bucket_name>/amazon-10k-2023.pdf"
      }
    },
    "score": 0.61908984
  }
]

The following are the results for semantic search (with some of the output redacted for brevity):

[
  {
    "content": {
      "text": "Property and equipment, net by segment is as follows (in millions):    December 31,    2021 2022 2023   North America $ 83,640 $ 90,076 $ 93,632  International 21,718 23,347 24,357  AWS 43,245 60,324 72,701.."
    },
    "location": {
      "type": "S3",
      "s3Location": {
        "uri": "s3://<bucket_name>/amazon-10k-2023.pdf"
      }
    },
    "score": 0.6389407
  },
  {
    "content": {
      "text": "Depreciation and amortization expense on property and equipment was $22.9 billion, $24.9 billion, and $30.2 billion which includes amortization of property and equipment acquired under finance leases of $9.9 billion, $6.1 billion, and $5.9 billion for 2021, 2022, and 2023.   54        Table of Contents   Note 4 — LEASES We have entered into non-cancellable operating and finance leases for fulfillment network, data center, office, and physical store facilities as well a..."
    },
    "location": {
      "type": "S3",
      "s3Location": {
        "uri": "s3://<bucket_name>/amazon-10k-2023.pdf"
      }
    },
    "score": 0.61908984
  },
  {
    "content": {
      "text": "Incentives that we receive from property and equipment   vendors are recorded as a reduction to our costs. Property includes buildings and land that we own, along with property we have acquired under build-to-suit lease arrangements when we have control over the building during the construction period and finance lease arrangements..."
    },
    "location": {
      "type": "S3",
      "s3Location": {
        "uri": "s3://<bucket_name>/amazon-10k-2023.pdf"
      }
    },
    "score": 0.61353767
  }
]

As you can see in the results, hybrid search was able to retrieve the search result with the leased square footage for physical stores in North America as mentioned in the user query. The main reason was that hybrid search was able to combine the results from keywords such as date, physical stores, and North America in the query, whereas semantic search did not. Therefore, when the search results are augmented with the user query and the prompt, the FM won’t be able to provide the correct response in case of semantic search.

Now let’s look at the RetrieveAndGenerate API with hybrid search to understand the final response generated by the FM.

RetrieveAndGenerate API

The RetrieveAndGenerate API queries a knowledge base and generates a response based on the retrieved results. You specify the knowledge base ID as well as the FM to generate a response from the results. Amazon Bedrock converts the queries into embeddings, queries the knowledge base based on the search type, and then augments the FM prompt with the search results as context information and returns the FM-generated response.

Let’s use the query “As of December 31st 2023, what is the leased square footage for physical stores in North America?” and ask the RetrieveAndGenerate API to generate the response using our query:

def retrieveAndGenerate(input, kbId):
    return bedrock_agent_runtime.retrieve_and_generate(
        input={
            'text': input
        },
        retrieveAndGenerateConfiguration={
            'type': 'KNOWLEDGE_BASE',
            'knowledgeBaseConfiguration': {
                'knowledgeBaseId': kbId,
                'modelArn': 'arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-instant-v1'
                'retrievalConfiguration': {
                'overrideSearchType': 'HYBRID/SEMANTIC',
                }
                }
            }
        )
response = retrieveAndGenerate("As of December 31st 2023, what is the leased square footage for physical stores in North America?", "<knowledge base id>")["output"]["text"]

The following are the results using hybrid search:

22,871 thousand leased square feet

The following are the results using semantic search:

The search results do not contain any information about the leased square footage for physical stores in North America for 2023.

The actual answer for the query is 22,871 thousand leased square feet, which is generated by the hybrid search. The retrieved search results for hybrid search included the information about the leased square footage for physical stores in North America, whereas semantic search wasn’t able to fetch the right information from the vector store due to embeddings translation. Therefore, the FM couldn’t provide the correct response because it didn’t have the correct and most relevant search results.

However, for more generic questions that don’t involve entities such as physical stores or North America, both hybrid and semantic search give similar results.

The following are sample responses from a few queries demonstrating cases when both hybrid and semantic search yield similar results.

Question	Semantic Search: RAG API	Hybrid Search: RAG API
How does Amazon serve the developers and enterprises?	We serve developers and enterprises of all sizes, including start-ups, government agencies, and academic institutions, through AWS, which offers a broad set of on-demand technology services, including compute, storage, database, analytics, and machine learning, and other services	We serve developers and enterprises of all sizes, including start-ups, government agencies, and academic institutions, through AWS, which offers a broad set of on-demand technology services, including compute, storage, database, analytics, and machine learning, and other services.
Who are the Executive Officers and Directors for Amazon as of January 24, 2024?	The executive officers of Amazon as of 2024 include Andrew R. Jassy as President and Chief Executive Officer, Douglas J. Herrington as CEO Worldwide Amazon Stores, Brian T. Olsavsky as Senior Vice President and Chief Financial Officer, Shelley L. Reynolds as Vice President, Worldwide Controller, and Principal Accounting Officer, Adam N. Selipsky as CEO Amazon Web Services, and David A. Zapolsky as Senior Vice President, Global Public Policy and General Counsel.	As of 2024, Jeffrey P. Bezos serves as Executive Chair of Amazon.com. Andrew R. Jassy serves as President and Chief Executive Officer. Other executive officers include Douglas J. Herrington as CEO Worldwide Amazon Stores, Brian T. Olsavsky as Senior Vice President and Chief Financial Officer, Shelley L. Reynolds as Vice President, Worldwide Controller, and Principal Accounting Officer, and Adam N. Selipsky as CEO Amazon Web Services. David A. Zapolsky serves as Senior Vice President, Global Public Policy and General Counsel

Use hybrid search and semantic search options via the Amazon Bedrock console

To use hybrid and semantic search options on the Amazon Bedrock console, complete the following steps:

On the Amazon Bedrock console, choose Knowledge base in the navigation pane.
Choose the knowledge base you created.
Choose Test knowledge base.
Choose the configurations icon.
For Search type¸ select Hybrid search (semantic & text).

By default, you can choose an FM to get a generated response for your query. If you want to see only the retrieved results, you can toggle Generate response off to get only retrieved results.

Conclusion

In this post, we covered the new query feature in Knowledge Bases for Amazon Bedrock, which enables hybrid search. We learned how to configure the hybrid search option in the SDK and the Amazon Bedrock console. This helps overcome some of the limitations of relying solely on semantic search, especially for searching over large collections of documents with diverse content. The use of hybrid search depends on the document type and the use case that you are trying to implement.

For additional resources, refer to the following:

User guide: Knowledge base for Amazon Bedrock
YouTube video: Use RAG to improve responses in generative AI application
GitHub repo code samples: Amazon Bedrock Knowledge Base – Samples for building RAG workflows

References

Improving Retrieval Performance in RAG Pipelines with Hybrid Search

About the Authors

Mani Khanuja is a Tech Lead – Generative AI Specialists, author of the book Applied Machine Learning and High Performance Computing on AWS, and a member of the Board of Directors for Women in Manufacturing Education Foundation Board. She leads machine learning projects in various domains such as computer vision, natural language processing, and generative AI. She speaks at internal and external conferences such AWS re:Invent, Women in Manufacturing West, YouTube webinars, and GHC 23. In her free time, she likes to go for long runs along the beach.

Pallavi Nargund is a Principal Solutions Architect at AWS. In her role as a cloud technology enabler, she works with customers to understand their goals and challenges, and give prescriptive guidance to achieve their objective with AWS offerings. She is passionate about women in technology and is a core member of Women in AI/ML at Amazon. She speaks at internal and external conferences such as AWS re:Invent, AWS Summits, and webinars. Outside of work she enjoys volunteering, gardening, cycling and hiking.

Automakers Electrify Geneva International Motor Show

The Geneva International Motor Show, one of the most important and long-standing global auto exhibitions, opened this week, with the spotlight on several China and U.S. EV makers building on NVIDIA DRIVE that are expanding their presence in Europe.

BYD

One of the key reveals is BYD’s Yangweng U8 plug-in hybrid large SUV, built on the NVIDIA DRIVE Orin platform. It features an electric drivetrain with a range of up to 1,000 kilometers, or 621 miles, thanks to its 49 kilowatt-hour battery and range-extender gasoline engine.

IM Motors

The premium EV brand by SAIC and Alibaba, IM Motors, unveiled its IM L6 mid-size electric saloon (aka sedan), which will soon be available to the European market.

The L6 features IM’s proprietary Intelligent Mobility Autonomous Driving System, powered by NVIDIA DRIVE Orin. The advanced system uses a comprehensive sensor suite, including one lidar, three radars, 11 cameras and 12 ultrasonic sensors. IM reports that the L6 is capable of highway navigation on autopilot, marking a milestone in the company’s mission to bring advanced self-driving vehicles to market.

IM Motors has been working with NVIDIA since 2021, using NVIDIA DRIVE Orin as the AI brain for its flagship LS7 SUV and L7 sedan.

Lucid

The Geneva Motor Show marks the European debut of Lucid’s Gravity SUV.

The Gravity aims to set fresh benchmarks for sustainability and technological innovation when its production commences in late 2024. Powered by NVIDIA DRIVE, the luxury SUV features supercar levels of performance and an impressive battery range to mitigate range anxiety.

Explore the latest developments in mobility and meet NVIDIA DRIVE ecosystem partners showcasing their next-gen vehicles at GTC, the conference for the era of AI, running from March 18-21 at the San Jose Convention Center and online.

Find additional details on automotive-specific programming at GTC.

No Noobs Here: Top Pro Gamers Bolster Software Quality Assurance Testing

For some NVIDIANs, it’s always game day.

Our Santa Clara-based software quality assurance team boasts some of the world’s top gamers, whose search for bugs and errors is as strategic as their battle plans for toppling top-tier opponents in video games.

Two team members of the QA team — friendly colleagues in the office but fierce rivals in the esports arena — recently competed against one another at the finals of the Guildhouse Fighters tournament, a local circuit in Northern California.

Eduardo “PR Balrog” Perez-Frangie, a veteran Street Fighter player, fought his way to the Grand Final to face Miky “Samurai” Chea.

Perez-Frangie came out on top, but there was a twist: he’d brought to the contest his two-year-old son, who fell asleep on his father’s chest mid-match. “I played the rest of the game with a deadweight in my lap,” he said.

Perez-Frangie and Chea play at work — guess who won? QA team members Alyssa Ruiz and DaJuan McDaniel cheer them on.

A Competitive Spirit

Perez-Frangie has competed for 15 years in a series of fighting game titles, including Marvel vs. Capcom, Killer Instinct and Mortal Kombat. He was part of the Evil Geniuses esports organization when he joined NVIDIA almost a decade ago but now plays without a sponsor so he can enjoy more family time.

He’s played against Chea in esports events for years, and they’re just as competitive in the office as in the stadiums.

“Even when we’re in the test environment, when we’re looking for bugs, we are competitive,” Perez-Frangie said. “But Miky stays calm — he was a teacher, so he can put everyone in their place.”

Chea’s days teaching kindergarten through eighth grade in Fresno, California, are behind him, but the coaching aspect of his current gaming-related role reminds him of the classroom — a place to share insights and takeaways.

As new games are released and older ones updated, “the hardware and software stack needs to work in harmony,” Chea said. “Our pro gaming team is the last line of defense to ensure our customers have the best gaming experience possible.”

The QA team gathers to check out a new game.

QA team member DaJuan “Shroomed” McDaniel is a top-ranked Super Smash Bros. Melee player whose signature characters are Sheik and Marth. He’s also widely considered to be the best Dr. Mario player of all time.

“Being a competitive gamer, visual fidelity is so important,” McDaniel said. “We can see and feel visual anomalies, frame discrepancies, general latency and anything that’s off in ways that others won’t see.”

A Winning Formula

Alyssa Ruiz joined the QA team a year ago, initially testing drivers as part of the pro gaming team before switching to testing NVIDIA DLSS, a suite of neural rendering techniques that use deep learning to improve image quality and performance.

Introduced to gaming by her brothers through Halo 3, she later dedicated hours to Fortnite before deciding to stream the gameplay directly from her console. She posted the content to TikTok and began playing in online tournaments. By then, her game of choice was Riot Games’ Valorant.

“The game has a large female player base with visually appealing graphics and an engrossing storyline,” she said. “It can be more complex than a fighting game because it relies on a combination of abilities with strategies. It’s also a team game, so if someone isn’t pulling their weight, it’s a loss for all of us.”

That’s not unlike the team dynamic in the office.

Each member brings their own specialties to the testing environment, where they’re using their keen eyes to scrutinize DLSS technologies.

Their acute awareness of game latency and image fidelity — honed through hundreds of hours of gameplay — means the team can achieve better test coverage all around.

“We’re all very competitive, but there’s a real diversity that contributes to a stronger team,” Ruiz said. “And we all get along really well.”

Learn more about NVIDIA life, culture and careers.

What Is Trustworthy AI?

Artificial intelligence, like any transformative technology, is a work in progress — continually growing in its capabilities and its societal impact. Trustworthy AI initiatives recognize the real-world effects that AI can have on people and society, and aim to channel that power responsibly for positive change.

What Is Trustworthy AI?

Trustworthy AI is an approach to AI development that prioritizes safety and transparency for those who interact with it. Developers of trustworthy AI understand that no model is perfect, and take steps to help customers and the general public understand how the technology was built, its intended use cases and its limitations.

In addition to complying with privacy and consumer protection laws, trustworthy AI models are tested for safety, security and mitigation of unwanted bias. They’re also transparent — providing information such as accuracy benchmarks or a description of the training dataset — to various audiences including regulatory authorities, developers and consumers.

Principles of Trustworthy AI

Trustworthy AI principles are foundational to NVIDIA’s end-to-end AI development. They have a simple goal: to enable trust and transparency in AI and support the work of partners, customers and developers.

Privacy: Complying With Regulations, Safeguarding Data

AI is often described as data hungry. Often, the more data an algorithm is trained on, the more accurate its predictions.

But data has to come from somewhere. To develop trustworthy AI, it’s key to consider not just what data is legally available to use, but what data is socially responsible to use.

Developers of AI models that rely on data such as a person’s image, voice, artistic work or health records should evaluate whether individuals have provided appropriate consent for their personal information to be used in this way.

For institutions like hospitals and banks, building AI models means balancing the responsibility of keeping patient or customer data private while training a robust algorithm. NVIDIA has created technology that enables federated learning, where researchers develop AI models trained on data from multiple institutions without confidential information leaving a company’s private servers.

NVIDIA DGX systems and NVIDIA FLARE software have enabled several federated learning projects in healthcare and financial services, facilitating secure collaboration by multiple data providers on more accurate, generalizable AI models for medical image analysis and fraud detection.

Safety and Security: Avoiding Unintended Harm, Malicious Threats

Once deployed, AI systems have real-world impact, so it’s essential they perform as intended to preserve user safety.

The freedom to use publicly available AI algorithms creates immense possibilities for positive applications, but also means the technology can be used for unintended purposes.

To help mitigate risks, NVIDIA NeMo Guardrails keeps AI language models on track by allowing enterprise developers to set boundaries for their applications. Topical guardrails ensure that chatbots stick to specific subjects. Safety guardrails set limits on the language and data sources the apps use in their responses. Security guardrails seek to prevent malicious use of a large language model that’s connected to third-party applications or application programming interfaces.

NVIDIA Research is working with the DARPA-run SemaFor program to help digital forensics experts identify AI-generated images. Last year, researchers published a novel method for addressing social bias using ChatGPT. They’re also creating methods for avatar fingerprinting — a way to detect if someone is using an AI-animated likeness of another individual without their consent.

To protect data and AI applications from security threats, NVIDIA H100 and H200 Tensor Core GPUs are built with confidential computing, which ensures sensitive data is protected while in use, whether deployed on premises, in the cloud or at the edge. NVIDIA Confidential Computing uses hardware-based security methods to ensure unauthorized entities can’t view or modify data or applications while they’re running — traditionally a time when data is left vulnerable.

Transparency: Making AI Explainable

To create a trustworthy AI model, the algorithm can’t be a black box — its creators, users and stakeholders must be able to understand how the AI works to trust its results.

Transparency in AI is a set of best practices, tools and design principles that helps users and other stakeholders understand how an AI model was trained and how it works. Explainable AI, or XAI, is a subset of transparency covering tools that inform stakeholders how an AI model makes certain predictions and decisions.

Transparency and XAI are crucial to establishing trust in AI systems, but there’s no universal solution to fit every kind of AI model and stakeholder. Finding the right solution involves a systematic approach to identify who the AI affects, analyze the associated risks and implement effective mechanisms to provide information about the AI system.

Retrieval-augmented generation, or RAG, is a technique that advances AI transparency by connecting generative AI services to authoritative external databases, enabling models to cite their sources and provide more accurate answers. NVIDIA is helping developers get started with a RAG workflow that uses the NVIDIA NeMo framework for developing and customizing generative AI models.

NVIDIA is also part of the National Institute of Standards and Technology’s U.S. Artificial Intelligence Safety Institute Consortium, or AISIC, to help create tools and standards for responsible AI development and deployment. As a consortium member, NVIDIA will promote trustworthy AI by leveraging best practices for implementing AI model transparency.

And on NVIDIA’s hub for accelerated software, NGC, model cards offer detailed information about how each AI model works and was built. NVIDIA’s Model Card ++ format describes the datasets, training methods and performance measures used, licensing information, as well as specific ethical considerations.

Nondiscrimination: Minimizing Bias

AI models are trained by humans, often using data that is limited by size, scope and diversity. To ensure that all people and communities have the opportunity to benefit from this technology, it’s important to reduce unwanted bias in AI systems.

Beyond following government guidelines and antidiscrimination laws, trustworthy AI developers mitigate potential unwanted bias by looking for clues and patterns that suggest an algorithm is discriminatory, or involves the inappropriate use of certain characteristics. Racial and gender bias in data are well-known, but other considerations include cultural bias and bias introduced during data labeling. To reduce unwanted bias, developers might incorporate different variables into their models.

Synthetic datasets offer one solution to reduce unwanted bias in training data used to develop AI for autonomous vehicles and robotics. If data used to train self-driving cars underrepresents uncommon scenes such as extreme weather conditions or traffic accidents, synthetic data can help augment the diversity of these datasets to better represent the real world, helping improve AI accuracy.

NVIDIA Omniverse Replicator, a framework built on the NVIDIA Omniverse platform for creating and operating 3D pipelines and virtual worlds, helps developers set up custom pipelines for synthetic data generation. And by integrating the NVIDIA TAO Toolkit for transfer learning with Innotescus, a web platform for curating unbiased datasets for computer vision, developers can better understand dataset patterns and biases to help address statistical imbalances.

Learn more about trustworthy AI on NVIDIA.com and the NVIDIA Blog. For more on tackling unwanted bias in AI, watch this talk from NVIDIA GTC and attend the trustworthy AI track at the upcoming conference, taking place March 18-21 in San Jose, Calif, and online.

Expedite your Genesys Cloud Amazon Lex bot design with the Amazon Lex automated chatbot designer

The rise of artificial intelligence (AI) has created opportunities to improve the customer experience in the contact center space. Machine learning (ML) technologies continually improve and power the contact center customer experience by providing solutions for capabilities like self-service bots, live call analytics, and post-call analytics. Self-service bots integrated with your call center can help you achieve decreased wait times, intelligent routing, decreased time to resolution through self-service functions or data collection, and improved net promoter scores (NPS). Some examples include a customer calling to check on the status of an order and receiving an update from a bot, or a customer needing to submit a renewal for a license and the chatbot collecting the necessary information, which it hands over to an agent for processing.

With Amazon Lex bots, you can use conversational AI capabilities to enable these capabilities within your call center. Amazon Lex uses automatic speech recognition (ASR) and natural language understanding (NLU) to understand the customer’s needs and assist them on their journey.

Genesys Cloud (an omni-channel orchestration and customer relationship platform) provides a contact center platform in a public cloud model that enables quick and simple integration of AWS Contact Center Intelligence (AWS CCI) to transform the modern contact center from a cost center into a profit center. As part of AWS CCI, Genesys Cloud integrates with Amazon Lex, which enables self-service, intelligent routing, and data collection capabilities.

When exploring AWS CCI capabilities with Amazon Lex and Genesys Cloud, you may be unsure of where to start on your bot design journey. To assist those who may be starting with a blank canvas, Amazon Lex provides the Amazon Lex automated chatbot designer. The automated chatbot designer uses ML to provide an initial bot design that you can then refine and launch conversational experiences faster based on your current call transcripts. With the automated chatbot designer, Amazon Lex customers and partners have a straightforward and intuitive way of designing chatbots and can reduce bot design time from weeks to hours. However, the automated chatbot designer requires transcripts to be in a certain format that is not aligned to Genesys Cloud transcript exports.

In this post, we show how you can implement an architecture using Amazon EventBridge, Amazon Simple Storage Service (Amazon S3), and AWS Lambda to automatically collect, transform, and load your Genesys call transcripts in the required format for the Amazon Lex automated chatbot designer. You can then run the automated chatbot designer on your transcripts, be given recommendations for bot design, and streamline your bot design journey.

Solution overview

The following diagram illustrates the solution architecture.

The solution workflow consists of the following steps:

Genesys Cloud sends iterative transcripts events to your EventBridge event bus.
Lambda receives the iterative transcripts from EventBridge, determines when a conversation is complete, and invokes the Transcript API within Genesys Cloud and drops the full transcript in an S3 bucket.
When a new full transcript is uploaded to Amazon S3, Lambda converts the Genesys Cloud formatted transcript into the required format for the Amazon Lex automated chatbot designer and copies it to an S3 bucket.
The Amazon Lex automated chatbot designer uses ML to build an initial bot design based on the provided Genesys Cloud transcripts.

Prerequisites

Before you deploy the solution, you must complete the following prerequisites:

Set up your Genesys Cloud CX account and make sure that you are able to log in. For more information on setting up your account, refer to the Genesys documentation.
Make sure that the right permissions are set for enabling and publishing transcripts from Genesys. For more information on setting up the required permissions, refer to Roles and permissions overview.
If PCI and PII encryption is required for transcription, make sure it is set up in Genesys. For more information on setting up the required permissions, refer to Are interaction transcripts encrypted when stored in the cloud.
Set up an AWS account with the appropriate permissions.

Deploy the Genesys EventBridge integration

To enable the EventBridge integration with Genesys Cloud, complete the following steps:

Log in to the Genesys Cloud environment.
Choose Admin, Integrations, Add Integrations, and Amazon EventBridge Source.
On the Configuration tab, provide the following information:
1. For AWS Account ID, enter your AWS account ID.
2. For AWS Account Region, enter the Region where you want EventBridge to be set up.
3. For Event Source Suffix, enter a suffix (for example, genesys-eb-poc-demo).
Save your configuration.
On the EventBridge console, choose Integration in the navigation pane, then choose Partner event sources.

There should be an event source listed with a name like aws.partner/genesys.com/…/genesys-eb-poc-demo.

Select the partner event source and choose Associate with event bus.

The status changes from Pending to Active. This sets up the EventBridge configuration for Genesys.

Next, you set up OAuth2 credentials in Genesys Cloud for authorizing the API call to get the final transcript.

Navigate to the Genesys Cloud instance.
Choose Admin, Integrations, and OAuth.
Choose Add Client.
On the Client Details tab, provide the following information:
1. For App Name, enter a name (for example, TranscriptInvoke-creds).
2. For Grant Types, select Client Credentials.

Make sure you’re using the right role that has access to invoke the Transcribe APIs.

Choose Save.

This generates new values for Client ID and Client Secret. Copy these values to use in the next section, where you configure the template for the solution.

Deploy the solution

After you have set up the Genesys EventBridge integration, you can deploy an AWS Serverless Application Model (AWS SAM) template, which deploys the remainder of the architecture. To deploy the solution in your account, complete the following steps:

Install AWS SAM if not installed already. For instructions, refer to Installing the AWS SAM CLI.
Download the GitHub repo and unzip to your directory.
Navigate to the genesys-to-lex-automated-chatbot-designer folder and run the following commands:
```
sam build --use-container
sam deploy –guided
```

The first command builds the source of your application. The second command packages and deploys your application to AWS, with a series of prompts:

Stack Name – Enter the name of the stack to deploy to AWS CloudFormation. This should be unique to your account and Region; a good starting point is something matching your project name.
AWS Region – Enter the Region you want to deploy your app to. Make sure it is deployed in the same Region as the EventBridge event bus.
Parameter GenesysBusname – Enter the bus name created when you configured the Genesys integration. The pattern of the bus name should look like aws.partner/genesys.com/*.
Parameter ClientId – Enter the client ID you copied earlier.
Parameter ClientSecret – Enter the client secret you copied earlier.
Parameter FileNamePrefix – Change the default file name prefix for the target transcript file in the raw S3 bucket or keep the default.
Parameter GenCloudEnv – Enter is the cloud environment for the specific Genesys organization. Genesys is available in more than 15 Regions worldwide as of this writing, so this value is mandatory and should point to the environment where your organization is created in Genesys (for example, usw2.pure.cloud).
Confirm changes before deploy – If set to yes, any change sets will be shown to you before deployment for manual review. If set to no, the AWS SAM CLI will automatically deploy application changes.
Allow SAM CLI IAM role creation – Many AWS SAM templates, including this example, create AWS Identity and Access Management (IAM) roles required for the Lambda functions included to access AWS services. By default, these are scoped down to the minimum required permissions. To deploy a CloudFormation stack that creates or modifies IAM roles, you must provide the CAPABILITY_IAM value for capabilities. If permission isn’t provided through this prompt, to deploy this example, you must explicitly pass --capabilities CAPABILITY_IAM to the sam deploy command.
Save arguments to samconfig.toml – If set to yes, your choices will be saved to a configuration file inside the project, so that in the future you can rerun sam deploy without parameters to deploy changes to your application.

After you deploy your AWS SAM application in your account, you can test that Genesys transcripts are being sent to your account and being transformed into the required format for the Amazon Lex automated chatbot designer.

Make a test call to validate the solution

After you have set up the Genesys EventBridge integration and deployed the preceding AWS SAM template, you can make test calls and validate that files are ending up in the S3 bucket for transformed files. At a high level, you need to perform the following steps:

Make a test call to your Genesys instance to create a transcript.
Wait a few minutes and check the TransformedTranscript bucket for the output.

Run the automated chatbot designer

After you have a few days’ worth of transcripts saved in Amazon S3, you can run the automated chatbot designer through the Amazon Lex console using the steps in this section. For more information about the minimum and maximum amount of turns for the service, refer to Prepare transcripts.

On the Amazon Lex V2 console, choose Bots in the navigation pane.
Choose Create bot.
Select Start with transcripts as the creation method.
Give the bot a name (for this example, InsuranceBot) and provide an optional description.
Select Create a role with basic Amazon Lex permissions and use this as your runtime role.
After you fill out the other fields, choose Next to proceed to the language configuration.
Choose the language and voice for your interaction.
Specify the Amazon S3 location of the transcripts that the solution has converted for you.
Add additional local paths if you have a specific a folder structure within your S3 bucket.
Apply a filter (date range) for your input transcripts.
Choose Done.

You can use the status bar on the Amazon S3 console to track the analysis. Within a few hours, the automated chatbot designer surfaces a chatbot design that includes user intents, sample phrases associated with those intents, and a list of all the information required to fulfill them. The amount of time it takes to complete training depends on several factors, including the volume of transcripts and the complexity of the conversations. Typically, 600 lines of transcript are analyzed every minute.

Choose Review to view the intents and slot types discovered by the automated chatbot designer.

The Intents tab lists all the intents along with sample phrases and slots, and the Slot types tab provides a list of all the slot types along with slot type values.

Choose any of the intents to review the sample utterances and slots. For example, in the following screenshot, we choose ChangePassword to view the utterances.
Choose the Associated transcripts tab to review the conversations used to identify the intents.
After you review the results, select the intents and slot types relevant to your use case and choose Add.

This adds the selected intents and slot types to the bot. You can now iterate on this design by making changes such as adding prompts, merging intents or slot types, and renaming slots.

You have now used the Amazon Lex automated chatbot designer to identify common intents, utterances mapped to those intents, and information that the chatbot needs to collect to fulfill certain business functions.

Clean up

When you’re finished, clean up your resources by using the following command within the AWS SAM CLI:

sam delete

Conclusion

This post showed you how to use the Genesys Cloud CX and EventBridge integration to send your Genesys CX transcripts to your AWS account, transform them, and use them with the Amazon Lex automated chatbot designer to create sample bots, intents, utterances, and slots. This architecture can help first-time AWS CCI users and current AWS CCI users onboard more chatbots using the Genesys CX and Amazon Lex integration, or in continuous improvement opportunities where you may want to compare your current intent design to that outputted by the Amazon Lex automated chatbot designer. For more information about other AWS CCI capabilities, see Contact Center Intelligence.

About the Authors

Joe Morotti is a Solutions Architect at Amazon Web Services (AWS), helping Enterprise customers across the Midwest US. He has held a wide range of technical roles and enjoy showing customer’s art of the possible. In his free time, he enjoys spending quality time with his family exploring new places and over analyzing his sports team’s performance.

Anand Bose is a Senior Solutions Architect at Amazon Web Services, supporting ISV partners who build business applications on AWS. He is passionate about creating differentiated solutions that unlock customers for cloud adoption. Anand lives in Dallas, Texas and enjoys travelling.

Teri Ferris is responsible for architecting great customer experiences alongside business partners, leveraging Genesys technology solutions that enable Experience Orchestration for contact centers. In her role she advises on solution architecture, integrations, IVR, routing, reporting analytics, self-service, AI, outbound, mobile capabilities, omnichannel, social channels, digital, unified communications (UCaaS), and analytics and how they can streamline the customer experience. Before Genesys, she held senior leadership roles at Human Resources, Payroll, and Learning Management companies, including overseeing the Contact Center.

Organizing Committee

Booth Activities

Talks

Monday

Tuesday

Wednesday

Thursday

Friday

Introducing Anthropic’s Claude 3 models

Benefits of Anthropic Claude 3 FMs on Amazon Bedrock

The future of generative AI

Conclusion

Resources

About the author

Hybrid search overview

Use cases for hybrid search

Benefits of hybrid search

Use hybrid search and semantic search options via SDK

Retrieve API

RetrieveAndGenerate API

Use hybrid search and semantic search options via the Amazon Bedrock console

Conclusion

References

About the Authors

BYD

IM Motors

Lucid

A Competitive Spirit

A Winning Formula

What Is Trustworthy AI?

Principles of Trustworthy AI

Privacy: Complying With Regulations, Safeguarding Data

Safety and Security: Avoiding Unintended Harm, Malicious Threats

Transparency: Making AI Explainable

Nondiscrimination: Minimizing Bias

Solution overview

Prerequisites

Deploy the Genesys EventBridge integration

Deploy the solution

Make a test call to validate the solution

Run the automated chatbot designer

Clean up

Conclusion

About the Authors

Navigation

GenAI Vision Endless Possibilities

"I'm interested in things that change the world or that affect the future and wondrous, new technology where you see it, and you're like, 'Wow, how did that even happen? How is that possible?'" -- Elon Musk

Copyright © 2019-2025 Vedere AI. All Rights Reserved.