How AWS Sales uses generative AI to streamline account planning

How AWS Sales uses generative AI to streamline account planning

Every year, AWS Sales personnel draft in-depth, forward looking strategy documents for established AWS customers. These documents help the AWS Sales team to align with our customer growth strategy and to collaborate with the entire sales team on long-term growth ideas for AWS customers. These documents are internally called account plans (APs). In 2024, this activity took an account manager (AM) up to 40 hours per customer. This, combined with similar time spent for support roles researching and writing the growth plans for customers on the AWS Cloud, led to significant organization overhead. To help improve this process, in October 2024 we launched an AI-powered account planning draft assistant for our sales teams, building on the success of Field Advisor, an internal sales assistant tool. This new capability uses Amazon Bedrock to help our sales teams create comprehensive and insightful APs in less time. Since its launch, thousands of sales teams have used the resulting generative AI-powered assistant to draft sections of their APs, saving time on each AP created.

In this post, we showcase how the AWS Sales product team built the generative AI account plans draft assistant.

Business use cases

The account plans draft assistant serves four primary use cases:

  • Account plan draft generation: Using Amazon Bedrock, we’ve made internal and external data sources available to generate draft content for key sections of the APs. This enables our sales teams to quickly create initial drafts for sections such as customer overviews, industry analysis, and business priorities, which previously required hours of research across the internet and relied on disparate internal AWS tools.
  • Data synthesis: The assistant can pull relevant information from multiple sources including from our customer relationship management (CRM) system, financial reports, news articles, and previous APs to provide a holistic view of our customers.
  • Quality checks: Built-in quality assurance capabilities help ensure that APs meet internal standards for comprehensiveness, accuracy, and strategic alignment with our customers and business.
  • Customization: While providing AI-generated drafts, the product allows AMs to customize and refine the content by uploading proprietary documents to match their unique customer knowledge and strategic approach.

The account plan draft assistant loads when a user tries to create an AP, and users copy and paste each section they want to use in their final plan.

Account plans draft assistant UX

Our AMs report reduced time to write these documents, allowing them to focus more on high-value activities such as customer engagement and strategy development.

Here’s what some of our AMs had to say about their experience with the account plans draft assistant:

“The AI assistant saved me at least 15 hours on my latest enterprise account plan. It pulled together a great first draft, which I was then able to refine based on my own insights. This allowed me to spend more time actually engaging with my customer rather than doing research and writing.”

– Enterprise Account Manager

“As someone managing multiple mid-market accounts, I struggled to create in-depth plans for all my customers. The AI assistant now helps me rapidly generate baseline plans that I can then prioritize and customize. It’s a game-changer for serving my full portfolio of accounts.”

– Mid-market Account Manager

Amazon Q, Amazon Bedrock, and other AWS services underpin this experience, enabling us to use large language models (LLMs) and knowledge bases (KBs) to generate relevant, data-driven content for APs. Let’s explore how we built this AI assistant and some of our future plans.

Building the account plans draft assistant

When a user of the AWS internal CRM system initiates the workflow in Field Advisor, it triggers the account plan draft assistant capability through a pre-signed URL. The assistant then orchestrates a multi-source data collection process, performing web searches while also pulling account metadata from OpenSearch, Amazon DynamoDB, and Amazon Simple Storage Service (Amazon S3) storage. After analyzing and combining this data with user-uploaded documents, the assistant uses Amazon Bedrock to generate the AP. When complete, a notification chain using Amazon Simple Queue Service (Amazon SQS) and our internal notifications service API gateway begins delivering updates using Slack direct messaging and storing searchable records in OpenSearch for future reference.

The following diagram illustrates the high-level architecture of the account plans draft assistant.

Solution overview

We built the account plans draft assistant using the following key components:

  1. Amazon Bedrock: Provides programmatic (API) access to high performing foundation models (FMs) along with vector search capabilities and metadata filtering using Amazon Bedrock Knowledge Bases. We populate an Amazon Bedrock knowledge bases using sales-enablement materials, historic APs, and other relevant documents curated by AWS Glue jobs (see more on AWS Glue jobs in the item 4).
  2. AWS Lambda: Supports two use cases:
    1. The async resolver Lambda function interfaces with the front-end client CRM and orchestrates async job IDs for the client to poll. This layer also handles input validations, user request throttling and cache management.
    2. Worker Lambda functions perform the actual heavy lifting to create AP content. These functions work concurrently to generate different sections of APs by using publicly available data, internal data, and curated data in Amazon Bedrock knowledge bases. These functions invoke various LLMs using Amazon Bedrock and store the final content in the AP’s DynamoDB database corresponding to each async job ID.
  3. DynamoDB: Maintains the state of each user request by tracking async job IDs, tracks throttling quota (global request count and per-user request count), and acts as a cache.
  4. AWS Glue jobs: Curate and transform data from various internal and external data sources. These AWS Glue jobs push data to internal data sources (APs, internal tooling team S3 buckets, and other internal services) and to Bedrock KBs, facilitating high quality output through retrieval augmented generation (RAG).
  5. Amazon SQS: Enables us to decouple the management plane and data plane. This decoupling is crucial in allowing the data plane worker functions to concurrently process different sections of the APs and make sure that we can generate APs within specified times.
  6. Custom web frontend: A ReactJS based micro-frontend architecture enables us to integrate directly into our CRM system for a seamless user experience.

Data management

Our account plans draft assistant uses an Amazon Bedrock out-of-the-box knowledge base management solution. Through its RAG architecture, we semantically search and use metadata filtering to retrieve relevant context from diverse sources: internal sales enablement materials, historic APs, SEC filings, news articles, executive engagements and data from our CRM systems. The connectors built into Amazon Bedrock handle data ingestion from Amazon S3, relational database management systems (RDBMS), and third-party APIs; while its KB capabilities enable us to filter and prioritize source documents when generating responses. This context-aware approach results in higher quality and more relevant content in our generated AP sections.

Security and compliance

Security and Compliance are paramount to AWS when dealing with data regarding our customers. We use AWS IAM Identity Center for enterprise single sign-on so that only authorized users can access the account plans draft assistant. Using Field Advisor, we use various internal authorization mechanisms to help ensure that a user who’s generating APs only accesses the data that they already have access to.

User experience

We built a custom web frontend using a micro-frontend approach that integrates directly into our CRM system, allowing AMs to access the account plans draft assistant without leaving their familiar work environment. The interface allows users to select which sections of APs they want to generate, provides options for customization, and notifies users to create their APs on time through Slack.

Looking ahead

While the account plans draft assistant has already demonstrated significant value, we’re continuing to enhance its capabilities. Our goal is to create a zero-touch account planner that sales teams can use to generate a full AP for a customer, incorporating best practices observed across our customers to provide sales teams best-in-class strategies to engage with customers. This would include:

  •  Deeper integration with our bespoke purpose-built planning tools and assistance with account planning, such as automatically generating value maps and stakeholder maps.
  • Enhanced personalization to tailor content based on industry, account size, and individual user preferences.
  • Improved collaboration features, so that multiple sales team members can work together on refining AI-generated plans.
  • Expanded use of recommendations to provide what next? ideas to our sales teams to better serve our customers.

Conclusion

The account plans draft assistant, powered by Amazon Bedrock, has significantly streamlined our AP process, allowing our AWS Sales teams to create higher quality APs in a fraction of the time they currently need. As we continue to refine and expand this capability, we’re excited to see how it will further enhance our ability to serve our customers and drive their success in the AWS Cloud.

If you’re interested in learning how generative AI can transform your sales function and its processes, reach out to your AWS account team to discuss how services such as Amazon Q and Amazon Bedrock can help you build similar solutions for your organization.


About the Authors

Saksham Kakar is a Sr. Product Manager (Technical) in the AWS Field Experiences (AFX) organization focused on developing products that enable AWS Sales teams to help AWS customers grow with Amazon. Prior to this, Saksham led large sales, strategy and operations teams across startups and Fortune 500 companies. Outside of work, he is an avid tennis player and amateur skier.

Vimanyu Aggarwal is a Senior Software Engineer in AWS Field Experiences (AFX) organization with over 10 years of industry experience. Over the last decade, Vimanyu has been focusing on building large-scale, complex distributed systems at various Fortune 500 organizations. Currently, he works with multiple teams within the AFX organization to deliver technical solutions that empower the $100 billion sales funnel. Outside of work, he likes to play board games, tinker with IoT, and explore nature.

Krishnachand Velaga is a Senior Manager for Product Management – Technical (PM-T) in the AWS Field Experiences (AFX) organization who manages a team of seasoned PM-Ts and a suite of sales products, using generative AI to enable the AWS Sales organization help AWS customers across the globe adopt, migrate and grow on the AWS Cloud in line with their business needs and outcomes while bolstering sales efficiency and productivity and reducing operational cost.

Scott Wilkinson is a Software Development Manager in the AWS Field Experiences (AFX) organization, where he leads a cross-functional engineering team developing tools that aggregate and productize data to power AWS customer insights. Prior to AWS, Scott worked for notable startups including Digg, eHarmony, and Nasty Gal in both leadership and software development roles. Outside of work, Scott is a musician (guitar and piano) and loves to cook French cuisine.

Read More

Shaping the future: OMRON’s data-driven journey with AWS

Shaping the future: OMRON’s data-driven journey with AWS

This post is co-written with Emrah Kaya and Xinyi Zhou from Omron Europe.

Data is one of the most critical assets of many organizations. They’re constantly seeking ways to use their vast amounts of information to gain competitive advantages.

OMRON Corporation is a leading technology provider in industrial automation, healthcare, and electronic components. In their Shaping the Future 2030 (SF2030) strategic plan, OMRON aims to address diverse social issues, drive sustainable business growth, transform business models and capabilities, and accelerate digital transformation. At the heart of this transformation is the OMRON Data & Analytics Platform (ODAP), an innovative initiative designed to revolutionize how the company harnesses its data assets.

This post explores how OMRON Europe is using Amazon Web Services (AWS) to build its advanced ODAP and its progress toward harnessing the power of generative AI.

Challenges

By using advanced data and analytics capabilities, organizations can gain valuable insights into their operations, industry trends, and customer behaviors, leading to more informed strategies and increased insight. This approach is particularly powerful when applied to mission-critical data such as enterprise resource planning (ERP) and customer relationship management (CRM) systems because these contain information about internal processes, supply chain management, and customer interactions. By analyzing their data, organizations can identify patterns in sales cycles, optimize inventory management, or help tailor products or services to meet customer needs more effectively. However, organizations often face significant challenges in realizing these benefits because of:

  • Data silos – Organizations often use multiple systems across regions or departments. Integrating these diverse sources to create a single source of truth is complex, making it difficult to generate unified reports or analyze cross-functional trends.
  • Data governance challenges – Maintaining consistent data governance across different systems is crucial but complex. Implementing uniform policies across different systems and departments presents significant hurdles.
  • Different formats and standards – Systems typically use varied data formats and structures. This disparity complicates data integration and cross-system analysis, requiring significant effort to reconcile and harmonize data for comprehensive insights.

OMRON Data & Analytics Platform

To address these challenges, OMRON Europe (hereinafter “OMRON”) decided to implement an advanced data and analytics platform, ODAP. This innovative solution was designed to serve as a centralized hub for specific data assets, breaking down the barriers between various data sources and systems.

The following diagram shows a simplified architecture and some of the services and architectural patterns used for ODAP.

ODAP aimed to seamlessly integrate data from multiple ERP and CRM systems in addition to other relevant data sources across the organization. Amazon AppFlow was used to facilitate the smooth and secure transfer of data from various sources into ODAP. Additionally, Amazon Simple Storage Service (Amazon S3) served as the central data lake, providing a scalable and cost-effective storage solution for the diverse data types collected from different systems. The robust security features provided by Amazon S3, including encryption and durability, were used to provide data protection. Finally, ODAP was designed to incorporate cutting-edge analytics tools and future AI-powered insights.

Some of these tools included AWS Cloud based solutions, such as AWS Lambda and AWS Step Functions. Lambda enables serverless, event-driven data processing tasks, allowing for real-time transformations and calculations as data arrives. Step Functions complements this by orchestrating complex workflows, coordinating multiple Lambda functions, and managing error handling for sophisticated data processing pipelines. This enables OMRON to extract meaningful patterns and trends from its vast data repositories, supporting more informed decision-making at all levels of the organization.

OMRON’s data strategy—represented on ODAP—also allowed the organization to unlock generative AI use cases focused on tangible business outcomes and enhanced productivity. Part of a comprehensive approach to using artificial intelligence and machine learning (AI/ML) and generative AI includes a strong data strategy that can help provide high quality and reliable data.

Embracing generative AI with Amazon Bedrock

The company has identified several use cases where generative AI can significantly impact operations, particularly in analytics and business intelligence (BI).

One key initiative is ODAPChat, an AI-powered chat-based assistant employees can use to interact with data using natural language queries. This tool democratizes data access across the organization, enabling even nontechnical users to gain valuable insights.

A standout application is the SQL-to-natural language capability, which translates complex SQL queries into plain English and vice versa, bridging the gap between technical and business teams. To power these advanced AI features, OMRON chose Amazon Bedrock. This fully managed service offers a range of foundation models (FMs), providing the flexibility to select the most suitable model for each use case. The straightforward implementation of Amazon Bedrock, coupled with its scalability to handle growing data volumes and user requests, made it an ideal choice for OMRON. The ability of Amazon Bedrock to support various models from different providers helps make sure that OMRON can always use the most advanced AI capabilities as they evolve.

Crucially, the robust security features provided by Amazon Bedrock align perfectly with OMRON’s stringent data protection requirements. Some highlights include:

  • Fine-grained access controls
  • Networking security features such as encryption of data in transit and at rest, or the ability to use private virtual private clouds (VPCs), helping to make sure that sensitive business data remains secure even when being processed by AI models
  • Amazon Bedrock Guardrails

These strict security controls offer a comprehensive security approach that allows OMRON to innovate with AI while maintaining the highest standards of data governance and protection.

The following diagram shows a basic layout of how the solution works. It helps illustrate the main parts and how they work together to make the AI assistant do its job.

The system has three main sections:

  • User interface – Users engage with the chat interface hosted on AWS. Amazon Cognito handles the user authentication processes, providing secure access to the application.
  • Input processing backend – The Amazon API Gateway receives incoming messages, which are then processed by containers running on Amazon Elastic Container Service (Amazon ECS). Chat conversations are preserved in Amazon DynamoDB to be used for the follow-up conversation. Amazon Bedrock takes care of generating AI responses, and tools are configured using LangChain, which helps determine how to handle different types of queries. When needed, the system can access an ODAP data warehouse to retrieve additional information.
  • Document management – Documents are securely stored in Amazon S3, and when new documents are added, a Lambda function processes them into chunks. These chunks are converted into embeddings using Amazon Bedrock and the embeddings are stored in an Amazon OpenSearch Service vector store for semantic search.

Results and future plans

The implementation of ODAP and ODAPChat on AWS has already yielded significant benefits for OMRON:

  • Optimization of reports, leading to more efficient and insightful analysis
  • SQL-to-natural language capabilities powered by generative AI, making data more accessible to nontechnical users
  • Increased business agility with infrastructure fully deployed in the cloud
  • Data democratization, enabling more employees to use data-driven insights

Looking ahead, OMRON plans to significantly expand its use of AWS services and further use generative AI capabilities. The company aims to integrate additional data sources, including other mission-critical systems, into ODAP. This expansion will be coupled with enhanced data governance measures to help promote data quality and compliance across the growing data solution.

OMRON is also exploring more advanced generative AI use cases. These initiatives will use the evolving capabilities provided by Amazon Bedrock to potentially incorporate advanced AI models and security features.

Conclusion

OMRON’s journey with AWS demonstrates the transformative power of cloud-based data solutions and generative AI in overcoming data silos and driving business innovation. By using AWS services such as Amazon AppFlow, Amazon S3, and Amazon Bedrock, OMRON has created a comprehensive, secure, and adaptable data and analytics platform that not only meets its current needs, but also positions the company for future growth and innovation.

As organizations across industries grapple with similar data challenges, OMRON’s story serves as an inspiring example of how embracing cloud technologies and AI can lead to significant business transformation and competitive advantage.


About the Authors

Emrah Kaya is Data Engineering Manager at Omron Europe and Platform Lead for ODAP Project. With his extensive background on Cloud & Data Architecture, Emrah leads key OMRON’s technological advancement initiatives, including artificial intelligence, machine learning, or data science.

Xinyi Zhou is a Data Engineer at Omron Europe, bringing her expertise to the ODAP team led by Emrah Kaya. She specializes in building efficient data pipelines and managing AWS infrastructure, while actively contributing to the implementation of new solutions that advance ODAP’s technological capabilities.

Emel Mendoza is a Senior Solutions Architect at AWS based in the Netherlands. With passion for cloud migrations and application modernization, Emel helps organizations navigate their digital transformation journeys on AWS. Emel leverages his decade of experience to guide customers in adopting AWS services and architecting scalable, efficient solutions.

Jagdeep Singh Soni is a Senior Partner Solutions Architect at AWS based in the Netherlands. He uses his passion for Generative AI to help customers and partners build GenAI applications using AWS services. Jagdeep has 15 years of experience in innovation, experience engineering, digital transformation, cloud architecture and ML applications.

Read More

AI Workforce: using AI and Drones to simplify infrastructure inspections

AI Workforce: using AI and Drones to simplify infrastructure inspections

Inspecting wind turbines, power lines, 5G towers, and pipelines is a tough job. It’s often dangerous, time-consuming, and prone to human error. That’s why we at Amazon Web Services (AWS) are working on AI Workforce—a system that uses drones and AI to make these inspections safer, faster, and more accurate.

This post is the first in a three-part series exploring AI Workforce, the AWS AI-powered drone inspection system. In this post, we introduce the concept and key benefits. The second post dives into the AWS architecture that powers AI Workforce, and the third focuses on the drone setup and integration.

In the following sections, we explain how AI Workforce enables asset owners, maintenance teams, and operations managers in industries such as energy and telecommunications to enhance safety, reduce costs, and improve efficiency in infrastructure inspections.

Challenges with traditional inspections

Inspecting infrastructure using traditional methods is a challenge. You need trained people and specialized equipment, and you often must shut things down during inspection. As an example, climbing a wind turbine in bad weather for an inspection can be dangerous. Plus, even the best human inspector can miss things. This can lead to bigger problems down the line, costing time and money.

Technicians inspecting wind turbine blades overlooking landscape.

How AI Workforce helps

AI Workforce is designed to change all that. We use autonomous drones equipped with advanced sensors and AI to do the inspections. This brings the following benefits:

  • Less risk for people – Drones do the dangerous work so people don’t have to. This makes inspections much safer.
  • Faster and more efficient – Drones can cover a lot of ground quickly, getting the job done faster.
  • Better data – Automated data collection and analysis means fewer mistakes and more consistent results. This allows for proactive maintenance.

What does AI Workforce look like in action? Users interact with a simple AI assistant and dashboard that displays near real-time drone inspections, detected issues, and AI-generated insights. The following figure shows an example of the user dashboard and drone conversation.

AIW user interface

The following figure is an example of drone 4K footage.

Solution overview

AI Workforce is built on a robust and scalable architecture using a wide array of AWS services. Security is paramount, and we adhere to AWS best practices across the layers. This includes:

  • Amazon API Gateway manages secure communication between various components, enforcing authentication and authorization
  • AWS Identity and Access Management (IAM) roles and policies verify least privilege access, limiting each component’s permissions to only what is necessary
  • Network security is implemented through virtual private clouds (VPCs), security groups, and network access control lists (ACLs), isolating the system and protecting it from unauthorized access
  • For video processing, we employ secure transfer protocols and encryption at rest and in transit

AI Workforce provides a robust API for managing drone operations, including flight planning, telemetry data, and anomaly detection. The following diagram outlines how different components interact.

Imagine a system where drones autonomously inspect critical infrastructure, capturing high-resolution video, analyzing potential defects with AI, and seamlessly integrating findings into business workflows. The AI Workforce architecture brings this vision to life, using AWS services across four key pillars.

Control plane: Secure drone communication and operations

Our journey begins with automated drone flights. Each drone follows predefined routes, with flight waypoints, altitude, and speed configured through an AWS API, using coordinates stored in Amazon DynamoDB. Once airborne, AWS IoT Core enables secure, bidirectional communication—allowing drones to receive real-time commands (like “take-off”, “begin flight ID = xxx”, or “land”), adjust flight paths, and transmit telemetry data back to AWS. To maintain robust security, AWS Lambda responds to Internet of Things (IoT) events, enabling immediate actions based on drone data, while Amazon GuardDuty continuously monitors for anomalies or potential security threats, such as unusual API activity or unauthorized access attempts, helping protect the integrity of drone operations and promoting secure operations.

In AI Workforce, AWS IoT Core serves as the primary entry point for real-time drone communication, handling telemetry data, command and control messaging, and secure bidirectional communication with drones. API Gateway plays a complementary role by acting as the main entry point for external applications, dashboards, and enterprise integrations. It is responsible for managing RESTful API calls related to flight planning, retrieving inspection results, and interacting with backend services like Amazon Relational Database Service (Amazon RDS) and AWS Step Functions. While drones communicate directly with AWS IoT Core, user-facing applications and automation workflows rely on API Gateway to access structured data and trigger specific actions within the AI Workforce ecosystem.

AI/ML and generative AI: Computer vision and intelligent insights

As drones capture video footage, raw data is processed through AI-powered models running on Amazon Elastic Compute Cloud (Amazon EC2) instances. These computer vision models detect anomalies, classify damage types, and extract actionable insights—whether it’s spotting cracks on wind turbines or identifying corrosion on pipelines. Amazon SageMaker AI is at the core of our machine learning (ML) pipeline, training and deploying models for object detection, anomaly detection, and predictive maintenance.

We are also pioneering generative AI with Amazon Bedrock, enhancing our system’s intelligence. With natural language interactions, asset owners can ask questions like “What were the most critical defects detected last week?” and Amazon Bedrock generates structured reports based on inspection findings. It even aids in synthetic training data generation, refining our ML models for improved accuracy.

Data layer: Storing and managing inspection data

Every inspection generates vast amounts of data—high-resolution images, videos, and sensor readings. This information is securely stored in Amazon Simple Storage Service (Amazon S3), promoting durability and ease of access. Amazon S3 encrypts data at rest by default using server-side encryption (SSE), providing an additional layer of security without requiring manual configuration. Meanwhile, structured metadata and processed results are housed in Amazon RDS, enabling fast queries and integration with enterprise applications. Together, these services create a resilient data foundation, supporting both real-time analysis and historical trend monitoring.

Analytics and business: Automated workflows and business intelligence

Insights don’t stop at data collection—Step Functions orchestrates workflows that trigger automated actions. For example, if an AI model detects a critical defect, Step Functions can initiate a maintenance request in SAP, notify engineers, and schedule repairs without human intervention.

For deeper analysis, Amazon QuickSight transforms raw inspection data into interactive dashboards, helping asset owners track infrastructure health, spot trends, and optimize maintenance strategies. With a clear visual representation of defects, decision-makers can act swiftly, minimizing downtime and maximizing operational efficiency.

The future of AI Workforce: Expanding drone capabilities

Beyond inspections, AI Workforce provides a robust Drone API, offering seamless integration for third-party applications. This API enables remote flight planning, telemetry monitoring, and anomaly detection—all within a scalable AWS environment.

With secure drone communication, powerful AI-driven insights, a robust data foundation, and business automation, AI Workforce is redefining infrastructure inspection, making it smarter, faster, and more efficient than ever before.

Benefits and impact on business operations

The deployment of AI Workforce delivers a wide range of tangible benefits for organizations managing critical infrastructure (for example, automatically compare multiple inspections over time to detect longitudinal changes, and identify progressive failures for proactive maintenance), particularly in the energy and telco sector:

  • Significant cost savings – By reducing the need for human labor, specialized equipment, and extensive logistical planning, AI Workforce can significantly lower inspection costs. Proactive maintenance based on early defect detection also prevents costly repairs and unplanned downtime.
  • Dramatically enhanced safety – Removing human personnel from hazardous environments drastically reduces the risk of accidents and injuries, creating a safer working environment.
  • Substantially improved efficiency – Automated drone inspections are significantly faster and more efficient than traditional methods, enabling more frequent inspections and faster turnaround times.
  • Data-driven decision-making – AI Workforce provides asset owners with comprehensive and accurate data, enabling them to make informed decisions about maintenance, repairs, and asset management.

Example AI Workforce use case in the industry sector

Picture an energy company responsible for maintaining a large wind farm. They deploy AI Workforce drones for regular inspections. The drones, autonomously navigating preprogrammed flight paths defined by coordinates stored in DynamoDB and controlled through REST API calls, are securely connected using AWS IoT Core.

During the flight, sensor data is processed at the edge and streamed to Amazon S3, with metadata stored in Amazon RDS. Computer vision algorithms analyze the video in real time. If an anomaly is detected, a Lambda function triggers a Step Functions workflow, which in turn interacts with their SAP system to generate a maintenance work order. Inspection data is aggregated and visualized in QuickSight dashboards, providing a comprehensive overview of the wind farm’s health.

SageMaker AI models analyze the data, predicting potential failures and informing proactive maintenance strategies. In the future, Amazon Bedrock might provide summarized reports and generate synthetic data to further enhance the system’s capabilities.

Conclusion

At AWS, we’re committed to driving innovation in AI-powered solutions for a wide range of industries. AI Workforce is a prime example of how we’re using cutting-edge technologies to transform how critical infrastructure is managed and maintained.

We’re building this workforce to help businesses operate more efficiently and safely. We’re open to collaborating with others who are interested in this space. If you’d like to learn more, feel free to reach out. We welcome the opportunity to discuss your specific needs and explore potential collaborations.


About the Author

Miguel Muñoz de Rivera González is the original designer and technical lead for the AI Workforce initiative at AWS, driving AI-powered drone solutions for safer, smarter, and cost-effective infrastructure inspections.

Read More

From Browsing to Buying: How AI Agents Enhance Online Shopping

From Browsing to Buying: How AI Agents Enhance Online Shopping

Editor’s note: This post is part of the AI On blog series, which explores the latest techniques and real-world applications of agentic AI, chatbots and copilots. The series also highlights the NVIDIA software and hardware powering advanced AI agents, which form the foundation of AI query engines that gather insights and perform tasks to transform everyday experiences and reshape industries.

Online shopping puts a world of choices at people’s fingertips, making it convenient for them to purchase and receive orders — all from the comfort of their homes.

But too many choices can turn experiences from exciting to exhausting, leaving shoppers struggling to cut through the noise and find exactly what they need.

By tapping into AI agents, retailers can deepen their customer engagement, enhance their offerings and maintain a competitive edge in a rapidly shifting digital marketplace.

Every digital interaction results in new data being captured. This valuable customer data can be used to fuel generative AI and agentic AI tools that provide personalized recommendations and boost online sales. According to NVIDIA’s latest State of AI in Retail and Consumer-Packaged Goods report, 64% of respondents investing in AI for digital retail are prioritizing hyper-personalized recommendations.

Smart, Seamless and Personalized: The Future of Customer Experience

AI agents offer a range of benefits that significantly improve the retail customer experience, including:

  • Personalized Experiences: Using customer insights and product information, these digital assistants can deliver the expertise of a company’s best sales associate, stylist or designer — providing tailored product recommendations, enhancing decision-making, and boosting conversion rates and customer satisfaction.
  • Product Knowledge: AI agents enrich product catalogs with explanatory titles, enhanced descriptions and detailed attributes like size, warranty, sustainability and lifestyle uses. This makes products more discoverable and recommendations more personalized and informative, which increases consumer confidence.
  • Omnichannel Support: AI provides seamless integration of online and offline experiences, facilitating smooth transitions between digital and physical retail environments.
  • Virtual Try-On Capabilities: Customers can easily visualize products on themselves or in their homes in real time, helping improve product expectations and potentially lowering return rates.
  • 24/7 Availability: AI agents offer around-the-clock customer support across time zones and languages.

Real-World Applications of AI Agents in Retail

AI is redefining digital commerce, empowering retailers to deliver richer, more intuitive shopping experiences. From enhancing product catalogs with accurate, high-quality data to improving search relevance and offering personalized shopping assistance, AI agents are transforming how customers discover, engage with and purchase products online.

AI agents for catalog enrichment automatically enhance product information with consumer-focused attributes. These attributes can range from basic details like size, color and material to technical details such as warranty information and compatibility.

They also include contextual attributes, like sustainability, and lifestyle attributes, such as “for hiking.” AI agents can also integrate service attributes — including delivery times and return policies — making items more discoverable and relevant to customers while addressing common concerns to improve purchase results.

Amazon faced the challenge of ensuring complete and accurate product information for shoppers while reducing the effort and time required for sellers to create product listings. To address this, the company implemented generative AI using the NVIDIA TensorRT-LLM library. This technology allows sellers to input a product description or URL, and the system automatically generates a complete, enriched listing. The work helps sellers reach more customers and expand their businesses effectively while making the catalog more responsive and energy efficient.

AI agents for search tap into enriched data to deliver more accurate and contextually relevant search results. By employing semantic understanding and personalization, these agents better match customer queries with the right products, making the overall search experience faster and more intuitive.

Amazon Music has optimized its search capabilities using the Amazon SageMaker platform with NVIDIA Triton Inference Server and the NVIDIA TensorRT software development kit. This includes implementing vector search and transformer-based spell-correction models.

As a result, when users search for music — even with typos or vague terms — they can quickly find what they’re looking for. These optimizations, which make the search bar more effective and user friendly, have led to faster search times and 73% lower costs for Amazon Music.

AI agents for shopping assistants build on the enriched catalog and improved search functionality. They offer personalized recommendations and answer queries in a detailed, relevant, conversational manner, guiding shoppers through their buying journeys with a comprehensive understanding of products and user intent.

SoftServe, a leading IT advisor, has launched the SoftServe Gen AI Shopping Assistant, developed using the NVIDIA AI Blueprint for retail shopping assistants. SoftServe’s shopping assistant offers seamless and engaging shopping experiences by helping customers discover products and access detailed product information quickly and efficiently. One of its standout features is the virtual try-on capability, which allows customers to visualize how clothing and accessories look on them in real time.

Defining the Essential Traits of a Powerful AI Shopping Agent

Highly skilled AI shopping assistants are designed to be multimodal, understanding text- and image-based prompts, voice and more through large language models (LLMs) and vision language models. These AI agents can search for multiple items simultaneously, complete complicated tasks — such as creating a travel wardrobe — and answer contextual questions, like whether a product is waterproof or requires drycleaning.

This high level of sophistication offers experiences akin to engaging with a company’s best sales associate, delivering information to customers in a natural, intuitive way.

Diagram showing NVIDIA technologies used to build agentic AI applications, such as NVIDIA AI Blueprints (top), NVIDIA NeMo (middle) and NVIDIA NIM microservices (bottom).
With software building blocks, developers can design an AI agent with various features.

The building blocks of a powerful retail shopping agent include:

  • Multimodal and Multi-Query Capabilities: These agents can process and respond to queries that combine text and images, making search processes more versatile and user friendly. They can also easily be extended to support other modalities such as voice.
  • Integration With LLMs: Advanced LLMs, such as the NVIDIA Llama Nemotron family, bring reasoning capabilities to AI shopping assistants, enabling them to engage in natural, humanlike interactions. NVIDIA NIM microservices provide industry-standard application programming interfaces for simple integration into AI applications, development frameworks and workflows.
  • Management of Structured and Unstructured Data: NVIDIA NeMo Retriever microservices provide the ability to ingest, embed and understand retailers’ suites of relevant data sources, such as customer preferences and purchases, product catalog text and image data, and more, helping ensure AI agent responses are relevant, accurate and context-aware.
  • Guardrails for Brand Safe, On-Topic Conversations: NVIDIA NeMo Guardrails are implemented to help ensure that conversations with the shopping assistant remain safe and on topic, ultimately protecting brand values and bolstering customer trust.
  • State-of-the-Art Simulation Tools: The NVIDIA Omniverse platform and partner simulation technologies can help visualize products in physically accurate spaces. For example, customers looking to buy a couch could preview how the furniture would look in their own living room.

By using these key technologies, retailers can design AI shopping agents that exceed customer expectations, driving higher satisfaction and improved operational efficiency.

Retail organizations that harness AI agents are poised to experience evolving capabilities, such as enhanced predictive analytics for further personalized recommendations.

And integrating AI with augmented- and virtual-reality technologies is expected to create even more immersive and engaging shopping environments — delivering a future where shopping experiences are more immersive, convenient and customer-focused than ever.

Learn more about the AI Blueprint for retail shopping assistants.

Read More

NVIDIA Showcases Real-Time AI and Intelligent Media Workflows at NAB

NVIDIA Showcases Real-Time AI and Intelligent Media Workflows at NAB

Real-time AI is unlocking new possibilities in media and entertainment, improving viewer engagement and advancing intelligent content creation. 

At NAB Show, a premier conference for media and entertainment running April 5-9 in Las Vegas, NVIDIA will showcase how emerging AI tools and the technologies underpinning them help streamline workflows for streamers, content creators, sports leagues and broadcasters.  

Attendees can experience the power of the NVIDIA Blackwell platform, which serves as the foundation of NVIDIA Media2 — a collection of NVIDIA technologies including NVIDIA NIM microservices and NVIDIA AI Blueprints for live video analysis, accelerated computing platforms and generative AI software.   

Attendees can also see NVIDIA Holoscan for Media — an advanced real-time AI platform designed for live media workflows and applications — in action at the Dell booth, as well as experience the NVIDIA AI Blueprint for video search and summarization, which makes it easy to build and customize video analytics AI agents.  

NVIDIA will also present in these sessions: 

Driving Innovation With Partners  

Partners across the industry are showcasing innovative solutions using NVIDIA technologies to accelerate live media. 

Amazon Web Services (booth W1701) will collaborate with NVIDIA to showcase an esport racing challenge through a live cloud production. The professional-grade racing simulator allows users to analyze their performance through cutting-edge AI-powered insights and step into the spotlight for their own post-race interview. Other demos will offer a peek into the future of live cloud production and generative AI in sports broadcasting. 

Beamr (booth SL1730MR) will demonstrate how it’s driving AV1 adoption with GPU-accelerated video processing. Beamr’s technology, powered by the NVIDIA NVENC encoder, enables cost-efficient, high-quality and scalable AV1 transformation. 

Dell (booth SL4616) is collaborating with a wide range of partners to highlight their latest innovations in the media industry. Autodesk will feature its Flame visual effects software for AI-driven compositing; Avid will demonstrate real-time editing and AI metadata tagging on Dell Pro Max high-performance PCs; and Boris FX and RE:Vision Effects will showcase their motion-tracking, slow-motion interpolation and object-removal technologies — all running on NVIDIA accelerated computing. In addition, Speed Read AI will showcase the use of NVIDIA RTX-powered workstations to analyze scripts in seconds, while Arcitecta and Elements will demonstrate high-speed media collaboration and post-production workflows on Dell PowerScale storage.  

HP (booth SL3723) will showcase its desktop and mobile workstation portfolio with NVIDIA RTX PRO Blackwell GPUs, delivering cutting-edge AI performance in a variety of use cases. Attendees can also find HP’s newly announced AI solutions, the HP ZGX Nano AI Station G1n and HP ZGX Fury AI Station G1n, developed in collaboration with NVIDIA.  

Qvest (booth W2055) will spotlight two new AI solutions that help clients increase audience engagement, simplify insight gathering and streamline workflows. The Agentic Live Multi-Camera Video Event Extractor identifies, detects and extracts near-real-time events into structured outputs in an easily configurable, natural language, no-code interface, and the No-Code Media-Centric AI Agent Builder extracts meaningful structured data from unstructured media formats including video, images and complex documents. Both use NVIDIA NIM microservices, NVIDIA NeMo, NVIDIA Holoscan for Media, the NVIDIA AI Blueprint for video search and summarization and more. 

Monks (booth W2530) will announce its complete suite of products and services for the media and entertainment industry, designed to drive innovation, monetization and efficiency. Monks uses tools under NVIDIA Media2, such as NIDIA NIM microservices and Holoscan for Media, to enable real-time audience feedback, AI-powered selective encoding and contextual content analysis for large archives. The company will also launch a new suite of vision language model service offerings with its strategic partner TwelveLabs.  

Supermicro (booth W3713) will demonstrate the ease of setting up and running a complete AI video pipeline with WAN 2.1 and Adobe Premiere Pro, all running on the new high-performance Supermicro AS -531AW-TC workstation with an NVIDIA RTX PRO 6000 Blackwell Workstation Edition GPU. With RAVEL Orchestrate handling workstation and AI cluster orchestration, everything can run smoothly — from setup and deployment to user access and workload management.  

Speechmatics (booth W2317) will demonstrate its speech-to-text technology, which taps into NVIDIA accelerated computing to deliver highly accurate, real-time transcription across multiple languages and use cases, from media production to broadcast captioning. 

Telestream (booth W1501) will showcase its waveform monitoring solution, which seeks to bridge the gap for cloud-native workflows with a microservices architecture that taps into NVIDIA Holoscan for Media. In collaboration with NVIDIA, Telestream will demonstrate the ability to introduce cloud-native waveform monitoring to replicate broadcast center and master control room capabilities for engineering and creative teams. 

TwelveLabs (booth W3921) will showcase its newest models, which are being trained in part on NVIDIA DGX Cloud, to bring state-of-the-art video understanding to the world’s largest sports teams, clubs and leagues. The company is currently developing models based on NVIDIA NIM microservices to bring media and entertainment customers highly efficient inference and easy integration with leading software frameworks and agentic applications. 

VAST Data (booth SL9213) will spotlight the VAST InsightEngine — a solution that securely ingests, processes, and retrieves all enterprise data in real-time in a demo powered by the NVIDIA AI Enterprise software platform. Developed in collaboration with the National Hockey League, the demo showcases instant access to an archive of over 550,000 hours of hockey game footage. The work is set to redefine sponsorship analytics and empower video producers to instantly search, edit and deliver dynamic broadcast clips — fueling hyper-personalized fan experiences. 

Vizrt (booth W3031) will present its solution portfolio, which when matched with NVIDIA accelerated computing and NVIDIA Maxine technology, simplifies complex processes to support the immersive talent reflections, shadow casting and 3D pose tracking of Reality Connect, in addition to Particle Effects, Talent Gesture Control, XR Draw and the AI Gaze Correction feature available in the TriCaster Vizion. 

 V-Nova (booth W1252 and W1454) will spotlight its 6DoF virtual-reality experiences with new immersive content — Sharkarma and Weightless in booth W1252 — and AI-accelerated optimization in booth W1454, demonstrating how NVIDIA NVENC and NVIDIA GPUs unlock incredible video quality, efficiency and performance for critical video, AI and VR streaming cloud applications. 

Join NVIDIA at NAB Show 2025. 

Read More

Nintendo Switch 2 Leveled Up With NVIDIA AI-Powered DLSS and 4K Gaming

Nintendo Switch 2 Leveled Up With NVIDIA AI-Powered DLSS and 4K Gaming

The Nintendo Switch 2, unveiled April 2, takes performance to the next level, powered by a custom NVIDIA processor featuring an NVIDIA GPU with dedicated RT Cores and Tensor Cores for stunning visuals and AI-driven enhancements.

With 1,000 engineer-years of effort across every element — from system and chip design to a custom GPU, application programming interfaces (APIs) and world-class development tools — the Nintendo Switch 2 brings major upgrades.

The new console enables up to 4K gaming in TV mode and up to 120 frames per second at 1080p in handheld mode. Nintendo Switch 2 also supports high dynamic range and AI upscaling to sharpen visuals and smooth gameplay.

AI and Ray Tracing for Next-Level Visuals

The new RT Cores bring real-time ray tracing, delivering lifelike lighting, reflections and shadows for more immersive worlds.

Tensor Cores power AI-driven features like Deep Learning Super Sampling (DLSS), boosting resolution for sharper details without sacrificing image quality.

Tensor Cores also enable AI-powered face tracking and background removal in video chat use cases, enhancing social gaming and streaming.

With millions of players worldwide, the Nintendo Switch has become a gaming powerhouse and home to Nintendo’s storied franchises. Its hybrid design redefined console gaming, bridging TV and handheld play.

More Power, Smoother Gameplay

With 10x the graphics performance of the Nintendo Switch, the Nintendo Switch 2 delivers smoother gameplay and sharper visuals.

  • Tensor Cores boost AI-powered graphics while keeping power consumption efficient.
  • RT Cores enhance in-game realism with dynamic lighting and natural reflections.
  • Variable refresh rate via NVIDIA G-SYNC in handheld mode ensures ultra-smooth, tear-free gameplay.

Tools for Developers, Upgrades for Players

Developers get improved game engines, better physics and optimized APIs for faster, more efficient game creation.

Powered by NVIDIA technologies, Nintendo Switch 2 delivers for both players and developers.

Read More

No Foolin’: GeForce NOW Gets 21 Games in April

No Foolin’: GeForce NOW Gets 21 Games in April

GeForce NOW isn’t fooling around.

This month, 21 games are joining the cloud gaming library of over 2,000 titles. Whether chasing epic adventures, testing skills in competitive battles or diving into immersive worlds, members can dive into April’s adventures arrivals, which are truly no joke.

Get ready to stream, play and conquer the eight games available this week. Members can also get ahead of the pack with advanced access to South of Midnight, streaming soon before launch.

Unleash the Magic

South of Midnight, an action-adventure game developed by Compulsion Games, offers advanced access for gamers who purchase its Premium Edition. Dive into the title’s hauntingly beautiful world before launch, exploring its rich Southern gothic setting and unique magical combat system while balancing magic with melee attacks.

South of Midnight Advanced Access on GeForce NOW
Step into the shadows.

Set in a mystical version of the American South, the game combines elements of magic, mystery and adventure, weaving a compelling story that draws players in. The endless opportunities for exploration and combat, along with deep lore and engaging characters, make the game a must-play for fans of the action-adventure genre.

With its blend of dark fantasy and historical influences, South of Midnight is poised to deliver a unique gaming experience that will leave players spellbound.

GeForce NOW members can be among the first to get advanced access to the game without the hassle of downloads or updates. With an Ultimate or Performance membership, experience the game’s haunting landscapes and cryptid encounters with the highest frame rates and lowest latency — no need for the latest hardware.

April Is Calling

Call of Duty Warzone Season 3 on GeForce NOW
Verdansk is back! Catch it in the cloud.

Verdansk, the original and iconic map from Call of Duty: Warzone, is making its highly anticipated return in the game’s third season, and available to stream on GeForce NOW. Known for its sprawling urban areas, rugged wilderness and points of interest like Dam and Superstore, Verdansk offers a dynamic battleground for intense combat. The map has been rebuilt from the ground up with key enhancements across audio, visuals and gameplay, getting back to basics and delivering nostalgia for fans.

Look for the following games available to stream in the cloud this week:

Here’s what to expect for April: 

  • South of Midnight (New release on Steam and Xbox, available on PC Game Pass, April 8)
  • Commandos Origins (New release on Steam and Xbox, available on PC Game Pass, April 9)
  • The Talos Principle: Reawakened (New release on Steam, April 10)
  • Night Is Coming (New release on Steam, April 14)
  • Mandragora: Whispers of the Witch Tree (New release on Steam, April 17)
  • Sunderfolk (New release on Steam, April 23)
  • Clair Obscur: Expedition 33 (New release on Steam and Xbox, available on PC Game Pass, April 24)
  • Tempest Rising (New release on Steam, April 24)
  • Aimlabs (Steam)
  • Backrooms: Escape Together (Steam)
  • Blood Strike (Steam) 
  • ContractVille (Steam)
  • EXFIL (Steam)

March Madness

In addition to the 14 games announced last month, 26 more joined the GeForce NOW library:

What are you planning to play this weekend? Let us know on X or in the comments below.

Read More

Real-world healthcare AI development and deployment—at scale

Real-world healthcare AI development and deployment—at scale

AI Revolution podcast | Episode 2 - Real-world healthcare AI development and deployment—at scale | outline illustration of Seth Hain, Peter Lee, Dr. Matthew Lungren

Two years ago, OpenAI’s GPT-4 kick-started a new era in AI. In the months leading up to its public release, Peter Lee, president of Microsoft Research, cowrote a book full of optimism for the potential of advanced AI models to transform the world of healthcare. What has happened since? In this special podcast series, The AI Revolution in Medicine, Revisited, Lee revisits the book, exploring how patients, providers, and other medical professionals are experiencing and using generative AI today while examining what he and his coauthors got right—and what they didn’t foresee. 

In this episode, Dr. Matthew Lungren (opens in new tab) and Seth Hain (opens in new tab), leaders in the implementation of healthcare AI technologies and solutions at scale, join Lee to discuss the latest developments. Lungren, the chief scientific officer at Microsoft Health and Life Sciences, explores the creation and deployment of generative AI for automating clinical documentation and administrative tasks like clinical note-taking. Hain, the senior vice president of R&D at the healthcare software company Epic, focuses on the opportunities and challenges of integrating AI into electronic health records at global scale, highlighting AI-driven workflows, decision support, and Epic’s Cosmos project, which leverages aggregated healthcare data for research and clinical insights. 


Learn more:

Meet Microsoft Dragon Copilot: Your new AI assistant for clinical workflow 
Microsoft Industry Blog | March 2025 

Unlocking next-generation AI capabilities with healthcare AI models 
Microsoft Industry Blog | October 2024 

Multimodal Generative AI: the Next Frontier in Precision Health 
Microsoft Research Forum | March 2024 

An Introduction to How Generative AI Will Transform Healthcare with Dr. Matthew Lungren (opens in new tab) 
LinkedIn Learning 

AI for Precision Health 
Video | July 2023 

CheXNet: Radiologist-Level Pneumonia Detection on Chest X-Rays with Deep Learning 
Publication | December 2017 

Epic Cosmos (opens in new tab) 
Homepage

The AI Revolution in Medicine: GPT-4 and Beyond 
Book | April 2023

Transcript

[MUSIC]  

[BOOK PASSAGE]   

PETER LEE: “It’s hard to convey the huge complexity of today’s healthcare system. Processes and procedures, rules and regulations, and financial benefits and risks all interact, evolve, and grow into a giant edifice of paperwork that is well beyond the capability of any one human being to master. This is where the assistance of an AI like GPT-4 can be not only useful—but crucial.”   

[END OF BOOK PASSAGE]  

[THEME MUSIC]  

This is The AI Revolution in Medicine, Revisited. I’m your host, Peter Lee.  

Shortly after OpenAI’s GPT-4 was publicly released, Carey Goldberg, Dr. Zak Kohane, and I published The AI Revolution in Medicine to help educate the world of healthcare and medical research about the transformative impact this new generative AI technology could have. But because we wrote the book when GPT-4 was still a secret, we had to speculate. Now, two years later, what did we get right, and what did we get wrong?   

In this series, we’ll talk to clinicians, patients, hospital administrators, and others to understand the reality of AI in the field and where we go from here.


[THEME MUSIC FADES] 

The passage I read at the top there is from Chapter 7 of the book, “The Ultimate Paperwork Shredder.”  

Paperwork plays a particularly important role in healthcare. It helps convey treatment information that supports patient care, and it’s also used to help demonstrate that providers are meeting regulatory responsibilities, among other things. But if we’re being honest, it’s taxing—for everyone—and it’s a big contributor to the burnout our clinicians are experiencing today. Carey, Zak, and I identified this specific pain point as one of the best early avenues to pursue as far as putting generative AI to good work in the healthcare space.  

In this episode, I’m excited to welcome Dr. Matt Lungren and Seth Hain to talk about matching technological advancements in AI to clinical challenges, such as the paperwork crisis, to deliver solutions in the clinic and in the health system back office.  

Matt is the chief scientific officer for Microsoft Health and Life Sciences, where he focuses on translating cutting-edge technology, including generative AI and cloud services, into innovative healthcare applications. He’s a clinical interventional radiologist and a clinical machine learning researcher doing collaborative research and teaching as an adjunct professor at Stanford University. His scientific work has led to more than 200 publications, including work on new computer vision and natural language processing approaches for healthcare.  

Seth is senior vice president of research and development at Epic, a leading healthcare software company specializing in electronic health record systems, also known as EHR, as well as other solutions for connecting clinicians and patients. During his 19 years at Epic, Seth has worked on enhancing the core analytics and other technologies in Epic’s platforms as well as their applications across medicine, bringing together his graduate training in mathematics and his dedication to better health.  

I’ve had the pleasure of working closely with both Matt and Seth. Matt, as a colleague here at Microsoft, really focused on our health and life sciences business. And Seth, as a collaborator at Epic, as we embark on the questions of how to integrate and deploy generative AI into clinical applications at scale.   

[TRANSITION MUSIC] 

Here’s my conversation with Dr. Matt Lungren:  

LEE: Matt, welcome. It’s just great to have you here. 

MATTHEW LUNGREN: Thanks so much, Peter. Appreciate being here. 

LEE: So, I’d like to just start just talking about you. You know, I had mentioned your role as the chief scientific officer for Microsoft Health and Life Sciences. Of course, that’s just a title. So, what the heck is that? What is your job exactly? And, you know, what does a typical day at work look like for you? 

LUNGREN: So, really what you could boil my work down to is essentially cross collaboration, right. We have a very large company, lots of innovation happening all over the place, lots of partners that we work with and then obviously this sort of healthcare mission.

And so, what innovations, what kind of advancements are happening that can actually solve clinical problems, right, and sort of kind of direct that. And we can go into some examples, you know, later. But then the other direction, too, is important, right. So, identifying problems that may benefit from a technologic application or solution and kind of translating that over into the, you know, pockets of innovation saying, “Hey, if you kind of tweaked it this way, this is something that would really help, you know, the clinical world.”  

And so, it’s really a bidirectional role. So, my day to day is … every day is a little different, to be honest with you. Some days it’s very much in the science and learning about new techniques. On the other side, though, it can be very much in the clinic, right. So, what are the pain points that we’re seeing? Where are the gaps in the solutions that we’ve already rolled out? And, you know, again, what can we do to make healthcare better broadly? 

LEE: So, you know, I think of you as a technologist, and, Matt, you and I actually are colleagues working together here at Microsoft. But you also do spend time in the clinic still, as well, is that right? 

LUNGREN: You know, initially it was kind of a … very much a non-negotiable for me … in sort of taking an industry role. I think like a lot of, you know, physicians, you know, we’re torn with the idea of like, hey, I spent 20 years training. I love what I do, you know, with a lot of caveats there in terms of some of the administrative burden and some of the hassle sometimes. But for the most part, I love what I do, and there’s no greater feeling than using something that you trained years to do and actually see the impact on a human life. It’s unbelievable, right.  

So, I think part of me was just, like, I didn’t want to let that part of my identity go. And frankly, as I often say, to this day, I walk by a fax machine in our office today, like in 2025.  

So just to be extra clear, it really grounds me in, like, yes, I love the possibilities. I love thinking about what we can do. But also, I have a very stark understanding of the reality on the ground, both in terms of the technology but also the burnout, right. The challenges that we’re facing in taking care of patients has gotten, you know, much, much more difficult in the last few years, and, you know, I like to think it keeps my perspective, yeah. 

LEE: You know, I think some listeners to this podcast might be surprised that we have doctors on staff in technical roles at Microsoft. How do you explain that to people? 

LUNGREN: [LAUGHS] Yeah, no, yeah, it is interesting. I would say that, you know, from, you know, the legacy Nuance [1] world, it wasn’t so far-fetched that you have physicians that were power users and eventually sort of, you know, became, “Hey, listen, I think this is a strategic direction; you should take it” or whatever. And certainly maybe in the last, I want to say, five years or so, I’ve seen more and more physicians who have, you know, taken the time, sometimes on their own, to learn some of the AI capabilities, learn some of the principles and concepts; and frankly, some are, you know, even coding solutions and leading companies.

So, I do think that that has shifted a bit in terms of like, “Hey, doctor, this is your lane, and over here, you know, here’s a technical person.” And I think that’s fused quite a bit more.  

But yeah, it is an unusual thing, I think, in sort of how we’ve constructed what at least my group does. But again, I can’t see any other way around some of the challenges.  

I think, you know, an anecdote I’d like to tell you, when I was running the AIMI [Artificial Intelligence in Medicine and Imaging] Center, you know, we were bringing the medical school together with the computer science department, right, at Stanford. And I remember one day a student, you know, very smart, came into my office, you know, a clinical day or something, and he’s like, is there just, like, a book or something where I can just learn medicine? Because, like, I feel like there’s a lot of, like, translation you have to do for me.  

It really raised an important insight, which is that you can learn the, you know, medicine, so to speak. You know, go to med school; you know, take the test and all that. But it really … you don’t really understand the practice of medicine until you are doing that.  

And in fact, I even push it a step further to say after training those first two or three years of … you are the responsible person; you can turn around, and there’s no one there. Like, you are making a decision. Getting used to that and then having a healthy respect for that actually I think provides the most educational value of anything in healthcare.  

LEE: You know, I think what you’re saying is so important because as I reflect on my own journey. Of course, I’m a computer scientist. I don’t have medical training, although at this point, I feel confident that I could pass a Step 1 medical exam.  

LUNGREN: I have no doubt. [LAUGHS] 

LEE: But I think that the tech industry, because of people like you, have progressed tremendously in having a more sophisticated and nuanced understanding of what actually goes on in clinic and also what goes on in the boardrooms of healthcare delivery organizations. And of course, at the end of the day, I think that’s really been your role.  

So roughly speaking, your job as an executive at a big tech company has been to understand what the technology platforms need to be, particularly with respect to machine learning, AI, and cloud computing, to best support healthcare. And so maybe let’s start pre-GPT-4, pre-ChatGPT, and tell us a little bit, you know, about maybe some of your proudest moments in getting advanced technologies like AI into the clinic. 

LUNGREN: You know, when I first started, so remember, like you go all the way back to about 2013, right, my first faculty job, and, you know, we’re building a clinical program and I, you know, I had a lot of interest in public health and building large datasets for pop [population] health, etc. But I was doing a lot of that, you know, sort of labeling to get those insights manually, right. So, like, I was the person that you’d probably look at now and say, “What are you doing?” Right?  

So … but I had a complete random encounter with Andrew Ng, who I didn’t know at the time, at Stanford. And I, you know, went to one of the seminars that he was holding at the Gates building, and, you know, they were talking about their performance on ImageNet. You know, cat and dog and, you know, tree, bush, whatever. And I remember sitting in kind of the back, and I think I maybe had my scrubs on at the time and just kind of like, what? Like, why … like, this … we could use this in healthcare, you know. [LAUGHS]  

But for me, it was a big moment. And I was like, this is huge, right. And as you remember, the deep learning really kind of started to show its stuff with, you know, Fei-Fei Li’s ImageNet stuff.

So anyway, we started the collaboration that actually became a NIDUS. And one of the first things we worked on, we just said, “Listen, one of the most common medical imaging examinations in the world is the chest x-ray.” Right? Two, three billion are done every year in the world, and so is that not a great place to start?

And of course, we had a very democratizing kind of mission. As you know, Andrew has done a lot of work in that space, and I had similar ambitions. And so, we really started to focus on bringing the, you know, the sort of the clinical and the CS together and see what could be done.  

So, we did CheXNet. And this is, remember this is around the time when, like, Geoffrey Hinton was saying things like we should stop training radiologists, and all this stuff was going on. [LAUGHTER] So there’s a lot of hype, and this is the narrow AI days just to remind the audience.  

LEE: How did you feel about that since you are a radiologist? 

LUNGREN: Well, it was so funny. So, Andrew is obviously very prolific on social media, and I was, who am I, right? So, I remember he tagged me. Well, first he said, “Matt, you need to get a Twitter account.” And I said OK. And he tagged me on the very first post of our, what we call, CheXNet that was kind of like the “Hello, World!” for this work.  

And I remember it was a clinical day. I had set my phone, as you do, outside the OR. I go in. Do my procedure. You know, hour or so, come back, my phone’s dead. I’m like, oh, that’s weird. Like I had a decent charge. So, you know, I plug it in. I turn it on. I had like hundreds of thousands of notifications because Andrew had tweeted out to his millions or whatever about CheXNet.  

And so, then of course, as you point out, I go to RSNA that year, which is our large radiology conference, and that Geoffrey Hinton quote had come out. And everyone’s looking at me like, “What are you doing, Matt?” You know, like, are you coming after our specialty? I’m like, “No, no,” that’s, [LAUGHS] you know, it’s a way to interpret it, but you have to take a much longer horizon view, right.  

LEE: Well, you know, we’re going to, just as an enticement for listeners to this podcast to listen to the very end, I’m going to pin you down toward the end on your assessment of whether Geoffrey Hinton will eventually be proven right or not. [LAUGHTER] But let’s take our time to get there.  

Now let’s go ahead and enter the generative AI era. When we were first exposed to what we now know of as GPT-4—this was before it was disclosed to the world—a small number of people at Microsoft and Microsoft Research were given access in order to do some technical assessment.  

And, Matt, you and I were involved very early on in trying to assess what might this technology mean for medicine. Tell us, you know, what was the first encounter with this new technology like for you?  

LUNGREN: It was the weirdest thing, Peter. Like … I joined that summer, so the summer before, you know, the actual GPT came out. I had literally no idea what I was getting into.  

So, I started asking it questions, you know, kind of general stuff, right. Just, you know, I was like, oh, all right, it’s pretty good. And so, then I would sort of go a little deeper. And eventually I got to the point where I’m asking questions that, you know, maybe there’s three papers on it in my community, and remember I’m a sub-sub specialist, right, pediatric interventional radiology. And the things that we do in vascular malformations and, you know, rare cancers are really, really strange and not very commonly known.  

And I kind of walked away from that—first I said, can I have this thing, right? [LAUGHS]  

But then I, you know, I don’t want to sound dramatic, but I didn’t sleep that well, if I’m being honest, for the first few nights. Partially because I couldn’t tell anybody, except for the few that I knew were involved, and partially because I just couldn’t wrap my head around how we went from what I was doing in LSTMs [long short-term memory networks], right, which was state of the artish at the time for NLP [natural language processing].  

And all of a sudden, I have this thing that is broadly, you know, domain experts, you know, representations of knowledge that there’s no way you could think of it would be in distribution for a normal approach to this.  

And so, I really struggled with it, honestly. Interpersonally, like, I would be like, uh, well, let’s not work on that. They’re like, why not? You were just excited about it last week. I’m like, I don’t know. I think that we could think of another approach later. [LAUGHS]  

And so yeah, when we were finally able to really look at some of the capabilities and really think clearly, it was really clear that we had a massive opportunity on our hands to impact healthcare in a way that was never possible before. 

LEE: Yeah, and at that time you were still a part of Nuance. Nuance, I think, was in the process of being acquired by Microsoft. Is that right?  

LUNGREN: That’s right.  

LEE: And so, of course, this was also a technology that would have profound and very direct implications for Nuance. How did you think about that? 

LUNGREN: Nuance, for those in the audience who don’t know, for 25 years was, sort of, the medical speech-to-text thing that all, you know, physicians used. But really the brass ring had always been … and I want to say going back to like 2013, 2014, Nuance had tried to figure out, OK, we see this pain point. Doctors are typing on their computers while they’re trying to talk to their patients, right.  

We should be able to figure out a way to get that ambient conversation turned into text that then, you know, accelerates the doctor … takes all the important information. That’s a really hard problem, right. You’re having a conversation with a patient about their knee pain, but you’re also talking about, you know, their cousin’s wedding and their next vacation and their dog is sick or whatever and all that gets recorded, right.  

And so, then you have to have the intelligence/context to be able to tease out what’s important for a note. And then it has to be at the performance level that a physician who, again, 20 years of training and education plus a huge, huge amount of, you know, need to get through his cases efficiently, that’s a really difficult problem.  

And so, for a long time, there was a human-in-the-loop aspect to doing this because you needed a human to say, “This transcript’s great, but here’s actually what needs to go on the note.” And that can’t scale, as you know.  

When the GPT-4, you know, model kind of, you know, showed what it was capable of, I think it was an immediate light bulb because there was no … you can ask any physician in your life, anyone in the audience, you know, what are your … what is the biggest pain point when you go to see your doctor? Like, “Oh, they don’t talk to me. They don’t look me in the eye. They’re rushing around trying to finish a note.”  

If we could get that off their plate, that’s a huge unlock, Peter. And I think that, again, as you know, it’s now led to so much more. But that was kind of the initial, I think, reaction. 

LEE: And so, maybe that gets us into our next set of questions, our next topic, which is about the book and all the predictions we made in the book. Because Carey, Zak, and I—actually we did make a prediction that this technology would have a huge impact on this problem of clinical note-taking.  

And so, you’re just right in the middle of that. You’re directly hands-on creating, I think, what is probably the most popular early product for doing exactly that. So, were we right? Were we wrong? What else do we need to understand about this? 

LUNGREN: No, you were right on. I think in the book, I think you called it like a paper shredder or something. I think you used a term like that. That’s exactly where the activity is right now and the opportunity.  

I’ve even taken that so far as to say that when folks are asking about what the technology is capable of doing, we say, well, listen, it’s going to save time before it saves lives. It’ll do both. But right now, it’s about saving time.  

It’s about peeling back the layers of the onion that if you, you know, put me in where I started medicine in 2003, and then fast-forward and showed me a day in the life of 2025, I would be shocked at what I was doing that wasn’t related to patient care, right. So, all of those layers that have been stacked up over the years, we can start finding ways to peel that back. And I think that’s exactly what we’re seeing.

And to your point, I think you mentioned this, too, which is, well, sure, we can do this transcript, and we can turn a note, but then we can do other things, right. We can summarize that in the patient’s language or education level of choice. We can pend orders. We can eventually get to a place of decision support. So, “Hey, did you think about this diagnosis, doctor?” Like those kinds of things.  

And all those things, I think you highlighted beautifully, and again, it sounds like with, you know, a lot of, right, just kind of guesswork and prediction, but those things are actually happening every single day right now.  

LEE: Well, so now, you know, in this episode, we’re really trying to understand, you know, where the technology industry is in delivering these kinds of things. And so from your perspective, you know, in the business that you’re helping to run here at Microsoft, you know, what are the things that are actually shipping as product versus things that clinicians are doing, let’s say, off label, just by using, say, ChatGPT on their personal mobile devices, and then what things aren’t happening? 

LUNGREN: Yeah. I’ll start with the shipping part because I think you, again, you know my background, right. Academic clinician, did a lot of research, hadn’t had a ton of product experience.  

In other words, like, you know, again, I’m happy to show you what benchmarks we beat or a new technique or, you know, get a grant to do all this, or even frankly, you know, talk about startups. But to actually have an audience that is accustomed to a certain level of performance for the solutions that they use, to be able to deliver something new at that same level of expectation, wow, that’s a big deal.  

And again, this is part of the learning by, you know, kind of being around this environment that we have, which is we have this, you know, incredibly focused, very experienced clinical product team, right.

And then I think on the other side, to your point about the general-purpose aspect of this, it’s no secret now, right, that, you know, this is a useful technology in a lot of different medical applications. And let’s just say that there’s a lot of knowledge that can be used, particularly by the physician community. And I think the most recent survey I saw was from the British Medical Journal, which said, hey, you know, which doctors are using … are you willing to tell us, you know, what you’re doing? And it turns out that folks are, what, 30% or so said that they were using it regularly in clinic [2]. And again, this is the general, this is the API or whatever off the shelf.

And then frankly, when they ask what they’re using it for, tends to be things like, “Hey, differential, like, help me fill in my differential or suggest … ” and to me, I think what that created, at least—and you’re starting to see this trend really accelerate in the US especially—is, well, listen, we can’t have everybody pulling out their laptops and potentially exposing, you know, patient information by accident or something to a public API.  

We have to figure this out, and so brilliantly, I think NYU [New York University] was one of the first. Now I think there’s 30 plus institutions that said, listen, “OK, we know this is useful to the entire community in the healthcare space.” Right? We know the administrators and nurses and everybody thinks this is great.  

We can’t allow this sort of to be a very loosey-goosey approach to this, right, given this sort of environment. So, what we’ll do is we’ll set up a HIPAA-compliant instance to allow anyone in the community—you know, in the health system—to use the models, and then whatever, the newest model comes, it gets hosted, as well.  

And what’s cool about that—and that’s happened now a lot of places—is that at the high level … first of all, people get to use it and experiment and learn. But at the high level, they’re actually seeing what are the common use cases. Because you could ask 15 people and you might get super long lists, and it may not help you decide what to operationalize in your health system.  

LEE: But let me ask you about that. When you observe that, are there times when you think, “Oh, some specific use cases that we’re observing in that sort of organic way need to be taken into specialized applications and made into products?” Or is it best to keep these things sort of, you know, open-chat-interface types of general-purpose platform?  

LUNGREN: Honestly, it’s both, and that’s exactly what we’re seeing. I’m most familiar with Stanford, kind of, the work that Nigam Shah leads on this. But he, he basically, … you know, there’s a really great paper that is coming out in JAMA, but basically saying, “Here’s what our workforce is using it for. Here are the things in the literature that would suggest what would be popular.”  

And some of those line up, like helping with a clinical diagnosis or documentation, but some of them don’t. But for the most part, the stuff that flies to the top, those are opportunities to operationalize and productize, etc. And I think that’s exactly what we’re seeing. 

LEE: So, let’s get into some of the specific predictions. We’ve, I think, beaten note-taking to death here. But there’s other kinds of paperwork, like filling out prior authorization request forms or referral letters, an after-visit note or summary to give instructions to patients, and so on. And these were all things that we were making guesses in our book might be happening. What’s the reality there? 

LUNGREN: I’ve seen every single one of those. In fact, I’ve probably seen a dozen startups too, right, doing exactly those things. And, you know, we touched a little bit on translation into the actual clinic. And that’s actually another thing that I used to kind of underappreciate, which is that, listen, you can have a computer scientist and a physician or nurse or whatever, like, give the domain expertise, and you think you’re ready to build something.  

The health IT [LAUGHS] is another part of that Venn diagram that’s so incredibly critical, and then exactly how are you going to bring that into the system. That’s a whole new ballgame. 

And so I do want to do a callout because the collaboration that we have with Epic is monumental because here, you have the system of record that most physicians, at least in the US, use. And they’re going to use an interface and they’re going to have an understanding of, hey, we know these are pain points, and so I think there’s some really, really cool, you know, new innovations that are coming out of the relationship that we have with Epic. And certainly the audience may be familiar with those, that I think will start to knock off a lot of the things that you predicted in your book relatively soon. 

LEE: I think most of the listeners to this podcast will know what Epic is. But for those that are unfamiliar with the health industry, and especially the technology foundation, Epic is probably the largest provider of electronic health record systems. And, of course, in collaboration with you and your team, they’ve been integrating generative AI quite a bit. Are there specific uses that Epic is making and deploying that get you particularly excited? 

LUNGREN: First of all, the ambient note generation, by the way, is integrated into Epic now. So like, you know, it’s not another screen, another thing for physicians. So that’s a huge, huge unlock in terms of the translation.

But then Epic themselves, so they have, I guess, on the last roadmap that they talked [about], more than 60, but the one that’s kind of been used now is this inbox response. 

So again, maybe someone might not be familiar with, why is it such a big deal? Well, if you’re a physician, you already have, you know, 20 patients to see that day and you got all those notes to do, and then Jevons paradox, right. So if you give me better access to my doctor, well, maybe I won’t make an appointment. I’m just going to send him a note and this is kind of this inbox, right.  

So then at the end of my day, I got to get all my notes done. And then I got to go through all the inbox messages I’ve received from all of my patients and make sure that they’re not like having chest pain and they’re blowing it off or something.  

Now that’s a lot of work and the cold start problem of like, OK, I to respond to them. So Epic has leveraged this system to say, “Let me just draft a note for you,” understanding the context of, you know, what’s going on with the patient, etc. And you can edit that and sign it, right. So you can accelerate some of those … so that’s probably one I’m most excited about. But there’s so many right now. 

LEE: Well, I think I need to let you actually state the name of the clinical note-taking product that you’re associated with. Would you like to do that? [LAUGHS] 

LUNGREN: [LAUGHS] Sure. Yeah, it’s called DAX Copilot [3]. And for the record, it is the fastest-growing copilot in the Microsoft ecosystem. We’re very proud of that. Five hundred institutions already are using it, and millions of notes have already been created with it. And the feedback has been tremendous.

LEE: So, you sort of referred to this a little bit, you know, this idea of AI being a second set of eyes. So, doctor makes some decisions in diagnosis or kind of working out potential treatments or medication decisions. And in the book, you know, we surmise that, well, AI might not replace the doctor doing those things. It could but might not. But AI could possibly reduce errors if doctors and nurses are making decisions by just looking at those decisions and just checking them out. Is that happening at all, and what do you see the future there? 

LUNGREN: Yeah, I would say, you know, that’s kind of the jagged edge of innovation, right, where sometimes the capability gets ahead of the ability to, you know, operationalize that. You know, part of that is just related to the systems. The evidence has been interesting on this. So, like, you know this, our colleague Eric Horvitz has been doing a lot of work in sort of looking at physician, physician with GPT-4, let’s say, and then GPT-4 alone for a whole variety of things. You know, we’ve been saying to the world for a long time, particularly in the narrow AI days, that AI plus human is better than either alone. We’re not really seeing that bear out really that well yet in some of the research.  

But it is a signal to me and to the use case you’re suggesting, which is that if we let this system, in the right way, kind of handle a lot of the safety-net aspects of what we do but then also potentially take on some of the things that maybe are not that challenging or at least somewhat simple.  

And of course, this is really an interesting use case in my world, in the vision world, which is that we know these models are multimodal, right. They can process images and text. And what does that look like for pathologists or radiologists, where we do have a certain percentage of the things we look at in a given day are normal, right? Or as close to normal as you can imagine. So is there a way to do that? And then also, by the way, have a safety net.  

And so I think that this is an extremely active area right now. I don’t think we’ve figured out exactly how to have the human and AI model interact in this space yet. But I know that there’s a lot of attempts at it right now. 

LEE: Yeah, I think, you know, this idea of a true copilot, you know, a true collaborator, you know, I think is still something that’s coming. I think we’ve had a couple of decades of people being trained to think of computers as question-answering machines. Ask a question, get an answer. Provide a document, get a summary. And so on.  

But the idea that something might actually be this second set of eyes just assisting you all day continuously, I think, is a new mode of interaction. And we haven’t quite figured that out.  

Now, in preparation for this podcast, Matt, you said that you actually used AI to assist you in getting ready. [LAUGHS] Would you like to share what you learned by doing that? 

LUNGREN: Yeah, it’s very funny. So, like, you may have heard this term coined by Ethan Mollick called the “secret cyborg,” (opens in new tab) which is sort of referring to the phenomena of folks using GPT, realizing it can actually help them a ton in all kinds of parts of their work, but not necessarily telling anybody that they’re using it, right.  

And so in a similar secret cyborgish way, I was like, “Well, listen, you know, I haven’t read your book in like a year. I recommend it to everybody. And [I need] just a refresher.” So what I did was I took your book, I put it into GPT-4, OK, and asked it to sort of talk about the predictions that you made.  

And then I took that and put it in the stronger reasoning model—in this case, the “deep research” that you may have just seen or heard of and the audience from OpenAI—and asked it to research all the current papers, you know, and blogs and whatever else and tell me like what was right, what was wrong in terms of the predictions. [LAUGHS]  

So it, actually, it was an incredible thing. It’s a, like, what, six or seven pages. It probably would have taken me two weeks, frankly, to do this amount of work.  

LEE: I’ll be looking forward to reading that in the New England Journal of Medicine shortly. 

LUNGREN: [LAUGHS] That’s right. Yeah, no, don’t, before this podcast comes out, I’ll submit it as an opinion piece. No. [LAUGHS] But, yeah, but I think on balance, incredibly insightful views. And I think part of that was, you know, your team that got together really had a lot of different angles on this. But, you know, and I think the only area that was, like, which I’ve observed as well, it’s just, man, this can do a lot for education.  

We haven’t seen … I don’t think we’re looking at this as a tutor. To your point, we’re kind of looking at it as a transactional in and out. But as we’ve seen in all kinds of data, both in low-, middle-income countries and even in Harvard, using this as a tutor can really accelerate your knowledge and in profound ways.  

And so that is probably one area where I think your prediction was maybe slightly even further ahead of the curve because I don’t think folks have really grokked that opportunity yet. 

LEE: Yeah, and for people who haven’t read the book, you know, the guess was that you might use this as a training aid if you’re an aspiring doctor. For example, you can ask GPT-4 to pretend to be a patient that presents a certain way and that you are the doctor that this patient has come to see. And so you have an interaction. And then when you say end of encounter, you ask GPT-4 to assess how well you did. And we thought that this might be a great training aid, and to your point, it seems not to have materialized.  

LUNGREN: There’s some sparks. You know, with, like, communication, end-of-life conversations that no physician loves to have, right. It’s very, very hard to train someone in those. I’ve seen some work done, but you’re right. It’s not quite hit mainstream yet. 

LEE: On the subject of things that we missed, one thing that you’ve been very, very involved in in the last several months has been in shipping products that are multimodal. So that was something I think that we missed completely. What is the current state of affairs for multimodal, you know, healthcare AI, medical AI? 

LUNGREN: Yeah, the way I like to explain it—and first of all, no fault to you, but this is not an area that, like, we were just so excited about the text use cases that I can’t fault you. But yeah, I mean, so if we look at healthcare, right, how we take care of patients today, as you know, the vast majority of the data in terms of just data itself is actually not in text, right. It’s going be in pathology and genomics and radiology, etc.  

And it seems like an opportunity here to watch this huge curve just goes straight up in the general reasoning and frankly medical competency and capabilities of the models that are coming and continue to come but then to see that it’s not as proficient for medical-specific imaging and video and, you know, other data types. And that gap is, kind of, what I describe as the multimodal medical AI gap.  

We’re probably in GPT-2 land, right, for this other modality types versus the, you know, we’re now at o3, who knows where we’re going to go. At least in our view, we can innovate in that space.  

How do we help bring those innovations to the broader community to close that gap and see some of these use cases really start to accelerate in the multimodal world?  

And I think we’ve taken a pretty good crack at that. A lot of that is credit to the innovative work. I mean, MSR [Microsoft Research] was two or three years ahead of everyone else on a lot of this. And so how do we package that up in a way that the community can actually access and use? And so, we took a lot of what your group had done in, let’s just say, radiology or pathology in particular, and say, “OK, well, let’s put this in an ecosystem of other models.” Other groups can participate in this, but let’s put it in a platform where maybe I’m really competent in radiology or pathology. How do I connect those things together? How do I bring the general reasoner knowledge into a multimodal use case?  

And I think that’s what we’ve done pretty well so far. We have a lot of work to do still, but this is very, very exciting. We’re seeing just such a ton of interest in building with the tools that we put out there. 

LEE: Well, I think how rapidly that’s advancing has been a surprise to me. So I think we’re running short on time. So two last questions to wrap up this conversation. The first one is, as we think ahead on AI in medicine, what do you think will be the biggest changes or make the biggest differences two years from now, five years from now, 10 years from now?

LUNGREN: This is really tough. OK. I think the two-year timeframe, I think we will have some autonomous agent-based workflows for a lot of the … what I would call undifferentiated heavy lifting in healthcare.  

And this is happening in, you know, the pharmaceutical industry, the payer … every aspect is sort of looking at their operations at a macro level: where are these big bureaucratic processes that largely involve text and where can we shrink those down and really kind of unlock a lot of our workforce to do things that might be more meaningful to the business? I think that’s my safe one.  

Going five years out, you know, I have a really difficult time grappling with this seemingly shrinking timeline to AGI [artificial general intelligence] that we hear from people who I would respect and certainly know more than me. And in that world, I think there’s only been one paper that I’ve seen that has attempted to say, what does that mean in healthcare (opens in new tab) when we have this?  

And the fact is, I actually don’t know. [LAUGHS] I wonder whether there’ll still be a gap in some modalities. Maybe there’ll be the ability to do new science, and all kinds of interesting things will come of that.  

But then if you go all the way to your 10-year, I do feel like we’re going to have systems that are acting autonomously in a variety of capacities, if I’m being honest.  

What I would like to see if I have any influence on some of this is, can we start to celebrate the closing of hospitals instead of opening them? Meaning that, can we actually start to address—at a personal, individual level—care? And maybe that’s outside the home, maybe that’s, you know, in a way that doesn’t have to use so many resources and, frankly, really be very reactive instead of proactive.  

I really want to see that. That’s been the vision of precision medicine for, geez, 20-plus years. I feel like we’re getting close to that being something we can really tackle. 

LEE: So, we talked about Geoff Hinton and his famous prediction that we would soon not have human radiologists. And of course, maybe he got the date wrong. So, let’s reset the date to 2028. So, Matt, do you think Geoff is right or wrong? 

LUNGREN: [LAUGHS] Yeah, so the way … I’m not going to dodge the question, but let me just answer this a different way.  

We have a clear line of sight to go from images to draft reports. That is unmistakable. And that’s now in 2025. How it will be implemented and what the implications of that will be, I think, will be heavily dependent on the health system or the incentive structure for where it’s deployed.  

So, if I’m trying to take a step back, back to my global health days, man, that can’t come fast enough. Because, you know, you have entire health systems, you know, in fact entire countries that have five, you know, medical imaging experts for the whole country, but they still need this to you know take care of patients.  

Zooming in on today’s crisis in the US, right, we have the burnout crisis just as much as the doctors who are seeing patients and write notes. We can’t keep up with the volume. In fact, we’re not training folks fast enough, so there is a push pull; there may be a flip to your point of autonomous reads across some segments of what we do.  

By 2028, I think that’s a reasonable expectation that we’ll have some form of that. Yes. 

LEE: I tend to agree, and I think things get reshaped, but it seems very likely that even far into the future we’ll have humans wanting to take care of other humans and be taken care of by humans.  

Matt, this has been a fantastic conversation, and, you know, I feel it’s always a personal privilege to have a chance to work with someone like you so keep it up. 

[TRANSITION MUSIC] 

LUNGREN: Thank you so much, Peter. Thanks for having me. 

LEE: I’m always so impressed when I talk to Matt, and I feel lucky that we get a chance to work together here at Microsoft. You know, one of the things that always strikes me whenever I talk to him is just how disruptive generative AI has been to a business like Nuance. Nuance has had clinical note-taking as part of their product portfolio for a long, long time. And so, you know, when generative AI comes along, it’s not only an opportunity for them, but also a threat because in a sense, it opens up the possibility of almost anyone being able to make clinical note-taking capabilities into products.  

It’s really interesting how Matt’s product, DAX Copilot, which since the time that we had our conversation has expanded into a full healthcare workflow product called Dragon Copilot, has really taken off in the marketplace and how many new competing AI products have also hit the market, and all in just two years, because of generative AI.  

The other thing, you know, that I always think about is just how important it is for these kinds of systems to work together and especially how they integrate into the electronic health record systems. This is something that Carey, Zak, and I didn’t really realize fully when we wrote our book. But you know, when you talk to both Matt and Seth, of course, we see how important it is to have that integration.  

Finally, what a great example of yet another person who is both a surgeon and a tech geek. [LAUGHS] People sometimes think of healthcare as moving very slowly when it comes to new technology, but people like Matt are actually making it happen much more quickly than most people might expect.  

Well, anyway, as I mentioned, we also had a chance to talk to Seth Hain, and so here’s my conversation with Seth:

LEE: Seth, thank you so much for joining.  

SETH HAIN: Well, Peter, it’s such an exciting time to sit down and talk about this topic. So much has changed in the last two years. Thanks for inviting me.  

LEE: Yeah, in fact, I think in a way both of our lives have been upended in many ways by the emergence of AI. [LAUGHTER]  

The traditional listeners of the Microsoft Research Podcast, I think for the most part, aren’t steeped in the healthcare industry. And so maybe we can just start with two things. One is, what is Epic, really? And then two, what is your job? What does the senior vice president for R&D at Epic do every day? 

HAIN: Yeah, well, let’s start with that first question. So, what is Epic? Most people across the world experience Epic through something we call MyChart. They might use it to message their physician. They might use it to check the lab values after they’ve gotten a recent test. But it’s an app on their phone, right, for connecting in with their doctors and nurses and really making them part of the care team.  

But the software we create here at Epic goes beyond that. It’s what runs in the clinic, what runs at the bedside, in the back office to help facilitate those different pieces of care, from collecting vital information at the bedside to helping place orders if you’re coming in for an outpatient visit, maybe with a kiddo with an earache, and capturing that note and record of what happened during that encounter, all the way through back-office encounters, back-office information for interacting with payers as an example.  

And so, we provide a suite of software that health systems and increasingly a broader set of the healthcare ecosystem, like payers and specialty diagnostic groups, use to connect with that patient at the center around their care. 

And my job is to help our applications across the company take advantage of those latest pieces of technology to help improve the efficiency of folks like clinicians in the exam room when you go in for a visit. We’ll get into, I imagine, some use cases like ambient conversations, capturing that conversation in the exam room to help drive some of that documentation.  

But then providing that platform for those teams to build those and then strategize around what to create next to help both the physicians be efficient and also the health systems. But then ultimately continuing to use those tools to advance the science of medicine. 

LEE: Right. You know, one thing that I explain to fellow technologists is that I think today health records are almost entirely digital. I think the last figures I saw is well over 99% of all health records are digital.  

But in the year 2001, fewer than 15% of health records were digital. They were literally in folders on paper in storerooms, and if you’re old enough, you might even remember seeing those storerooms.  

So, it’s been quite a journey. Epic and Epic’s competitors—though I think Epic is really the most important company—have really moved the entire infrastructure of record keeping and other communications in healthcare to a digital foundation.  

And I think one thing we’ll get into, of course, one of the issues that has really become, I think, a problem for doctors and nurses is the kind of clerical or paperwork, record-keeping, burden. And for that reason, Epic and Epic systems end up being a real focus of attention. And so, we’ll get into that in a bit here.  

HAIN: And I think that hits, just to highlight it, on both sides. There is both the need to capture documentation; there’s also the challenge in reviewing it.  

LEE: Yes.  

HAIN: The average medical record these days is somewhere between the length of Fahrenheit 451 and To Kill a Mockingbird. [LAUGHTER] So there’s a fair amount of effort going in on that review side, as well. 

LEE: Yeah, indeed. So much to get into there. But I would like to talk about encounters with AI. So obviously, I think there are two eras here: before the emergence of ChatGPT and what we now call of as generative AI and afterwards. And so, let’s take the former.  

Of course, you’ve been thinking about machine learning and health data probably for decades. Do you have a memory of how you got into this? Why did you get an interest in data analytics and machine learning in the first place? 

HAIN: Well, my background, as you noted, is in mathematics before I came to Epic. And the sort of patterns and what could emerge were always part of what drove that. Having done development and kind of always been around computers all my life, it was a natural transition as I came here.  

And I started by really focusing on, how do we scale systems for the very largest organizations, making sure they are highly available and also highly responsive? Time is critical in these contexts in regards to rapidly getting information to doctors and nurses.  

And then really in the, say, in the 2010s, there started to be an emergence of capabilities from a storage and compute perspective where we could begin to build predictive analytics models. And these were models that were very focused, right. It predicted the likelihood somebody would show up for an appointment. It predicted the likelihood that somebody may fall during an inpatient stay, as an example.  

And I think a key learning during that time period was thinking through the full workflow. What information was available at that point in time, right? At the moment somebody walks into the ED [emergency department], you don’t have a full picture to predict the likelihood that they may deteriorate during an inpatient encounter.  

And in addition to what information was available was, what can you do about it? And a key part of that was how do we help get the right people in the right point in time at the bedside to make an assessment, right? It was a human-in-the-loop type of workflow where, for example, you would predict deterioration in advance and have a nurse come to the bedside or a physician come to the bedside to assess.  

And I think that combination of narrowly focused predictive models with an understanding that to have them make an impact you had to think through the full workflow of where a human would make a decision was a key piece. 

LEE: Obviously there is a positive human impact. And so, for sure, part of the thought process for these kinds of capabilities comes from that.  

But Epic is also a business, and you have to worry about, you know, what are doctors and clinics and healthcare systems willing to buy. And so how do you balance those two things, and do those two things ever come into conflict as you’re imagining what kinds of new capabilities and features and products to create? 

HAIN: Two, sort of, two aspects I think really come to mind. First off, generally speaking, we see analytics and AI as a part of the application. So, in that sense, it’s not something we license separately. We think that those insights and those pieces of data are part of what makes the application meaningful and impactful.  

At the scale that many of these health systems operate and the number of patients that they care for, as well as having tens of thousands of users in the system daily, one needs to think about the compute overhead … 

LEE: Yes. 

HAIN: … that these things cause. And so, in that regard, there is always a ROI assessment that is taking place to some degree around, what happens if this runs at full scale? And in a way, that really got accelerated as we went into the generative AI era.  

LEE: Right. OK. So, you mentioned generative AI. What was the first encounter, and what was that experience for you?

HAIN: So, in the winter of ’22 and into 2023, I started experimenting alongside you with what we at that time called DV3, or Davinci 3, and eventually became GPT-4. And immediately, a few things became obvious. The tool was highly general purpose. One was able to, in putting in a prompt, have it sort of convert into the framing and context of a particular clinical circumstance and reason around that context. But I think the other thing that started to come to bear in that context was there was a fair amount of latent knowledge inside of it that was very, very different than anything we’d seen before. And, you know, there’s some examples from the Sparks of AGI paper from Microsoft Research, where a series of objects end up getting stacked together in the optimal way to build height. Just given the list of objects, it seems to have a understanding of physical space that it intuited from the training processes we hadn’t seen anywhere. So that was an entirely new capability that programmers now had access to.  

LEE: Well in fact, you know, I think that winter of 2022, and we’ll get into this, one of your projects that you’ve been running for quite a few years is something called Cosmos (opens in new tab), which I find exceptionally interesting. And I was motivated to understand whether this type of technology could have an impact there.  

And so, I had to receive permission from both OpenAI and Microsoft to provide you with early access.  

When I did first show this technology to you, you must have had an emotional response, either skepticism or … I can’t imagine you just trusted, you know, trusted me to the extent of believing everything I was telling you. 

HAIN: I think there’s always a question of, what is it actually, right? It’s often easy to create demos. It’s often easy to show things in a narrow circumstance. And it takes getting your hands on it and really spending your 10,000 hours digging in and probing it in different ways to see just how general purpose it was.  

And so, the skepticism was really around, how applicable can this be broadly? And I think the second question—and we’re starting to see this play out now in some of the later models—was, is this just a language thing? Is it narrowly only focused on that? Or can we start to imagine other modalities really starting to factor into this? How will it impact basic sciences? Those sorts of things.

On a personal note, I mean, I had, at that point, now they’re now 14 and 12, two kids that I wondered, what did this mean for them? What is the right thing for them to be studying? And so I remember sleepless nights on that topic, as well. 

LEE: OK, so now you get early access to this technology; you’re able to do some experimentation. I think one of the things that impressed me is just less than four months later at the major health tech industry conference, HIMSS, which also happened timing-wise to take place just after the public disclosure of GPT-4, Epic showed off some early prototype applications of generative AI. And so, describe what those were, and how did you choose what to try to do there? 

HAIN: Yeah, and we were at that point, we actually had the very first users live on that prototype, on that early version.  

And the key thing we’d focused on—we started this development in very, very late December, January of 2023—was a problem that its origins really were during the pandemic.  

So, during the pandemic, we started to see patients increasingly messaging their providers, nurses, and clinicians through MyChart, that patient portal I mentioned with about 190 million folks on it. And as you can imagine, that was a great opportunity in the context of COVID to limit the amount of direct contact between providers and patients while still getting their questions answered.  

But what we found as we came out of the pandemic was that folks preferred it regardless. And that messaging volume had stayed very, very high and was a time-consuming effort for folks.  

And so, the first use case we came out with was a draft message in the context of the message from the patient and understanding of their medical history using that medical record that we talked about.  

And the nurse or physician using the tool had two options. They could either click to start with that draft and edit it and then hit send, or they could go back to the old workflow and start with a blank text box and write it from their own memory as they preferred.

And so that was that very first use case. There were many more that we had started from a development perspective, but, yeah, we had that rolling out right in March of 2023 there with the first folks. 

LEE: So, I know from our occasional discussions that some things worked very well. In fact, this is a real product now for Epic. And it seems to be really a very, very popular feature now. I know from talking to you that a lot of things have been harder. And so, I’d like to dive into that. As a developer, tech developer, you know, what’s been easy, what’s been hard, what’s in your mind still is left to do in terms of the development of AI? 

HAIN: Yeah. You know, the first thing that comes to mind sort of starting foundationally, and we hinted at this earlier in our conversation, was at that point in time, it was kind of per a message, rather compute-intensive to run these. And so, there were always trade-offs we were making in regards to how many pieces of information we would send into the model and how much would we request back out of it.  

The result of that was that while kind of theoretically or even from a research perspective, we could achieve certain outcomes that were quite advanced, one had to think about, where you make those trade-offs from a scalability perspective as you wanted to roll that out to lot of folks. So … 

LEE: Were you charging your customers more money for this feature? 

HAIN: Yeah, essentially the way that we handle that is there’s compute that’s required. As I mentioned, the feature is just part of our application. So, it’s just what they get with an upgrade.  

But that compute overhead is something that we needed to pass through to them. And so, it was something, particularly given both the staffing challenges, but also the margin pressures that health systems are feeling today, we wanted to be very cautious and careful about. 

LEE: And let’s put that on the stack because I do want to get into, from the selling perspective, that challenge and how you perceive health systems as a customer making those trade-offs. But let’s continue on the technical side here. 

HAIN: Yeah. On the technical side, it was a consideration, right. We needed to be thoughtful about how we used them. But going up a layer in the stack, at that time, there’s a lot of conversation in the industry around something called RAG, or retrieval-augmented generation.  

And the idea was, could you pull the relevant bits, the relevant pieces of the chart, into that prompt, that information you shared with the generative AI model, to be able to increase the usefulness of the draft that was being created? And that approach ended up proving and continues to be to some degree, although the techniques have greatly improved, somewhat brittle, right. You have a general-purpose technology that is drafting the response. 

But in many ways, you needed to, for a variety of pragmatic reasons, have somewhat brittle capability in regards to what you pulled into that approach. It tended to be pretty static. And I think this becomes one of the things that, looking forward, as these models have gotten a lot more efficient, we are and will continue to improve upon because, as you get a richer and richer amount of information into the model, it does a better job of responding.  

I think the third thing, and I think this is going to be something we’re going to continue to work through as an industry, was helping users understand and adapt to these circumstances. So many folks when they hear AI think, it will just magically do everything perfectly.  

And particularly early on with some of those challenges we’re talking about, it doesn’t. You know, if it’s helpful 85% of the time, that’s great, but it’s not going to be 100% of the time. And it’s interesting as we started, we do something we call immersion, where we always make sure that developers are right there elbow to elbow with the users of the software. 

And one of the things that I realized through that experience with some of the very early organizations like UCSD [UC San Diego] or University of Wisconsin here in Madison was that even when I’m responding to an email or a physician is responding to one of these messages from a patient, depending on the patient and depending on the person, they respond differently.  

In that context, there’s opportunity to continue to mimic that behavior as we go forward more deeply. And so, you learn a lot about, kind of, human behavior as you’re putting these use cases out into the world. 

LEE: So, you know, this increasing burden of electronic communications between doctors, nurses, and patients is centered in one part of Epic. I think that’s called your in-basket application, if I understand correctly.  

HAIN: That’s correct. 

LEE: But that also creates, I think, a reputational risk and challenge for Epic because as doctors feel overburdened by this and they’re feeling burnt out—and as we know, that’s a big issue—then they point to, you know, “Oh, I’m just stuck in this Epic system.”  

And I think a lot of the dissatisfaction about the day-to-day working lives of doctors and nurses then focuses on Epic. And so, to what extent do you see technologies like generative AI as, you know, a solution to that or contributing either positively or negatively to this? 

HAIN: You know, earlier I made the comment that in December, as we started to explore this technology, we realized there were a class of problems that now might have solutions that never did before.  

And as we’ve started to dig into those—and we now have about 150 different use cases that are under development, many of which are live across … we’ve got about 350 health systems using them—one of the things we’ve started to find is that physicians, nurses, and others start to react to saying it’s helping them move forward with their job.  

And examples of this, obviously the draft of the in-basket message response is one, but using ambient voice recognition as a kind of new input into the software so that when a patient and a physician sit down in the exam room, the physician can start a recording and that conversation then ends up getting translated or summarized, if you will, including using medical jargon, into the note in the framework that the physician would typically write.  

Another one of those circumstances where they then review it, don’t need to type it out from scratch, for example, …  

LEE: Right. 

HAIN: … and can quickly move forward.  

I think looking forward, you know, you brought up Cosmos earlier. It’s a suite of applications, but at its core is a dataset of about 300 million de-identified patients. And so using generative AI, we built research tools on top of it. And I bring that up because it’s a precursor of how that type of deep analytics can be put into context at the point of care. That’s what we see this technology more deeply enabling in the future. 

LEE: Yeah, when you are creating … so you said there are about 150 sort of integrations of generative AI going into different parts of Epic’s software products.  

When you are doing those developments and then you’re making a decision that something is going to get deployed, one thing that people might worry about is, well, these AI systems hallucinate. They have biases. There are unclear accountabilities, you know, maybe patient expectations.  

For example, if there’s a note drafted by AI that’s sent to a patient, does the patient have a right to know what was written by AI and what was written by the human doctor? So, can we run through how you have thought about those things?  

HAIN: I think one thing that is important context to set here for folks, and I think it’s often a point of confusion when I’m chatting with folks in public, is that their interaction with generative AI is typically through a chatbot, right. It’s something like ChatGPT or Bing or one of these other products where they’re essentially having a back-and-forth conversation. 

LEE: Right. 

HAIN: And that is a dramatically different experience than how we think it makes sense to embed into an enterprise set of applications.  

So, an example use case may be in the back office, there are folks that are coding encounters. So, when a patient comes in, right, they have the conversation with the doctor, the doctor documents it, that encounter needs to be billed for, and those folks in the back-office associate to that encounter a series of codes that provide information about how that billing should occur.

So, one of the things we did from a workflow perspective was add a selector pane to the screen that uses generative AI to suggest a likely code. Now, this suggestion runs the risk of hallucination. So, the question is, how do you build into the workflow additional checks that can help the user do that?  

And so in this context, we always include a citation back to the part of the medical record that justifies or supports that code. So quickly on hover, the user can see, does this make sense before selecting it? And it’s those types of workflow pieces that we think are critical to using this technology as an aid to helping people make decisions faster, right. It’s similar to drafting documentation that we talked about earlier.  

And it’s interesting because there’s a series of patterns that are … going back to the AI Revolution book you folks wrote two years ago. Some of these are really highlighted there, right. This idea of things like a universal translator is a common pattern that we ended up applying across the applications. And in my mind, translation, this may sound a little bit strange, but summarization is an example of translating a very long series of information in a medical record into the context that an ED physician might care about, where they have three or four minutes to quick review that very long chart.  

And so, in that perspective, and back to your earlier comment, we added the summary into the workflow but always made sure that the full medical record was available to that user, as well. So, a lot of what we’ve done over the last couple of years has been to create a series of repeatable techniques in regards to both how to build the backend use cases, where to pull the information, feed it into the generative AI models.  

But then I think more importantly are the user experience design patterns to help mitigate those risks you talked about and to maintain consistency across the integrated suite of applications of how those are deployed.  

LEE: You might remember from our book, we had a whole chapter on reducing paperwork, and I think that’s been a lot of what we’ve been talking about. I want to get beyond that, but before transitioning, let’s get some numbers.  

So, you talked about messages drafted to patients, to be sent to patients. So, give a sense of the volume of what’s happening right now. 

HAIN: Oh, we are seeing across the 300 and, I think it’s, 48 health systems that are now using generative AI—and to be clear, we have about 500 health systems we have the privilege of working with, each with many, many hospitals—there are tens of thousands of physicians and nurses using the software. That includes drafting million-plus, for example, notes a month at this point, as well as helping to generate in a similar ballpark that number of responses to patients.  

The thing I’m increasingly excited about is the broader set of use cases that we’re seeing folks starting to deploy now. One of my favorites has been … it’s natural that as part of, for example, a radiology workflow, in studying that image, the radiologist made note that it would be worth double checking, say in six to eight months, that the patient have this area scanned of their chest. Something looks a little bit fishy there, but there’s not … 

LEE: There’s not a definitive finding yet. 

HAIN: … there’s not a definitive finding at that point. Part of that workflow is that the patient’s physician place an order for that in the future. And so, we’re using generative AI to note that back to the physician. And with one click, allow them to place that order, helping that patient get better care.  

That’s one example of dozens of use cases that are now live, both to help improve the care patients are getting but also help the workforce. So going back to the translation-summarization example, a nurse at the end of their shift needs to write up a summary of that shift for the next nurse for each … 

LEE: Right. 

HAIN: … each patient that they care for. Well, they’ve been documenting information in the chart over those eight or 12 hours, right.  

LEE: Yep, yep. 

HAIN: So, we can use that information to quickly draft that end-of-shift note for the nurse. They can verify it with those citations we talked about and make any additions or edits that they need and then complete their end of day far more efficiently.  

LEE: Right. OK. So now let’s get to Cosmos, which has been one of these projects that I think has been your baby for many years and has been something that has had a profound impact on my thinking about possibilities. So first off, what is Cosmos? 

HAIN: Well, just as an aside, I appreciate the thoughtful comments. There is a whole team of folks here that are really driving these projects forward. And a large part of that has been, as you brought up, both Cosmos as a foundational capability but then beginning to integrate it into applications. And that’s what those folks spend time on.  

Cosmos is this effort across hundreds of health systems that we have the privilege of working with to build out a de-identified dataset with today—and it climbs every day—but 300 million unique patient records in it.  

And one of the interesting things about that structure is that, for example, if I end up in a hospital in Seattle and have that encounter documented at a health system in Seattle, I still—a de-identified version of me—still only shows up once in Cosmos, stitching together both my information from here in Madison, Wisconsin, where Epic is at, with that extra data from Seattle. The result is these 300 million unique longitudinal records that have a deep history associated with them.  

LEE: And just to be clear, a patient record might have hundreds or even thousands of individual, I guess what you would call, clinical records or elements. 

HAIN: That’s exactly right. It’s the breadth of information from orders and allergies and blood pressures collected, for example, in an outpatient setting to cancer staging information that might have come through as part of an oncology visit. And it’s coming from a variety of sources. We exchange information about 10 million times a day between different health systems. And that full picture is available within Cosmos in that way of the patient. 

LEE: So now why? Why Cosmos? 

HAIN: Why Cosmos? Well, the real ultimate aim is to put a deeply informed in-context perspective at the point of care. So, as a patient, if I’m in the exam room, it’s helpful for the physician and me to know what have similar patients like me experienced in this context. What was the result of that line of treatment, for example? 

Or as a doctor, if I’m looking and working through a relatively rare or strange case to me, I might be able to connect with—this as an example workflow we built called Look-Alikes—with another physician who has seen similar patients or within the workflow see a list of likely diagnoses based on patients that have been in a similar context. And so, the design of Cosmos is to put those insights into the point of care in the context of the patient.  

To facilitate those steps there, the first phase was building out a set of research tooling. So, we see dozens of papers a year being published by the health systems that we work with. Those that participate in Cosmos have access to it to do research on it. And so they use both a series of analytical and data science tools to do that analysis and then publish research. So, building up trust that way.  

LEE: The examples you gave are, like with Look-Alikes, it’s very easy, I think, for people outside of the healthcare world to imagine how that could be useful. So now why is GPT-4 or any generative AI relevant to this? 

HAIN: Well, so a couple of different pieces, right. Earlier we talked about—and I think this is the most important—how generative AI is able to cast things into a specific context. And so, in that way, we can use these tools to help both identify a cohort of patients similar to you when you’re in the exam room. And then also help present that information back in a way that relates to other research and understandings from medical literature to understand what are those likely outcomes.  

I think more broadly, these tools and generative AI techniques in the transformer architecture envision a deeper understanding of sequences of events, sequences of words. And that starts to open up broader questions about what can really be understood about patterns and sequences of events in a patient’s journey.  

Which if you didn’t know, the name Epic, just like a great long nation’s journey is told through an epic story, is a patient’s story. So that’s where it came from. 

LEE: So, we’re running up against our time together. And I always like to end with a more provocative question.  

HAIN: Certainly. 

LEE: And for you, I wanted to raise a question that I think we had asked ourselves in the very earliest days that we were sharing Davinci 3, what we now know of as GPT-4, with each other, which is, is there a world in the future because of AI where we don’t need electronic health records anymore? Is there a world in the future without EHR? 

HAIN: I think it depends on how you define EHR. I see a world coming where we need to manage a hybrid workforce, where there is a combination of humans and something folks are sometimes calling agents working in concert together to care for more and more of our … of the country and of the world. And there is and will need to be a series of tools to help orchestrate that hybrid workforce. And I think things like EHRs will transform into helping that operate … be operationally successful.  

But as a patient, I think there’s a very different opportunity that starts to be presented. And we’ve talked about kind of understanding things deeply in context. There’s also a real acceleration happening in science right now. And the possibility of bringing that second- and third-order effects of generative AI to the point of care, be that through the real-world evidence we were talking about with Cosmos or maybe personalized therapies that really are well matched to that individual. These generative AI techniques open the door for that, as well as the full lifecycle of managing that from a healthcare perspective all the way through monitoring after the fact.  

And so, I think we’ll still be recording people’s stories. Their stories are relevant to them, and they can help inform the bigger picture. But I think the real question is, how do you put those in a broader context? And these tools open the door for a lot more. 

LEE: Well, that’s really a great vision for the future.  

[TRANSITION MUSIC] 

Seth, I always really learn so much talking to you, and thank you so much for this great chat. 

HAIN: Thank you for inviting me.   

LEE: I see Seth as someone on the very leading frontier of bringing generative AI to the clinic and into the healthcare back office and at the full scale of our massive healthcare system. It’s always impressive to me how thoughtful Seth has had to be about how to deploy generative AI into a clinical setting.  

And, you know, one thing that sticks out—and he made such a point of this—is, you know, generative AI in the clinical setting isn’t just a chatbot. They’ve had to really think of other ways that will guarantee that the human stays in the loop. And that’s of course exactly what Carey, Zak, and I had predicted in our book. In fact, we even had a full chapter of our book entitled “Trust but Verify,” which really spoke to the need in medicine to always have a human being directly involved in overseeing the process of healthcare delivery. 

One technical point that Carey, Zak, and I completely missed, on the other hand, in our book, was the idea of something that Seth brought up called RAG, which is retrieval-augmented generation. That’s the idea of giving AI access to a database of information and allowing it to use that database as it constructs its answers. And we heard from Seth how fundamental RAG is to a lot of the use cases that Epic is deploying. 

And finally, I continue to find Seth’s project called Cosmos to be a source of inspiration, and I’ve continued to urge every healthcare organization that has been collecting data to consider following a similar path. 

In our book, we spent a great deal of time focusing on the possibility that AI might be able to reduce or even eliminate a lot of the clerical drudgery that currently exists in the delivery of healthcare. We even had a chapter entitled “The Paperwork Shredder.” And we heard from both Matt and Seth that that has indeed been the early focus of their work.  

But we also saw in our book the possibility that AI could provide diagnoses, propose treatment options, be a second set of eyes to reduce medical errors, and in the research lab be a research assistant. And here in Epic’s Cosmos, we are seeing just the early glimpses that perhaps generative AI can actually provide new research possibilities in addition to assistance in clinical decision making and problem solving. On the other hand, that still seems to be for the most part in our future rather than something that’s happening at any scale today. 

But looking ahead to the future, we can still see the potential of AI helping connect healthcare delivery experiences to the advancement of medical knowledge. As Seth would say, the ability to connect bedside to the back office to the bench. That’s a pretty wonderful future that will take a lot of work and tech breakthroughs to make it real. But the fact that we now have a credible chance of making that dream happen for real, I think that’s pretty wonderful. 

[MUSIC TRANSITIONS TO THEME] 

I’d like to say thank you again to Matt and Seth for sharing their experiences and insights. And to our listeners, thank you for joining us. We have some really great conversations planned for the coming episodes, including a look at how patients are using generative AI for their own healthcare, as well as an episode on the laws, norms, and ethics developing around AI and health, and more. We hope you’ll continue to tune in.

Until next time.

[MUSIC FADES] 

[1] A provider of conversational, ambient, and generative AI, Nuance was acquired by Microsoft in March 2022 (opens in new tab). Nuance solutions and capabilities are now part of Microsoft Cloud for Healthcare.

[2] According to the survey (opens in new tab), of the 20% of respondents who said they use generative AI in clinical practice, 29% reported using the technology for patient documentation and 28% said they use it for differential diagnosis.

[3] A month after the conversation was recorded, Microsoft Dragon Copilot was unveiled. Dragon Copilot combines and extends the capabilities of DAX Copilot and Dragon Medical One.


The post Real-world healthcare AI development and deployment—at scale appeared first on Microsoft Research.

Read More

PyTorch Day France 2025: Call For Proposals Open

PyTorch Day France 2025: Call For Proposals Open

We’re pleased to announce PyTorch Day France 2025, a dedicated gathering of the PyTorch community held 7 May 2025 in Paris, France. Proudly hosted by the PyTorch Foundation and co-located with GOSIM AI Paris 2025, this event will bring together developers, researchers, and practitioners driving innovation in open source AI and machine learning.

Whether you’re building cutting-edge models or contributing to the ecosystem, PyTorch Day France is your opportunity to connect, collaborate, and help shape the future of deep learning.

PT Day CFP

Why Attend?

Set in the vibrant atmosphere of STATION F, the world’s largest startup campus, PyTorch Day France will offer a full day of:

  • Insightful Technical Talks
  • Interactive Discussions
  • Engaging Poster Sessions

The event is designed to foster open exchange across the PyTorch ecosystem, providing a space to learn from peers, share practical insights, and explore the latest research and applications in AI.

Submit a Proposal

We are currently accepting proposals for talks. If you have a project, idea, or research story you’d like to share with the PyTorch community, we want to hear from you.

📩 Email your talk title and abstract to pytorchevents@linuxfoundation.org for consideration.

Registration

To register for PyTorch Day France, please visit the GOSIM AI Paris website, and use the code PYTORCHFRIEND to receive 25% off.

👉 https://paris2025.gosim.org/

We encourage early registration to secure your spot and ensure access to both PyTorch Day France and the broader GOSIM AI Paris programming.

Venue

STATION F
5 Parv. Alan Turing, 75013 Paris, France
A landmark of innovation and entrepreneurship in the heart of Paris.

Travel and Accommodations

Participants are responsible for their own travel and lodging. For those arriving internationally, Paris Charles de Gaulle Airport is approximately 38.4 km from STATION F. Additional information about accommodations and transportation may be available on the GOSIM AI Paris website.

Questions?

For any inquiries, please contact us at pytorchevents@linuxfoundation.org.

We look forward to welcoming the PyTorch community to Paris this May for a day of collaboration, learning, and open source AI innovation.

Read More

Interpreting and Improving Optimal Control Problems With Directional Corrections

Many robotics tasks, such as path planning or trajectory optimization, are formulated as optimal control problems (OCPs). The key to obtaining high performance lies in the design of the OCP’s objective function. In practice, the objective function consists of a set of individual components that must be carefully modeled and traded off such that the OCP has the desired solution. It is often challenging to balance multiple components to achieve the desired solution and to understand, when the solution is undesired, the impact of individual cost components. In this paper, we present a framework…Apple Machine Learning Research