GPT-4’s potential in shaping the future of radiology

GPT-4’s potential in shaping the future of radiology

This research paper is being presented at the 2023 Conference on Empirical Methods in Natural Language Processing (opens in new tab) (EMNLP 2023), the premier conference on natural language processing and artificial intelligence.

EMNLP 2023 blog hero - female radiologist analyzing an MRI image of the head

In recent years, AI has been increasingly integrated into healthcare, bringing about new areas of focus and priority, such as diagnostics, treatment planning, patient engagement. While AI’s contribution in certain fields like image analysis and drug interaction is widely recognized, its potential in natural language tasks with these newer areas presents an intriguing research opportunity. 

One notable advancement in this area involves GPT-4’s impressive performance (opens in new tab) on medical competency exams and benchmark datasets. GPT-4 has also demonstrated potential utility (opens in new tab) in medical consultations, providing a promising outlook for healthcare innovation.

Progressing radiology AI for real problems

Our paper, “Exploring the Boundaries of GPT-4 in Radiology (opens in new tab),” which we are presenting at EMNLP 2023 (opens in new tab), further explores GPT-4’s potential in healthcare, focusing on its abilities and limitations in radiology—a field that is crucial in disease diagnosis and treatment through imaging technologies like x-rays, computed tomography (CT) and magnetic resonance imaging (MRI). We collaborated with our colleagues at Nuance (opens in new tab), a Microsoft company, whose solution, PowerScribe, is used by more than 80 percent of US radiologists. Together, we aimed to better understand technology’s impact on radiologists’ workflow.

Our research included a comprehensive evaluation and error analysis framework to rigorously assess GPT-4’s ability to process radiology reports, including common language understanding and generation tasks in radiology, such as disease classification and findings summarization. This framework was developed in collaboration with a board-certified radiologist to tackle more intricate and challenging real-world scenarios in radiology and move beyond mere metric scores.

We also explored various effective zero-, few-shot, and chain-of-thought (CoT) prompting techniques for GPT-4 across different radiology tasks and experimented with approaches to improve the reliability of GPT-4 outputs. For each task, GPT-4 performance was benchmarked against prior GPT-3.5 models and respective state-of-the-art radiology models. 

We found that GPT-4 demonstrates new state-of-the-art performance in some tasks, achieving about a 10-percent absolute improvement over existing models, as shown in Table 1. Surprisingly, we found radiology report summaries generated by GPT-4 to be comparable and, in some cases, even preferred over those written by experienced radiologists, with one example illustrated in Table 2.

Table 1: Table showing GPT-4 either outperforms or is on par with previous state-of-the-art multimodal LLMs.
Table 1: Results overview. GPT-4 either outperforms or is on par with previous state-of-the-art (SOTA) multimodal LLMs.
Table 2. Table showing examples where GPT-4 impressions, or findings summaries, are favored over existing manually written impressions on the Open-i dataset. In both examples, GPT-4 outputs are more faithful and provide more complete details on the findings.
Table 2. Examples where GPT-4 findings summaries are favored over existing manually written ones on the Open-i dataset. In both examples, GPT-4 outputs are more faithful and provide more complete details on the findings.

Another encouraging prospect for GPT-4 is its ability to automatically structure radiology reports, as schematically illustrated in Figure 1. These reports, based on a radiologist’s interpretation of medical images like x-rays and include patients’ clinical history, are often complex and unstructured, making them difficult to interpret. Research shows that structuring these reports can improve standardization and consistency in disease descriptions, making them easier to interpret by other healthcare providers and more easily searchable for research and quality improvement initiatives. Additionally, using GPT-4 to structure and standardize radiology reports can further support efforts to augment real-world data (RWD) and its use for real-world evidence (RWE). This can complement more robust and comprehensive clinical trials and, in turn, accelerate the application of research findings into clinical practice.

MAIRA - Figure 1. Radiology report findings are input into GPT-4, which structures the findings into a knowledge graph and performs tasks such as disease classification, disease progression classification, or impression generation.
Figure 1. Radiology report findings are input into GPT-4, which structures the findings into a knowledge graph and performs tasks such as disease classification, disease progression classification, or impression generation.

Beyond radiology, GPT-4’s potential extends to translating medical reports into more empathetic (opens in new tab) and understandable formats for patients and other health professionals. This innovation could revolutionize patient engagement and education, making it easier for them and their carers to actively participate in their healthcare.

Microsoft Research Podcast

Collaborators: Gov4git with Petar Maymounkov and Kasia Sitkiewicz

Gov4git is a governance tool for decentralized, open-source cooperation, and is helping to lay the foundation for a future in which everyone can collaborate more efficiently, transparently, and easily and in ways that meet the unique desires and needs of their respective communities.


A promising path toward advancing radiology and beyond

When used with human oversight, GPT-4 also has the potential to transform radiology by assisting professionals in their day-to-day tasks. As we continue to explore this cutting-edge technology, there is great promise in improving our evaluation results of GPT-4 by investigating how it can be verified more thoroughly and finding ways to improve its accuracy and reliability. 

Our research highlights GPT-4’s potential in advancing radiology and other medical specialties, and while our results are encouraging, they require further validation through extensive research and clinical trials. Nonetheless, the emergence of GPT-4 heralds an exciting future for radiology. It will take the entire medical community working alongside other stakeholders in technology and policy to determine the appropriate use of these tools and responsibly realize the opportunity to transform healthcare. We eagerly anticipate its transformative impact towards improving patient care and safety.

Learn more about this work by visiting the Project MAIRA (opens in new tab) (Multimodal AI for Radiology Applications) page.

Acknowledgements 

We’d like to thank our coauthors: Qianchu Liu, Stephanie Hyland, Shruthi Bannur, Kenza Bouzid, Daniel C. Castro, Maria Teodora Wetscherek, Robert Tinn, Harshita Sharma, Fernando Perez-Garcia, Anton Schwaighofer, Pranav Rajpurkar, Sameer Tajdin Khanna, Hoifung Poon, Naoto Usuyama, Anja Thieme, Aditya V. Nori, Ozan Oktay 

The post GPT-4’s potential in shaping the future of radiology appeared first on Microsoft Research.

Read More

AWS AI services enhanced with FM-powered capabilities

AWS AI services enhanced with FM-powered capabilities

Artificial intelligence (AI) continues to transform how we do business and serve our customers. AWS offers a range of pre-trained AI services that provide ready-to-use intelligence for your applications. In this post, we explore the new AI service capabilities and how they are enhanced using foundation models (FMs).

We focus on the following major updates in this post across key AI services:

  • Amazon Transcribe now offers FM-powered language support across over 100 languages to unlock rich insights.
  • Amazon Transcribe Call Analytics now offers a new generative AI-powered summarization capability (in preview) that automates post-call summarization to improve contact center agent and manager productivity.
  • Amazon Personalize now uses an FM to generate more compelling content and product recommendations
  • Amazon Lex now uses large language models (LLMs) to provide accurate and conversational responses to FAQs (in preview), going beyond task-oriented dialogue

Amazon Transcribe expands language support and supercharges customer service productivity using FMs

In order to build global and inclusive speech-enabled applications that cater to users from diverse linguistic backgrounds, customers seek a truly global AI service that can understand and transcribe a wide array of languages with high accuracy. To help you scale globally, Amazon Transcribe now offers a speech FM-powered automatic speech recognition (ASR) system that expands support to over 100 languages.

FM-powered Amazon Transcribe delivers significant accuracy improvement between 20% and 50% across most languages. Apart from accuracy improvements, the new ASR system delivers several differentiating features across all supported languages (over 100) related to ease of use, customization, user safety, and privacy. Some examples include features such as automatic punctuation, custom vocabulary, automatic language identification, speaker diarization, word-level confidence scores, and custom vocabulary filters. Enabled by the high accuracy of Amazon Transcribe across different accents and noise conditions, its support for a large number of languages, and its breadth of value-added feature sets, thousands of enterprises will be empowered to unlock rich insights from their audio content, as well as increase the accessibility and discoverability of their audio and video content across various domains. All existing and new customers using Amazon Transcribe can experience the performance improvements out of the box, without any API changes.

Carbyne is a software company that develops cloud-based, mission-critical contact center solutions for emergency call responders. Carbyne’s mission is to help emergency responders save lives, and language cannot come in the way of their goals.

“AI-powered Carbyne Live Audio Translation is directly aimed at helping improve emergency response for the 68 million Americans who speak a language other than English at home, in addition to the up to 79 million foreign visitors to the country annually. By leveraging Amazon Transcribe’s new multilingual foundation model powered ASR, Carbyne will be even better equipped to democratize life-saving emergency services, because Every. Person. Counts.”

– Alex Dizengof, Co-Founder and CTO of Carbyne.

In a contact center, agents spend precious time after each call manually summarizing notes, which can impact their productivity and increase call wait times. Managers who have limited time to investigate calls and agent performance spend a significant amount of time listening to call recordings or reading entire transcripts while investigating caller issues. Amazon Transcribe Call Analytics now offers generative call summarization, a generative AI-powered capability that can automatically condense the entire interaction into a concise summary. For example, the following is a sample summary of a 10-minute phone call: “Customer reported that they didn’t receive their order even after 10 days from expected delivery date. The agent offered the customer a free replacement and $10 credit for future purchases. The agent will follow up with the customer in 2 days to confirm the receipt of the replacement order.”

This capability allows agents to spend more time talking to callers waiting in the queue rather than engaging in after-call work, thereby improving customer experience. Managers can review the call summary to quickly understand the context of an interaction without reading the whole transcript.

With AWS post call analytics solution, Principal can currently conduct large-scale historical analytics to understand where customer experiences can be improved, generate actionable insights, and prioritize where to act. We look forward to exploring the post call summarization feature using generative AI in Amazon Transcribe Call Analytics in order to enable our agents to focus their time and resources engaging with customers, rather than manual after contact work

– Miguel Antonio Sanchez, Regional Chief Data Officer, Principal Financial Group.

The following screenshots illustrate how to enable generative call summarization on the Amazon Transcribe console, and an example of a summarized transcript.

Amazon Personalize enables hyper-personalization with FMs

Customers across industries such as retail and media and entertainment are increasingly looking to make content and recommended products more tailored to user interest in order to drive higher engagement. For instance, on streaming platforms, users see the standard “Because you watched” recommendations, and on ecommerce websites, “frequently bought together” is used as a generic tagline. To offer more personalized browsing experiences with titles such as “Rise and Shine” and “Love, laughter, and hijinks,” companies need to allocate resources to generate compelling taglines manually. This is tedious and time consuming.

To help address this challenge, Amazon Personalize now offers the Content Generator—a new FM-powered capability that uses natural language to craft simple and engaging text that describes the thematic connections between recommended items. This enables companies to automatically generate engaging titles or email subject lines, to invite customers to click on videos or purchase items.

In addition, Amazon Personalize now offers Personalize on LangChain to power the journey of customers who want to build their own FM-based applications. With this integration, you can invoke Amazon Personalize, retrieve recommendations for a campaign or recommender, and seamlessly feed it into your FM-powered applications within the LangChain ecosystem.

“We are integrating generative AI with Amazon Personalize in order to deliver hyper-personalized experiences to our users. Amazon Personalize has helped us achieve high levels of automation in content customization. For instance, FOX Sports experienced a 400% increase in viewership content starts post-event when applied. Now, we are augmenting generative AI with Amazon Bedrock to our pipeline in order to help our content editors generate themed collections. We look forward to exploring features such as Amazon Personalize Content Generator and Personalize on Langchain to further personalize those collections for our users.”

– Daryl Bowden, Executive Vice President, Technology, Fox Corporation.

Amazon Lex offers FM-powered capabilities to build bots faster and improve containment

Driven by rising consumer demand for automated self-service, companies are prioritizing investments in conversational AI to optimize customer experience. To that end, AWS recently previewed Conversational FAQ (CFAQ), a new capability from Amazon Lex that answers frequently asked customer questions intelligently and at scale. Powered by FMs from Amazon Bedrock and approved knowledge sources, CFAQ enables companies to provide accurate, automated responses to common customer inquiries in a natural and engaging way. With this innovation, brands can deliver seamless self-service experiences that strengthen customer satisfaction and loyalty.

CFAQ simplifies bot development by eliminating the need to manually create intents, sample utterances, slots, and prompts to handle a wide range of frequently asked questions. It does so with a new intent type called QnAIntent that securely connects to knowledge sources like Amazon Bedrock, Amazon OpenSearch Service, and Amazon Kendra knowledge bases to retrieve the most relevant information to answer a question. Developers maintain control over response content, with the option to summarize retrieved information or use the authorized text as is. This allows highly regulated industries like financial services and healthcare to use CFAQ, enabling you to ensure responses use only compliant language. By streamlining access to relevant knowledge, CFAQ reduces the effort to build bots that handle common customer questions naturally and accurately.

Conclusion

AWS is constantly innovating on behalf of our customers. The latest set of advancements in AI services allow us to deliver more impactful capabilities that help organizations work smarter and provide personalized and intuitive experiences. To learn more about these launches, refer to the following:


About the author

Bratin Saha is the Vice President of Artificial Intelligence and Machine Learning at AWS.

Read More

Elevate your self-service assistants with new generative AI features in Amazon Lex

Elevate your self-service assistants with new generative AI features in Amazon Lex

In this post, we talk about how generative AI is changing the conversational AI industry by providing new customer and bot builder experiences, and the new features in Amazon Lex that take advantage of these advances.

As the demand for conversational AI continues to grow, developers are seeking ways to enhance their chatbots with human-like interactions and advanced capabilities such as FAQ handling. Recent breakthroughs in generative AI are leading to significant improvements in natural language understanding that make conversational systems more intelligent. By training large neural network models on datasets with trillions of tokens, AI researchers have developed techniques that allow bots to understand more complex questions, provide nuanced and more natural human-sounding responses, and handle a wide range of topics. With these new generative AI innovations, you can create virtual assistants that feel more natural, intuitive, and helpful during text- or voice-based self-service interactions. The rapid progress in generative AI is bringing automated chatbots and virtual assistants significantly closer to the goal of having truly intelligent, free-flowing conversations. With further advances in deep learning and neural network techniques, conversational systems are poised to become even more flexible, relatable, and human-like. This new generation of AI-powered assistants can provide seamless self-service experiences across a multitude of use cases.

How Amazon Bedrock is changing the landscape of conversational AI

Amazon Bedrock is a user-friendly way to build and scale generative AI applications with foundational models (FMs). Amazon Bedrock offers an array of FMs from leading providers, so AWS customers have flexibility and choice to use the best models for their specific use case.

In today’s fast-paced world, we expect quick and efficient customer service from every business. However, providing excellent customer service can be significantly challenging when the volume of inquiries outpaces the human resources employed to address them. Businesses can overcome this challenge efficiently while also providing personalized customer service by taking advantage of advancements in generative AI powered by large language models (LLMs).

Over the years, AWS has invested in democratizing access to—and amplifying the understanding of—AI, machine learning (ML), and generative AI. LLMs can be highly useful in contact centers by providing automated responses to frequently asked questions, analyzing customer sentiment and intents to route calls appropriately, generating summaries of conversations to help agents, and even automatically generating emails or chat responses to common customer inquiries. By handling repetitive tasks and gaining insights from conversations, LLMs allow contact center agents to focus on delivering higher value through personalized service and resolving complex issues.

Improving the customer experience with conversational FAQs

Generative AI has tremendous potential to provide quick, reliable answers to commonly asked customer questions in a conversational manner. With access to authorized knowledge sources and LLMs, your existing Amazon Lex bot can provide helpful, natural, and accurate responses to FAQs, going beyond task-oriented dialogue. Our Retrieval Augmented Generation (RAG) approach allows Amazon Lex to harness both the breadth of knowledge available in repositories as well as the fluency of LLMs. You can simply ask your question in free-form, conversational language, and receive a natural, tailored response within seconds. The new conversational FAQ feature in Amazon Lex allows bot developers and conversation designers to focus on defining business logic rather than designing exhaustive FAQ-based conversation flows within a bot.

We are introducing a built-in QnAIntent that uses an LLM to query an authorized knowledge source and provide a meaningful and contextual response. In addition, developers can configure the QnAIntent to point to specific knowledge base sections, ensuring only specific portions of the knowledge content is queried at runtime to fulfill user requests. This capability fulfills the need for highly regulated industries, such as financial services and healthcare, to only provide responses in compliant language. The conversational FAQ feature in Amazon Lex allows organizations to improve containment rates while avoiding the high costs of missed queries and human representative transfers.

Building an Amazon Lex bot using the descriptive bot builder

Building conversational bots from scratch is a time-consuming process that requires deep knowledge of how users interact with bots in order to anticipate potential requests and code appropriate responses. Today, conversation designers and developers spend many days writing code to help run all possible user actions (intents), the various ways users phrase their requests (utterances), and the information needed from the user to complete those actions (slots).

The new descriptive bot building feature in Amazon Lex uses generative AI to accelerate the bot building process. Instead of writing code, conversation designers and bot developers can now describe in plain English what they want the bot to accomplish (for example, “Take reservations for my hotel using name and contact info, travel dates, room type, and payment info”). Using only this simple prompt, Amazon Lex will automatically generate intents, training utterances, slots, prompts, and a conversational flow to bring the described bot to life. By providing a baseline bot design, this feature immensely reduces the time and complexity of building conversational chatbots, allowing the builder to reprioritize effort on fine-tuning the conversational experience.

By tapping into the power of generative AI with LLMs, Amazon Lex enables developers and non-technical users to build bots simply by describing their goal. Rather than meticulously coding intents, utterances, slots, and so on, developers can provide a natural language prompt and Amazon Lex will automatically generate a basic bot flow ready for further refinement. This capability is initially only available in English, but developers can further customize the AI-generated bot as needed before deployment, saving many hours of manual development work.

Improving the user experience with assisted slot resolution

As consumers become more familiar with chatbots and interactive voice response (IVR) systems, they expect higher levels of intelligence baked into self-service experiences. Disambiguating responses that are more conversational is imperative to success as users expect more natural, human-like experiences. With rising consumer confidence in chatbot capabilities, there is also an expectation of elevated performance from natural language understanding (NLU). In the likely scenario that a semantically simple or complex utterance is not resolved properly to a slot, user confidence can dwindle. In such instances, an LLM can dynamically assist the existing Amazon Lex NLU model and ensure accurate slot resolution even when the user utterance is beyond the bounds of the slot model. In Amazon Lex, the assisted slot resolution feature provides the bot developer yet another tool for which to increase containment.

During runtime, when NLU fails to resolve a slot during a conversational turn, Amazon Lex will call the LLM selected by the bot developer to assist with resolving the slot. If the LLM is able to provide a value upon slot retry, the user can continue with the conversation as normal. For example, if upon slot retry, a bot asks “What city does the policy holder reside in?” and the user responds “I live in Springfield,” the LLM will be able to resolve the value to “Springfield.” The supported slot types for this feature include AMAZON.City, AMAZON.Country, AMAZON.Number, AMAZON.Date, AMAZON.AlphaNumeric (without regex) and AMAZON.PhoneNumber, and AMAZON.Confirmation. This feature is only available in English at the time of writing.

Improving the builder experience with training utterance generation

One of the pain points that bot builders and conversational designers often encounter is anticipating the variation and diversity of responses when invoking an intent or soliciting slot information. When a bot developer creates a new intent, sample utterances must be provided to train the ML model on the types of responses it can and should accept. It can often be difficult to anticipate the permutations on verbiage and syntax used by customers. With utterance generation, Amazon Lex uses foundational models such as Amazon Titan to generate training utterances with just one click, without the need for any prompt engineering.

Utterance generation uses the intent name, existing utterances, and optionally the intent description to generate new utterances with an LLM. Bot developers and conversational designers can edit or delete the generated utterances before accepting them. This feature works with both new and existing intents.

Conclusion

Recent advancements in generative AI have undoubtedly made automated consumer experiences better. With Amazon Lex, we are committed to infusing generative AI into every aspect of the builder and user experience. The features mentioned in this post are just the beginning—and we can’t wait to show you what is to come.

To learn more, refer to Amazon Lex Documentation, and try these features out on the Amazon Lex console.


About the authors

Anuradha Durfee is a Senior Product Manager on the Amazon Lex team and has more than 7 years of experience in conversational AI. She is fascinated by voice user interfaces and making technology more accessible through intuitive design.

Sandeep Srinivasan is a Senior Product Manager on the Amazon Lex team. As a keen observer of human behavior, he is passionate about customer experience. He spends his waking hours at the intersection of people, technology, and the future.

Read More

Amazon Transcribe announces a new speech foundation model-powered ASR system that expands support to over 100 languages

Amazon Transcribe announces a new speech foundation model-powered ASR system that expands support to over 100 languages

Amazon Transcribe is a fully managed automatic speech recognition (ASR) service that makes it straightforward for you to add speech-to-text capabilities to your applications. Today, we are happy to announce a next-generation multi-billion parameter speech foundation model-powered system that expands automatic speech recognition to over 100 languages. In this post, we discuss some of the benefits of this system, how companies are using it, and how to get started. We also provide an example of the transcription output below.

Transcribe’s speech foundation model is trained using best-in-class, self-supervised algorithms to learn the inherent universal patterns of human speech across languages and accents. It is trained on millions of hours of unlabeled audio data from over 100 languages. The training recipes are optimized through smart data sampling to balance the training data between languages, ensuring that traditionally under-represented languages also reach high accuracy levels.

Carbyne is a software company that develops cloud-based, mission-critical contact center solutions for emergency call responders. Carbyne’s mission is to help emergency responders save lives, and language can’t get in the way of their goals. Here is how they use Amazon Transcribe to pursue their mission:

“AI-powered Carbyne Live Audio Translation is directly aimed at helping improve emergency response for the 68 million Americans who speak a language other than English at home, in addition to the up to 79 million foreign visitors to the country annually. By leveraging Amazon Transcribe’s new multilingual foundation model powered ASR, Carbyne will be even better equipped to democratize life-saving emergency services, because Every. Person. Counts.”

– Alex Dizengof, Co-Founder and CTO of Carbyne.

By leveraging speech foundation model, Amazon Transcribe delivers significant accuracy improvement between 20% and 50% across most languages. On telephony speech, which is a challenging and data-scarce domain, accuracy improvement is between 30% and 70%. In addition to substantial accuracy improvement, this large ASR model also delivers improvements in readability with more accurate punctuation and capitalization. With the advent of generative AI, thousands of enterprises are using Amazon Transcribe to unlock rich insights from their audio content. With significantly improved accuracy and support for over 100 languages, Amazon Transcribe will positively impact all such use cases. All existing and new customers using Amazon Transcribe in batch mode can access speech foundation model-powered speech recognition without needing any change to either the API endpoint or input parameters.

The new ASR system delivers several key features across all the 100+ languages related to ease of use, customization, user safety, and privacy. These include features such as automatic punctuation, custom vocabulary, automatic language identification, speaker diarization, word-level confidence scores, and custom vocabulary filter. The system’s expanded support for different accents, noise environments, and acoustic conditions enables you to produce more accurate outputs and thereby helps you effectively embed voice technologies in your applications.

Enabled by the high accuracy of Amazon Transcribe across different accents and noise conditions, its support for a large number of languages, and its breadth of value-added feature sets, thousands of enterprises will be empowered to unlock rich insights from their audio content, as well as increase the accessibility and discoverability of their audio and video content across various domains. For instance, contact centers transcribe and analyze customer calls to identify insights and subsequently improve customer experience and agent productivity. Content producers and media distributors automatically generate subtitles using Amazon Transcribe to improve content accessibility.

Get started with Amazon Transcribe

You can use the AWS Command Line Interface (AWS CLI), AWS Management Console, and various AWS SDKs for batch transcriptions and continue to use the same StartTranscriptionJob API to get performance benefits from the enhanced ASR model without needing to make any code or parameter changes on your end. For more information about using the AWS CLI and the console, refer to Transcribing with the AWS CLI and Transcribing with the AWS Management Console, respectively.

The first step is to upload your media files into an Amazon Simple Storage Service (Amazon S3) bucket, an object storage service built to store and retrieve any amount of data from anywhere. Amazon S3 offers industry-leading durability, availability, performance, security, and virtually unlimited scalability at very low cost. You can choose to save your transcript in your own S3 bucket, or have Amazon Transcribe use a secure default bucket. To learn more about using S3 buckets, see Creating, configuring, and working with Amazon S3 buckets.

Transcription output

Amazon Transcribe uses JSON representation for its output. It provides the transcription result in two different formats: text format and itemized format. Nothing changes with respect to the API endpoint or input parameters.

The text format provides the transcript as a block of text, whereas itemized format provides the transcript in the form of timely ordered transcribed items, along with additional metadata per item. Both formats exist in parallel in the output file.

Depending on the features you select when creating the transcription job, Amazon Transcribe creates additional and enriched views of the transcription result. See the following example code:

{
   "jobName": "2x-speakers_2x-channels",
    "accountId": "************",
    "results": {
        "transcripts": [
{
                "transcript": "Hi, welcome."
            }
        ],
        "speaker_labels": [
            {
                "channel_label": "ch_0",
                "speakers": 2,
                "segments": [
                ]
            },
            {
                "channel_label": "ch_1",
                "speakers": 2,
                "segments": [
                ]
            }
        ],
        "channel_labels": {
            "channels": [
            ],
            "number_of_channels": 2
        },
        "items": [
            
        ],
        "segments": [
        ]
    },
    "status": "COMPLETED"
}

The views are as follows:

  • Transcripts – Represented by the transcripts element, it contains only the text format of the transcript. In multi-speaker, multi-channel scenarios, concatenation of all transcripts is provided as a single block.
  • Speakers – Represented by the speaker_labels element, it contains the text and itemized formats of the transcript grouped by speaker. It’s available only when the multi-speakers feature is enabled.
  • Channels – Represented by the channel_labels element, it contains the text and itemized formats of the transcript, grouped by channel. It’s available only when the multi-channels feature is enabled.
  • Items – Represented by the items element, it contains only the itemized format of the transcript. In multi-speaker, multi-channel scenarios, items are enriched with additional properties, indicating speaker and channel.
  • Segments – Represented by the segments element, it contains the text and itemized formats of the transcript, grouped by alternative transcription. It’s available only when the alternative results feature is enabled.

Conclusion

At AWS, we are constantly innovating on behalf of our customers. By extending the language support in Amazon Transcribe to over 100 languages, we enable our customers to serve users from diverse linguistic backgrounds. This not only enhances accessibility, but also opens up new avenues for communication and information exchange on a global scale. To learn more about the features discussed in this post, check out features page and what’s new post.


About the authors

Sumit Kumar is a Principal Product Manager, Technical at AWS AI Language Services team. He has 10 years of product management experience across a variety of domains and is passionate about AI/ML. Outside of work, Sumit loves to travel and enjoys playing cricket and Lawn-Tennis.

Vivek Singh is a Senior Manager, Product Management at AWS AI Language Services team. He leads the Amazon Transcribe product team. Prior to joining AWS, he held product management roles across various other Amazon organizations such as consumer payments and retail. Vivek lives in Seattle, WA and enjoys running, and hiking.

Read More

Drive hyper-personalized customer experiences with Amazon Personalize and generative AI

Drive hyper-personalized customer experiences with Amazon Personalize and generative AI

Today, we are excited to announce three launches that will help you enhance personalized customer experiences using Amazon Personalize and generative AI. Whether you’re looking for a managed solution or build your own, you can use these new capabilities to power your journey.

Amazon Personalize is a fully managed machine learning (ML) service that makes it easy for developers to deliver personalized experiences to their users. It enables you to improve customer engagement by powering personalized product and content recommendations in websites, applications, and targeted marketing campaigns, with no ML expertise required. Using recipes (algorithms prepared for specific uses cases) provided by Amazon Personalize, you can offer diverse personalization experiences like “recommend for you”, “frequently bought together”, guidance on next best actions, and targeted marketing campaigns with user segmentation.

Generative AI is quickly transforming how enterprises do business. Gartner predicts that “by 2026, more than 80% of enterprises will have used generative AI APIs or models, or deployed generative AI-enabled applications in production environments, up from less than 5% in 2023.” While generative AI can quickly create content, it alone is not enough to provide higher degree of personalization to adapt to the ever-changing and nuanced preferences of individual users. Many companies are actively seeking solutions to enhance user experience using Amazon Personalize and generative AI.

FOX Corporation (FOX) produces and distributes news, sports, and entertainment content.

“We are integrating generative AI with Amazon Personalize in order to deliver hyper-personalized experiences to our users. Amazon Personalize has helped us achieve high levels of automation in content customization. For instance, FOX Sports experienced a 400% increase in viewership content starts post-event when applied. Now, we are augmenting generative AI with Amazon Bedrock to our pipeline in order to help our content editors generate themed collections. We look forward to exploring features such as Amazon Personalize Content Generator and Personalize on LangChain to further personalize those collections for our users.”

– Daryl Bowden, Executive Vice President of Technology Platforms.

Announcing Amazon Personalize Content Generator to make recommendations more compelling

Amazon Personalize has launched Content Generator, a new generative AI-powered capability that helps companies make recommendations more compelling by identifying thematic connections between the recommended items. This capability can elevate the recommendation experience beyond standard phrases like “People who bought this also bought…” to more engaging taglines such as “Rise and Shine” for a breakfast food collection, enticing users to click and purchase.

To explore the impact of Amazon Personalize Content Generator in detail, let’s look at two examples.

Use case 1: Carousel titles for movie collections

A micro-genre is a specialized subcategory within a broader genre of film, music, or other forms of media. Streaming platforms use micro-genres to enhance user experience by allowing viewers or listeners to discover content that aligns with their specific tastes and interests. By recommending media content with micro-genres, streaming platforms cater to diverse preferences, ultimately increasing user engagement and satisfaction.

Now you can use Amazon Personalize Content Generator to create carousel titles for micro-genre collections. First, import your datasets of users’ interactions and items into Amazon Personalize for training. You upload a list of itemId values as your seed items. Next, create a batch inference job selecting Themed recommendations with Content Generator on the Amazon Personalize console or setting batch-inference-job-mode to THEME_GENERATION in the API configuration.

As the batch inference output, you will get a set of similar items and a theme for each seed item. We also provide items-theme relevance scores that you can use to set a threshold to show only items that are strongly related to the theme. The following screenshot shows an example of the output:

{"input":{"itemId":"40"},"output":{
"recommendedItems":["36","50","44","22","21","29","3","1","2","39"],
"theme":"Movies with a strong female lead",
"itemsThemeRelevanceScores":[0.19994527,0.183059963,0.17478035,0.1618133,0.1574806,0.15468733,0.1499242,0.14353688,0.13531424,0.10291852]}}

{"input":{"itemId":"43"},"output":{
"recommendedItems":["50","21","36","3","17","2","39","1","10","5"],
"theme":"Romantic movies for a cozy night in",
"itemsThemeRelevanceScores":[0.184988,0.1795761,0.11143453,0.0989443,0.08258403,0.07952615,0.07115086,0.0621634,-0.138913,-0.188913]}}
...

Subsequently you can replace the generic phrase “More like X” with the output theme from Amazon Personalize Content Generator to make the recommendations more compelling.

Use case 2: Subject lines for marketing emails

Email marketing, although cost-effective, often struggles with low open rates and high unsubscribe rates. The decision to open an email critically depends on how attractive the subject line is, because it’s the first thing recipients see along with the sender’s name. However, scripting appealing subject lines can often be tedious and time-consuming.

Now with Amazon Personalize Content Generator, you can create compelling subject lines or headlines in the email body more efficiently, further personalizing your email campaigns. You follow the same process of data ingestion, training, and creating a batch inference job as in the previous use case. The following is an example of a marketing email that incorporates output from Amazon Personalize using Content Generator, including a set of recommended items and a generated subject line:

Subject: Cleaning Products That Will Make Your Life Sparkle!

Dear <user name>,
Are you ready to transform your cleaning routine into an effortless and enjoyable experience? Explore our top-tier selections:
Robot Vacuum Cleaners <picture>
Window Cleaning Kits <picture>
Scrub Brushes with Ergonomic Handles <picture>
Microfiber Cloths <picture>
Eco-Friendly Cleaning Sprays <picture>

These examples showcase how Amazon Personalize Content Generator can assist you in creating a more engaging browsing experience or a more effective marketing campaign. For more detailed instructions, refer to Themed batch recommendations.

Announcing LangChain integration to seamlessly integrate Amazon Personalize with the LangChain framework

LangChain is a powerful open-source framework that allows for integration with large language models (LLMs). LLMs are typically versatile but may struggle with domain-specific tasks where deeper context and nuanced responses are needed. LangChain empowers developers in such scenarios to build modules (agents/chains) for their specific generative AI tasks. They can also introduce context and memory into LLMs by connecting and chaining LLM prompts to solve for varying use cases.

We are excited to launch LangChain integration. With this new capability, builders can use the Amazon Personalize custom chain on LangChain to seamlessly integrate Amazon Personalize with generative AI solutions. Adding a personalized touch to generative AI solutions helps you create more tailored and relevant interactions with end-users. The following code snippet demonstrates how you can invoke Amazon Personalize, retrieve recommendations for a campaign or recommender, and seamlessly feed it into your generative AI applications within the LangChain ecosystem. You can also use this for sequential chains.

from langchain.utilities import AmazonPersonalize
from langchain.chains import AmazonPersonalizeChain
from langchain.llms.bedrock import Bedrock

recommender_arn="<insert_arn>"
client=AmazonPersonalize(recommender_arn=recommender_arn, credentials_profile_name="default",region_name="us-west-2")

bedrock_llm = Bedrock(model_id="anthropic.claude-v2", region_name="us-west-2")

# Create personalize chain
chain = AmazonPersonalizeChain.from_llm( llm=bedrock_llm, client=client)
response = chain({'user_id': '1'})

You can use this capability to craft personalized marketing copies, generate concise summaries for recommended content, recommend products or content in chatbots, and build next-generation customer experiences with your creativity.

Amazon Personalize now enables you to return metadata in inference response to improve generative AI workflow

Amazon Personalize now improves your generative AI workflow by enabling return item metadata as part of the inference output. Getting recommendations along with metadata makes it more convenient to provide additional context to LLMs. This additional context, such as genre and product description, can help the models gain a deeper understanding of item attributes to generate more relevant content.

Amazon Personalize supports this capability for both custom recipes and domain optimized recommenders. When creating a campaign or a recommender, you can enable the option to return metadata with recommendation results, or adjust the setting by updating the campaign or recommender. You can select up to 10 metadata fields and 50 recommendation results to return metadata during an inference call, either through the Amazon Personalize API or the Amazon Personalize console.

The following is an example in the API:

## Create campaign with enabled metadata
example_name = 'metadata_response_enabled_campaign'
create_campaign_response = personalize.create_campaign(
    name = example_name,
    solutionVersionArn = example_solution_version_arn,
    minProvisionedTPS = 1,
    campaignConfig = {"enableMetadataWithRecommendations": True}
)

## GetRecommendations with metadata columns
metadataMap = {"ITEMS": ["genres", "num"]}
response = personalize_runtime.get_recommendations(campaignArn=example_campaign_arn,
     userId="0001", itemId="0002", metadataColumns=metadataMap, numResults=2)
     
## Example response with metadata
 itemList': 
 [
     {
      'itemId': '356',
      'metadata': {'genres': 'Comedy', 'num': '0.6103248'}
     },
     {
      'itemId': '260',
      'metadata': {'genres': 'Action|Adventure', 'num': '0.074548'}},
     }
 ]

Conclusion

At AWS, we are constantly innovating on behalf of our customers. By introducing these new launches powered by Amazon Personalize and Amazon Bedrock, we will enrich every aspect of the builder and user experience, elevating efficiency and end-user satisfaction. To learn more about the capabilities discussed in this post, check out Amazon Personalize features and the Amazon Personalize Developer Guide.


About the Authors

Jingwen Hu is a Senior Technical Product Manager working with AWS AI/ML on the Amazon Personalize team. In her spare time, she enjoys traveling and exploring local food.

Pranav Agarwal is a Senior Software Engineer with AWS AI/ML and works on architecting software systems and building AI-powered recommender systems at scale. Outside of work, he enjoys reading, running, and ice-skating.

Rishabh Agrawal is a Senior Software Engineer working on AI services at AWS. In his spare time, he enjoys hiking, traveling, and reading.

Ashish Lal is a Senior Product Marketing Manager who leads product marketing for AI services at AWS. He has 9 years of marketing experience and has led the product marketing effort for intelligent document processing. He got his master’s in Business Administration at the University of Washington.

Read More

Build brand loyalty by recommending actions to your users with Amazon Personalize Next Best Action

Build brand loyalty by recommending actions to your users with Amazon Personalize Next Best Action

Amazon Personalize is excited to announce the new Next Best Action (aws-next-best-action) recipe to help you determine the best actions to suggest to your individual users that will enable you to increase brand loyalty and conversion.

Amazon Personalize is a fully managed machine learning (ML) service that makes it effortless for developers to deliver highly personalized user experiences in real time. It enables you to improve customer engagement by powering personalized product and content recommendations in websites, applications, and targeted marketing campaigns. You can get started without any prior ML experience, using APIs to easily build sophisticated personalization capabilities in a few clicks. All your data is encrypted to be private and secure.

In this post, we show you how to use the Next Best Action recipe to personalize action recommendations based on each user’s past interactions, needs, and behavior.

Solution overview

With the rapid growth of digital channels and technology advances that make hyper-personalization more accessible, brands struggle to determine what actions will maximize engagement for each individual user. Brands either show the same actions to all users or rely on traditional user segmentation approaches to recommend actions to each user cohort. However, these approaches are no longer sufficient, because every user expects a unique experience and tends to abandon brands that don’t understand their needs. Furthermore, brands are unable to update the action recommendations in real time due to the manual nature of the process.

With Next Best Action, you can determine the actions that have the highest likelihood of engaging each individual user based on their preferences, needs, and history. Next Best Action takes the in-session interests of each user into account and provides action recommendations in real time. You can recommend actions such as enrolling in loyalty programs, signing up for a newsletter or magazine, exploring a new category, downloading an app, and other actions that encourage conversion. This will enable you to improve each user’s experience by providing them with recommendations on actions across their user journey that will help promote long-term brand engagement and revenue. It will also help improve your return on marketing investment by recommending the action that each user has a high likelihood of taking.

AWS Partners like Credera are excited by the personalization possibilities that the Amazon Personalize Next Best Action will unlock for their customers.

“Amazon Personalize is a world-class machine learning solution that enables companies to create meaningful customer experiences across a wide array of use cases without extensive rework or up-front implementation cost that is typically required of these types of solutions. We are really excited about the addition of the Next Best Action capability that will enable customers to provide personalized action recommendations, significantly improving their digital experiences and driving additional business value. Specifically, we expect anyone working within the retail or content space to see an improved experience for their customers and higher conversions as a direct result of using Amazon Personalize. We are extremely thrilled to be a launch partner with AWS on this release and looking forward to empowering businesses to drive ML-based personalized solutions with Next Best Action.”

– Jason Goth, Partner and Chief Technology Officer, Credera.

Example use cases

To explore the impact of this new feature in greater detail, let’s review an example by taking three users: A (User_id 11999), B (User_id 17141), and C (User_id 8103), who are in different stages of their user journey while making purchases on a website. We then see how Next Best Action suggests the optimal actions for each user based on their past interactions and preferences.

First, we look at the action interactions dataset to understand how users have interacted with actions in the past. The following example shows the three users and their different shopping patterns. User A is a frequent buyer and has shopped mostly in the “Beauty & Grooming” and “Jewelry” categories in the past. User B is a casual buyer who has made a few purchases in the “Electronics” category in the past, and User C is a new user on the website who has made their first purchase in the “Clothing” category.

User Type User_id Actions Action_Event_Type Timestamp
User A 11999 Purchase in “Beauty & Grooming” category taken 2023-09-17 20:03:05
User A 11999 Purchase in “Beauty & Grooming” category taken 2023-09-18 19:28:38
User A 11999 Purchase in “Beauty & Grooming” category taken 2023-09-20 17:49:52
User A 11999 Purchase in “Jewelry” category taken 2023-09-26 18:36:16
User A 11999 Purchase in “Beauty & Grooming” category taken 2023-09-30 19:21:05
User A 11999 Download the mobile app taken 2023-09-30 19:29:35
User A 11999 Purchase in “Jewelry” category taken 2023-10-01 19:35:47
User A 11999 Purchase in “Beauty & Grooming” category taken 2023-10-04 19:19:34
User A 11999 Purchase in “Jewelry” category taken 2023-10-06 20:38:55
User A 11999 Purchase in “Beauty & Grooming” category taken 2023-10-10 20:17:07
User B 17141 Purchase in “Electronics” category taken 2023-09-29 20:17:49
User B 17141 Purchase in “Electronics” category taken 2023-10-02 00:38:08
User B 17141 Purchase in “Electronics” category taken 2023-10-07 11:04:56
User C 8103 Purchase in “Clothing” category taken 2023-09-26 18:30:56

Traditionally, brands either show the same actions to all users or employ user segmentation strategies to recommend actions to their user base. The following table is an example of a brand showing the same set of actions to all users. These actions may or may not be relevant to the users, reducing their engagement with the brand.

User Type User_id Action Recommendations Rank of Action
User A 11999 Subscribe to Loyalty Program 1
User A 11999 Download the mobile app 2
User A 11999 Purchase in “Electronics” category 3
User B 17141 Subscribe to Loyalty Program 1
User B 17141 Download the mobile app 2
User B 17141 Purchase in “Electronics” category 3
User C 8103 Subscribe to Loyalty Program 1
User C 8103 Download the mobile app 2
User C 8103 Purchase in “Electronics” category 3

Now let’s use Next Best Action to recommend actions for each user. After you define the actions eligible for recommendations, the aws-next-best-action recipe returns a ranked list of actions, personalized for each user, based on user propensity (the probability of a user taking a particular action, ranging between 0.0–1.0) and value of that action, if provided. For the purpose of this post, we only consider user propensity.

In the following example, we see that for User A (frequent buyer), Subscribe to Loyalty Program is the top recommended action with a propensity score of 1.00, which means that this user is most likely to enroll in the loyalty program because they have made numerous purchases. Therefore, recommending the action Subscribe to Loyalty Program to User A has a high probability of increasing User A’s engagement.

User Type User_id Action Recommendations Rank of Action Propensity Score
User A 11999 Subscribe to Loyalty Program 1 1.00
User A 11999 Purchase in “Jewelry” category 2 0.86
User A 11999 Purchase in “Beauty & Grooming” category 3 0.85
User B 17141 Purchase in “Electronics” category 1 0.78
User B 17141 Subscribe to Loyalty Program 2 0.71
User B 17141 Purchase in “Smart Homes” category 3 0.66
User C 8103 Purchase in “Handbags & Shoes” category 1 0.60
User C 8103 Download the mobile app 2 0.48
User C 8103 Purchase in “Clothing” category 3 0.46

Similarly, User B (casual buyer persona) has a higher probability to continue purchasing in “Electronics” category and also buying new products in a similar category, “Smart Homes”. Therefore, Next Best Action recommends you to prioritize actions, Purchase in “Electronics” category and Purchase in “Smart Homes” category. This means that if you prompt User B to buy products in these two categories, it can lead to greater engagement. We also notice the action to Subscribe to Loyalty Program is recommended to User B but with a lower propensity score of 0.71 as compared to User A, whose propensity score is 1.0. This is because users that have a deeper history and are further along their shopping journey benefit more from Loyalty programs due of the added benefits and are highly likely to interact more.

Finally, we see that Next Best Action for User C is purchasing in “Handbags & Shoes” category, which is similar to their previous action of Purchase in “Clothing” category. We also see that the propensity score to Download the mobile app is relatively lower (0.48) than another action, Purchase in “Handbags & Shoes” category, which has a higher propensity score of 0.60. This means that if you recommend User C to purchase products in a complementary category (“Handbags & Shoes”) over downloading the mobile app, they are more likely to stick with your brand and continue shopping in the future.

For more details on how to implement the Next Best Action (aws-next-best-action) recipe, refer to documentation.

Conclusion

The new Next Best Action recipe in Amazon Personalize helps you recommend the right actions to the right user in real time based on their individual behavior and needs. This will enable you to maximize user engagement and lead to greater conversion rates.

For more information about Amazon Personalize, see the Amazon Personalize Developer Guide.


About the Authors

Shreeya Sharma is a Sr. Technical Product Manager working with AWS AI/ML on Amazon Personalize. She has a background in computer science engineering, technology consulting, and data analytics. In her spare time, she enjoys traveling, performing theatre, and trying out new adventures.

Pranesh Anubhav is a Senior Software Engineer for Amazon Personalize. He is passionate about designing machine learning systems to serve customers at scale. Outside of his work, he loves playing soccer and is an avid follower of Real Madrid.

Aniket Deshmukh is an Applied Scientist in AWS AI labs supporting Amazon Personalize. Aniket works in the general area of recommendation systems, contextual bandits, and multi-modal deep learning.

Read More

Medical Imaging AI Made Easier: NVIDIA Offers MONAI as Hosted Cloud Service

Medical Imaging AI Made Easier: NVIDIA Offers MONAI as Hosted Cloud Service

NVIDIA today launched a cloud service for medical imaging AI to further streamline and accelerate the creation of ground-truth data and training of specialized AI models through fully managed, cloud-based application programming interfaces.

NVIDIA MONAI cloud APIs — announced at the annual meeting of RSNA, the Radiological Society of North America, taking place this week in Chicago — provide an expedited path for developers and platform providers to integrate AI into their medical imaging offerings using pretrained foundation models and AI workflows for enterprises. The APIs are built on the open-source MONAI project founded by NVIDIA and King’s College London.

Medical imaging is critical across healthcare, making up approximately 90% of healthcare data. It’s used by radiologists and clinicians to do screening, diagnosis and intervention, by biopharma researchers to evaluate how clinical trial patients respond to new drugs and by medical device makers to provide real-time decision support.

The scale of work across each of these areas requires a medical imaging-specific AI factory — an enterprise-grade platform that delivers large-scale data management, creates ground-truth annotations, accelerates model development and establishes seamless AI application deployment.

With NVIDIA MONAI cloud APIs, solution providers can more easily integrate AI into their medical imaging platforms, enabling them to provide supercharged tools for radiologists, researchers and clinical trial teams to build domain-specialized AI factories. The APIs are available in early access through the NVIDIA DGX Cloud AI supercomputing service.

The NVIDIA MONAI cloud API is integrated into Flywheel, a leading medical imaging data and AI platform that supports end-to-end workflows for AI development. Developers at medical image annotation companies including RedBrick AI and at machine learning operations (MLOps) platform providers including Dataiku are poised to integrate NVIDIA MONAI cloud APIs into their offerings.

Ready-to-Run Annotation and Training for Medical Imaging

Building efficient and cost-effective AI solutions requires a robust, domain-specialized development foundation that includes full-stack optimizations for software, scalable multi-node systems and state-of-the-art research. It also requires high-quality ground-truth data — which can be arduous and time-consuming to gather, particularly for 3D medical images that require a high level of expertise to annotate.

NVIDIA MONAI cloud APIs feature interactive annotation powered by the VISTA-3D (Vision Imaging Segmentation and Annotation) foundation model. It’s purpose-built for continuous learning, a capability that improves AI model performance based on user feedback and new data.

Trained on a dataset of annotated images from 3D CT scans from more than 4,000 patients, spanning various diseases and parts of the body, VISTA-3D accelerates the creation of 3D segmentation masks for medical image analysis. With continuous learning, the AI model’s annotation quality improves over time.

To further accelerate AI training, this release includes APIs that make it seamless to build custom models based on MONAI pretrained models. NVIDIA MONAI cloud APIs also include Auto3DSeg, which automates hyperparameter tuning and AI model selection for a given 3D segmentation task, simplifying the model development process.

NVIDIA researchers recently won four challenges at the MICCAI medical imaging conference using Auto3DSeg. These included AI models to analyze 3D CT scans of the kidneys and heart, brain MRIs and 3D ultrasounds of the heart.

Solutions Providers, Platform Builders Embrace NVIDIA MONAI Cloud APIs

Medical imaging solution providers and machine learning platforms are using NVIDIA MONAI cloud APIs to deliver critically valuable AI insights to accelerate their customers’ work.

Flywheel has integrated MONAI through NVIDIA AI Enterprise and is now offering NVIDIA MONAI cloud APIs to accelerate medical image curation, labeling analysis and training. The Minneapolis-based company’s centralized, cloud-based platform powers biopharma companies, life science organizations, healthcare providers and academic medical centers to identify, curate and train medical imaging data for the development of trustworthy AI.

“NVIDIA MONAI cloud APIs lower the cost of building high-quality AI models for radiology, disease research and the evaluation of clinical trial data,” said Dan Marcus, chief scientific officer at Flywheel. “With the addition of cloud APIs for interactive annotation and automated segmentation, customers of our medical imaging AI platform can accelerate AI model development to more quickly deliver innovative solutions.”

Annotation and viewer solution providers, including Redbrick AI, Radical Imaging, V7 Labs and Centaur Labs, will also use NVIDIA MONAI cloud APIs to bring AI-assisted annotation and training capabilities to market faster, without having to host and manage the AI infrastructure on their own.

RedBrick AI is integrating the VISTA-3D model available through NVIDIA MONAI cloud APIs to deliver interactive cloud annotation for its medical device customers that support distributed teams of clinicians.

“VISTA-3D allows our clients to rapidly build models across different modalities and conditions,” said Shivam Sharma, CEO of RedBrick AI. “The foundation model is generalizable, making it easy to fine-tune for various clinical applications with accurate, reliable segmentation results.”

To streamline enterprise AI model development, MLOps platform builders including Dataiku, ClearML and Weight & Biases are also investigating the use of NVIDIA MONAI cloud APIs.

Dataiku plans to integrate NVIDIA MONAI cloud APIs to further simplify AI model creation for medical imaging applications.

“With NVIDIA MONAI cloud APIs, Dataiku users would be able to easily use Auto3DSeg, a low-code option to accelerate the development of state-of-the-art segmentation models, through Dataiku’s web interface connected to an NVIDIA-hosted, GPU-accelerated service,” said Kelci Miclaus, global head of AI health and life sciences solutions at Dataiku. “This democratizes AI in biomedical imaging by extending the power to create and apply AI-driven workflows to both data and domain experts.”

Join the medical imaging innovators accelerating AI development with NVIDIA MONAI cloud APIs by signing up for early access.

Read More

Accelerating AI/ML development at BMW Group with Amazon SageMaker Studio

Accelerating AI/ML development at BMW Group with Amazon SageMaker Studio

This post is co-written with Marc Neumann, Amor Steinberg and Marinus Krommenhoek from BMW Group.

The BMW Group – headquartered in Munich, Germany – is driven by 149,000 employees worldwide and manufactures in over 30 production and assembly facilities across 15 countries. Today, the BMW Group is the world’s leading manufacturer of premium automobiles and motorcycles, and provider of premium financial and mobility services. The BMW Group sets trends in production technology and sustainability as an innovation leader with an intelligent material mix, a technological shift towards digitalization, and resource-efficient production.

In an increasingly digital and rapidly changing world, BMW Group’s business and product development strategies rely heavily on data-driven decision-making. With that, the need for data scientists and machine learning (ML) engineers has grown significantly. These skilled professionals are tasked with building and deploying models that improve the quality and efficiency of BMW’s business processes and enable informed leadership decisions.

Data scientists and ML engineers require capable tooling and sufficient compute for their work. Therefore, BMW established a centralized ML/deep learning infrastructure on premises several years ago and continuously upgraded it. To pave the way for the growth of AI, BMW Group needed to make a leap regarding scalability and elasticity while reducing operational overhead, software licensing, and hardware management.

In this post, we will talk about how BMW Group, in collaboration with AWS Professional Services, built its Jupyter Managed (JuMa) service to address these challenges. JuMa is a service of BMW Group’s AI platform for its data analysts, ML engineers, and data scientists that provides a user-friendly workspace with an integrated development environment (IDE). It is powered by Amazon SageMaker Studio and provides JupyterLab for Python and Posit Workbench for R. This offering enables BMW ML engineers to perform code-centric data analytics and ML, increases developer productivity by providing self-service capability and infrastructure automation, and tightly integrates with BMW’s centralized IT tooling landscape.

JuMa is now available to all data scientists, ML engineers, and data analysts at BMW Group. The service streamlines ML development and production workflows (MLOps) across BMW by providing a cost-efficient and scalable development environment that facilitates seamless collaboration between data science and engineering teams worldwide. This results in faster experimentation and shorter idea validation cycles. Moreover, the JuMa infrastructure, which is based on AWS serverless and managed services, helps reduce operational overhead for DevOps teams and allows them to focus on enabling use cases and accelerating AI innovation at BMW Group.

Challenges of growing an on-premises AI platform

Prior to introducing the JuMa service, BMW teams worldwide were using two on-premises platforms that provided teams JupyterHub and RStudio environments. These platforms were too limited regarding CPU, GPU, and memory to allow the scalability of AI at BMW Group. Scaling these platforms with managing more on-premises hardware, more software licenses, and support fees would require significant up-front investments and high efforts for its maintenance. To add to this, limited self-service capabilities were available, requiring high operational effort for its DevOps teams. More importantly, the use of these platforms was misaligned with BMW Group’s IT cloud-first strategy. For example, teams using these platforms missed an easy migration of their AI/ML prototypes to the industrialization of the solution running on AWS. In contrast, the data science and analytics teams already using AWS directly for experimentation needed to also take care of building and operating their AWS infrastructure while ensuring compliance with BMW Group’s internal policies, local laws, and regulations. This included a range of configuration and governance activities from ordering AWS accounts, limiting internet access, using allowed listed packages to keeping their Docker images up to date.

Overview of solution

JuMa is a fully managed multi-tenant, security hardened AI platform service built on AWS with SageMaker Studio at the core. By relying on AWS serverless and managed services as the main building blocks of the infrastructure, the JuMa DevOps team doesn’t need to worry about patching servers, upgrading storage, or managing any other infrastructure components. The service handles all those processes automatically, providing a powerful technical platform that is generally up to date and ready to use.

JuMa users can effortlessly order a workspace via a self-service portal to create a secure and isolated development and experimentation environment for their teams. After a JuMa workspace is provisioned, the users can launch JupyterLab or Posit workbench environments in SageMaker Studio with just a few clicks and start the development immediately, using the tools and frameworks they are most familiar with. JuMa is tightly integrated with a range of BMW Central IT services, including identity and access management, roles and rights management, BMW Cloud Data Hub (BMW’s data lake on AWS) and on-premises databases. The latter helps AI/ML teams seamlessly access required data, given they are authorized to do so, without needing to build data pipelines. Furthermore, the notebooks can be integrated into the corporate Git repositories to collaborate using version control.

The solution abstracts away all technical complexities associated with AWS account management, configuration, and customization for AI/ML teams, allowing them to fully focus on AI innovation. The platform ensures that the workspace configuration meets BMW’s security and compliance requirements out of the box.

The following diagram describes the high-level context view of the architecture.

User journey

BMW AI/ML team members can order their JuMa workspace using BMW’s standard catalog service. After approval by the line manager, the ordered JuMa workspace is provisioned by the platform fully automatedly. The workspace provisioning workflow includes the following steps (as numbered in the architecture diagram).

  1. A data scientist team orders a new JuMa workspace in BMW’s Catalog. JuMa automatically provisions a new AWS account for the workspace. This ensures full isolation between the workspaces following the federated model account structure mentioned in SageMaker Studio Administration Best Practices.
  2. JuMa configures a workspace (which is a Sagemaker domain) that only allows predefined Amazon SageMaker features required for experimentation and development, specific custom kernels, and lifecycle configurations. It also sets up the required subnets and security groups that ensure the notebooks run in a secure environment.
  3. After the workspaces are provisioned, the authorized users log in to the JuMa portal and access the SageMaker Studio IDE within their workspace using a SageMaker pre-signed URL. Users can choose between opening a SageMaker Studio private space or a shared space. Shared spaces encourage collaboration between different members of a team that can work in parallel on the same notebooks, whereas private spaces allow for a development environment for solitary workloads.
  4. Using the BMW data portal, users can request access to on-premises databases or data stored in BMW’s Cloud Data Hub, making it available in their workspace for development and experimentation, from data preparation and analysis to model training and validation.

After an AI model is developed and validated in JuMa, AI teams can use the MLOPs service of the BMW AI platform to deploy it to production quickly and effortlessly. This service provides users with a production-grade ML infrastructure and pipelines on AWS using SageMaker, which can be set up in minutes with just a few clicks. Users simply need to host their model on the provisioned infrastructure and customize the pipeline to meet their specific use case needs. In this way, the AI platform covers the entire AI lifecycle at BMW Group.

JuMa features

Following best practice architecting on AWS, the JuMa service was designed and implemented according to the AWS Well-Architected Framework. Architectural decisions of each Well-Architected pillar are described in detail in the following sections.

Security and compliance

To assure full isolation between the tenants, each workspace receives its own AWS account, where the authorized users can jointly collaborate on analytics tasks as well as on developing and experimenting with AI/ML models. The JuMa portal itself enforces isolation at runtime using policy-based isolation with AWS Identity and Access Management (IAM) and the JuMa user’s context. For more information about this strategy, refer to Run-time, policy-based isolation with IAM.

Data scientists can only access their domain through the BMW network via pre-signed URLs generated by the portal. Direct internet access is disabled within their domain. Their Sagemaker domain privileges are built using Amazon SageMaker Role Manager personas to ensure least privilege access to AWS services needed for the development such as SageMaker, Amazon Athena, Amazon Simple Storage Service (Amazon S3), and AWS Glue. This role implements ML guardrails (such as those described in Governance and control), including enforcement of ML training to occur in either Amazon Virtual Private Cloud (Amazon VPC) or without internet and allowing only the use of JuMa’s custom vetted and up-to-date SageMaker images.

Because JuMa is designed for development, experimentation, and ad-hoc analysis, it implements retention policies to remove data after 30 days. To access data whenever needed and store it for long term, JuMa seamlessly integrates with the BMW Cloud Data Hub and BMW on-premises databases.

Finally, JuMa supports multiple Regions to comply to special local legal situations which, for example, require it to process data locally to enable BMW’s data sovereignty.

Operational excellence

Both the JuMa platform backend and workspaces are implemented with AWS serverless and managed services. Using those services helps minimize the effort of the BMW platform team maintaining and operating the end-to-end solution, striving to be a no-ops service. Both the workspace and portal are monitored using Amazon CloudWatch logs, metrics, and alarms to check key performance indicators (KPIs) and proactively notify the platform team of any issues. Additionally, the AWS X-Ray distributed tracing system is used to trace requests throughout multiple components and annotate CloudWatch logs with workspace-relevant context.

All changes to the JuMa infrastructure are managed and implemented through automation using infrastructure as code (IaC). This helps reduce manual efforts and human errors, increase consistency, and ensure reproducible and version-controlled changes across both JuMa platform backend workspaces. Specifically, all workspaces are provisioned and updated through an onboarding process built on top of AWS Step Functions, AWS CodeBuild, and Terraform. Therefore, no manual configuration is required to onboard new workspaces to the JuMa platform.

Cost optimization

By using AWS serverless services, JuMa ensures on-demand scalability, pre-approved instance sizes, and a pay-as-you-go model for the resources used during the development and experimentation activities per the AI/ML teams’ needs. To further optimize costs, the JuMa platform monitors and identifies idle resources within SageMaker Studio and shuts them down automatically to prevent expenses for non-utilized resources.

Sustainability

JuMa replaces BMW’s two on-premises platforms for analytics and deep learning workloads that consume a considerable amount of electricity and produce CO2 emissions even when not in use. By migrating AI/ML workloads from on premises to AWS, BMW will slash its environmental impact by decommissioning the on-premises platforms.

Furthermore, the mechanism for auto shutdown of idle resources, data retention polices, and the workspace usage reports to its owners implemented in JuMa help further minimize the environmental footprint of running AI/ML workloads on AWS.

Performance efficiency

By using SageMaker Studio, BMW teams benefit from an easy adoption of the latest SageMaker features that can help accelerate their experimentation. For example, they can use Amazon SageMaker JumpStart capabilities to use the latest pre-trained, open source models. Additionally, it helps reduce AI/ML team efforts moving from experimentation to solution industrialization, because the development environment provides the same AWS core services but restricted to development capabilities.

Reliability

SageMaker Studio domains are deployed in a VPC-only mode to manage internet access and only allow access to intended AWS services. The network is deployed in two Availability Zones to protect against a single point of failure, achieving greater resiliency and availability of the platform to its users.

Changes to JuMa workspaces are automatically deployed and tested to development and integration environments, using IaC and CI/CD pipelines, before upgrading customer environments.

Finally, data stored in Amazon Elastic File System (Amazon EFS) for SageMaker Studio domains is kept after volumes are deleted for backup purposes.

Conclusion

In this post, we described how BMW Group in collaboration with AWS ProServe developed a fully managed AI platform service on AWS using SageMaker Studio and other AWS serverless and managed services.

With JuMa, BMW’s AI/ML teams are empowered to unlock new business value by accelerating experimentation as well as time-to-market for disruptive AI solutions. Furthermore, by migrating from its on-premises platform, BMW can reduce the overall operational efforts and costs while also increasing sustainability and the overall security posture.

To learn more about running your AI/ML experimentation and development workloads on AWS, visit Amazon SageMaker Studio.


About the Authors

Marc Neumann is the head of the central AI Platform at BMP Group. He is responsible for developing and implementing strategies to use AI technology for business value creation across the BMW Group. His primary goal is to ensure that the use of AI is sustainable and scalable, meaning it can be consistently applied across the organization to drive long-term growth and innovation. Through his leadership, Neumann aims to position the BMW Group as a leader in AI-driven innovation and value creation in the automotive industry and beyond.

Amor Steinberg is a Machine Learning Engineer at BMW Group and the service lead of Jupyter Managed, a new service that aims to provide a code-centric analytics and machine learning workbench for engineers and data scientists at the BMW Group. His past experience as a DevOps Engineer at financial institutions enabled him to gather a unique understanding of the challenges that faces banks in the European Union and keep the balance between striving for technological innovation, complying with laws and regulations, and maximizing security for customers.

Marinus Krommenhoek is a Senior Cloud Solution Architect and a Software Developer at BMW Group. He is enthusiastic about modernizing the IT landscape with state-of-the-art services that add high value and are easy to maintain and operate. Marinus is a big advocate of microservices, serverless architectures, and agile working. He has a record of working with distributed teams across the globe within large enterprises.

Nicolas Jacob Baer is a Principal Cloud Application Architect at AWS ProServe with a strong focus on data engineering and machine learning, based in Switzerland. He works closely with enterprise customers to design data platforms and build advanced analytics and ML use cases.

Joaquin Rinaudo is a Principal Security Architect at AWS ProServe. He is passionate about building solutions that help developers improve their software quality. Prior to AWS, he worked across multiple domains in the security industry, from mobile security to cloud and compliance-related topics. In his free time, Joaquin enjoys spending time with family and reading science-fiction novels.

Shukhrat Khodjaev is a Senior Global Engagement Manager at AWS ProServe. He specializes in delivering impactful big data and AI/ML solutions that enable AWS customers to maximize their business value through data utilization.

Read More

Automating product description generation with Amazon Bedrock

Automating product description generation with Amazon Bedrock

In today’s ever-evolving world of ecommerce, the influence of a compelling product description cannot be overstated. It can be the decisive factor that turns a potential visitor into a paying customer or sends them clicking off to a competitor’s site. The manual creation of these descriptions across a vast array of products is a labor-intensive process, and it can slow down the velocity of new innovation. This is where Amazon Bedrock with its generative AI capabilities steps in to reshape the game. In this post, we dive into how Amazon Bedrock is transforming the product description generation process, empowering e-retailers to efficiently scale their businesses while conserving valuable time and resources.

Unlocking the power of generative AI in retail

Generative AI has captured the attention of boards and CEOs worldwide, prompting them to ask, “How can we leverage generative AI for our business?” One of the most promising applications of generative AI in ecommerce is using it to craft product descriptions. Retailers and brands have invested significant resources in testing and evaluating the most effective descriptions, and generative AI excels in this area.

Creating engaging and informative product descriptions for a vast catalog is a monumental task, especially for global ecommerce platforms. Manual translation and adaptation of product descriptions for each market consumes time and resources. This results in generic or incomplete descriptions, leading to reduced sales and customer satisfaction.

The power of Amazon Bedrock: AI-generated product descriptions

Amazon Bedrock is a fully managed service that simplifies generative AI development, offering high-performing foundation models (FMs) from leading AI companies like AI21 Labs, Anthropic, Cohere, Meta, Stability AI, and Amazon through a single API. It provides a comprehensive set of capabilities for building generative AI applications while ensuring privacy and security are maintained. With Amazon Bedrock, you can experiment with various FMs and customize them privately using techniques like fine-tuning and Retrieval Augmented Generation (RAG). The platform enables you to create managed agents for complex business tasks without the need for coding, such as booking travel, processing insurance claims, creating ad campaigns, and managing inventory.

For example, ecommerce platforms can initially generate basic product descriptions that include size, color, and price. However, Amazon Bedrock’s flexibility allows these descriptions to be fine-tuned to incorporate customer reviews, integrate brand-specific language, and highlight specific product features, resulting in tailored descriptions that resonate with the target audience. Moreover, Amazon Bedrock offers access to foundation models from Amazon and leading AI startups through an intuitive API, making the entire process seamless and efficient.

Using AI can have the following impact on the product description process:

  • Faster approvals – Vendors experience a streamlined process, moving from product listing to approval in under an hour, eliminating frustrating delays
  • Improved product listing velocity – When automated, your ecommerce marketplace sees a surge in product listings, offering consumers access to the latest merchandise nearly instantaneously
  • Future-proofing – By embracing cutting-edge AI, you secure your position as a forward-looking platform ready to meet evolving market demands
  • Innovation – This solution liberates teams from mundane tasks, allowing them to focus on higher-value work and fostering a culture of innovation

Solution overview

Before we dive into the technical details, let’s see the high-level preview of what this solution offers. This solution will allow you to create and manage product descriptions for your ecommerce platform. It empowers your platform to:

  • Generate descriptions from text – With the power of generative AI, Amazon Bedrock can convert plain text descriptions into vivid, informative, and captivating product descriptions.
  • Craft images – Beyond text, it can also craft images that align perfectly with the product descriptions, enhancing the visual appeal of your listings.
  • Enhance existing content – Do you have existing product descriptions that need a fresh perspective? Amazon Bedrock can take your current content and make it even more compelling and engaging.

This solution is available in the AWS Solutions Library. We’ve provided detailed instructions in the accompanying README file. The README file contains all the information you need to get started, from requirements to deployment guidelines.

The system architecture comprises several core components:

  • UI portal – This is the user interface (UI) designed for vendors to upload product images.
  • Amazon Rekognition Amazon Rekognition is an image analysis service that detects objects, text, and labels in images.
  • Amazon Bedrock – Foundation models in Amazon Bedrock use the labels detected by Amazon Rekognition to generate product descriptions.
  • AWS Lambda AWS Lambda provides serverless compute for processing.
  • Product database – The central repository stores vendor products, images, labels, and generated descriptions. This could be any database of your choice. Note that in this solution, all of the storage is in the UI.
  • Admin portal – This portal provides oversight of the system and product listings, ensuring smooth operation. This is not part of the solution; we’ve added it for understanding.

The following diagram illustrates the flow of data and interactions within the system

Image is a picture with white background that has text describing the workflow.The workflow includes the following steps: 1. The client initiates a request to the Amazon API Gateway REST API. 2. Amazon API Gateway passes the request to AWS Lambda through a proxy integration. 3. When operating on product image inputs, AWS Lambda calls Amazon Rekognition to detect objects in the image. 4. AWS Lambda calls LLMs hosted by Amazon Bedrock, such as the Amazon Titan language models, to generate product descriptions. 5. The response is passed back from AWS Lambda to Amazon API Gateway. 6. Finally, HTTP response from Amazon API Gateway is returned to the client.

The workflow includes the following steps:

  1. The client initiates a request to the Amazon API Gateway REST API.
  2. Amazon API Gateway passes the request to AWS Lambda through a proxy integration.
  3. When operating on product image inputs, AWS Lambda calls Amazon Rekognition to detect objects in the image.
  4. AWS Lambda calls LLMs hosted by Amazon Bedrock, such as the Amazon Titan language models, to generate product descriptions.
  5. The response is passed back from AWS Lambda to Amazon API Gateway.
  6. Finally, HTTP response from Amazon API Gateway is returned to the client.

Example use case

Imagine a vendor uploads a product image of shoes, and Amazon Rekognition identifies key attributes like “white shoes,” “sneaker,” and “durable.” The Amazon Bedrock Titan AI takes this information and generates a product description like, “Here is a draft product description for a canvas running shoe based on the product photo: Introducing the Canvas Runner, the perfect lightweight sneaker for your active lifestyle. This running shoe features a breathable canvas upper with leather accents for a stylish, classic look. The lace-up design provides a secure fit, while the padded tongue and collar add comfort. Inside, a removable cushioned insole supports and comforts your feet. The EVA midsole absorbs shock with each step, reducing fatigue. Flex grooves in the rubber outsole ensure flexibility and traction. With its simple, retro-inspired style, the Canvas Runner seamlessly transitions from workouts to everyday wear. Whether you’re running errands or running miles, this versatile sneaker will keep you moving in comfort and style.”
Image is picture in white background with shoes and tabs in yellow color.

Design details

Let’s explore the components in more detail:

  • User interface:
    • Front end – The front end of the vendor portal allows vendors to upload product images and displays product listings.
    • API calls – The portal communicates with the backend through APIs to process images and generate descriptions.
  • Amazon Rekognition:
    • Image analysis – Triggered by API calls, Amazon Rekognition analyzes images and detects objects, text, and labels.
    • Label output – It outputs label data derived from the analysis.
  • Amazon Bedrock:
    • NLP text generation – Amazon Bedrock uses the Amazon Titan natural language processing (NLP) model to generate textual descriptions.
    • Label integration – It takes the labels detected by Amazon Rekognition as input to generate product descriptions.
    • Style matching – Amazon Bedrock provides fine-tuning capabilities for Amazon Titan models to ensure that the generated descriptions match the style of the platform.
  • AWS Lambda:
    • Processing – Lambda handles the API calls to services.
  • Product database:
    • Flexible database – The product database is chosen based on customer preferences and requirements. Note this is not provided as part of the solution.

Additional capabilities

This solution goes beyond just generating product descriptions. It offers two more incredible options:

  • Image and description generation from text – With the power of generative AI, Amazon Bedrock can take text descriptions and create corresponding images along with detailed product descriptions. Consider the potential:
    • Instantly visualizing products from text.
    • Automating image creation for large catalogs.
    • Enhancing customer experience with rich visuals.
    • Reducing content creation time and costs.
  • Description enhancement – If you already have existing product descriptions, Amazon Bedrock can enhance them. Simply supply the text and the prompt, and Amazon Bedrock will skillfully enhance and enrich the content, rendering it highly captivating and engaging for your customers.

Conclusion

In the fiercely competitive world of ecommerce, staying at the forefront of innovation is imperative. Amazon Bedrock offers a transformative capability for e-retailers looking to enhance their product content, optimize their listing process, and drive sales. With the power of AI-generated product descriptions, businesses can create compelling, informative, and culturally relevant content that resonates deeply with customers. The future of ecommerce has arrived, and it’s driven by machine learning with Amazon Bedrock.

Are you ready to unlock the full potential of AI-powered product descriptions? Take the next step in revolutionizing your ecommerce platform. Visit the AWS Solutions Library and explore how Amazon Bedrock can transform your product descriptions, streamline your processes, and boost your sales. It’s time to supercharge your ecommerce with Amazon Bedrock!


About the Authors

Dhaval Shah is a Senior Solutions Architect at AWS, specializing in Machine Learning. With a strong focus on digital native businesses, he empowers customers to leverage AWS and drive their business growth. As an ML enthusiast, Dhaval is driven by his passion for creating impactful solutions that bring positive change. In his leisure time, he indulges in his love for travel and cherishes quality moments with his family.

Doug Tiffan is the Head of World Wide Solution Strategy for Fashion & Apparel at AWS. In his role, Doug works with Fashion & Apparel executives to understand their goals and align with them on the best solutions. Doug has over 30 years of experience in retail, holding several merchandising and technology leadership roles. Doug holds a BBA from Texas A&M University and is based in Houston, Texas.

Nikhil Sharma is a Solutions Architecture Leader at Amazon Web Services (AWS) where he and his team of Solutions Architects help AWS customers solve critical business challenges using AWS cloud technologies and services.

Kevin Bell is a Sr. Solutions Architect at AWS based in Seattle. He has been building things in the cloud for about 10 years. You can find him online as @bellkev on GitHub.

Nipun Chagari is a Principal Solutions Architect based in the Bay Area, CA. Nipun is passionate about helping customers adopt Serverless technology to modernize applications and achieve their business objectives. His recent focus has been on assisting organizations in adopting modern technologies to enable digital transformation. Apart from work, Nipun finds joy in playing volleyball, cooking and traveling with his family.

Marshall Bunch is a Solutions Architect at AWS helping North American customers design secure, scalable and cost-effective workloads in the cloud. His passion lies in solving age-old business problems where data and the newest technologies enable novel solutions. Beyond his professional pursuits, Marshall enjoys hiking and camping in Colorado’s beautiful Rocky Mountains.

Altaaf Dawoodjee is a Solutions Architect Leader that supports AdTech customers in the Digital Native Business (DNB) segment at Amazon Web Service (AWS). He has over 20 years of experience in Technology and has deep expertise in Analytics. He is passionate about helping drive successful business outcomes for his customers leveraging the AWS cloud.

Scott Bell is a dynamic leader and innovator with 25+ years of technology management experience. He is passionate about leading and developing teams in providing technology to meet the challenges of global users and businesses. He has extensive experience in leading technology teams which provide global technology solutions supporting 35+ languages. He is also passionate about the way the AI and Generative AI transform businesses and the way they support customer’s current unmet needs.

Sachin Shetti is a Principal Customer Solution Manager at AWS. He is passionate about helping enterprises succeed and realize significant benefits from cloud adoption, driving everything from basic migration to large-scale cloud transformation across people, processes, and technology. Prior to joining AWS, Sachin worked as a software developer for over 12 years and held multiple senior leadership positions leading technology delivery and transformation in healthcare, financial services, retail, and insurance. He has an Executive MBA and a Bachelor’s degree in Mechanical Engineering.

Read More

Optimizing costs for Amazon SageMaker Canvas with automatic shutdown of idle apps

Optimizing costs for Amazon SageMaker Canvas with automatic shutdown of idle apps

Amazon SageMaker Canvas is a rich, no-code Machine Learning (ML) and Generative AI workspace that has allowed customers all over the world to more easily adopt ML technologies to solve old and new challenges thanks to its visual, no-code interface. It does so by covering the ML workflow end-to-end: whether you’re looking for powerful data preparation and AutoML, managed endpoint deployment, simplified MLOps capabilities, and ready-to-use models powered by AWS AI services and Generative AI, SageMaker Canvas can help you to achieve your goals.

As companies of all sizes adopt SageMaker Canvas, customers asked for ways to optimize cost. As defined in the AWS Well-Architected Framework, a cost-optimized workload fully uses all resources, meets your functional requirements, and achieves an outcome at the lowest possible price point.

Today, we’re introducing a new way to further optimize costs for SageMaker Canvas applications. SageMaker Canvas now collects Amazon CloudWatch metrics that provide insight into app usage and idleness. Customers can use this information to shut down automatically idle SageMaker Canvas applications to avoiding incurring unintended costs.

In this post, we’ll show you how to automatically shut down idle SageMaker Canvas apps to control costs by using a simple serverless architecture. Templates used in this post are available in GitHub.

Understanding and tracking costs

Education is always the first step into understanding and controlling costs for any workload, either on-premises or in the cloud. Let’s start by reviewing the SageMaker Canvas pricing model. In a nutshell, SageMaker Canvas has a pay-as-you-go pricing model, based on two dimensions:

  • Workspace instance: ­ formerly known as session time, is the cost associated with running the SageMaker Canvas app
  • AWS service charges: ­ costs associated with training the models, deploying the endpoints, generating inferences (resources to spin up SageMaker Canvas).

Customers always have full control over the resources that are launched by SageMaker Canvas and can keep track of costs associated with the SageMaker Canvas app by using the AWS Billing and Cost Management service. For more information, refer to Manage billing and cost in SageMaker Canvas.

To limit the cost associated with the workspace instances, as a best practice, you must log out, do not close the browser tab. To log out, choose the Log out button on the left panel of the SageMaker Canvas app.

Automatically shutting down SageMaker Canvas applications

For IT Administrators that are looking to provide automated controls for shutting down SageMaker Canvas applications and keeping costs under control, there are two approaches:

  1. Shutdown applications on a schedule (every day at 19:00 or every Friday at 18:00)
  2. Shutdown automatically idle applications (when the application hasn’t been used for two hours)

Shutdown applications on a schedule

Canvas Scheduled Shutdown Architecture

Scheduled shutdown of SageMaker Canvas applications can be achieved with very little effort by using a cron expression (with Amazon EventBridge Cron Rule), a compute component (an AWS Lambda function) that calls the Amazon SageMaker API DeleteApp. This approach has been discussed in the Provision and manage ML environments with Amazon SageMaker Canvas using AWS CDK and AWS Service Catalog post, and implemented in the associated GitHub repository.

One of the advantages of the above architecture is that it is very simple to duplicate it to achieve scheduled creation of the SageMaker Canvas app. By using a combination of scheduled creation and scheduled deletion, a cloud administrator can make sure that the SageMaker Canvas application is ready to be used whenever users start their business day (e.g. 9AM on a work day), and that the app also automatically shuts down at the end of the business day (e.g. 7PM on a work day, always shut down during weekends). All that is needed to do is change the line of code calling the DeleteApp API into CreateApp, as well as updating the cron expression to reflect the desired app creation time.

While this approach is very easy to implement and test, a drawback of the suggested architecture is that it does not take into account whether an application is currently being used or not, shutting it down regardless of its current activity status. According to different situations, this might cause friction with active users, which might suddenly see their session terminated.

You can retrieve the template associated to this architecture from the following GitHub repository:

Shutdown automatically idle applications

Canvas Shutdown on Idle Architecture

Starting today, Amazon SageMaker Canvas emits CloudWatch metrics that provide insight into app usage and idleness. This allows an administrator to define a solution that reads the idleness metric, compares it against a threshold, and defines a specific logic for automatic shutdown. A more detailed overview of the idleness metric emitted by SageMaker Canvas is shown in the following paragraph.

To achieve automatic shutdown of SageMaker Canvas applications based on the idleness metrics, we provide an AWS CloudFormation template. This template consists of three main components:

  1. An Amazon CloudWatch Alarm, which runs a query to check the MAX value of the TimeSinceLastActive metric. If this value is greater than a threshold provided as input to the CloudFormation template, it triggers the rest of the automation. This query can be run on a single user profile, on a single domain, or across all domains. According to the level of control that you wish to have, you can use:
    1. the all-domains-all-users template, which checks this across all users and all domains in the region where the template is deployed
    2. the one-domain-all-users template, which checks this across all users in one domain in the region where the template is deployed
    3. the one-domain-one-user template, which checks this for one user profile, in one domain, in the region where the template is deployed
  2. The alarm state change creates an event on the default event bus in Amazon EventBridge, which has an Amazon EventBridge Rule set up to trigger an AWS Lambda function
  3. The AWS Lambda function identifies which SageMaker Canvas app has been running in idle for more than the specified threshold, and deletes it with the DeleteApp API.

You can retrieve the AWS CloudFormation templates associated to this architecture from the following GitHub repository:

How SageMaker Canvas idleness metric work

SageMaker Canvas emits a TimeSinceLastActive metric in the /aws/sagemaker/Canvas/AppActivity namespace, which shows the number of seconds that the app has been idle with no user activity. We can use this new metric to trigger an automatic shutdown of the SageMaker Canvas app when it has been idle for a defined period. SageMaker Canvas exposes the TimeSinceLastActive with the following schema:

{
    "Namespace": "/aws/sagemaker/Canvas/AppActivity",
    "Dimensions": [
        [
            "DomainId",
            "UserProfileName"
        ]
    ],
    "Metrics": [
        {
            "Name": "TimeSinceLastActive",
            "Unit": "Seconds",
            "Value": 12345
        }
    ]
}

The key components of this metric are as follows:

  • Dimensions, in particular DomainID and UserProfileName, that allow an administrator to pinpoint which applications are idle across all domains and users
  • Value of the metric, which indicates the number of seconds since the last activity in the SageMaker Canvas applications. SageMaker Canvas considers the following as activity:
    • Any action taken in the SageMaker Canvas application (clicking a button, transforming a dataset, generating an in-app inference, deploying a model);
    • Using a ready-to-use model or interacting with the Generative AI models using chat interface;
    • A batch inference scheduled to run at a specific time; for more information, refer to  Manage automations.

This metric can be read via Amazon CloudWatch API such as get_metric_data. For example, using the AWS SDK for Python (boto3):

import boto3, datetime

cw = boto3.client('cloudwatch')
metric_data_results = cw.get_metric_data(
    MetricDataQueries=[
        {
            "Id": "q1",
            "Expression": 'SELECT MAX(TimeSinceLastActive) FROM "/aws/sagemaker/Canvas/AppActivity" GROUP BY DomainId, UserProfileName',
            "Period": 900
        }
    ],
    StartTime=datetime.datetime(2023, 1, 1),
    EndTime=datetime.datetime.now(),
    ScanBy='TimestampAscending'
)

The Python query extracts the MAX value of TimeSinceLastActive from the namespace associated to SageMaker Canvas after grouping these values by DomainID and UserProfileName.

Deploying and testing the auto-shutdown solution

To deploy the auto-shutdown stack, do the following:

  1. Download the AWS CloudFormation template that refers to the solution you want to implement from the above GitHub repository. Choose whether you want to implement a solution for all SageMaker Domains, for a single SageMaker Domain, or for a single user;
  2. Update template parameters:
    1. The idle timeout – time (in seconds) that the SageMaker Canvas app is allowed to stay in idle before it gets shutdown; default value is 2 hours
    2. The alarm period – aggregation time (in seconds) used by CloudWatch Alarm to compute the idle timeout; default value is 20 minutes
    3. (optional) SageMaker Domain ID and user profile name
  3. Deploy the CloudFormation stack to create the resources

Once deployed (should take less than two minutes), the AWS Lambda function and Amazon CloudWatch alarm are configured to automatically shut down the Canvas app when idle. To test the auto-shutdown script, do the following:

  1. Make sure that the SageMaker Canvas app is running within the right domain and with the right user profile (if you have configured them).
  2. Stop using the SageMaker Canvas app and wait for the idle timeout period (default, 2 hours)
  3. Check that the app is stopped after being idle for the threshold time by checking that the CloudWatch alarm has been triggered and, after triggering the automation, it has gone back to the normal state.

In our test, we have set the idle timeout period to two hours (7200 seconds). In the following graph plotted by Amazon CloudWatch Metrics, you can see that the SageMaker Canvas app has been emitting the TimeSinceLastActive metric until the threshold was met (1), which triggered the alarm. Once the alarm was triggered, the AWS Lambda function was executed, which deleted the app and brought the metric back below the threshold (2).

Canvas Auto-shutdown Metrics Plot

Conclusion

In this post, we implemented an automated shutdown solution for idle SageMaker Canvas apps using AWS Lambda and CloudWatch Alarm and the newly emitted metric of idleness from SageMaker Canvas. Thanks to this solution, customers not only can optimize costs for their ML workloads but can also avoid unintended charges for applications that they forgot were running in their SageMaker Domain.

We’re looking forward to seeing what new use cases and workloads customers can solve with the peace of mind brought by this solution. For more examples of how SageMaker Canvas can help you achieve your business goals, refer to the following posts:

To learn how you can run production-level workloads with Amazon SageMaker Canvas, refer to the following posts:


About the authors


Davide Gallitelli is a Senior Specialist Solutions Architect for AI/ML. He is based in Brussels and works closely with customers all around the globe that are looking to adopt Low-Code/No-Code Machine Learning technologies, and Generative AI. He has been a developer since he was very young, starting to code at the age of 7. He started learning AI/ML at university, and has fallen in love with it since then.


Huong Nguyen is a Sr. Product Manager at AWS. She is leading the data ecosystem integration for SageMaker, with 14 years of experience building customer-centric and data-driven products for both enterprise and consumer spaces.


Gunjan Garg is a Principal Engineer at Amazon SageMaker team in AWS, providing technical leadership for the product. She has worked in several roles in the AI/ML org for last 5 years and is currently focused on Amazon SageMaker Canvas.


Ziyao Huang is a Software Development Engineer with Amazon SageMaker Data Wrangler. He is passionate about building great product that makes ML easy for the customers. Outside of work, Ziyao likes to read, and hang out with his friends.

Read More