NVIDIA GeForce RTX 50 Series Accelerates Adobe Premiere Pro and Media Encoder’s 4:2:2 Color Sampling

NVIDIA GeForce RTX 50 Series Accelerates Adobe Premiere Pro and Media Encoder’s 4:2:2 Color Sampling

Video editing workflows are getting a lot more colorful.

Adobe recently announced massive updates to Adobe Premiere Pro (beta) and Adobe Media Encoder, including PC support for 4:2:2 video color editing.

The 4:2:2 color format is a game changer for professional video editors, as it retains nearly as much color information as 4:4:4 while greatly reducing file size. This improves color grading and chroma keying — using color information to isolate a specific range of hues — while maximizing efficiency and quality.

In addition, new NVIDIA GeForce RTX 5090 and 5080 laptops — built on the NVIDIA Blackwell architecture — are out now, accelerating 4:2:2 and advanced AI-powered features across video-editing workflows.

Adobe and other industry partners are attending NAB Show — a premier gathering of over 100,000 leaders in the broadcast, media and entertainment industries — running April 5-9 in Las Vegas. Professionals in these fields will come together for education, networking and exploring the latest technologies and trends.

Shed Some Color on 4:2:2

Consumer cameras that are limited to 4:2:0 color compression capture a limited amount of color information. 4:2:0 is acceptable for video playback on browsers, but professional video editors often rely on cameras that capture 4:2:2 color depth with precise color accuracy to ensure higher color fidelity.

Adobe Premiere Pro’s beta with 4:2:2 means video data can now provide double the color information with just a 1.3x increase in raw file size over 4:2:0. This unlocks several key benefits within professional video-production workflows:

Increased Color Accuracy: 10-bit 4:2:2 retains more color information compared with 8-bit 4:2:0, leading to more accurate color representation and better color grading results.

4:2:2 offers more accurate color representation for better color grading results.

More Flexibility: The extra color data allows for increased flexibility during color correction and grading, enabling more nuanced adjustments and corrections.

Improved Keying: 4:2:2 is particularly beneficial for keying — including green screening — as it enables cleaner, more accurate extraction of the subject from the background, as well as cleaner edges of small keyed objects like hair.

4:2:2 enables cleaner green screen video content.

Smaller File Sizes: Compared with 4:4:4, 4:2:2 reduces file sizes without significantly impacting picture quality, offering an optimal balance between quality and storage.

Combining 4:2:2 support with NVIDIA hardware increases creative possibilities.

Advanced Video Editing

Prosumer-grade cameras from most major brands support HEVC and H.264 10-bit 4:2:2 formats to deliver superior image quality, manageable file sizes and the flexibility needed for professional video production.

GeForce RTX 50 Series GPUs paired with Microsoft Windows 11 come with GPU-powered decode acceleration in HEVC and H.264 10-bit 4:2:2 formats.

GPU-powered decode enables faster-than-real-time playback without stuttering, the ability to work with original camera media instead of proxies, smoother timeline responsiveness and reduced CPU load — freeing system resources for multi-app workflows and creative tasks.

RTX 50 Series’ 4:2:2 hardware can decode up to six 4K 60 frames-per-second video sources on an RTX 5090-enabled studio PC, enabling smooth multi-camera video-editing workflows on Adobe Premiere Pro.

Video exports are also accelerated with NVIDIA’s ninth-generation encoder and sixth-generation decoder.

NVIDIA and GeForce RTX Laptop GPU encoders and decoders.

In GeForce RTX 50 Series GPUs, the ninth-generation NVIDIA video encoder, NVENC, offers an 8% BD-BR upgrade in video encoding efficiency when exporting to HEVC on Premiere Pro.

Adobe AI Accelerated

Adobe delivers an impressive array of advanced AI features for idea generation, enabling streamlined processes, improved productivity and opportunities to explore new artistic avenues — all accelerated by NVIDIA RTX GPUs.

For example, Adobe Media Intelligence, a feature in Premiere Pro (beta) and After Effects (beta), uses AI to analyze footage and apply semantic tags to clips. This lets users more easily and quickly find specific footage by describing its content, including objects, locations, camera angles and even transcribed spoken words.

Media Intelligence runs 30% faster on the GeForce RTX 5090 Laptop GPU compared with the GeForce RTX 4090 Laptop GPU.

In addition, the Enhance Speech feature in Premiere Pro (beta) improves the quality of recorded speech by filtering out unwanted noise and making the audio sound clearer and more professional. Enhance Speech runs 7x faster on GeForce RTX 5090 Laptop GPUs compared to the MacBook Pro M4 Max.

Visit Adobe’s Premiere Pro page to download a free trial of the beta and explore the slew of AI-powered features across the Adobe Creative Cloud and Substance 3D apps.

Unleash (AI)nfinite Possibilities

GeForce RTX 5090 and 5080 Series laptops deliver the largest-ever generational leap in portable performance for creating, gaming and all things AI.

They can run creative generative AI models such as Flux up to 2x faster in a smaller memory footprint, compared with the previous generation.

The previously mentioned ninth-generation NVIDIA encoders elevate video editing and livestreaming workflows, and come with NVIDIA DLSS 4 technology and up to 24GB of VRAM to tackle massive 3D projects.

NVIDIA Max-Q hardware technologies use AI to optimize every aspect of a laptop — the GPU, CPU, memory, thermals, software, display and more — to deliver incredible performance and battery life in thin and quiet devices.

All GeForce RTX 50 Series laptops include NVIDIA Studio platform optimizations, with over 130 GPU-accelerated content creation apps and exclusive Studio tools including NVIDIA Studio Drivers, tested extensively to enhance performance and maximize stability in popular creative apps.

The game-changing NVIDIA GeForce RTX 5090 and 5080 GPU laptops are available now.

Adobe will participate in the Creator Lab at NAB Show, offering hands-on training for editors to elevate their skills with Adobe tools. Attend a 30-minute section and try out Puget Systems laptops equipped with GeForce RTX 5080 Laptop GPUs to experience blazing-fast performance and demo new generative AI features.

Use NVIDIA’s product finder to explore available GeForce RTX 50 Series laptops with complete specifications.

New creative app updates and optimizations are powered by the NVIDIA Studio platform. Follow NVIDIA Studio on Instagram, X and Facebook. Access tutorials on the Studio YouTube channel and get updates directly in your inbox by subscribing to the Studio newsletter

See notice regarding software product information.

Read More

The Role of Prosody in Spoken Question Answering

Spoken language understanding research to date has generally carried a heavy text perspective. Most datasets are derived from text, which is then subsequently synthesized into speech, and most models typically rely on automatic transcriptions of speech. This is to the detriment of prosody–additional information carried by the speech signal beyond the phonetics of the words themselves and difficult to recover from text alone. In this work, we investigate the role of prosody in Spoken Question Answering. By isolating prosodic and lexical information on the SLUE-SQA-5 dataset, which consists of…Apple Machine Learning Research

Mutual Reinforcement of LLM Dialogue Synthesis and Summarization Capabilities for Few-Shot Dialogue Summarization

In this work, we propose Mutual Reinforcing Data Synthesis (MRDS) within LLMs to improve few-shot dialogue summarization task. Unlike prior methods that require external knowledge, we mutually reinforce the LLM’s dialogue synthesis and summarization capabilities, allowing them to complement each other during training and enhance overall performances. The dialogue synthesis capability is enhanced by directed preference optimization with preference scoring from summarization capability. The summarization capability is enhanced by the additional high quality dialogue-summary paired data produced…Apple Machine Learning Research

Universally Instance-Optimal Mechanisms for Private Statistical Estimation

We consider the problem of instance-optimal statistical estimation under the constraint of differential privacy where mechanisms must adapt to the difficulty of the input dataset. We prove a
new instance specific lower bound using a new divergence and show it characterizes the local minimax optimal rates for private statistical estimation. We propose two new mechanisms that are
universally instance-optimal for general estimation problems up to logarithmic factors. Our first
mechanism, the total variation mechanism, builds on the exponential mechanism with stable approximations of the total…Apple Machine Learning Research

Modeling Speech Emotion With Label Variance and Analyzing Performance Across Speakers and Unseen Acoustic Conditions

Spontaneous speech emotion data usually contain perceptual grades where graders assign emotion score after listening to the speech files. Such perceptual grades introduce uncertainty in labels due to grader opinion variation. Grader variation is addressed by using consensus grades as groundtruth, where the emotion with the highest vote is selected, and as a consequence fails to consider ambiguous instances where a speech sample may contain multiple emotions, as captured through grader opinion uncertainty. We demonstrate that using the probability density function of the emotion grades as…Apple Machine Learning Research

Introducing AWS MCP Servers for code assistants (Part 1)

Introducing AWS MCP Servers for code assistants (Part 1)

We’re excited to announce the open source release of AWS MCP Servers for code assistants — a suite of specialized Model Context Protocol (MCP) servers that bring Amazon Web Services (AWS) best practices directly to your development workflow. Our specialized AWS MCP servers combine deep AWS knowledge with agentic AI capabilities to accelerate development across key areas. Each AWS MCP Server focuses on a specific domain of AWS best practices, working together to provide comprehensive guidance throughout your development journey.

This post is the first in a series covering AWS MCP Servers. In this post, we walk through how these specialized MCP servers can dramatically reduce your development time while incorporating security controls, cost optimizations, and AWS Well-Architected best practices into your code. Whether you’re an experienced AWS developer or just getting started with cloud development, you’ll discover how to use AI-powered coding assistants to tackle common challenges such as complex service configurations, infrastructure as code (IaC) implementation, and knowledge base integration. By the end of this post, you’ll understand how to start using AWS MCP Servers to transform your development workflow and deliver better solutions, faster.

If you want to get started right away, skip ahead to the section “From Concept to working code in minutes.”

AI is transforming how we build software, creating opportunities to dramatically accelerate development while improving code quality and consistency. Today’s AI assistants can understand complex requirements, generate production-ready code, and help developers navigate technical challenges in real time. This AI-driven approach is particularly valuable in cloud development, where developers need to orchestrate multiple services while maintaining security, scalability, and cost-efficiency.

Developers need code assistants that understand the nuances of AWS services and best practices. Specialized AI agents can address these needs by:

  • Providing contextual guidance on AWS service selection and configuration
  • Optimizing compliance with security best practices and regulatory requirements
  • Promoting the most efficient utilization and cost-effective solutions
  • Automating repetitive implementation tasks with AWS specific patterns

This approach means developers can focus on innovation while AI assistants handle the undifferentiated heavy lifting of coding. Whether you’re using Amazon Q, Amazon Bedrock, or other AI tools in your workflow, AWS MCP Servers complement and enhance these capabilities with deep AWS specific knowledge to help you build better solutions faster.

Model Context Protocol (MCP) is a standardized open protocol that enables seamless interaction between large language models (LLMs), data sources, and tools. This protocol allows AI assistants to use specialized tooling and to access domain-specific knowledge by extending the model’s capabilities beyond its built-in knowledge—all while keeping sensitive data local. Through MCP, general-purpose LLMs can now seamlessly access relevant knowledge beyond initial training data and be effectively steered towards desired outputs by incorporating specific context and best practices.

Accelerate building on AWS

What if your AI assistant could instantly access deep AWS knowledge, understanding every AWS service, best practice, and architectural pattern? With MCP, we can transform general-purpose LLMs into AWS specialists by connecting them to specialized knowledge servers. This opens up exciting new possibilities for accelerating cloud development while maintaining security and following best practices.

Build on AWS in a fraction of the time, with best practices automatically applied from the first line of code. Skip hours of documentation research and immediately access ready-to-use patterns for complex services such as Amazon Bedrock Knowledge Bases. Our MCP Servers will help you write well-architected code from the start, implement AWS services correctly the first time, and deploy solutions that are secure, observable, and cost-optimized by design. Transform how you build on AWS today.

  • Enforce AWS best practices automatically – Write well-architected code from the start with built-in security controls, proper observability, and optimized resource configurations
  • Cut research time dramatically – Stop spending hours reading documentation. Our MCP Servers provide contextually relevant guidance for implementing AWS services correctly, addressing common pitfalls automatically
  • Access ready-to-use patterns instantly – Use pre-built AWS CDK constructs, Amazon Bedrock Agents schema generators, and Amazon Bedrock Knowledge Bases integration templates that follow AWS best practices from the start
  • Optimize cost proactively – Prevent over-provisioning as you design your solution by getting cost-optimization recommendations and generating a comprehensive cost report to analyze your AWS spending before deployment

To turn this vision into reality and make AWS development faster, more secure, and more efficient, we’ve created AWS MCP Servers—a suite of specialized AWS MCP Servers that bring AWS best practices directly to your development workflow. Our specialized AWS MCP Servers combine deep AWS knowledge with AI capabilities to accelerate development across key areas. Each AWS MCP Server focuses on a specific domain of AWS best practices, working together to provide comprehensive guidance throughout your development journey.

Overview of domain-specific MCP Servers for AWS development

Our specialized MCP Servers are designed to cover distinct aspects of AWS development, each bringing deep knowledge to specific domains while working in concert to deliver comprehensive solutions:

  • Core – The foundation server that provides AI processing pipeline capabilities and serves as a central coordinator. It helps provide clear plans for building AWS solutions and can federate to other MCP servers as needed.
  • AWS Cloud Development Kit (AWS CDK) – Delivers AWS CDK knowledge with tools for implementing best practices, security configurations with cdk-nag, Powertools for AWS Lambda integration, and specialized constructs for generative AI services. It makes sure infrastructure as code (IaC) follows AWS Well-Architected principles from the start.
  • Amazon Bedrock Knowledge Bases – Enables seamless access to Amazon Bedrock Knowledge Bases so developers can query enterprise knowledge with natural language, filter results by data source, and use reranking for improved relevance.
  • Amazon Nova Canvas – Provides image generation capabilities using Amazon Nova Canvas through Amazon Bedrock, enabling the creation of visuals from text prompts and color palettes—perfect for mockups, diagrams, and UI design concepts.
  • Cost – Analyzes AWS service costs and generates comprehensive cost reports, helping developers understand the financial implications of their architectural decisions and optimize for cost-efficiency.

Prerequisites

To complete the solution, you need to have the following prerequisites in place:

  • uv package manager
  • Install Python using uv python install 3.13
  • AWS credentials with appropriate permissions
  • An MCP-compatible LLM client (such as Anthropic’s Claude for Desktop, Cline, Amazon Q CLI, or Cursor)

From concept to working code in minutes

You can download the AWS MCP Servers on GitHub or through the PyPI package manager. Here’s how to get started using your favorite code assistant with MCP support.

To install MCP Servers, enter the following code:

# Install and setup the MCP servers
{
  "mcpServers": {
    "awslabs.core-mcp-server": {
      "command": "uvx",
      "args": [
        "awslabs.core-mcp-server@latest"
      ],
      "env": {
        "FASTMCP_LOG_LEVEL": "ERROR",
        "MCP_SETTINGS_PATH": "path to your mcp server settings"
      },
      "autoApprove": [],
      "disabled": false
    },
    "awslabs.bedrock-kb-retrieval-mcp-server": {
      "command": "uvx",
      "args": ["awslabs.bedrock-kb-retrieval-mcp-server@latest"],
      "env": {
        "AWS_PROFILE": "your-aws-profile",
        "AWS_REGION": "us-east-1"
      }
    },
    "awslabs.cdk-mcp-server": {
      "command": "uvx",
      "args": ["awslabs.cdk-mcp-server@latest"],
      "env": {
        "FASTMCP_LOG_LEVEL": "ERROR"
      }
    },
    "awslabs.cost-analysis-mcp-server": {
      "command": "uvx",
      "args": ["awslabs.cost-analysis-mcp-server@latest"],
      "env": {
        "AWS_PROFILE": "your-aws-profile",
        "FASTMCP_LOG_LEVEL": "ERROR"
      }
    },
    "awslabs.nova-canvas-mcp-server": {
      "command": "uvx",
      "args": ["awslabs.nova-canvas-mcp-server@latest"],
      "env": {
        "AWS_PROFILE": "your-aws-profile",
        "AWS_REGION": "us-east-1"
      }
    }
  }
}

AWS MCP Servers in action

Here’s how AWS MCP servers transform the development experience:

Developer: “I need to build an AI-powered chatbot using Amazon Bedrock that can answer questions from our company’s knowledge base. I also want to add a tool for the chatbot to call our internal API.”

Core: “I’ll help you build an Amazon Bedrock Knowledge Bases chatbot with API integration. Let’s create an architecture that uses Amazon Bedrock Agents with a custom action group to call your internal API.”

Core generates a comprehensive architecture diagram showing the knowledge base integration, Amazon Bedrock Agents configuration with action groups, API connectivity, and data flow between components.

AWS CDK: “Here’s the infrastructure code for your chatbot with the Amazon Bedrock Agents action group. I’ve included proper IAM roles, security controls, and Lambda Powertools for observability.”

// CDK code with Well-Architected best practices
 const knowledgeBase = new BedrockKnowledgeBase(this, 'CompanyKB', {
   embeddingModel: BedrockFoundationModel.TITAN_EMBED_TEXT_V1,
   vectorStore: new OpenSearchServerlessVectorStore(this, 'VectorStore', {
     encryption: OpenSearchEncryption.KMS,
     ebs: OpenSearchEbsOptions.provisioned(100, OpenSearchVolumeType.GP3)
   })
 });

 // Action Group for API integration
 const actionGroupSchema = new BedrockAgentSchema(this, 'ApiActionSchema', {
   schemaPath: 'schemas/api-action-schema.json',
   // Schema will be auto-generated from your Lambda code
   generateFromLambda: {
     lambdaPath: 'lambda/api-actions.py',
     outputPath: 'schemas/api-action-schema.json'
   }
 });

 const apiActionGroup = new BedrockAgentActionGroup(this, 'ApiActionGroup', {
   actionGroupName: 'InternalApiActions',
   description: 'Actions to interact with internal API',
   actionGroupExecutor: new BedrockAgentLambdaExecutor(apiActionsLambda),
   actionGroupSchema: actionGroupSchema
 });

The CDK MCP Server generates complete AWS CDK code to deploy the entire solution. It automatically runs cdk-nag to identify potential security issues and provides remediation steps for each finding, making sure that the infrastructure follows AWS Well-Architected best practices.

Amazon Bedrock Knowledge Bases retrieval: “I’ve configured the optimal settings for your knowledge base queries, including proper reranking for improved relevance.”

Amazon Bedrock Knowledge Bases MCP Server demonstrates how to structure queries to the knowledge base for maximum relevance, provides sample code for filtering by data source, and shows how to integrate the knowledge base responses with the chatbot interface.

Amazon Nova Canvas: “To enhance your chatbot’s capabilities, I’ve created visualizations that can be generated on demand when users request data explanations.”

Amazon Nova Canvas MCP server generates sample images showing how Amazon Nova Canvas can create charts, diagrams, and visual explanations based on knowledge base content, making complex information more accessible to users.

Cost Analysis: “Based on your expected usage patterns, here’s the estimated monthly cost breakdown and optimization recommendations.”

The Cost Analysis MCP Server generates a detailed cost analysis report showing projected expenses for each AWS service, identifies cost optimization opportunities such as reserved capacity for Amazon Bedrock, and provides specific recommendations to reduce costs without impacting performance.

With AWS MCP Servers, what would typically take days of research and implementation is completed in minutes, with better quality, security, and cost-efficiency than manual development in that same time.

Best practices for MCP-assisted development

To maximize the benefits of MCP assisted development while maintaining security and code quality, developers should follow these essential guidelines:

  • Always review generated code for security implications before deployment
  • Use MCP Servers as accelerators, not replacements for developer judgment and expertise
  • Keep MCP Servers updated with the latest AWS security best practices
  • Follow the principle of least privilege when configuring AWS credentials
  • Run security scanning tools on generated infrastructure code

Coming up in the series

This post introduced the foundations of AWS MCP Servers and how they accelerate AWS development through specialized, AWS specific MCP Servers. In upcoming posts, we’ll dive deeper into:

  • Detailed walkthroughs of each MCP server’s capabilities
  • Advanced patterns for integrating AWS MCP Servers into your development workflow
  • Real-world case studies showing AWS MCP Servers’ impact on development velocity
  • How to extend AWS MCP Servers with your own custom MCP servers

Stay tuned to learn how AWS MCP Servers can transform your specific AWS development scenarios and help you build better solutions faster. Visit our GitHub repository or Pypi package manager to explore example implementations and get started today.


About the Authors

Jimin Kim is a Prototyping Architect on the AWS Prototyping and Cloud Engineering (PACE) team, based in Los Angeles. With specialties in Generative AI and SaaS, she loves helping her customers succeed in their business. Outside of work, she cherishes moments with her wife and three adorable calico cats.

Pranjali Bhandari is part of the Prototyping and Cloud Engineering (PACE) team at AWS, based in the San Francisco Bay Area. She specializes in Generative AI, distributed systems, and cloud computing. Outside of work, she loves exploring diverse hiking trails, biking, and enjoying quality family time with her husband and son.

Laith Al-Saadoon is a Principal Prototyping Architect on the Prototyping and Cloud Engineering (PACE) team. He builds prototypes and solutions using generative AI, machine learning, data analytics, IoT & edge computing, and full-stack development to solve real-world customer challenges. In his personal time, Laith enjoys the outdoors–fishing, photography, drone flights, and hiking.

Paul Vincent is a Principal Prototyping Architect on the AWS Prototyping and Cloud Engineering (PACE) team. He works with AWS customers to bring their innovative ideas to life. Outside of work, he loves playing drums and piano, talking with others through Ham radio, all things home automation, and movie nights with the family.

Justin Lewis leads the Emerging Technology Accelerator at AWS. Justin and his team help customers build with emerging technologies like generative AI by providing open source software examples to inspire their own innovation. He lives in the San Francisco Bay Area with his wife and son.

Anita Lewis is a Technical Program Manager on the AWS Emerging Technology Accelerator team, based in Denver, CO. She specializes in helping customers accelerate their innovation journey with generative AI and emerging technologies. Outside of work, she enjoys competitive pickleball matches, perfecting her golf game, and discovering new travel destinations.

Read More

Harness the power of MCP servers with Amazon Bedrock Agents

Harness the power of MCP servers with Amazon Bedrock Agents

AI agents extend large language models (LLMs) by interacting with external systems, executing complex workflows, and maintaining contextual awareness across operations. Amazon Bedrock Agents enables this functionality by orchestrating foundation models (FMs) with data sources, applications, and user inputs to complete goal-oriented tasks through API integration and knowledge base augmentation. However, in the past, connecting these agents to diverse enterprise systems has created development bottlenecks, with each integration requiring custom code and ongoing maintenance—a standardization challenge that slows the delivery of contextual AI assistance across an organization’s digital ecosystem. This is a problem that you can solve by using Model Context Protocol (MCP), which provides a standardized way for LLMs to connect to data sources and tools.

Today, MCP is providing agents standard access to an expanding list of accessible tools that you can use to accomplish a variety of tasks. In time, MCP can promote better discoverability of agents and tools through marketplaces, enabling agents to share context and have common workspaces for better interaction, and scale agent interoperability across the industry.

In this post, we show you how to build an Amazon Bedrock agent that uses MCP to access data sources to quickly build generative AI applications. Using Amazon Bedrock Agents, your agent can be assembled on the fly with MCP-based tools as in this example:

InlineAgent(
    foundation_model="us.anthropic.claude-3-5-sonnet-20241022-v2:0",
    instruction="You are a friendly assistant for resolving user queries",
    agent_name="SampleAgent",
    action_groups=[
        ActionGroup(
            name="SampleActionGroup",
            mcp_clients=[mcp_client_1, mcp_client_2],
        )
    ],
).invoke(input_text=”Convert 11am from NYC time to London time”)

We showcase an example of building an agent to understand your Amazon Web Service (AWS) spend by connecting to AWS Cost Explorer, Amazon CloudWatch, and Perplexity AI through MCP. You can use the code referenced in this post to connect your agents to other MCP servers to address challenges for your business. We envision a world where agents have access to an ever-growing list of MCP servers that they can use for accomplishing a wide variety of tasks.

Model Context Protocol

Developed by Anthropic as an open protocol, MCP provides a standardized way to connect AI models to virtually any data source or tool. Using a client-server architecture, MCP enables developers to expose their data through lightweight MCP servers while building AI applications as MCP clients that connect to these servers. Through this architecture, MCP enables users to build more powerful, context-aware AI agents that can seamlessly access the information and tools they need. Whether you’re connecting to external systems or internal data stores or tools, you can now use MCP to interface with all of them in the same way. The client-server architecture of MCP enables your agent to access new capabilities as the MCP server updates without requiring any changes to the application code.

MCP architecture

MCP uses a client-server architecture that contains the following components and is shown in the following figure:

  • Host: An MCP host is a program or AI tool that requires access to data through the MCP protocol, such as Claude Desktop, an integrated development environment (IDE), or any other AI application.
  • Client: Protocol clients that maintain one-to-one connections with servers.
  • Server: Lightweight programs that expose capabilities through standardized MCP.
  • Local data sources: Your databases, local data sources, and services that MCP servers can securely access.
  • Remote services: External systems available over the internet through APIs that MCP servers can connect to.

Let’s walk through how to set up Amazon Bedrock agents that take advantage of MCP servers.

Using MCP with Amazon Bedrock agents

In this post, we provide a step-by-step guide for how to connect your favorite MCP servers with Amazon Bedrock agents as Action Groups that an agent can use to accomplish tasks provided by the user. The AgentInlineSDK provides a straightforward way to create inline agents, containing a built-in MCP client implementation that provides you with direct access to tools delivered by an MCP server.

As part of creating an agent, the developer creates an MCP client specific to each MCP server that requires agent communication. When invoked, the agent determines which tools are needed for the user’s task; if MCP server tools are required, it uses the corresponding MCP client to request tool execution from that server. The user code doesn’t need to be aware of the MCP protocol because that’s handled by the MCP client provided the InlineAgent code repository.

To orchestrate this workflow, you take advantage of the return control capability of Amazon Bedrock Agents. The following diagram illustrates the end-to-end flow of an agent handling a request that uses two tools. In the first flow, a Lambda-based action is taken, and in the second, the agent uses an MCP server.

Use case: transform how you manage your AWS spend across different AWS services including Amazon Bedrock

To show how an Amazon Bedrock agent can use MCP servers, let’s walk through a sample use case. Imagine asking questions like “Help me understand my Bedrock spend over the last few weeks” or “What were my EC2 costs last month across regions and instance types?” and getting a human-readable analysis of the data instead of raw numbers on a dashboard. The system interprets your intent and delivers precisely what you need—whether that’s detailed breakdowns, trend analyses, visualizations, or cost-saving recommendations. This is useful because what you’re interested in is insights rather than data. You can accomplish this using two MCP servers: a custom-built MCP server for retrieving the AWS spend data and an open source MCP server from Perplexity AI to interpret the data. You add these two MCP servers as action groups in an inline Amazon Bedrock agent. This gives you an AI agent that can transform the way you manage your AWS spend. All the code for this post is available in the GitHub repository.

Let’s walk through how this agent is created using inline agents. You can use inline agents to define and configure Amazon Bedrock agents dynamically at runtime. They provide greater flexibility and control over agent capabilities, enabling users to specify FMs, instructions, action groups, guardrails, and knowledge bases as needed without relying on pre-configured control plane settings. It’s worth noting that you can also orchestrate this behavior without inline agents by using RETURN_CONTROL with the InvokeAgent API.

MCP components in Amazon Bedrock Agents

  1. Host: This is the Amazon Bedrock inline agent. This agent adds MCP clients as action groups that can be invoked through RETURN_CONTROL when the user asks an AWS spend-related question.
  2. Client: You create two clients that establish one-to-one connections with their respective servers: a cost explorer client with specific cost server parameters and a Perplexity AI client with Perplexity server parameters.
  3. Servers: You create two MCP servers that each run locally on your machine and communicate to your application over standard input/output (alternatively, you could also configure the client to talk to remote MCP servers).
    1. Cost Explorer and Amazon CloudWatch Logs (for Amazon Bedrock model invocation log data) and an MCP server to retrieve the AWS spend data.
    2. Perplexity AI MCP server to interpret the AWS spend data.
  4. Data sources: The MCP servers talk to remote data sources such as Cost Explorer API, CloudWatch Logs and the Perplexity AI search API.

Prerequisites

You need the following prerequisites to get started implementing the solution in this post:

  1. An AWS account
  2. Familiarity with FMs and Amazon Bedrock
  3. Install AWS Command Line Interface (AWS CLI) and set up credentials
  4. Python 3.11 or later
  5. AWS Cloud Development Kit (AWS CDK) CLI
  6. Enable model access for Anthropic’s Claude 3.5 Sonnet v2
  7. You need to have your AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY so that you can set them using environment variables for the server
  8. The two MCP servers are run as Docker daemons, so you need to have Docker installed and running on your computer

The MCP servers run locally on your computer and need to access AWS services and the Perplexity API. You can read more about AWS credentials in Manage access keys for IAM users. Make sure that your credentials include AWS Identity and Access Manager (IAM) read access to Cost Explorer and CloudWatch. You can do this by using AWSBillingReadOnlyAccess and CloudWatchReadOnlyAccess managed IAM permissions. You can get the Perplexity API key from the Perplexity Sonar API page.

Steps to run

With the prerequisites in place, you’re ready to implement the solution.

  1. Navigate to the InlineAgent GitHub repository.
  2. Follow the setup steps.
  3. Navigate to the cost_explorer_agent This folder contains the code for this post.
    cd examples/mcp/cost_explorer_agent

  4. Create a .env file in cost_explorer_agent directory using example.
    AWS_ACCESS_KEY_ID=
    AWS_SECRET_ACCESS_KEY=
    AWS_REGION=
    BEDROCK_LOG_GROUP_NAME=
    PERPLEXITY_API_KEY=

  5. Build aws-cost-explorer-mcp server
    git clone https://github.com/aarora79/aws-cost-explorer-mcp-server.git
    cd aws-cost-explorer-mcp-server/
    docker build -t aws-cost-explorer-mcp

  6. You’re now ready to create an agent that can invoke these MCP servers to provide insights into your AWS spend. You can do this by running the python main.py command. The output will look like the following example. The agent connects with the two MCP servers and accesses their respective tools. Next, the agent lays out a plan to use specific tools sequentially, uses code interpreter to generate a chart showing cost distribution, and uses Cost Explorer and Perplexity API to provide information about Amazon Bedrock and the AWS account’s cost expenditure.
    Connected to server with tools: ['get_bedrock_daily_usage_stats', 'get_bedrock_hourly_usage_stats', 'get_ec2_spend_last_day', 'get_detailed_breakdown_by_day']
    Connected to server with tools: ['perplexity_ask']
    
    Thought: Let me break this down into parts:
    1. First, I'll ask about Amazon Bedrock using Perplexity
    2. Then I'll get the cost breakdown for the last 7 days
    3. Finally, I'll create a visualization of the spending data
    
    Tool use: perplexity_ask with these inputs: messages[[{role=user, content=What is Amazon Bedrock? Provide a concise explanation.}] (array)]
    
    Tool use: get_detailed_breakdown_by_day with these inputs: days[7 (integer)]
    
    Thought: Let me analyze the data and create a visualization of the top services by cost. I'll use Python to process the data and create a bar chart.
    
    Code interpreter:
    Generated code
    ```
    import pandas as pd
    import matplotlib.pyplot as plt
    ...
    ```
    
    Code interpreter output:
    aws_costs.png (image/png)
    Directory output created successfully
    
    Thought: Now I can provide a comprehensive answer about Amazon Bedrock and the cost analysis. Amazon Bedrock is a fully managed service that provides access to foundation models (FMs) from leading AI companies through a single API. Based on the cost analysis of the last 7 days, your top spending services were:
    
    1. Amazon Neptune: $24
    2. Amazon SageMaker: $567
    3. EC2 (Compute + Other): $454
    4. Amazon OpenSearch Service: $34
    5. Amazon Bedrock: $1235
    6. Amazon Q: $178

Implementation details

Now that you understand the output produced by an agent, let’s lift the curtain and review some of the important pieces of code that produce the output.

  1. Creating MCP clients: config.py, defines the two MCP clients that talk to your two MCP servers.
    1. Server parameters are defined for the cost explorer and Perplexity clients. The solution uses StdioServerParameters, which configures how the client should communicate over standard input/output (stdio) streams. This contains the parameters required by the server to access the required data through APIs.
      # Cost server parameters
      cost_server_params = StdioServerParameters(
          command="/usr/local/bin/docker",
          args=[
              "run",
              "-i",
              "--rm",
              "-e",
              "AWS_ACCESS_KEY_ID",
              "-e",
              "AWS_SECRET_ACCESS_KEY",
              "-e",
              "AWS_REGION",
              "-e",
              "BEDROCK_LOG_GROUP_NAME",
              "-e",
              "stdio",
              "aws-cost-explorer-mcp:latest",
          ],
          env={
              "AWS_ACCESS_KEY_ID": AWS_ACCESS_KEY_ID,
              "AWS_SECRET_ACCESS_KEY": AWS_SECRET_ACCESS_KEY,
              "AWS_REGION": AWS_REGION,
              "BEDROCK_LOG_GROUP_NAME": BEDROCK_LOG_GROUP_NAME,
          },
      )
      
      # Perplexity server parameters
      perplexity_server_params = StdioServerParameters(
          command="/usr/local/bin/docker",
          args=["run", "-i", "--rm", "-e", "PERPLEXITY_API_KEY", "mcp/perplexity-ask"],
          env={"PERPLEXITY_API_KEY": PERPLEXITY_API_KEY},
      )

    2. In main.py, the MCP server parameters are imported and used to create your two MCP clients.
      cost_explorer_mcp_client = await MCPClient.create(server_params=cost_server_params)
      perplexity_mcp_client = await MCPClient.create(server_params=perplexity_server_params)

  1. Configure agent action group: main.py creates the action group that combines the MCP clients into a single interface that the agent can access. This enables the agent to ask your application to invoke either of these MCP servers as needed through return of control.
    # Create action group with both MCP clients
    cost_action_group = ActionGroup(
        name="CostActionGroup",
        mcp_clients=[cost_explorer_mcp_client, perplexity_mcp_client]
    )

  2. Inline agent creation: The inline agent can be created with the following specifications:
    1. Foundation model: Configure your choice of FM to power your agent. This can be any model provided on Amazon Bedrock. This example uses Anthropic’s Claude 3.5 Sonnet model.
    2. Agent instruction: Provide instructions to your agent that contain the guidance and steps for orchestrating responses to user queries. These instructions anchor the agent’s approach to handling various types of queries
    3. Agent name: Name of your agent.
    4. Action groups: Define the action groups that your agent can access. These can include single or multiple action groups, with each group having access to multiple MCP clients or AWS Lambda As an option, you can configure your agent to use Code Interpreter to generate, run, and test code for your application.
# Create and invoke the inline agent
await InlineAgent(
    foundation_model="us.anthropic.claude-3-5-sonnet-20241022-v2:0",
    instruction="""You are a friendly assistant that is responsible for resolving user queries.
    
    You have access to search, cost tool and code interpreter. 
    
    """,
    agent_name="cost_agent",
    action_groups=[
        cost_action_group,
        {
            "name": "CodeInterpreter",
            "builtin_tools": {
                "parentActionGroupSignature": "AMAZON.CodeInterpreter"
            },
        },
    ],
).invoke(
    input_text="<user-query-here>"
)

You can use this example to build an inline agent on Amazon Bedrock that establishes connections with different MCP servers and groups their clients into a single action group for the agent to access.

Conclusion

The Anthropic MCP protocol offers a standardized way of connecting FMs to data sources, and now you can use this capability with Amazon Bedrock Agents. In this post, you saw an example of combining the power of Amazon Bedrock and MCP to build an application that offers a new perspective on understanding and managing your AWS spend.

Organizations can now offer their teams natural, conversational access to complex financial data while enhancing responses with contextual intelligence from sources like Perplexity. As AI continues to evolve, the ability to securely connect models to your organization’s critical systems will become increasingly valuable. Whether you’re looking to transform customer service, streamline operations, or gain deeper business insights, the Amazon Bedrock and MCP integration provides a flexible foundation for your next AI innovation. You can dive deeper on this MCP integration by exploring our code samples.

Here are some examples of what you can build by connecting your Amazon Bedrock Agents to MCP servers:

  • A multi-data source agent that retrieves data from different data sources such as Amazon Bedrock Knowledge Bases, Sqlite, or even your local filesystem.
  • A developer productivity assistant agent that integrates with Slack and GitHub MCP servers.
  • A machine learning experiment tracking agent that integrates with the Opik MCP server from Comet ML for managing, visualizing, and tracking machine learning experiments directly within development environments.

What business challenges will you tackle with these powerful new capabilities?


About the authors

Mark Roy is a Principal Machine Learning Architect for AWS, helping customers design and build generative AI solutions. His focus since early 2023 has been leading solution architecture efforts for the launch of Amazon Bedrock, the flagship generative AI offering from AWS for builders. Mark’s work covers a wide range of use cases, with a primary interest in generative AI, agents, and scaling ML across the enterprise. He has helped companies in insurance, financial services, media and entertainment, healthcare, utilities, and manufacturing. Prior to joining AWS, Mark was an architect, developer, and technology leader for over 25 years, including 19 years in financial services. Mark holds six AWS certifications, including the ML Specialty Certification.

Eashan Kaushik is a Specialist Solutions Architect AI/ML at Amazon Web Services. He is driven by creating cutting-edge generative AI solutions while prioritizing a customer-centric approach to his work. Before this role, he obtained an MS in Computer Science from NYU Tandon School of Engineering. Outside of work, he enjoys sports, lifting, and running marathons.

Madhur Prashant is an AI and ML Solutions Architect at Amazon Web Services. He is passionate about the intersection of human thinking and generative AI. His interests lie in generative AI, specifically building solutions that are helpful and harmless, and most of all optimal for customers. Outside of work, he loves doing yoga, hiking, spending time with his twin, and playing the guitar.

Amit Arora is an AI and ML Specialist Architect at Amazon Web Services, helping enterprise customers use cloud-based machine learning services to rapidly scale their innovations. He is also an adjunct lecturer in the MS data science and analytics program at Georgetown University in Washington, D.C.

Andy Palmer is a Director of Technology for AWS Strategic Accounts. His teams provide Specialist Solutions Architecture skills across a number of speciality domain areas, including AIML, generative AI, data and analytics, security, network, and open source software. Andy and his team have been at the forefront of guiding our most advanced customers through their generative AI journeys and helping to find ways to apply these new tools to both existing problem spaces and net new innovations and product experiences.

Read More

Generate compliant content with Amazon Bedrock and ConstitutionalChain

Generate compliant content with Amazon Bedrock and ConstitutionalChain

Generative AI has emerged as a powerful tool for content creation, offering several key benefits that can significantly enhance the efficiency and effectiveness of content production processes such as creating marketing materials, image generation, content moderation etc. Constitutional AI and LangGraph‘s reflection mechanisms represent two complementary approaches to ensuring AI systems behave ethically – with Anthropic embedding principles during training while LangGraph implements them during inference/runtime through reflection and self-correction mechanisms. By using LanGraph’s Constitutional AI, content creators can streamline their workflow while maintaining high standards of user-defined compliance and ethical integrity. This method not only reduces the need for extensive human oversight but also enhances the transparency and accountability of content generation process by AI.

In this post, we explore practical strategies for using Constitutional AI to produce compliant content efficiently and effectively using Amazon Bedrock and LangGraph to build ConstitutionalChain for rapid content creation in highly regulated industries like finance and healthcare. Although AI offers significant productivity benefits, maintaining compliance with strict regulations are crucial. Manual validation of AI-generated content for regulatory adherence can be time-consuming and challenging. We also provide an overview of how Insagic, a Publicis Groupe company, integrated this concept into their existing healthcare marketing workflow using Amazon Bedrock. Insagic is a next-generation insights and advisory business that combines data, design, and dialogues to deliver actionable insights and transformational intelligence for healthcare marketers. It uses expertise from data scientists, behavior scientists, and strategists to drive better outcomes in the healthcare industry.

Understanding Constitutional AI

Constitutional AI is designed to align large language models (LLMs) with human values and ethical considerations. It works by integrating a set of predefined rules, principles, and constraints into the LLM’s core architecture and training process. This approach makes sure that the LLM operates within specified ethical and legal parameters, much like how a constitution governs a nation’s laws and actions.

The key benefits of Constitutional AI for content creation include:

  • Ethical alignment – Content generated using Constitutional AI is inherently aligned with predefined ethical standards
  • Legal compliance – The LLM is designed to operate within legal frameworks, reducing the risk of producing non-compliant content
  • Transparency – The principles guiding the LLM’s decision-making process are clearly defined and can be inspected
  • Reduced human oversight – By embedding ethical guidelines into the LLM, the need for extensive human review is significantly reduced

Let’s explore how you can harness the power of Constitutional AI to generate compliant content for your organization.

Solution overview

For this solution, we use Amazon Bedrock Knowledge Bases to store a repository of healthcare documents. We employ a Retrieval Augmented Generation (RAG) approach, first retrieving relevant context and synthesizing an answer based on the retrieved context, to generate articles based on the repository. We then use the open source orchestration framework LangGraph and ConstitutionalChain to generate, critique, and review prompts in an Amazon SageMaker notebook and develop an agentic workflow to generate compliance content. The following diagram illustrates this architecture.

This implementation demonstrates a sophisticated agentic workflow that not only generates responses based on a knowledge base but also employs a reflection technique to examine its outputs through ethical principles, allowing it to refine and improve its outputs. We upload a sample set of mental health documents to Amazon Bedrock Knowledge Bases and use those documents to write an article on mental health using a RAG-based approach. Later, we define a constitutional principle with a custom Diversity, Equity, and Inclusion (DEI) principle, specifying how to critique and revise responses for inclusivity.

Prerequisites

To deploy the solution, you need the following prerequisites:

Create an Amazon Bedrock knowledge base

To demonstrate this capability, we download a mental health article from the following GitHub repo and store it in Amazon S3. We then use Amazon Bedrock Knowledge Bases to index the articles. By default, Amazon Bedrock uses Amazon OpenSearch Serverless as a vector database. For full instructions to create an Amazon Bedrock knowledge base with Amazon S3 as the data source, see Create a knowledge base in Amazon Bedrock Knowledge Bases.

    1. On the Amazon Bedrock console, create a new knowledge base.
    2. Provide a name for your knowledge base and create a new IAM service role.Provide Knowledge Base details in the Amazon Bedrock console.
    3. Choose Amazon S3 as the data source and provide the S3 bucket storing the mental health article.
    4. Choose Amazon Titan Text Embeddings v2 as the embeddings model and OpenSearch Serverless as the vector store.
    5. Choose Create Knowledge Base.Create Knowledge Base in the Amazon Bedrock console.

Import statements and set up an Amazon Bedrock client

Follow the instructions provided in the README file in the GitHub repo. Clone the GitHub repo to make a local copy. We recommend running this code in a SageMaker JupyterLab environment. The following code imports the necessary libraries, including Boto3 for AWS services, LangChain components, and Streamlit. It sets up an Amazon Bedrock client and configures Anthropic’s Claude 3 Haiku model with specific parameters.

import boto3
from langchain_aws import ChatBedrock
…

bedrock_runtime = boto3.client(service_name="bedrock-runtime", region_name="us-east-1")
llm = ChatBedrock(client=bedrock_runtime, model_id="anthropic.claude-3-haiku-20240307-v1:0")
…..

Define Constitutional AI components

Next, we define a Critique class to structure the output of the critique process. Then we create prompt templates for critique and revision. Lastly, we set up chains using LangChain for generating responses, critiques, and revisions.

# LangChain Constitutional chain migration to LangGraph

class Critique(TypedDict):
    """Generate a critique, if needed."""

    critique_needed: Annotated[bool, ..., "Whether or not a critique is needed."]
    critique: Annotated[str, ..., "If needed, the critique."]

critique_prompt = ChatPromptTemplate.from_template(
    "Critique this response according to the critique request. "
…
)

revision_prompt = ChatPromptTemplate.from_template(
    "Revise this response according to the critique and reivsion request.nn"
    ….
)
chain = llm | StrOutputParser()
critique_chain = critique_prompt | llm.with_structured_output(Critique)
revision_chain = revision_prompt | llm | StrOutputParser()

Define a State class and refer to the Amazon Bedrock Knowledge Bases retriever

We define a LangGraph State class to manage the conversation state, including the query, principles, responses, and critiques:

# LangGraph State

class State(TypedDict):
    query: str
    constitutional_principles: List[ConstitutionalPrinciple]

Next, we set up an Amazon Bedrock Knowledge Bases retriever to extract the relevant information. We refer to the Amazon Bedrock knowledge base we created earlier to create an article based on mental health documents. Make sure to update the knowledge base ID in the following code with the knowledge base you created in previous steps:

#-----------------------------------------------------------------
# Amazon Bedrock KnowledgeBase

from langchain_aws.retrievers import AmazonKnowledgeBasesRetriever

retriever = AmazonKnowledgeBasesRetriever(
knowledge_base_id="W3NMIJXLUE", # Change it to your Knowledge base ID
…
)

Create LangGraph nodes and a LangGraph graph along with constitutional principles

The next section of code integrates graph-based workflow orchestration, ethical principles, and a user-friendly interface to create a sophisticated Constitutional AI model. The following diagram illustrates the workflow.

Workflow of start, retrieval augmented generation, critique and revise, and end.

It uses a StateGraph to manage the flow between RAG and critique/revision nodes, incorporating a custom DEI principle to guide the LLM’s responses. The system is presented through a Streamlit application, which provides an interactive chat interface where users can input queries and view the LLM’s initial responses, critiques, and revised answers. The application also features a sidebar displaying a graph visualization of the workflow and a description of the applied ethical principle. This comprehensive approach makes sure that the LLM’s outputs are not only knowledge-based but also ethically aligned by using customizable constitutional principles that guide a reflection flow (critique and revise), all while maintaining a user-friendly experience with features like chat history management and a clear chat option.

Streamlit application

The Streamlit application component of this code creates an interactive and user-friendly interface for the Constitutional AI model. It sets up a side pane that displays a visualization of the LLM’s workflow graph and provides a description of the DEI principle being applied. The main interface features a chat section where users can input their queries and view the LLM’s responses.

# ------------------------------------------------------------------------
# Streamlit App

# Clear Chat History fuction
def clear_screen():
    st.session_state.messages = [{"role": "assistant", "content": "How may I assist you today?"}]

with st.sidebar:
    st.subheader('Constitutional AI Demo')
…..
    ConstitutionalPrinciple(
        name="DEI Principle",
        critique_request="Analyze the content for any lack of diversity, equity, or inclusion. Identify specific instances where the text could be more inclusive or representative of diverse perspectives.",
        revision_request="Rewrite the content by incorporating critiques to be more diverse, equitable, and inclusive. Ensure representation of various perspectives and use inclusive language throughout."
    )
    """)
    st.button('Clear Screen', on_click=clear_screen)

# Store LLM generated responses
if "messages" not in st.session_state.keys():
    st.session_state.messages = [{"role": "assistant", "content": "How may I assist you today?"}]

# Chat Input - User Prompt 
if prompt := st.chat_input():
….

    with st.spinner(f"Generating..."):
        ….
    with st.chat_message("assistant"):
        st.markdown("**[initial response]**")
….
        st.session_state.messages.append({"role": "assistant", "content": "[revised response] " + generation['response']})

The application maintains a chat history, displaying both user inputs and LLM responses, including the initial response, any critiques generated, and the final revised response. Each step of the LLM’s process is clearly labeled and presented to the user. The interface also includes a Clear Screen button to reset the chat history. When processing a query, the application shows a loading spinner and displays the runtime, providing transparency into the LLM’s operation. This comprehensive UI design allows users to interact with the LLM while observing how constitutional principles are applied to refine the LLM’s outputs.

Test the solution using the Streamlit UI

In the Streamlit application, when a user inputs a query, the application initiates the process by creating and compiling the graph defined earlier. It then streams the execution of this graph, which includes the RAG and critique/revise steps. During this process, the application displays real-time updates for each node’s execution, showing the user what’s happening behind the scenes. The system measures the total runtime, providing transparency about the processing duration. When it’s complete, the application presents the results in a structured manner within the chat interface. It displays the initial LLM-generated response, followed by any critiques made based on the constitutional principles, and finally shows the revised response that incorporates these ethical considerations. This step-by-step presentation allows users to see how the LLM’s response evolves through the constitutional AI process, from initial generation to ethical refinement. As mentioned, in the GitHub README file, in order to run the Streamlit application, use the following code:

pip install -r requirements.txt
streamlit run main.py

For details on using a Jupyter proxy to access the Streamlit application, refer to Build Streamlit apps in Amazon SageMaker Studio.

Modify the Studio URL, replacing lab? with proxy/8501/.

Chat interface showing the RAG and critique and revise steps.

How Insagic uses Constitutional AI to generate compliant content

Insagic uses real-world medical data to help brands understand people as patients and patients as people, enabling them to deliver actionable insights in the healthcare marketing space. Although generating deep insights in the health space can yield profound dividends, it must be done with consideration for compliance and the personal nature of health data. By defining federal guidelines as constitutional principles, Insagic makes sure that the content delivered by generative AI complies with federal guidelines for healthcare marketing.

Clean up

When you have finished experimenting with this solution, clean up your resources to prevent AWS charges from being incurred:

  1. Empty the S3 buckets.
  2. Delete the SageMaker notebook instance.
  3. Delete the Amazon Bedrock knowledge base.

Conclusion

This post demonstrated how to implement a sophisticated generative AI solution using Amazon Bedrock and LangGraph to generate compliant content. You can also integrate this workflow to generate responses based on a knowledge base and apply ethical principles to critique and revise its outputs, all within an interactive web interface. Insagic is looking at more ways to incorporate this into existing workflows by defining custom principles to achieve compliance goals.

You can expand this concept further by incorporating Amazon Bedrock Guardrails. Amazon Bedrock Guardrails and LangGraph Constitutional AI can create a comprehensive safety system by operating at different levels. Amazon Bedrock provides API-level content filtering and safety boundaries, and LangGraph implements constitutional principles in reasoning workflows. Together, they enable multi-layered protection through I/O filtering, topic restrictions, ethical constraints, and logical validation steps in AI applications.

Try out the solution for your own use case, and leave your feedback in the comments.


About the authors

Sriharsh AdariSriharsh Adari is a Senior Solutions Architect at Amazon Web Services (AWS), where he helps customers work backwards from business outcomes to develop innovative solutions on AWS. Over the years, he has helped multiple customers on data platform transformations across industry verticals. His core area of expertise include Technology Strategy, Data Analytics, and Data Science. In his spare time, he enjoys playing sports, binge-watching TV shows, and playing Tabla.

David Min is a Senior Partner Sales Solutions Architect at Amazon Web Services (AWS) specializing in Generative AI, where he helps customers transform their businesses through innovative AI solutions. Throughout his career, David has helped numerous organizations across industries bridge the gap between cutting-edge AI technology and practical business applications, focusing on executive engagement and successful solution adoption.

Stephen Garth is a Data Scientist at Insagic, where he develops advanced machine learning solutions, including LLM-powered automation tools and deep clustering models for actionable, consumer insights. With a strong background spanning software engineering, healthcare data science, and computational research, he is passionate to bring his expertise in AI-driven analytics and large-scale data processing to drive solutions.

Chris Cocking specializes in scalable enterprise application design using multiple programming languages. With a nearly 20 years of experience, he excels in LAMP and IIS environments, SEO strategies, and most recently designing agentic systems. Outside of work, Chris is an avid bassist and music lover, which helps fuel his creativity and problem-solving skills.

Read More