Advertising agencies can use generative AI and text-to-image foundation models to create innovative ad creatives and content. In this post, we demonstrate how you can generate new images from existing base images using Amazon SageMaker, a fully managed service to build, train, and deploy ML models for at scale. With this solution, businesses large and small can develop new ad creatives much faster and at lower cost than ever before. This allows you to develop new custom ad creative content for your business at low cost and at a rapid pace.
Consider the following scenario: a global automotive company needs new marketing material generated for their new car design being released and hires a creative agency that is known for providing advertising solutions for clients with strong brand equity. The car manufacturer is looking for low-cost ad creatives that display the model in diverse locations, colors, views, and perspectives while maintaining the brand identity of the car manufacturer. With the power of state-of-the-art techniques, the creative agency can support their customer by using generative AI models within their secure AWS environment.
The solution is developed with Generative AI and Text-to-Image models in Amazon SageMaker. SageMaker is a fully managed machine learning (ML) service that that makes it straightforward to build, train, and deploy ML models for any use case with fully managed infrastructure, tools, and workflows. Stable Diffusion is a text-to-image foundation model from Stability AI that powers the image generation process. Diffusers are pre-trained models that use Stable Diffusion to use an existing image to generate new images based on a prompt. Combining Stable Diffusion with Diffusers like ControlNet can take existing brand-specific content and develop stunning versions of it. Key benefits of developing the solution within AWS along with Amazon SageMaker are:
- Privacy – Storing the data in Amazon Simple Storage Service (Amazon S3) and using SageMaker to host models allows you to adhere to security best practices within your AWS account while not exposing assets publicly.
- Scalability – The Stable Diffusion model, when deployed as a SageMaker endpoint, brings scalability by allowing you to configure instance sizes and number of instances. SageMaker endpoints also have auto scaling features and are highly available.
- Flexibility – When creating and deploying endpoints, SageMaker provides the flexibility to choose GPU instance types. Also, instances behind SageMaker endpoints can be changed with minimum effort as business needs change. AWS has also developed hardware and chips using AWS Inferentia2 for high performance at the lowest cost for generative AI inference.
- Rapid innovation – Generative AI is a rapidly evolving domain with new approaches, and models are being constantly developed and released. Amazon SageMaker JumpStart regularly onboards new models along with foundation models.
- End-to-end integration – AWS allows you to integrate the creative process with any AWS service and develop an end-to-end process using fine-grained access control through AWS Identity and Access Management (IAM), notification through Amazon Simple Notification Service (Amazon SNS), and postprocessing with the event-driven compute service AWS Lambda.
- Distribution – When the new creatives are generated, AWS allows distributing the content across global channels in multiple Regions using Amazon CloudFront.
For this post, we use the following GitHub sample, which uses Amazon SageMaker Studio with foundation models (Stable Diffusion), prompts, computer vision techniques, and a SageMaker endpoint to generate new images from existing images. The following diagram illustrates the solution architecture.
The workflow contains the following steps:
- We store the existing content (images, brand styles, and so on) securely in S3 buckets.
- Within SageMaker Studio notebooks, the original image data is transformed to images using computer vision techniques, which preserves the shape of the product (the car model), removes color and background, and generates monotone intermediate images.
- The intermediate image acts as a control image for Stable Diffusion with ControlNet.
- We deploy a SageMaker endpoint with the Stable Diffusion text-to-image foundation model from SageMaker Jumpstart and ControlNet on a preferred GPU-based instance size.
- Prompts describing new backgrounds and car colors along with the intermediate monotone image are used to invoke the SageMaker endpoint, yielding new images.
- New images are stored in S3 buckets as they’re generated.
Deploy ControlNet on SageMaker endpoints
To deploy the model to SageMaker endpoints, we must create a compressed file for each individual technique model artifact along with the Stable Diffusion weights, inference script, and NVIDIA Triton config file.
In the following code, we download the model weights for the different ControlNet techniques and Stable Diffusion 1.5 to the local directory as tar.gz files:
To create the model pipeline, we define an
inference.py script that SageMaker real-time endpoints will use to load and host the Stable Diffusion and ControlNet tar.gz files. The following is a snippet from
inference.py that shows how the models are loaded and how the Canny technique is called:
We deploy the SageMaker endpoint with the required instance size (GPU type) from the model URI:
Generate new images
Now that the endpoint is deployed on SageMaker endpoints, we can pass in our prompts and the original image we want to use as our baseline.
To define the prompt, we create a positive prompt,
p_p, for what we’re looking for in the new image, and the negative prompt,
n_p, for what is to be avoided:
Finally, we invoke our endpoint with the prompt and source image to generate our new image:
Different ControlNet techniques
In this section, we compare the different ControlNet techniques and their effect on the resulting image. We use the following original image to generate new content using Stable Diffusion with Control-net in Amazon SageMaker.
The following table shows how the technique output dictates what, from the original image, to focus on.
|Technique Name||Technique Type||Technique Output||Prompt||Stable Diffusion with ControlNet|
|canny||A monochrome image with white edges on a black background.||metal orange colored car, complete car, colour photo, outdoors in a pleasant landscape, realistic, high quality|
|depth||A grayscale image with black representing deep areas and white representing shallow areas.||metal red colored car, complete car, colour photo, outdoors in pleasant landscape on beach, realistic, high quality|
|hed||A monochrome image with white soft edges on a black background.||metal white colored car, complete car, colour photo, in a city, at night, realistic, high quality|
|scribble||A hand-drawn monochrome image with white outlines on a black background.||metal blue colored car, similar to original car, complete car, colour photo, outdoors, breath-taking view, realistic, high quality, different viewpoint|
After you generate new ad creatives with generative AI, clean up any resources that won’t be used. Delete the data in Amazon S3 and stop any SageMaker Studio notebook instances to not incur any further charges. If you used SageMaker JumpStart to deploy Stable Diffusion as a SageMaker real-time endpoint, delete the endpoint either through the SageMaker console or SageMaker Studio.
In this post, we used foundation models on SageMaker to create new content images from existing images stored in Amazon S3. With these techniques, marketing, advertisement, and other creative agencies can use generative AI tools to augment their ad creatives process. To dive deeper into the solution and code shown in this demo, check out the GitHub repo.
Also, refer to Amazon Bedrock for use cases on generative AI, foundation models, and text-to-image models.
About the Authors
Sovik Kumar Nath is an AI/ML solution architect with AWS. He has extensive experience designing end-to-end machine learning and business analytics solutions in finance, operations, marketing, healthcare, supply chain management, and IoT. Sovik has published articles and holds a patent in ML model monitoring. He has double masters degrees from the University of South Florida, University of Fribourg, Switzerland, and a bachelors degree from the Indian Institute of Technology, Kharagpur. Outside of work, Sovik enjoys traveling, taking ferry rides, and watching movies.
Sandeep Verma is a Sr. Prototyping Architect with AWS. He enjoys diving deep into customer challenges and building prototypes for customers to accelerate innovation. He has a background in AI/ML, founder of New Knowledge, and generally passionate about tech. In his free time, he loves traveling and skiing with his family.
Uchenna Egbe is an Associate Solutions Architect at AWS. He spends his free time researching about herbs, teas, superfoods, and how to incorporate them into his daily diet.
Mani Khanuja is an Artificial Intelligence and Machine Learning Specialist SA at Amazon Web Services (AWS). She helps customers using machine learning to solve their business challenges using the AWS. She spends most of her time diving deep and teaching customers on AI/ML projects related to computer vision, natural language processing, forecasting, ML at the edge, and more. She is passionate about ML at edge, therefore, she has created her own lab with self-driving kit and prototype manufacturing production line, where she spend lot of her free time.