New technical deep dive course: Generative AI Foundations on AWS

Generative AI Foundations on AWS is a new technical deep dive course that gives you the conceptual fundamentals, practical advice, and hands-on guidance to pre-train, fine-tune, and deploy state-of-the-art foundation models on AWS and beyond. Developed by AWS generative AI worldwide foundations lead Emily Webber, this free hands-on course and the supporting GitHub source code launched via AWS Youtube. If you are looking for a curated playlist of the top resources, concepts, and guidance to get up to speed on foundation models, and especially those that unlock generative capabilities in your data science and machine learning projects, then look no further.

During this 8-hour deep dive, you will be introduced to the key techniques, services, and trends that will help you understand foundation models from the ground up. This means breaking down theory, mathematics, and abstract concepts combined with hands-on exercises to gain functional intuition for practical application. Throughout the course, we focus on a wide spectrum of progressively complex generative AI techniques, giving you a strong base to understand, design, and apply your own models for the best performance. We’ll start with recapping foundation models, understanding where they come from, how they work, how they relate to generative AI, and what you can to do customize them. You’ll then learn about picking the right foundation model to suit your use case.

Once you’ve developed a strong contextual understanding of foundation models and how to use them, you’ll be introduced to the core subject of this course: pre-training new foundation models. You’ll learn why you’d want to do this as well as how and where it’s competitive. You’ll even learn how to use the scaling laws to pick the right model, dataset, and compute sizes. We’ll cover preparing training datasets at scale on AWS, including picking the right instances and storage techniques. We’ll cover fine-tuning your foundation models, evaluating recent techniques, and understanding how to run these with your scripts and models. We’ll dive into reinforcement learning with human feedback, exploring how to use it skillfully and at scale to truly maximize your foundation model performance.

Finally, you’ll learn how to apply theory to production by deploying your new foundation model on Amazon SageMaker, including across multiple GPUs and using top design patterns like retrieval augmented generation and chained dialogue. As an added bonus, we’ll walk you through a Stable Diffusion deep dive, prompt engineering best practices, standing up LangChain, and more.

More of a reader than a video consumer? You can check out my 15-chapter book “Pretrain Vision and Large Language Models in Python: End-to-end techniques for building and deploying foundation models on AWS,” which released May 31, 2023, with Packt publishing and is available now on Amazon. Want to jump right into the code? I’m with you—every video starts with a 45-minute overview of the key concepts and visuals. Then I’ll give you a 15-minute walkthrough of the hands-on portion. All of the example notebooks and supporting code will ship in a public repository, which you can use to step through on your own. Feel free to reach out to me on Medium, LinkedIn, GitHub, or through your AWS teams. Learn more about generative AI on AWS.

Happy trails!

Course outline

1. Introduction to Foundation Models

  • What are large language models and how do they work?
  • Where do they come from?
  • What are other types of generative AI?
  • How do you customize a foundation model?
  • How do you evaluate a Generative model?
  • Hands-on walk through: Foundation Models on SageMaker

Lesson 1 slides

Lesson 1 hands-on demo resources

2. Picking the right foundation model

  • Why starting with the right foundation model matters
  • Considering size
  • Considering accuracy
    • Considering ease-of-use
  • Considering licensing
  • Considering previous examples of this model working well in your industry
    • Considering external benchmarks

Lesson 2 slides

Lesson 2 hands-on demo resources

3. Using pretrained foundation models: prompt engineering and fine-tuning

  • The benefits of starting with a pre-trained foundation model
  • Prompt engineering:
    • Zero-shot
    • Single-shot
    • Few-shot
    • Summarization
      • Classification
    • Translation
  • Fine-tuning
    • Classic fine-tuning
    • Parameter efficient fine-tuning
    • Hugging Face’s new library
    • Hands-on walk through: prompt engineering and fine-tuning on SageMaker

Lesson 3 slides

Lesson 3 hands-on demo resources

4. Pretraining a new foundation model

  • Why would you want or need to create a new foundation model?
    • Comparing pretraining to fine-tuning
  • Preparing your dataset for pretraining
  • Distributed training on SageMaker: libraries, scripts, jobs, resources
  • Why and how to adapt a new script to SageMaker distributed training

Lesson 4 slides

Lesson 4 hands-on demo resources

5. Preparing data and training at scale

  • Options for prepping data at scale on AWS
  • Explain SageMaker job parallelism on CPU instances
  • Explain modes of sending data to SageMaker Training
  • Introduction to FSx for Lustre
  • Using FSx for Lustre at scale for SageMaker Training
  • Hands-on walk through: configuring Lustre for SageMaker Training

Lesson 5 slides

Lesson 5 hands-on demo resources

6. Reinforcement learning with human feedback

  • What is this technique and why do we care about it
  • How it gets around problems with subjectivity and objectivity through ranking human preferences at scale
  • How does it work?
  • How to do this with SageMaker Ground Truth
  • Updated reward modeling
  • Hands-on walk through: RLFH on SageMaker

Lesson 6 slides

Lesson 6 hands-on demo resources

7. Deploying a foundation model

  • Why do we want to deploy models?
  • Different options for deploying FM’s on AWS
  • How to optimize your model for deployment
  • Large model deployment container deep dive
  • Top configuration tips for deploying FM’s on SageMaker
  • Prompt engineering tips for invoking foundation models
  • Using retrieval augmented generation to mitigate hallucinations
  • Hands-on walk through: Deploying an FM on SageMaker

Lesson 7 slides

Lesson 7 hands-on demo resources


About the author

Emily Webber joined AWS just after SageMaker launched, and has been trying to tell the world about it ever since! Outside of building new ML experiences for customers, Emily enjoys meditating and studying Tibetan Buddhism.

Read More