Riding the Rays: Sunswift Racing Shines in World Solar Challenge Race

Riding the Rays: Sunswift Racing Shines in World Solar Challenge Race

In the world’s largest solar race car event of the year, the University of New South Wales Sunswift Racing team is having its day in the sun.

The World Solar Challenge, which first began some 35 years ago, attracts academic participants from across the globe. This year’s event drew nearly 100 competitors.

The race runs nearly 1,900 miles over the course of about four days and pits challengers in a battle not for speed but for greatest energy efficiency.

UNSW Sydney won the energy efficiency competition and crossed the finish line first, taking the Cruiser Cup with its Sunswift 7 vehicle, which utilizes NVIDIA Jetson Xavier NX for energy optimization. It was also the only competitor to race with 4 people on board and a remote mission control team.

“It’s a completely different proposition to say we can use the least amount of energy and arrive in Adelaide before anybody else, but crossing the line first is just about bragging rights,” said Richard Hopkins, project manager at Sunswift and a UNSW professor. Hopkins previously managed Formula 1 race teams in the U.K.

Race organizers bill the event, which cuts across the entire Australian continent on public roads — from Darwin in the north to Adelaide in the south — as the “world’s greatest innovation and engineering challenge contributing to a more sustainable mobility future.” It’s also become a launchpad for students pursuing career paths in the electric vehicle industry.

Like many of the competitors, UNSW is coming back after a three-year hiatus from the race due to the COVID-19 pandemic, making this year’s competition highly anticipated.

“Every single team member needs to understand what they’re doing and what their role is on the team and perform at the very best during those five-and-a-half days,” said Hopkins. “It is exhausting.”

All In on Energy Efficiency  

The race allows participants to start with a fully charged battery and to charge when the vehicles stop for the night at two locations. The remaining energy used, some 90%, comes from the sun and the vehicles’ solar panels.

UNSW’s seventh-generation Sunswift 7 runs algorithms to optimize for energy efficiency, essentially shutting down all nonessential computing to maximize battery life.

The solar electric vehicle relies on NVIDIA Jetson AI to give it an edge across its roughly 100 automotive monitoring and power management systems.

It can also factor in whether it should drive faster or slower based on weather forecasts. For instance, the car will urge the driver to go faster if it’s going to rain later in the day when conditions would force the car to slow down.

The Sunswift 7 vehicle was designed to mostly drive in a straight line from Darwin to Adelaide, and the object is to use the least amount of power outside of that mission, said Hopkins.

“Sunswift 7 late last year was featured in the Guinness Book of World Records for being the fastest electric vehicle for over 1,000 kilometers on a single charge of battery,” he said.

Jetson-Based Racers for Learning

The UNSW team created nearly 60 design iterations to improve on the aerodynamics of the vehicle. They used computational fluid dynamics modeling and ran simulations to analyze each version.

“We didn’t ever put the car through a physical wind tunnel,” said Hopkins.

The technical team has been working on a model to determine what speed the vehicle should be driven at for maximum energy conservation. “They’re working on taking in as many parameters as you can, given it’s really hard to get good driving data,” said Josh Bramley, technology manager at Sunswift Racing.

Sunswift 7 is running on the Robot Operating System (ROS) suite of software and relies on its NVIDIA Jetson module to process all the input from the sensors for analytics, which can be monitored by the remote pit crew back on campus at UNSW.

Jetson is used for all the control systems on the car, so everything from the accelerator pedal, wheel sensors, solar current sensors and more are processed on it for data to analyze for ways AI might help, said Bramley. The next version of the vehicle is expected to pack more AI, he added.

“A lot of the AI and computer vision will be coming for Sunswift 8 in the next solar challenge,” said Bramley.

More than 100 students are getting course credit for the Sunswift Racing team work, and many are interested in pursuing careers in electric vehicles, said Hopkins.

Past World Solar Challenge contestants have gone on to work at Tesla, SpaceX and Zipline.

Talk about a bright future.

Learn more about the NVIDIA Jetson platform for edge AI and robotics.

Read More

Schneider Electric leverages Retrieval Augmented LLMs on SageMaker to ensure real-time updates in their ERP systems

Schneider Electric leverages Retrieval Augmented LLMs on SageMaker to ensure real-time updates in their ERP systems

This post was co-written with Anthony Medeiros, Manager of Solutions Engineering and Architecture for North America Artificial Intelligence, and Blake Santschi, Business Intelligence Manager, from Schneider Electric. Additional Schneider Electric experts include Jesse Miller, Somik Chowdhury, Shaswat Babhulgaonkar, David Watkins, Mark Carlson and Barbara Sleczkowski. 

Enterprise Resource Planning (ERP) systems are used by companies to manage several business functions such as accounting, sales or order management in one system. In particular, they are routinely used to store information related to customer accounts. Different organizations within a company might use different ERP systems and merging them is a complex technical challenge at scale which requires domain-specific knowledge.

Schneider Electric is a leader in digital transformation of energy management and industrial automation. To best serve their customers’ needs, Schneider Electric needs to keep track of the links between related customers’ accounts in their ERP systems. As their customer base grows, new customers are added daily, and their account teams have to manually sort through these new customers and link them to the proper parent entity.

The linking decision is based on the most recent information available publicly on the Internet or in the media, and might be affected by recent acquisitions, market news or divisional re-structuring. An example of account linking would be to identify the relationship between Amazon and its subsidiary, Whole Foods Market [source].

Schneider Electric is deploying large language models for their capabilities in answering questions in various knowledge specific domains, the date the model has been trained is limiting its knowledge. They addressed that challenge by using a Retriever-Augmented Generation open source large language model available on Amazon SageMaker JumpStart to process large amounts of external knowledge pulled and exhibit corporate or public relationships among ERP records.

In early 2023, when Schneider Electric decided to automate part of its accounts linking process using artificial intelligence (AI), the company partnered with the AWS Machine Learning Solutions Lab (MLSL). With MLSL’s expertise in ML consulting and execution, Schneider Electric was able to develop an AI architecture that would reduce the manual effort in their linking workflows, and deliver faster data access to their downstream analytics teams.

Generative AI

Generative AI and large language models (LLMs) are transforming the way business organizations are able to solve traditionally complex challenges related to natural language processing and understanding. Some of the benefits offered by LLMs include the ability to comprehend large portions of text and answer related questions by producing human-like responses. AWS makes it easy for customers to experiment with and productionize LLM workloads by making many options available via Amazon SageMaker JumpStart, Amazon Bedrock, and Amazon Titan.

External Knowledge Acquisition

LLMs are known for their ability to compress human knowledge and have demonstrated remarkable capabilities in answering questions in various knowledge specific domains, but their knowledge is limited by the date the model has been trained. We address that information cutoff by coupling the LLM with a Google Search API to deliver a powerful Retrieval Augmented LLM (RAG) that addresses Schneider Electric’s challenges. The RAG is able to process large amounts of external knowledge pulled from the Google search and exhibit corporate or public relationships among ERP records.

See the following example:

Question: Who is the parent company of One Medical?
Google query: “One Medical parent company” → information → LLM
Answer: One Medical, a subsidiary of Amazon…

The preceding example (taken from the Schneider Electric customer database) concerns an acquisition that happened in February 2023 and thus would not be caught by the LLM alone due to knowledge cutoffs. Augmenting the LLM with Google search guarantees the most up-to-date information.

Flan-T5 model

In that project we used Flan-T5-XXL model from the Flan-T5 family of models.

The Flan-T5 models are instruction-tuned and therefore are capable of performing various zero-shot NLP tasks. In our downstream task there was no need to accommodate a vast amount of world knowledge but rather to perform well on question answering given a context of texts provided through search results, and therefore, the 11B parameters T5 model performed well.

JumpStart provides convenient deployment of this model family through Amazon SageMaker Studio and the SageMaker SDK. This includes Flan-T5 Small, Flan-T5 Base, Flan-T5 Large, Flan-T5 XL, and Flan-T5 XXL. Furthermore, JumpStart provides a few versions of Flan-T5 XXL at different levels of quantization. We deployed Flan-T5-XXL to an endpoint for inference using Amazon SageMaker Studio Jumpstart.

Path to Flan-T5 SageMaker JumpStart

Retrieval Augmented LLM with LangChain

LangChain is popular and fast growing framework allowing development of applications powered by LLMs. It is based on the concept of chains, which are combinations of different components designed to improve the functionality of LLMs for a given task. For instance, it allows us to customize prompts and integrate LLMs with different tools like external search engines or data sources. In our use-case, we used Google Serper component to search the web, and deployed the Flan-T5-XXL model available on Amazon SageMaker Studio Jumpstart. LangChain performs the overall orchestration and allows the search result pages be fed into the Flan-T5-XXL instance.

The Retrieval-Augmented Generation (RAG) consists of two steps:

  1. Retrieval of relevant text chunks from external sources
  2. Augmentation of the chunks with context in the prompt given to the LLM.

For Schneider Electric’ use-case, the RAG proceeds as follows:

  1. The given company name is combined with a question like “Who is the parent company of X”, where X is the given company) and passed to a google query using the Serper AI
  2. The extracted information is combined with the prompt and original question and passed to the LLM for an answer.

The following diagram illustrates this process.

RAG Workflow

Use the following code to create an endpoint:

# Spin FLAN-T5-XXL Sagemaker Endpoint
llm = SagemakerEndpoint(...)

Instantiate search tool:

search = GoogleSerperAPIWrapper()
search_tool = Tool(
	name="Search",
	func=search.run,
	description="useful for when you need to ask with search",
	verbose=False)

In the following code, we chain together the retrieval and augmentation components:

my_template = """
Answer the following question using the information. n
Question : {question}? n
Information : {search_result} n
Answer: """
prompt_template = PromptTemplate(
	input_variables=["question", 'search_result'],
	template=my_template)
question_chain = LLMChain(
	llm=llm,
	prompt=prompt_template,
	output_key="answer")

def search_and_reply_company(company):
	# Retrieval
	search_result = search_tool.run(f"{company} parent company")
	# Augmentation
	output = question_chain({
		"question":f"Who is the parent company of {company}?",
		"search_result": search_result})
	return output["answer"]

search_and_reply_company("Whole Foods Market")
"Amazon"

The Prompt Engineering

The combination of the context and the question is called the prompt. We noticed that the blanket prompt we used (variations around asking for the parent company) performed well for most public sectors (domains) but didn’t generalize well to education or healthcare since the notion of parent company is not meaningful there. For education, we used “X” while for healthcare we used “Y”.

To enable this domain specific prompt selection, we also had to identify the domain a given account belongs to. For this, we also used a RAG where a multiple choice question “What is the domain of {account}?” as a first step, and based on the answer we inquired on the parent of the account using the relevant prompt as a second step. See the following code:

my_template_options = """
Answer the following question using the information. n
Question :  {question}? n
Information : {search_result} n
Options :n {options} n
Answer:
"""

prompt_template_options = PromptTemplate(
input_variables=["question", 'search_result', 'options'],
template=my_template_options)
question_chain = LLMChain(
	llm=llm,
	prompt=prompt_template_options,
	output_key="answer")
	
my_options = """
- healthcare
- education
- oil and gas
- banking
- pharma
- other domain """

def search_and_reply_domain(company):
search_result = search_tool.run(f"{company} ")
output = question_chain({
	"question":f"What is the domain of {company}?",
	"search_result": search_result,
	"options":my_options})
return output["answer"]

search_and_reply_domain("Exxon Mobil")
"oil and gas"

The sector specific prompts have boosted the overall performance from 55% to 71% of accuracy. Overall, the effort and time invested to develop effective prompts appear to significantly improve the quality of LLM response.

RAG with tabular data (SEC-10k)

The SEC 10K filings is another reliable source of information for subsidiaries and subdivisions filed annually by a publicly traded companies. These filings are available directly on SEC EDGAR or through  CorpWatch API.

We assume the information is given in tabular format. Below is a pseudo csv dataset that mimics the original format of the SEC-10K dataset. It is possible to merge multiple csv data sources into a combined pandas dataframe:

# A pseudo dataset similar by schema to the CorpWatch API dataset
df.head()

index	relation_id		source_cw_id	target_cw_id	parent		subsidiary
  1		90				22569           37				AMAZON		WHOLE FOODS MARKET
873		1467			22569			781				AMAZON		TWITCH
899		1505			22569			821				AMAZON		ZAPPOS
900		1506			22569			821				AMAZON		ONE MEDICAL
901		1507			22569			821				AMAZON		WOOT!

The LangChain provides an abstraction layer for pandas through create_pandas_dataframe_agent.  There are two key advantages to using LangChain/LLMs for this task:

  1. Once spun up, it allows a downstream consumer to interact with the dataset in natural language rather than code
  2. It is more robust to misspellings and different ways of naming accounts.

We spin the endpoint as above and create the agent:

# Create pandas dataframe agent
agent = create_pandas_dataframe_agent(llm, df, varbose=True)

In the following code, we query for the parent/subsidiary relationship and the agent translates the query into pandas language:

# Example 1
query = "Who is the parent of WHOLE FOODS MARKET?"
agent.run(query)

#### output
> Entering new AgentExecutor chain...
Thought: I need to find the row with WHOLE FOODS MARKET in the subsidiary column
Action: python_repl_ast
Action Input: df[df['subsidiary'] == 'WHOLE FOODS MARKET']
Observation:
source_cw_id	target_cw_id	parent		subsidiary
22569			37				AMAZON		WHOLE FOODS MARKET
Thought: I now know the final answer
Final Answer: AMAZON
> Finished chain.
# Example 2
query = "Who are the subsidiaries of Amazon?"
agent.run(query)
#### output
> Entering new AgentExecutor chain...
Thought: I need to find the row with source_cw_id of 22569
Action: python_repl_ast
Action Input: df[df['source_cw_id'] == 22569]
...
Thought: I now know the final answer
Final Answer: The subsidiaries of Amazon are Whole Foods Market, Twitch, Zappos, One Medical, Woot!...> Finished chain.
'The subsidiaries of Amazon are Whole Foods Market, Twitch, Zappos, One Medical, Woot!.'

Conclusion

In this post, we detailed how we used building blocks from LangChain to augment an LLM with search capabilities, in order to uncover relationships between Schneider Electric’s customer accounts. We extended the initial pipeline to a two-step process with domain identification before using a domain specific prompt for higher accuracy.

In addition to the Google Search query, datasets that detail corporate structures such as the SEC 10K filings can be used to further augment the LLM with trustworthy information. Schneider Electric team will also be able to extend and design their own prompts mimicking the way they classify some public sector accounts, further improving the accuracy of the pipeline. These capabilities will enable Schneider Electric to maintain up-to-date and accurate organizational structures of their customers, and unlock the ability to do analytics on top of this data.


About the Authors

Anthony Medeiros is a Manager of Solutions Engineering and Architecture at Schneider Electric. He specializes in delivering high-value AI/ML initiatives to many business functions within North America. With 17 years of experience at Schneider Electric, he brings a wealth of industry knowledge and technical expertise to the team.

Blake Sanstchi is a Business Intelligence Manager at Schneider Electric, leading an analytics team focused on supporting the Sales organization through data-driven insights.

Joshua LevyJoshua Levy is Senior Applied Science Manager in the Amazon Machine Learning Solutions lab, where he helps customers design and build AI/ML solutions to solve key business problems.

Kosta Belz is a Senior Applied Scientist with AWS MLSL with focus on Generative AI and document processing. He is passionate about building applications using Knowledge Graphs and NLP. He has around 10 years of experience in building Data & AI solutions to create value for customers and enterprises.

Aude Genevay is an Applied Scientist in the Amazon GenAI Incubator, where she helps customers solve key business problems through ML and AI. She previously was a researcher in theoretical ML and enjoys applying her knowledge to deliver state-of-the-art solutions to customers.

Md Sirajus Salekin is an Applied Scientist at AWS Machine Learning Solution Lab. He helps AWS customers to accelerate their business by building AI/ML solutions. His research interests are multimodal machine learning, generative AI, and ML applications in healthcare.

Zichen Wang, PhD, is a Senior Applied Scientist in AWS. With several years of research experience in developing ML and statistical methods using biological and medical data, he works with customers across various verticals to solve their ML problems.

Anton Gridin is a Principal Solutions Architect supporting Global Industrial Accounts, based out of New York City. He has more than 15 years of experience building secure applications and leading engineering teams.

Read More

DLSS 3.5 With Ray Reconstruction Now Available in NVIDIA Omniverse

DLSS 3.5 With Ray Reconstruction Now Available in NVIDIA Omniverse

The highly anticipated NVIDIA DLSS 3.5 update, including Ray Reconstruction for NVIDIA Omniverse — a platform for connecting and building custom 3D tools and apps — is now available.

RTX Video Super Resolution (VSR) will be available with tomorrow’s NVIDIA Studio Driver release — which also supports the DLSS 3.5 update in Omniverse and is free for RTX GPU owners. The version 1.5 update delivers greater overall graphical fidelity, upscaling for native videos and support for GeForce RTX 20 Series GPUs.

NVIDIA Creative Director and visual effects producer Sabour Amirazodi returns In the NVIDIA Studio to share his Halloween-themed project: a full projection mapping show on his house, featuring haunting songs, frightful animation, spooky props and more.

Creators can join the #SeasonalArtChallenge by submitting harvest- and fall-themed pieces through November.

The latest Halloween-themed Studio Standouts video features ghouls, creepy monsters, haunted hospitals, dimly lit homes and is not for the faint-of-heart.

Remarkable Ray Reconstruction

NVIDIA DLSS 3.5 — featuring Ray Reconstruction — enhances ray-traced image quality on GeForce RTX GPUs by replacing hand-tuned denoisers with an NVIDIA supercomputer-trained AI network that generates higher-quality pixels in between sampled rays.

Previewing content in the viewport, even with high-end hardware, can sometimes offer less than ideal image quality, as traditional denoisers require hand-tuning for every scene.

With DLSS 3.5, the AI neural network recognizes a wide variety of scenes, producing high-quality preview images and drastically reducing time spent rendering scenes.

NVIDIA Omniverse and the USD Composer app — featuring the Omniverse RTX Renderer — specialize in real-time preview modes, offering ray-tracing inference and higher-quality previews while building and iterating.

The feature can be enabled by opening “Render Settings” under “Ray Tracing,” opening the “Direct Lighting” tab and ensuring “New Denoiser (experimental)” is turned on.

The ‘Haunted Sanctuary’ Returns

Sabour Amirazodi’s “home-made” installation, Haunted Sanctuary, has become an annual tradition, much to the delight of his neighbors.

Crowds form to watch the spectacular Halloween light show.

Amirazodi begins by staging props, such as pumpkins and skeletons, around his house.

Physical props add to the spooky atmosphere.

Then he carefully positions his projectors — building protective casings to keep them both safe and blended into the scene.

Amirazodi custom builds, paints and welds his projector cases to match the Halloween-themed decor.

“In the last few years, I’ve rendered 32,862 frames of 5K animation out of the Octane Render Engine. The loop has now become 21 minutes long, and the musical show is another 28 minutes!” — Sabour Amirazodi

Building a virtual scene onto a physical object requires projection mapping, so Amirazodi used NVIDIA GPU-accelerated MadMapper software and its structured light-scan feature to map custom visuals onto his house. He achieved this by connecting a DSLR camera to his mobile workstation, which was powered by an NVIDIA RTX A5000 GPU.

He used the camera to shoot a series of lines and capture photos. Then, he translated to the projector’s point of view an image on which to base a 3D model. Basic camera-matching tools found in Cinema 4D helped recreate the scene. Afterward, Amirazodi applied various mapping and perspective correction edits.

Projection mapping requires matching the virtual world with real-world specifications, done in Cinema 4D.

Next, Amirazodi animated and rigged the characters. GPU acceleration in the viewport enabled smooth interactivity with complex 3D models.

“I like having a choice between several third-party NVIDIA GPU-accelerated 3D renderers, such as V-Ray, OctaneRender and Redshift in Cinema 4D,” noted Amirazodi.

“I switched to NVIDIA graphics cards in 2017. GPUs are the only way to go for serious creators.” — Sabour Amirazodi

Amirazodi then spent hours on his RTX 6000 workstation creating and rendering out all the animations, assembling them in Adobe After Effects and compositing them on the scanned canvas in MadMapper. There, he crafted individual scenes to render out as chunks and assembled them in Adobe Premiere Pro. Remarkably, he repeated this workflow for every projector.

Once satisfied with the sequences, Amirazodi encoded everything using Adobe Media Encoder and loaded them onto BrightSign digital players — all networked to run the show synchronously.

Amirazodi used the advantages of GPU acceleration to streamline his workflow — saving him countless hours. “After Effects has numerous plug-ins that are GPU-accelerated — plus, Adobe Premiere Pro and Media Encoder use the new dual encoders found in the Ada generation of NVIDIA RTX 6000 GPUs, cutting my export times in half,” he said.

Smooth timeline movement in Adobe Premiere Pro assisted by the NVIDIA RTX A6000 GPU.

Amirazodi’s careful efforts are all in the Halloween spirit — creating a hauntingly memorable experience for his community.

“The hard work and long nights all become worth it when I see the smile on my kids’ faces and all the joy it brings to the entire neighborhood,” he reflected.

NVIDIA Creative Director Sabour Amirazodi.

Discover more of Amirazodi’s work on IMDb.

Follow NVIDIA Studio on Instagram, Twitter and Facebook. Access tutorials on the Studio YouTube channel and get updates directly in your inbox by subscribing to the Studio newsletter. 

Read More

AMD Extends Support for Pytorch Machine Learning Development nn Select RDNA™ 3 GPUs with ROCm™ 5.7

AMD Extends Support for Pytorch Machine Learning Development nn Select RDNA™ 3 GPUs with ROCm™ 5.7

Researchers and developers working with Machine Learning (ML) models and algorithms using PyTorch can now use AMD ROCm 5.7 on Ubuntu® Linux® to tap into the parallel computing power of the Radeon™ RX 7900 XTX and the Radeon™ PRO W7900 graphics cards which are based on the AMD RDNA™ 3 GPU architecture.

A client solution built on these two high-end GPUs enables a local, private, and cost-effective workflow for ML training and inference for those who previously relied on cloud-based solutions alone.

ML Development on Desktop

Accelerate Machine Learning With Pytorch On Your Desktop

  • A local PC or workstation system running PyTorch with a Radeon 7900 series GPU presents a capable, yet affordable solution to address these growing workflow challenges thanks to large GPU memory sizes of 24GB and even 48GB.

Unified Software Stack For The Desktop And The Datacenter

  • The latest AMD ROCm 5.7 software stack for GPU programming unlocks the massively parallel compute power of these RDNA™ 3 architecture-based GPUs for use with PyTorch, one of the leading ML frameworks. The same unified software stack also supports the CDNA™ GPU architecture of the AMD Instinct™ MI series accelerators.

Freedom To Customize

  • The AMD ROCm platform is primarily Open-Source Software (OSS). It allows developers the freedom to customize and tailor their GPU software for their own needs while collaborating with a community of other developers, and helping each other find solutions in an agile, flexible, and rapid manner. The AMD ROCm platform’s goal is to allow users to maximize their GPU hardware investment. The AMD ROCm platform is designed to help develop, test, and deploy GPU accelerated HPC, AI, scientific computing, CAD, and other applications in a free, open source, integrated and secure software ecosystem.

As the industry moves towards an ecosystem that supports a broad set of systems, frameworks and accelerators, AMD is determined to continue to make AI more accessible to PyTorch developers and researchers that benefit from a local client-based setup for ML development using RDNA™ 3 architecture-based desktop GPUs.

Learn More

https://www.amd.com/en/developer/resources/ml-radeon.html

Download Software

https://www.amd.com/en/support/linux-drivers

Visit the Documentation Portal to get started training ML models on your local desktop

https://rocm.docs.amd.com/projects/radeon/en/latest/

Prerequisites

https://rocm.docs.amd.com/projects/radeon/en/latest/docs/prerequisites.html

How to Guide

https://rocm.docs.amd.com/projects/radeon/en/latest/docs/install/howto.html

© 2023 Advanced Micro Devices, Inc. All rights reserved. AMD, the AMD Arrow logo, CDNA, Radeon, ROCm, and combinations thereof are trademarks of Advanced Micro Devices, Inc. Linux® is the registered trademark of Linus Torvalds in the U.S. and other countries. Microsoft and Windows are registered trademarks of Microsoft Corporation in the US and/or other countries. PyTorch, the PyTorch logo and any related marks are trademarks of The Linux Foundation. TensorFlow, the TensorFlow logo and any related marks are trademarks of Google Inc. Ubuntu and the Ubuntu logo are registered trademarks of Canonical Ltd. Other product names used in this publication are for identification purposes only and may be trademarks of their respective owners.

Radeon™ AI technology is compatible with all AMD Radeon 7000 Series graphics cards and newer. Please check with your system manufacturer for feature availability prior to purchase. GD-232.

  1. Based on AMD internal measurements, November 2022, comparing the Radeon RX 7900 XTX at 2.5GHz boost clock with 96 CUs issuing 2X the Bfloat16 math operations per clocks vs. the RX 6900 XT GPU at 2.25 GHz boost clock and 80 CUs issue 1X the Bfloat16 math operations per clock. RX-821

Read More

DELPHI: Data for Evaluating LLMs’ Performance in Handling Controversial Issues

*=Equal Contributors
Controversy is a reflection of our zeitgeist, and an important aspect to any discourse. The rise of large language models (LLMs) as conversational systems has increased public reliance on these systems for answers to their various questions. Consequently, it is crucial to systematically examine how these models respond to questions that pertaining to ongoing debates. However, few such datasets exist in providing human-annotated labels reflecting the contemporary discussions. To foster research in this area, we propose a novel construction of a controversial questions…Apple Machine Learning Research

Use AWS PrivateLink to set up private access to Amazon Bedrock

Use AWS PrivateLink to set up private access to Amazon Bedrock

Amazon Bedrock is a fully managed service provided by AWS that offers developers access to foundation models (FMs) and the tools to customize them for specific applications. It allows developers to build and scale generative AI applications using FMs through an API, without managing infrastructure. You can choose from various FMs from Amazon and leading AI startups such as AI21 Labs, Anthropic, Cohere, and Stability AI to find the model that’s best suited for your use case. With the Amazon Bedrock serverless experience, you can quickly get started, easily experiment with FMs, privately customize them with your own data, and seamlessly integrate and deploy them into your applications using AWS tools and capabilities.

Customers are building innovative generative AI applications using Amazon Bedrock APIs using their own proprietary data. When accessing Amazon Bedrock APIs, customers are looking for mechanism to set up a data perimeter without exposing their data to internet so they can mitigate potential threat vectors from internet exposure. The Amazon Bedrock VPC endpoint powered by AWS PrivateLink allows you to establish a private connection between the VPC in your account and the Amazon Bedrock service account. It enables VPC instances to communicate with service resources without the need for public IP addresses.

In this post, we demonstrate how to set up private access on your AWS account to access Amazon Bedrock APIs over VPC endpoints powered by PrivateLink to help you build generative AI applications securely with your own data.

Solution overview

You can use generative AI to develop a diverse range of applications, such as text summarization, content moderation, and other capabilities. When building such generative AI applications using FMs or base models, customers want to generate a response without going over the public internet or based on their proprietary data that may reside in their enterprise databases.

In the following diagram, we depict an architecture to set up your infrastructure to read your proprietary data residing in Amazon Relational Database Service (Amazon RDS) and augment the Amazon Bedrock API request with product information when answering product-related queries from your generative AI application. Although we use Amazon RDS in this diagram for illustration purposes, you can test the private access of the Amazon Bedrock APIs end to end using the instructions provided in this post.

The workflow steps are as follows:

  1. AWS Lambda running in your private VPC subnet receives the prompt request from the generative AI application.
  2. Lambda makes a call to proprietary RDS database and augments the prompt query context (for example, adding product information) and invokes the Amazon Bedrock API with the augmented query request.
  3. The API call is routed to the Amazon Bedrock VPC endpoint that is associated to the VPC endpoint policy with Allow permissions to Amazon Bedrock APIs.
  4. The Amazon Bedrock service API endpoint receives the API request over PrivateLink without traversing the public internet.
  5. You can change the Amazon Bedrock VPC endpoint policy to Deny permissions to validate that Amazon Bedrock APIs calls are denied.
  6. You can also privately access Amazon Bedrock APIs over the VPC endpoint from your corporate network through an AWS Direct Connect gateway.

Prerequisites

Before you get started, make sure you have the following prerequisites:

  • An AWS account
  • An AWS Identity and Access Management (IAM) federation role with access to do the following:
    • Create, edit, view, and delete VPC network resources
    • Create, edit, view and delete Lambda functions
    • Create, edit, view and delete IAM roles and policies
    • List foundation models and invoke the Amazon Bedrock foundation model
  • For this post, we use the us-east-1 Region
  • Request foundation model access via the Amazon Bedrock console

Set up the private access infrastructure

In this section, we set up the infrastructure such as VPC, private subnets, security groups, and Lambda function using an AWS CloudFormation template.

Use the following template to create the infrastructure stack Bedrock-GenAI-Stack in your AWS account.

The CloudFormation template creates the following resources on your behalf:

  • A VPC with two private subnets in separate Availability Zones
  • Security groups and routing tables
  • IAM role and policies for use by Lambda, Amazon Bedrock, and Amazon Elastic Compute Cloud (Amazon EC2)

Set up the VPC endpoint for Amazon Bedrock

In this section, we use Amazon Virtual Private Cloud (Amazon VPC) to set up the VPC endpoint for Amazon Bedrock to facilitate private connectivity from your VPC to Amazon Bedrock.

  1. On the Amazon VPC console, under Virtual private cloud in the navigation pane, choose Endpoints.
  2. Choose Create endpoint.
  3. For Name tag, enter bedrock-vpce.
  4. Under Services, search for bedrock-runtime and select com.amazonaws.<region>.bedrock-runtime.
  5. For VPC, specify the VPC Bedrock-GenAI-Project-vpc that you created through the CloudFormation stack in the previous section.
  6. In the Subnets section, and select the Availability Zones and choose the corresponding subnet IDs from the drop-down menu.
  7. For Security groups, select the security group with the group name Bedrock-GenAI-Stack-VPCEndpointSecurityGroup- and description Allow TLS for VPC Endpoint.

A security group acts as a virtual firewall for your instance to control inbound and outbound traffic. Note that this VPC endpoint security group only allows traffic originating from the security group attached to your VPC private subnets, adding a layer of protection.

  1. Choose Create endpoint.
  2. In the Policy section, select Custom and enter the following least privilege policy to ensure only certain actions are allowed on the specified foundation model resource, arn:aws:bedrock:*::foundation-model/anthropic.claude-instant-v1 for a given principal (such as Lambda function IAM role).
    {
    	"Version": "2012-10-17",
    	"Statement": [
    		{
    		    "Action": [
    		        "bedrock:InvokeModel"
    		        ],
    		    "Resource": [
    		        "arn:aws:bedrock:*::foundation-model/anthropic.claude-instant-v1"
    		        ],
    		    "Effect": "Allow",
    		    "Principal": {
                    "AWS": "arn:aws:iam::<accountid>:role/GenAIStack-Bedrock"
                }
    		}
    	]
    }

It may take up to 2 minutes until the interface endpoint is created and the status changes to Available. You can refresh the page to check the latest status.

Set up the Lambda function over private VPC subnets

Complete the following steps to configure the Lambda function:

  1. On the Lambda console, choose Functions in the navigation pane.
  2. Choose the function gen-ai-lambda-stack-BedrockTestLambdaFunction-XXXXXXXXXXXX.
  3. On the Configuration tab, choose Permissions in the left pane.
  4. Under Execution role¸ choose the link for the role gen-ai-lambda-stack-BedrockTestLambdaFunctionRole-XXXXXXXXXXXX.

You’re redirected to the IAM console.

  1. In the Permissions policies section, choose Add permissions and choose Create inline policy.
  2. On the JSON tab, modify the policy as follows:
    {
        "Version": "2012-10-17",
        "Statement": [
            {
                "Sid": "eniperms",
                "Effect": "Allow",
                "Action": [
                    "ec2:CreateNetworkInterface",
                    "ec2:DescribeNetworkInterfaces",
                    "ec2:DeleteNetworkInterface",
                    "ec2:*VpcEndpoint*"
                ],
                "Resource": "*"
            }
        ]
    }

  3. Choose Next.
  4. For Policy name, enter enivpce-policy.
  5. Choose Create policy.
  6. Add the following inline policy (provide your source VPC endpoints) for restricting Lambda access to Amazon Bedrock APIs only via VPC endpoints:
    {
        "Id": "lambda-bedrock-sourcevpce-access-only",
        "Version": "2012-10-17",
        "Statement": [
            {
                "Effect": "Allow",
                "Action": [
    		   "bedrock:ListFoundationModels",
                    "bedrock:InvokeModel"
                ],
                "Resource": "*",
                "Condition": {
                    "ForAnyValue:StringEquals": {
                        "aws:sourceVpce": [
                            "vpce-<bedrock-runtime-vpce>"
                        ]
                    }
                }
            }
        ]
    } 

  7. On Lambda function page, on the Configuration tab, choose VPC in the left pane, then choose Edit.
  8. For VPC, choose Bedrock-GenAI-Project-vpc.
  9. For Subnets, choose the private subnets.
  10. For Security groups, choose gen-ai-lambda-stack-SecurityGroup- (the security group for the Amazon Bedrock workload in private subnets).
  11. Choose Save.

Test private access controls

Now you can test the private access controls (Amazon Bedrock APIs over VPC endpoints).

  1. On the Lambda console, choose Functions in the navigation pane.
  2. Choose the function gen-ai-lambda-stack-BedrockTestLambdaFunction-XXXXXXXXXXXX.
  3. On the Code tab, choose Test.

You should see the following response from the Amazon Bedrock API call (Status: Succeeded).

  1. To deny access to Amazon Bedrock APIs over VPC endpoints, navigate to the Amazon VPC console.
  2. Under Virtual private cloud in the navigation pane, choose Endpoints.
  3. Choose your policy and navigate to the Policy tab.

Currently, the VPC endpoint policy is set to Allow.

  1. To deny access, choose Edit Policy.
  2. Change Allow to Deny and choose Save.

It may take up to 2 minutes for the policy for the VPC endpoint to update.

{
	"Version": "2012-10-17",
	"Statement": [
		{
		    "Action": [
		        "bedrock:InvokeModel"
		        ],
		    "Resource": [
		        "arn:aws:bedrock:*::foundation-model/anthropic.claude-instant-v1"
		        ],
		    "Effect": "Deny",
		    "Principal": {
                "AWS": "arn:aws:iam::<accountid>:role/GenAIStack-Bedrock"
            }
		}
	]
}
  1. Return to the Lambda function page and on the Code tab, choose Test.

As shown in the following screenshot, the access request to Amazon Bedrock over the VPC endpoint was denied (Status: Failed).

Through this testing process, we demonstrated how traffic from your VPC to the Amazon Bedrock API endpoint is traversing over the PrivateLink connection and not through the internet connection.

Clean up

Follow these steps to avoid incurring future charges:

  1. Clean up the VPC endpoints.
  2. Clean up the VPC.
  3. Delete the CloudFormation stack.

Conclusion

In this post, we demonstrated how to set up and operationalize a private connection between a generative AI workload deployed on your customer VPC and Amazon Bedrock using an interface VPC endpoint powered by PrivateLink. When using the architecture discussed in this post, the traffic between your customer VPC and Amazon Bedrock will not leave the Amazon network, ensuring your data is not exposed to the public internet and thereby helping with your compliance requirements.

As a next step, try the solution out in your account and share your feedback.


About the Authors

Ram Vittal is a Principal ML Solutions Architect at AWS. He has over 3 decades of experience architecting and building distributed, hybrid, and cloud applications. He is passionate about building secure and scalable AI/ML and big data solutions to help enterprise customers with their cloud adoption and optimization journey to improve their business outcomes. In his spare time, he rides his motorcycle and walks with his 3-year-old Sheepadoodle!

Ray Khorsandi is an AI/ML specialist at AWS, supporting strategic customers with AI/ML best practices. With an M.Sc. and Ph.D. in Electrical Engineering and Computer Science, he leads enterprises to build secure, scalable AI/ML and big data solutions to optimize their cloud adoption. His passions include computer vision, NLP, generative AI, and MLOps. Ray enjoys playing soccer and spending quality time with family.

Michael Daniels is an AI/ML Specialist at AWS. His expertise lies in building and leading AI/ML and generative AI solutions for complex and challenging business problems, which is enhanced by his Ph.D. from the Univ. of Texas and his M.Sc. in Computer Science specialization in Machine Learning from the Georgia Institute of Technology. He excels in applying cutting-edge cloud technologies to innovate, inspire, and transform industry-leading organizations, while also effectively communicating with stakeholders at any level or scale. In his spare time, you can catch Michael skiing or snowboarding in the mountains.

Read More

Deploy and fine-tune foundation models in Amazon SageMaker JumpStart with two lines of code

Deploy and fine-tune foundation models in Amazon SageMaker JumpStart with two lines of code

We are excited to announce a simplified version of the Amazon SageMaker JumpStart SDK that makes it straightforward to build, train, and deploy foundation models. The code for prediction is also simplified. In this post, we demonstrate how you can use the simplified SageMaker Jumpstart SDK to get started with using foundation models in just a couple of lines of code.

For more information about the simplified SageMaker JumpStart SDK for deployment and training, refer to Low-code deployment with the JumpStartModel class and Low-code fine-tuning with the JumpStartEstimator class, respectively.

Solution overview

SageMaker JumpStart provides pre-trained, open-source models for a wide range of problem types to help you get started with machine learning (ML). You can incrementally train and fine-tune these models before deployment. JumpStart also provides solution templates that set up infrastructure for common use cases, and executable example notebooks for ML with Amazon SageMaker. You can access the pre-trained models, solution templates, and examples through the SageMaker JumpStart landing page in Amazon SageMaker Studio or use the SageMaker Python SDK.

To demonstrate the new features of the SageMaker JumpStart SDK, we show you how to use the pre-trained Flan T5 XL model from Hugging Face for text generation for summarization tasks. We also showcase how, in just a few lines of code, you can fine-tune the Flan T5 XL model for summarization tasks. You can use any other model for text generation like Llama2, Falcon, or Mistral AI.

You can find the notebook for this solution using Flan T5 XL in the GitHub repo.

Deploy and invoke the model

Foundation models hosted on SageMaker JumpStart have model IDs. For the full list of model IDs, refer to Built-in Algorithms with pre-trained Model Table. For this post, we use the model ID of the Flan T5 XL text generation model. We instantiate the model object and deploy it to a SageMaker endpoint by calling its deploy method. See the following code:

from sagemaker.jumpstart.model import JumpStartModel

# Replace with larger model if needed
pretrained_model = JumpStartModel(model_id="huggingface-text2text-flan-t5-base")
pretrained_predictor = pretrained_model.deploy()

Next, we invoke the model to create a summary of the provided text using the Flan T5 XL model. The new SDK interface makes it straightforward for you to invoke the model: you just need to pass the text to the predictor and it returns the response from the model as a Python dictionary.

text = """Summarize this content - Amazon Comprehend uses natural language processing (NLP) to extract insights about the content of documents. It develops insights by recognizing the entities, key phrases, language, sentiments, and other common elements in a document. Use Amazon Comprehend to create new products based on understanding the structure of documents. For example, using Amazon Comprehend you can search social networking feeds for mentions of products or scan an entire document repository for key phrases. 
You can access Amazon Comprehend document analysis capabilities using the Amazon Comprehend console or using the Amazon Comprehend APIs. You can run real-time analysis for small workloads or you can start asynchronous analysis jobs for large document sets. You can use the pre-trained models that Amazon Comprehend provides, or you can train your own custom models for classification and entity recognition. """
query_response = pretrained_predictor.predict(text)
print(query_response["generated_text"])

The following is the output of the summarization task:

Understand how Amazon Comprehend works. Use Amazon Comprehend to analyze documents.

Fine-tune and deploy the model

The SageMaker JumpStart SDK provides you with a new class, JumpStartEstimator, which simplifies fine-tuning. You can provide the location of fine-tuning data and optionally pass validations datasets as well. After you fine-tune the model, use the deploy method of the Estimator object to deploy the fine-tuned model:

from sagemaker.jumpstart.estimator import JumpStartEstimator

estimator = JumpStartEstimator(
    model_id=model_id,
)
estimator.set_hyperparameters(instruction_tuned="True", epoch="3", max_input_length="1024")
estimator.fit({"training": train_data_location})
finetuned_predictor = estimator.deploy()

Customize the new classes in the SageMaker SDK

The new SDK makes it straightforward to deploy and fine-tune JumpStart models by defaulting many parameters. You still have the option to override the defaults and customize the deployment and invocation based on your requirements. For example, you can customize input payload format type, instance type, VPC configuration, and more for your environment and use case.

The following code shows how to override the instance type while deploying your model:

finetuned_predictor = estimator.deploy(instance_type='ml.g5.2xlarge')

The SageMaker JumpStart SDK deploy function will automatically select a default content type and serializer for you. If you want to change the format type of the input payload, you can use serializers and content_types objects to introspect the options available to you by passing the model_id of the model you are working with. In the following code, we set the payload input format as JSON by setting JSONSerializer as serializer and application/json as content_type:

from sagemaker import serializers
from sagemaker import content_types

serializer_options = serializers.retrieve_options(model_id=model_id, model_version=model_version)
content_type_options = content_types.retrieve_options(model_id=model_id, model_version=model_version)

pretrained_predictor.serializer = serializers.JSONSerializer()
pretrained_predictor.content_type = 'application/json'

Next, you can invoke the Flan T5 XL model for the summarization task with a payload of the JSON format. In the following code, we also pass inference parameters in the JSON payload for making responses more accurate:

from sagemaker import serializers

input_text= """Summarize this content - Amazon Comprehend uses natural language processing (NLP) to extract insights about the content of documents. It develops insights by recognizing the entities, key phrases, language, sentiments, and other common elements in a document. Use Amazon Comprehend to create new products based on understanding the structure of documents. For example, using Amazon Comprehend you can search social networking feeds for mentions of products or scan an entire document repository for key phrases.
You can access Amazon Comprehend document analysis capabilities using the Amazon Comprehend console or using the Amazon Comprehend APIs. You can run real-time analysis for small workloads or you can start asynchronous analysis jobs for large document sets. You can use the pre-trained models that Amazon Comprehend provides, or you can train your own custom models for classification and entity recognition. """

parameters = {
    "max_length": 600,
    "num_return_sequences": 1,
    "top_p": 0.01,
    "do_sample": False,
}

payload = {"text_inputs": input_text, **parameters} #JSON Input format

pretrained_predictor.serializer = serializers.JSONSerializer()
query_response = pretrained_predictor.predict(payload)
print(query_response["generated_texts"][0])

If you’re looking for more ways to customize the inputs and other options for hosting and fine-tuning, refer to the documentation for the JumpStartModel and JumpStartEstimator classes.

Conclusion

In this post, we showed you how you can use the simplified SageMaker JumpStart SDK for building, training, and deploying task-based and foundation models in just a few lines of code. We demonstrated the new classes like JumpStartModel and JumpStartEstimator using the Hugging Face Flan T5-XL model as an example. You can use any of the other SageMaker JumpStart foundation models for use cases such as content writing, code generation, question answering, summarization, classification, information retrieval, and more. To see the whole list of models available with SageMaker JumpStart, refer to Built-in Algorithms with pre-trained Model Table. SageMaker JumpStart also supports task-specific models for many popular problem types.

We hope the simplified interface of the SageMaker JumpStart SDK will help you get started quickly and enable you to deliver faster. We look forward to hearing how you use the simplified SageMaker JumpStart SDK to create exciting applications!


About the authors

Evan Kravitz is a software engineer at Amazon Web Services, working on SageMaker JumpStart. He is interested in the confluence of machine learning with cloud computing. Evan received his undergraduate degree from Cornell University and master’s degree from the University of California, Berkeley. In 2021, he presented a paper on adversarial neural networks at the ICLR conference. In his free time, Evan enjoys cooking, traveling, and going on runs in New York City.

Rachna Chadha is a Principal Solution Architect AI/ML in Strategic Accounts at AWS. Rachna is an optimist who believes that ethical and responsible use of AI can improve society in the future and bring economic and social prosperity. In her spare time, Rachna likes spending time with her family, hiking, and listening to music.

Jonathan Guinegagne is a Senior Software Engineer with Amazon SageMaker JumpStart at AWS. He got his master’s degree from Columbia University. His interests span machine learning, distributed systems, and cloud computing, as well as democratizing the use of AI. Jonathan is originally from France and now lives in Brooklyn, NY.

Dr. Ashish Khetan is a Senior Applied Scientist with Amazon SageMaker built-in algorithms and helps develop machine learning algorithms. He got his PhD from University of Illinois Urbana-Champaign. He is an active researcher in machine learning and statistical inference, and has published many papers in NeurIPS, ICML, ICLR, JMLR, ACL, and EMNLP conferences.

Read More

Silicon Volley: Designers Tap Generative AI for a Chip Assist

Silicon Volley: Designers Tap Generative AI for a Chip Assist

A research paper released today describes ways generative AI can assist one of the most complex engineering efforts: designing semiconductors.

The work demonstrates how companies in highly specialized fields can train large language models (LLMs) on their internal data to build assistants that increase productivity.

Few pursuits are as challenging as semiconductor design. Under a microscope, a state-of-the-art chip like an NVIDIA H100 Tensor Core GPU (above) looks like a well-planned metropolis, built with tens of billions of transistors, connected on streets 10,000x thinner than a human hair.

Multiple engineering teams coordinate for as long as two years to construct one of these digital megacities.

Some groups define the chip’s overall architecture, some craft and place a variety of ultra-small circuits, and others test their work. Each job requires specialized methods, software programs and computer languages.

A Broad Vision for LLMs

“I believe over time large language models will help all the processes, across the board,” said Mark Ren, an NVIDIA Research director and lead author on the paper.

Bill Dally, NVIDIA’s chief scientist, announced the paper today in a keynote at the International Conference on Computer-Aided Design, an annual gathering of hundreds of engineers working in the field called electronic design automation, or EDA.

“This effort marks an important first step in applying LLMs to the complex work of designing semiconductors,” said Dally at the event in San Francisco. “It shows how even highly specialized fields can use their internal data to train useful generative AI models.”

ChipNeMo Surfaces

The paper details how NVIDIA engineers created for their internal use a custom LLM, called ChipNeMo, trained on the company’s internal data to generate and optimize software and assist human designers.

Long term, engineers hope to apply generative AI to each stage of chip design, potentially reaping significant gains in overall productivity, said Ren, whose career spans more than 20 years in EDA.

After surveying NVIDIA engineers for possible use cases, the research team chose three to start: a chatbot, a code generator and an analysis tool.

Initial Use Cases

The latter — a tool that automates the time-consuming tasks of maintaining updated descriptions of known bugs — has been the most well-received so far.

A prototype chatbot that responds to questions about GPU architecture and design helped many engineers quickly find technical documents in early tests.

Animation of a generative AI code generator using an LLM
A code generator will help designers write software for a chip design.

A code generator in development (demonstrated above)  already creates snippets of about 10-20 lines of software in two specialized languages chip designers use. It will be integrated with existing tools, so engineers have a handy assistant for designs in progress.

Customizing AI Models With NVIDIA NeMo

The paper mainly focuses on the team’s work gathering its design data and using it to create a specialized generative AI model, a process portable to any industry.

As its starting point, the team chose a foundation model and customized it with NVIDIA NeMo, a framework for building, customizing and deploying generative AI models that’s included in the NVIDIA AI Enterprise software platform. The selected NeMo model sports 43 billion parameters, a measure of its capability to understand patterns. It was trained using more than a trillion tokens, the words and symbols in text and software.

Diagram of the ChipNeMo workflow for training a custom model
ChipNeMo provides an example of how one deeply technical team refined a pretrained model with its own data.

The team then refined the model in two training rounds, the first using about 24 billion tokens worth of its internal design data and the second on a mix of about 130,000 conversation and design examples.

The work is among several examples of research and proofs of concept of generative AI in the semiconductor industry, just beginning to emerge from the lab.

Sharing Lessons Learned

One of the most important lessons Ren’s team learned is the value of customizing an LLM.

On chip-design tasks, custom ChipNeMo models with as few as 13 billion parameters match or exceed performance of even much larger general-purpose LLMs like LLaMA2 with 70 billion parameters. In some use cases, ChipNeMo models were dramatically better.

Along the way, users need to exercise care in what data they collect and how they clean it for use in training, he added.

Finally, Ren advises users to stay abreast of the latest tools that can speed and simplify the work.

NVIDIA Research has hundreds of scientists and engineers worldwide focused on topics such as AI, computer graphics, computer vision, self-driving cars and robotics. Other recent projects in semiconductors include using AI to design smaller, faster circuits and to optimize placement of large blocks.

Enterprises looking to build their own custom LLMs can get started today using NeMo framework available from GitHub and NVIDIA NGC catalog.

Read More

Teachers in India help Microsoft Research design AI tool for creating great classroom content

Teachers in India help Microsoft Research design AI tool for creating great classroom content

a group of people sitting at a desk in front of a crowd

Teachers are the backbone of any educational system. They are not just educators; they are indispensable navigators, mentors, and leaders. Teachers around the world face many challenges, which vary from country to country or even within a city or town. But some challenges are universal, including time management, classroom organization, and creating effective lesson plans.

Advances in AI present new opportunities to enhance teachers’ abilities and empower students to learn more effectively. That’s the goal of a new project from Microsoft Research, which uses generative AI to help teachers quickly develop personalized learning experiences, design assignments, create hands-on activities, and more, while giving them back hours of time that they spend on daily planning today.

Shiksha copilot is a research project which is an interdisciplinary collaboration between Microsoft Research India and teams across Microsoft. Shiksha (Sanskrit: शिक्षा, IAST and ISO: śikṣā) is a Sanskrit word, which means “instruction, lesson, learning, study of skill”. The project aims to improve learning outcomes and empower teachers to create comprehensive, age-appropriate lesson plans combining the best available online resources, including textbooks, videos, classroom activities, and student assessment tools. To help curate these resources, the project team built a copilot—an AI-powered digital assistant—centered around teachers’ specific needs, which were identified right at the start through multiple interviews and workshops.

Working with Sikshana Foundation (opens in new tab), a local non-governmental organization focused on improving public education, the researchers are piloting this program at several public schools in and around Bengaluru, India, to build and improve the underlying tools. This post gives an overview of the project, including interviews with three teachers who have used Shiksha copilot in their own classrooms.

Spotlight: Microsoft research newsletter

Microsoft Research Newsletter

Stay connected to the research community at Microsoft.


A road map for teachers

A lesson plan is like a road map charting what students need to learn and how to efficiently cover the material during class time. It includes three key components:​

  • Objectives for student learning, based on grade level and subject​  
  • Teaching and learning tactics, including tutorials and activities to help students understand the topic
  • Strategies to assess student understanding, both in class and through homework 

Parimala H V teaches science in grades 6-8 at Government Higher Primary School, Santhe Beedhi in Bengaluru. She teaches in the local language, Kannada, and in English. For each class she teaches, she spends an hour or more each day scanning textbooks and printed materials to put together an effective lesson plan. She also searches the internet for ideas, but sifting through the growing body of online content could take just as long. Often she would work till midnight planning the next day’s activities, which left her feeling tired and stressed.

“Lesson planning can be a struggle, but it’s very important,” Parimala said. “If the planning goes well, everything goes well.”

With Shiksha copilot, Parimala was able to develop a complete lesson plan in 60 to 90 seconds, instead of 60 to 90 minutes. The simple interface asks basic questions about the curriculum, language of delivery, grade level, and subject. It then compiles engaging learning materials to achieve the teacher’s classroom objectives. Parimala finds better ideas and hands-on activities using Shiksha copilot than through other online tools. She feels well rested and better prepared for her day, which also makes her happier in the classroom. And with the time she saves, she can focus more on coaching her students and improving her teaching practices.

Ms. Parimala standing in front of a school

“I was thrilled to have the opportunity to use Shiksha copilot,” Parimala said. “It could be very useful for new teachers just learning their profession. I think it could revolutionize the way teachers teach.” 

Parimala H.V., Teacher, Government Higher Primary School, Santhee Beedhi

At Parimala’s school and others in the Bengaluru area, teachers face some significant challenges. Classrooms can have up to 70 students of varying abilities. Teachers often need to prepare lessons and give instruction in both English and Kannada. As the Covid pandemic brought about remote learning on a large scale, technology began to rapidly change how teachers and students interact. Most students now have computers or smartphones, expanding teachers’ options. But it also makes it harder to keep students focused on a traditional classroom blackboard.

“These children are addicted to their mobile phones and social media. If I use the ‘chalk and talk’ method in class, they may get bored,” said Gireesh K S, who relies heavily on his blackboard to teach math and physics at Government High School, Jalige. Gireesh has used web search tools to find digital resources like interactive PowerPoint slides that will hold his students’ attention longer. With Shiksha copilot, he can zero in more quickly on videos or classroom activities that help him connect better with all 40+ students in his class.

“Here lies the teacher’s job. The teacher has to select whichever activity, whichever video, or whichever questions to use,” Gireesh said. “There are so many questions and videos (to choose from), but as a teacher for my class, I know my students. So, I have to select the suitable ones.”

Other learning platforms were less flexible and less dynamic, returning static content options that were not always useful for a diverse group of learners. Shiksha copilot, on the other hand, does a much better job of customizing and adapting its recommendations based on teacher input, Gireesh said.

“Shiksha copilot is very easy to use when compared to other AI we have tried, because it is mapped with our own syllabus and our own curriculum.”

Gireesh K S, Teacher, Government High School, Jalige

Mr. Gireesh KM posing for the camera

Behind the technology

Designing and building Shiksha copilot requires various technological innovations. Educational content is mainly multimodal, including text, images, tables, videos, charts, and interactive materials. Therefore, for developing engaging learning experiences, it is essential to build generative AI models which have unified multimodal capabilities. Also, these experiences are most impactful when delivered in native languages, which requires improving the multilingual capabilities of generative AI models.

Shiksha copilot includes a range of powerful features that address those challenges and enhance the educational experience. It’s grounded in specific curricula and learning objectives, to ensure that all generated content aligns with desired educational outcomes, according to Akshay Nambi (opens in new tab), principal researcher at Microsoft Research. “This grounding is enabled by ingesting relevant data with the help of state-of-the-art optical character recognition (OCR), computer vision (CV) and generative AI models. It was also important to use natural language and support voice-based interactions while including options for English and Kannada speakers,” Nambi said. 

Shiksha copilot supports connectivity to both public and private resource content, enabling educators to tap into a vast array of materials and tailor them to their unique teaching requirements. Shiksha copilot can be accessed through different modalities, such as WhatsApp, Telegram, and web applications, enabling seamless integration with teachers’ current workflows.

To help create content more quickly and efficiently, the system leverages semantic caching with LLMs. Storing and reusing previously processed educational content reduces computational resources required to deliver a scalable, and affordable copilot experience. Throughout development, the project team followed established protocols regarding safety, reliability and trustworthiness.

“Extensive prompt designing, testing and rigorous responsible AI procedures, including content filtering and moderation, red team assessments and jailbreaking simulations, have been deployed to maximize safety and reliability. These measures are in place so that Shiksha copilot consistently produces factual and trustworthy content,” said Tanuja Ganu, principal research SDE manager at Microsoft Research.

Convincing the skeptics

Before the initial workshop, some teachers expressed skepticism about using AI for lesson planning. Students already have multiple digital learning tools. But for Mahalakshmi A, who teaches standard science in grades 4-8 at rural Government Higher Primary School, Basavana Halli, outside Bengaluru, the value for teachers was less clear. However, during a two-hour initial workshop session, Mahalakshmi found she could easily create multiple lesson plans using Shiksha copilot that would work well in her classroom.

Ms. Mahalakshmi standing in front of a classroom

“I felt very happy because it’s a totally different concept. Before now, I could see that technology could work for the students. But this is the first time that it felt like the teachers also had a tool for themselves.”

Mahalakshmi A., Teacher, Government Higher Primary School, Basavana Halli

Mahalakshmi could also see how the content assembled using Shiksha copilot would make her class more interesting for her students, which is an important goal. “Instead of giving them the same problems, the same experiments, and the same videos, we make learning interesting. And then they learn what we call shashwatha kalike, or permanent learning. With Shiksha copilot, we can make that permanent learning happen in our classroom,” she added.

Next steps

The initial pilot program for Shiksha copilot is underway at more than 10 schools in and around Bengaluru. The goal is to let the teachers experience how Shiksha copilot can best be used in their daily workflows to improve learning experiences and collect feedback. The early response has been highly positive, with teachers expressing great satisfaction in both the quality of the content generated and the time savings. To build on this successful pilot, researchers are gearing up to scale Shiksha copilot in schools across the state of Karnataka and beyond, in collaboration with Sikshana Foundation.

This copilot is being developed as part of Project VeLLM (Universal Empowerment with Large Language Models) at Microsoft Research India. VeLLM’s goal is to make inclusive and accessible copilots available to everyone by building a platform for developing population-scale copilots. Inclusive copilots must address various real-world challenges, such as a multilingual user base, varied skillsets, limited devices and connectivity, domain-specific understanding, guardrails, and safety principles. Shiksha is the first copilot developed using the VeLLM platform. The VeLLM team is working with collaborators across diverse domains, such as agriculture and healthcare, to develop tailored domain-specific copilot experiences utilizing the platform and addressing associated research problems. 

To learn more about the project or collaboration opportunities, email the team at shikshacopilot@microsoft.com

Group photo (from left to right): Meena Elapulli (MSR), Ishaan Watts (MSR), Kavyansh Chourasia (MSR), Gireesh K.S. (GHPS, Tumkur), Srujana V S (MSR), Tanuja Ganu (MSR), Mahalakshmi A (GHPS, Basavana Halli), Parimala H.V. (GHPS,Santhe Beedi), Ravi R (GHPS,Gowdahalli), Maruthi K.R. (GHPS, Anedoddi), Smitha Venkatesh (Sikshana Foundation),  Akshay Nambi (MSR), Somnath Kumar (MSR), Yash Gadhia (MSR), Sanchit Gupta (MSR)
The Shiksha copilot team and collaborators (from left to right): Meena Elapulli (Microsoft Research), Ishaan Watts (Microsoft Research), Kavyansh Chourasia (Microsoft Research), Gireesh K.S. (GHPS, Tumkur), Srujana V S (Microsoft Research), Tanuja Ganu (Microsoft Research), Mahalakshmi A (GHPS, Basavana Halli), Parimala H.V. (GHPS, Santhe Beedi), Ravi R (GHPS, Gowdahalli), Maruthi K.R. (GHPS, Anedoddi), Smitha Venkatesh (Sikshana Foundation), Akshay Nambi (Microsoft Research), Somnath Kumar (Microsoft Research), Yash Gadhia (Microsoft Research), Sanchit Gupta (Microsoft Research)

The post Teachers in India help Microsoft Research design AI tool for creating great classroom content appeared first on Microsoft Research.

Read More