October 2023 – Page 9

Learn how Amazon Pharmacy created their LLM-based chat-bot using Amazon SageMaker

Amazon Pharmacy is a full-service pharmacy on Amazon.com that offers transparent pricing, clinical and customer support, and free delivery right to your door. Customer care agents play a crucial role in quickly and accurately retrieving information related to pharmacy information, including prescription clarifications and transfer status, order and dispensing details, and patient profile information, in real time. Amazon Pharmacy provides a chat interface where customers (patients and doctors) can talk online with customer care representatives (agents). One challenge that agents face is finding the precise information when answering customers’ questions, because the diversity, volume, and complexity of healthcare’s processes (such as explaining prior authorizations) can be daunting. Finding the right information, summarizing it, and explaining it takes time, slowing down the speed to serve patients.

To tackle this challenge, Amazon Pharmacy built a generative AI question and answering (Q&A) chatbot assistant to empower agents to retrieve information with natural language searches in real time, while preserving the human interaction with customers. The solution is HIPAA compliant, ensuring customer privacy. In addition, agents submit their feedback related to the machine-generated answers back to the Amazon Pharmacy development team, so that it can be used for future model improvements.

In this post, we describe how Amazon Pharmacy implemented its customer care agent assistant chatbot solution using AWS AI products, including foundation models in Amazon SageMaker JumpStart to accelerate its development. We start by highlighting the overall experience of the customer care agent with the addition of the large language model (LLM)-based chatbot. Then we explain how the solution uses the Retrieval Augmented Generation (RAG) pattern for its implementation. Finally, we describe the product architecture. This post demonstrates how generative AI is integrated into an already working application in a complex and highly regulated business, improving the customer care experience for pharmacy patients.

The LLM-based Q&A chatbot

The following figure shows the process flow of a patient contacting Amazon Pharmacy customer care via chat (Step 1). Agents use a separate internal customer care UI to ask questions to the LLM-based Q&A chatbot (Step 2). The customer care UI then sends the request to a service backend hosted on AWS Fargate (Step 3), where the queries are orchestrated through a combination of models and data retrieval processes, collectively known as the RAG process. This process is the heart of the LLM-based chatbot solution and its details are explained in the next section. At the end of this process, the machine-generated response is returned to the agent, who can review the answer before providing it back to the end-customer (Step 4). It should be noted that agents are trained to exercise judgment and use the LLM-based chatbot solution as a tool that augments their work, so they can dedicate their time to personal interactions with the customer. Agents also label the machine-generated response with their feedback (for example, positive or negative). This feedback is then used by the Amazon Pharmacy development team to improve the solution (through fine-tuning or data improvements), forming a continuous cycle of product development with the user (Step 5).

The following figure shows an example from a Q&A chatbot and agent interaction. Here, the agent was asking about a claim rejection code. The Q&A chatbot (Agent AI Assistant) answers the question with a clear description of the rejection code. It also provides the link to the original documentation for the agents to follow up, if needed.

Accelerating the ML model development

In the previous figure depicting the chatbot workflow, we skipped the details of how to train the initial version of the Q&A chatbot models. To do this, the Amazon Pharmacy development team benefited from using SageMaker JumpStart. SageMaker JumpStart allowed the team to experiment quickly with different models, running different benchmarks and tests, failing fast as needed. Failing fast is a concept practiced by the scientist and developers to quickly build solutions as realistic as possible and learn from their efforts to make it better in the next iteration. After the team decided on the model and performed any necessary fine-tuning and customization, they used SageMaker hosting to deploy the solution. The reuse of the foundation models in SageMaker JumpStart allowed the development team to cut months of work that otherwise would have been needed to train models from scratch.

The RAG design pattern

One core part of the solution is the use of the Retrieval Augmented Generation (RAG) design pattern for implementing Q&A solutions. The first step in this pattern is to identify a set of known question and answer pairs, which is the initial ground truth for the solution. The next step is to convert the questions to a better representation for the purpose of similarity and searching, which is called embedding (we embed a higher-dimensional object into a hyperplane with less dimensions). This is done through an embedding-specific foundation model. These embeddings are used as indexes to the answers, much like how a database index maps a primary key to a row. We’re now ready to support new queries coming from the customer. As explained previously, the experience is that customers send their queries to agents, who then interface with the LLM-based chatbot. Within the Q&A chatbot, the query is converted to an embedding and then used as a search key for a matching index (from the previous step). The matching criteria is based on a similarity model, such as FAISS or Amazon Open Search Service (for more details, refer to Amazon OpenSearch Service’s vector database capabilities explained). When there are matches, the top answers are retrieved and used as the prompt context for the generative model. This corresponds to the second step in the RAG pattern—the generative step. In this step, the prompt is sent to the LLM (generator foundation modal), which composes the final machine-generated response to the original question. This response is provided back through the customer care UI to the agent, who validates the answer, edits it if needed, and sends it back to the patient. The following diagram illustrates this process.

Managing the knowledge base

As we learned with the RAG pattern, the first step in performing Q&A consists of retrieving the data (the question and answer pairs) to be used as context for the LLM prompt. This data is referred to as the chatbot’s knowledge base. Examples of this data are Amazon Pharmacy internal standard operating procedures (SOPs) and information available in Amazon Pharmacy Help Center. To facilitate the indexing and the retrieval process (as described previously), it’s often useful to gather all this information, which may be hosted across different solutions such as in wikis, files, and databases, into a single repository. In the particular case of the Amazon Pharmacy chatbot, we use Amazon Simple Storage Service (Amazon S3) for this purpose because of its simplicity and flexibility.

Solution overview

The following figure shows the solution architecture. The customer care application and the LLM-based Q&A chatbot are deployed in their own VPC for network isolation. The connection between the VPC endpoints is realized through AWS PrivateLink, guaranteeing their privacy. The Q&A chatbot likewise has its own AWS account for role separation, isolation, and ease of monitoring for security, cost, and compliance purposes. The Q&A chatbot orchestration logic is hosted in Fargate with Amazon Elastic Container Service (Amazon ECS). To set up PrivateLink, a Network Load Balancer proxies the requests to an Application Load Balancer, which stops the end-client TLS connection and hands requests off to Fargate. The primary storage service is Amazon S3. As mentioned previously, the related input data is imported into the desired format inside the Q&A chatbot account and persisted in S3 buckets.

When it comes to the machine learning (ML) infrastructure, Amazon SageMaker is at the center of the architecture. As explained in the previous sections, two models are used, the embedding model and the LLM model, and these are hosted in two separate SageMaker endpoints. By using the SageMaker data capture feature, we can log all inference requests and responses for troubleshooting purposes, with the necessary privacy and security constraints in place. Next, the feedback taken from the agents is stored in a separate S3 bucket.

The Q&A chatbot is designed to be a multi-tenant solution and support additional health products from Amazon Health Services, such as Amazon Clinic. For example, the solution is deployed with AWS CloudFormation templates for infrastructure as a code (IaC), allowing different knowledge bases to be used.

Conclusion

This post presented the technical solution for Amazon Pharmacy generative AI customer care improvements. The solution consists of a question answering chatbot implementing the RAG design pattern on SageMaker and foundation models in SageMaker JumpStart. With this solution, customer care agents can assist patients more quickly, while providing precise, informative, and concise answers.

The architecture uses modular microservices with separate components for knowledge base preparation and loading, chatbot (instruction) logic, embedding indexing and retrieval, LLM content generation, and feedback supervision. The latter is especially important for ongoing model improvements. The foundation models in SageMaker JumpStart are used for fast experimentation with model serving being done with SageMaker endpoints. Finally, the HIPAA-compliant chatbot server is hosted on Fargate.

In summary, we saw how Amazon Pharmacy is using generative AI and AWS to improve customer care while prioritizing responsible AI principles and practices.

You can start experimenting with foundation models in SageMaker JumpStart today to find the right foundation models for your use case and start building your generative AI application on SageMaker.

About the author

Burak Gozluklu is a Principal AI/ML Specialist Solutions Architect located in Boston, MA. He helps global customers adopt AWS technologies and specifically AI/ML solutions to achieve their business objectives. Burak has a PhD in Aerospace Engineering from METU, an MS in Systems Engineering, and a post-doc in system dynamics from MIT in Cambridge, MA. Burak is passionate about yoga and meditation.

Jangwon Kim is a Sr. Applied Scientist at Amazon Health Store & Tech. He has expertise in LLM, NLP, Speech AI, and Search. Prior to joining Amazon Health, Jangwon was an applied scientist at Amazon Alexa Speech. He is based out of Los Angeles.

Alexandre Alves is a Sr. Principal Engineer at Amazon Health Services, specializing in ML, optimization, and distributed systems. He helps deliver wellness-forward health experiences.

Nirvay Kumar is a Sr. Software Dev Engineer at Amazon Health Services, leading architecture within Pharmacy Operations after many years in Fulfillment Technologies. With expertise in distributed systems, he has cultivated a growing passion for AI’s potential. Nirvay channels his talents into engineering systems that solve real customer needs with creativity, care, security, and a long-term vision. When not hiking the mountains of Washington, he focuses on thoughtful design that anticipates the unexpected. Nirvay aims to build systems that withstand the test of time and serve customers’ evolving needs.

Keeping an eye on your cattle using AI technology

At Amazon Web Services (AWS), not only are we passionate about providing customers with a variety of comprehensive technical solutions, but we’re also keen on deeply understanding our customers’ business processes. We adopt a third-party perspective and objective judgment to help customers sort out their value propositions, collect pain points, propose appropriate solutions, and create the most cost-effective and usable prototypes to help them systematically achieve their business goals.

This method is called working backwards at AWS. It means putting aside technology and solutions, starting from the expected results of customers, confirming their value, and then deducing what needs to be done in reverse order before finally implementing a solution. During the implementation phase, we also follow the concept of minimum viable product and strive to quickly form a prototype that can generate value within a few weeks, and then iterate on it.

Today, let’s review a case study where AWS and New Hope Dairy collaborated to build a smart farm on the cloud. From this blog post, you can have a deep understanding about what AWS can provide for building a smart farm and how to build smart farm applications on the cloud with AWS experts.

Project background

Milk is a nutritious beverage. In consideration of national health, China has been actively promoting the development of the dairy industry. According to data from Euromonitor International, the sale of dairy products in China reached 638.5 billion RMB in 2020 and is expected to reach 810 billion RMB in 2025. In addition, the compound annual growth rate in the past 14 years has also reached 10 percent, showing rapid development.

On the other hand, as of 2022, most of the revenue in the Chinese dairy industry still comes from liquid milk. Sixty percent of the raw milk is used for liquid milk and yogurt, and another 20 percent is milk powder—a derivative of liquid milk. Only a very small amount is used for highly processed products such as cheese and cream.

Liquid milk is a lightly processed product and its output, quality, and cost are closely linked to raw milk. This means that if the dairy industry wants to free capacity to focus on producing highly processed products, create new products, and conduct more innovative biotechnology research, it must first improve and stabilize the production and quality of raw milk.

As a dairy industry leader, New Hope Dairy has been thinking about how to improve the efficiency of its ranch operations and increase the production and quality of raw milk. New Hope Dairy hopes to use the third-party perspective and technological expertise of AWS to facilitate innovation in the dairy industry. With support and promotion from Liutong Hu, VP and CIO of New Hope Dairy, the AWS customer team began to organize operations and potential innovation points for the dairy farms.

Dairy farm challenges

AWS is an expert in the field of cloud technology, but to implement innovation in the dairy industry, professional advice from dairy subject matter experts is necessary. Therefore, we conducted several in-depth interviews with Liangrong Song, the Deputy Director of Production Technology Center of New Hope Dairy, the ranch management team, and nutritionists to understand some of the issues and challenges facing the farm.

First is taking inventory of reserve cows

The dairy cows on the ranch are divided into two types: dairy cows and reserve cows. Dairy cows are mature and continuously produce milk, while reserve cows are cows that have not yet reached the age to produce milk. Large and medium-sized farms usually provide reserve cows with a larger open activity area to create a more comfortable growing environment.

However, both dairy cows and reserve cows are assets of the farm and need to be inventoried monthly. Dairy cows are milked every day, and because they are relatively still during milking, inventory tracking is easy. However, reserve cows are in an open space and roam freely, which makes it inconvenient to inventory them. Each time inventory is taken, several workers count the reserve cows repeatedly from different areas, and finally, the numbers are checked. This process consumes one to two days for several workers, and often there are problems with aligning the counts or uncertainties about whether each cow has been counted.

Significant time can be saved if we have a way to inventory reserve cows quickly and accurately.

Second is identifying lame cattle

Currently, most dairy companies use a breed named Holstein to produce milk. Holsteins are the black and white cows most of us are familiar with. Despite most dairy companies using the same breed, there are still differences in milk production quantity and quality among different companies and ranches. This is because the health of dairy cows directly affects milk production.

However, cows cannot express discomfort on their own like humans can, and it isn’t practical for veterinarians to give thousands of cows physical examinations regularly. Therefore, we have to use external indicators to quickly judge the health status of cows.

The external indicators of a cow’s health include body condition score and lameness degree. Body condition score is largely related to the cow’s body fat percentage and is a long-term indicator, while lameness is a short-term indicator caused by leg problems or foot infections and other issues that affect the cow’s mood, health, and milk production. Additionally, adult Holstein cows can weigh over 500 kg, which can cause significant harm to their feet if they aren’t stable. Therefore, when lameness occurs, veterinarians should intervene as soon as possible.

According to a 2014 study, the proportion of severely lame cows in China can be as high as 31 percent. Although the situation might have improved since the study, the veterinarian count on farms is extremely limited, making it difficult to monitor cows regularly. When lameness is detected, the situation is often severe, and treatment is time-consuming and difficult, and milk production is already affected.

If we have a way to timely detect lameness in cows and prompt veterinarians to intervene at the mild lameness stage, the overall health and milk production of the cows will increase, and the performance of the farm will improve.

Lastly, there is feed cost optimization

Within the livestock industry, feed is the biggest variable cost. To ensure the quality and inventory of feed, farms often need to purchase feed ingredients from domestic and overseas suppliers and deliver them to feed formulation factories for processing. There are many types of modern feed ingredients, including soybean meal, corn, alfalfa, oat grass, and so on, which means that there are many variables at play. Each type of feed ingredient has its own price cycle and price fluctuations. During significant fluctuations, the total cost of feed can fluctuate by more than 15 percent, causing a significant impact.

Feed costs fluctuate, but dairy product prices are relatively stable over the long term. Consequently, under otherwise unchanged conditions, the overall profit can fluctuate significantly purely due to feed cost changes.

To avoid this fluctuation, it’s necessary to consider storing more ingredients when prices are low. But stocking also needs to consider whether the price is genuinely at the trough and what quantity of feed should be purchased according to the current consumption rate.

If we have a way to precisely forecast feed consumption and combine it with the overall price trend to suggest the best time and quantity of feed to purchase, we can reduce costs and increase efficiency on the farm.

It’s evident that these issues are directly related to the customer’s goal of improving farm operational efficiency, and the methods are respectively freeing up labor, increasing production and reducing costs. Through discussions on the difficulty and value of solving each issue, we chose increasing production as the starting point and prioritized solving the problem of lame cows.

Research

Before discussing technology, research had to be conducted. The research was jointly conducted by the AWS customer team, the AWS Generative AI Innovation Center, which managed the machine learning algorithm models, and AWS AI Shanghai Lablet, which provides algorithm consultation on the latest computer vision research and the expert farming team from New Hope Dairy. The research was divided into several parts:

Understanding the traditional paper-based identification method of lame cows and developing a basic understanding of what lame cows are.
Confirming existing solutions, including those used in farms and in the industry.
Conducting farm environment research to understand the physical situation and limitations.

Through studying materials and observing on-site videos, the teams gained a basic understanding of lame cows. Readers can also get a basic idea of the posture of lame cows through the animated image below.

In contrast to a relatively healthy cow.

Lame cows have visible differences in posture and gait compared to healthy cows.

Regarding existing solutions, most ranches rely on visual inspection by veterinarians and nutritionists to identify lame cows. In the industry, there are solutions that use wearable pedometers and accelerometers for identification, as well as solutions that use partitioned weighbridges for identification, but both are relatively expensive. For the highly competitive dairy industry, we need to minimize identification costs and the costs and dependence on non-generic hardware.

After discussing and analyzing the information with ranch veterinarians and nutritionists, the AWS Generative AI Innovation Center experts decided to use computer vision (CV) for identification, relying only on ordinary hardware: civilian surveillance cameras, which don’t add any additional burden to the cows and reduce costs and usage barriers.

After deciding on this direction, we visited a medium-sized farm with thousands of cows on site, investigated the ranch environment, and determined the location and angle of camera placement.

Initial proposal

Now, for the solution. The core of our CV-based solution consists of the following steps:

Cow identification: Identify multiple cows in a single frame of video and mark the position of each cow.
Cow tracking: While video is recording, we need to continuously track cows as the frames change and assign a unique number to each cow.
Posture marking: Reduce the dimensionality of cow movements by converting cow images to marked points.
Anomaly identification: Identify anomalies in the marked points’ dynamics.
Lame cow algorithm: Normalize the anomalies to obtain a score to determine the degree of cow lameness.
Threshold determination: Obtain a threshold based on expert inputs.

According to the judgment of the AWS Generative AI Innovation Center experts, the first few steps are generic requirements that can be solved using open-source models, while the latter steps require us to use mathematical methods and expert intervention.

Difficulties in the solution

To balance cost and performance, we chose the yolov5l model, a medium-sized pre-trained model for cow recognition, with an input width of 640 pixels, which provides good value for this scene.

While YOLOv5 is responsible for recognizing and tagging cows in a single image, in reality, videos consist of multiple images (frames) that change continuously. YOLOv5 cannot identify that cows in different frames belong to the same individual. To track and locate a cow across multiple images, another model called SORT is needed.

SORT stands for simple online and realtime tracking, where online means it considers only the current and previous frames to track without consideration of any other frames, and realtime means it can identify the object’s identity immediately.

After the development of SORT, many engineers implemented and optimized it, leading to the development of OC-SORT, which considers the appearance of the object, DeepSORT (and its upgraded version, StrongSORT), which includes human appearance, and ByteTrack, which uses a two-stage association linker to consider low-confidence recognition. After testing, we found that for our scene, DeepSORT’s appearance tracking algorithm is more suitable for humans than for cows, and ByteTrack’s tracking accuracy is slightly weaker. As a result, we ultimately chose OC-SORT as our tracking algorithm.

Next, we use DeepLabCut (DLC for short) to mark the skeletal points of the cows. DLC is a markerless model, which means that although different points, such as the head and limbs, might have different meanings, they are all just points for DLC, which only requires us to mark the points and train the model.

This leads to a new question: how many points should we mark on each cow and where should we mark them? The answer to this question affects the workload of marking, training, and subsequent inference efficiency. To solve this problem, we must first understand how to identify lame cows.

Based on our research and the inputs of our expert clients, lame cows in videos exhibit the following characteristics:

An arched back: The neck and back are curved, forming a triangle with the root of the neck bone (arched-back).
Frequent nodding: Each step can cause the cow to lose balance or slip, resulting in frequent nodding (head bobbing).
Unstable gait: The cow’s gait changes after a few steps, with slight pauses (gait pattern change).

With regards to neck and back curvature as well as nodding, experts from AWS Generative AI Innovation Center have determined that marking only seven back points (one on the head, one at the base of the neck, and five on the back) on cattle can result in good identification. Since we now have a frame of identification, we should also be able to recognize unstable gait patterns.

Next, we use mathematical expressions to represent the identification results and form algorithms.

Human identification of these problems isn’t difficult, but precise algorithms are required for computer identification. For example, how does a program know the degree of curvature of a cow’s back given a set of cow back coordinate points? How does it know if a cow is nodding?

In terms of back curvature, we first consider treating the cow’s back as an angle and then we find the vertex of that angle, which allows us to calculate the angle. The problem with this method is that the spine might have bidirectional curvature, making the vertex of the angle difficult to identify. This requires switching to other algorithms to solve the problem.

In terms of nodding, we first considered using the Fréchet distance to determine if the cow is nodding by comparing the difference in the curve of the cow’s overall posture. However, the problem is that the cow’s skeletal points might be displaced, causing significant distance between similar curves. To solve this problem, we need to take out the position of the head relative to the recognition box and normalize it.

After normalizing the position of the head, we encountered a new problem. In the image that follows, the graph on the left shows the change in the position of the cow’s head. We can see that due to recognition accuracy issues, the position of the head point will constantly shake slightly. We need to remove these small movements and find the relatively large movement trend of the head. This is where some knowledge of signal processing is needed. By using a Savitzky-Golay filter, we can smooth out a signal and obtain its overall trend, making it easier for us to identify nodding, as shown by the orange curve in the graph on the right.

Additionally, after dozens of hours of video recognition, we found that some cows with extremely high back curvature actually did not have a hunched back. Further investigation revealed that this was because most of the cows used to train the DLC model were mostly black or black and white, and there weren’t many cows that were mostly white or close to pure white, resulting in the model recognizing them incorrectly when they had large white areas on their bodies, as shown by the red arrow in the figure below. This can be corrected through further model training.

In addition to solving the preceding problems, there were other generic problems that needed to be solved:

There are two paths in the video frame, and cows in the distance might also be recognized, causing problems.
The paths in the video also have a certain curvature, and the cow’s body length becomes shorter when the cow is on the sides of the path, making the posture easy to identify incorrectly.
Due to the overlap of multiple cows or occlusion from the fence, the same cow might be identified as two cows.
Due to tracking parameters and occasional frame skipping of the camera, it’s impossible to correctly track the cows, resulting in ID confusion issues.

In the short term, based on the alignment with New Hope Dairy on delivering a minimum viable product and then iterate on it, these problems can usually be solved by outlier judgment algorithms combined with confidence filtering, and if they cannot be solved, they will become invalid data, which requires us to perform additional training and continuously iterate our algorithms and models.

In the long term, AWS AI Shanghai Lablet provided future experiment suggestions to solve the preceding problems based on their object-centric research: Bridging the Gap to Real-World Object-Centric Learning and Self-supervised Amodal Video Object Segmentation. Besides invalidating those outlier data, the issues can also be addressed by developing more precise object-level models for pose estimation, amodal segmentation, and supervised tracking. However, traditional vision pipelines for these tasks typically require extensive labeling. Object-centric learning focuses on tackling the binding problem of pixels to objects without additional supervision. The binding process not only provides information on the location of objects but also results in robust and adaptable object representations for downstream tasks. Because the object-centric pipeline focuses on self-supervised or weakly-supervised settings, we can improve performance without significantly increasing labeling costs for our customers.

After solving a series of problems and combining the scores given by the farm veterinarian and nutritionist, we have obtained a comprehensive lameness score for cows, which helps us identify cows with different degrees of lameness such as severe, moderate, and mild, and can also identify multiple body posture attributes of cows, helping further analysis and judgment.

Within weeks, we developed an end-to-end solution for identifying lame cows. The hardware camera for this solution cost only 300 RMB, and the Amazon SageMaker batch inference, when using the g4dn.xlarge instance, took about 50 hours for 2 hours of video, totaling only 300 RMB. When it enters production, if five batches of cows are detected per week (assuming about 10 hours), and including the rolling saved videos and data, the monthly detection cost for a medium-sized ranch with several thousand cows is less than 10,000 RMB.

Currently, our machine learning model process is as follows:

Raw video is recorded.
Cows are detected and identified.
Each cow is tracked, and key points are detected.
Each cow’s movement is analyzed.
A lameness score is determined.

Model deployment

We’ve described the solution for identifying lame cows based on machine learning before. Now, we need to deploy these models on SageMaker. As shown in the following figure:

We first used the AWS Serverless Video Streaming Solution to obtain the real time video and saved it to Amazon Simple Storage Service (Amazon S3).
Then, we used Amazon EventBridge to set scheduled tasks for the SageMaker Batch Processing. The customer doesn’t need real time results, so batch processing can further improve efficiency of the GPU instance and decrease the cost.
During batch processing, SageMaker obtains the docker image from Amazon Elastic Container Registry (Amazon ECR) and the video data from an S3 bucket. Then, the inference process begins and the results are saved in an S3 bucket.

Business implementation

Of course, what we’ve discussed so far is just the core of our technical solution. To integrate the entire solution into the business process, we also must address the following issues:

Data feedback: For example, we must provide veterinarians with an interface to filter and view lame cows that need to be processed and collect data during this process to use as training data.
Cow identification: After a veterinarian sees a lame cow, they also need to know the cow’s identity, such as its number and pen.
Cow positioning: In a pen with hundreds of cows, quickly locate the target cow.
Data mining: For example, find out how the degree of lameness affects feeding, rumination, rest, and milk production.
Data-driven: For example, identify the genetic, physiological, and behavioral characteristics of lame cows to achieve optimal breeding and reproduction.

Only by addressing these issues can the solution truly solve the business problem, and the collected data can generate long-term value. Some of these problems are system integration issues, while others are technology and business integration issues. We will share further information about these issues in future articles.

Summary

In this article, we briefly explained how the AWS Customer Solutions team innovates quickly based on the customer’s business. This mechanism has several characteristics:

Business led: Prioritize understanding the customer’s industry and business processes on site and in person before discussing technology, and then delve into the customer’s pain points, challenges, and problems to identify important issues that can be solved with technology.
Immediately available: Provide a simple but complete and usable prototype directly to the customer for testing, validation, and rapid iteration within weeks, not months.
Minimal cost: Minimize or even eliminate the customer’s costs before the value is truly validated, avoiding concerns about the future. This aligns with the AWS frugality leadership principle.

In our collaborative innovation project with the dairy industry, we not only started from the business perspective to identify specific business problems with business experts, but also conducted on-site investigations at the farm and factory with the customer. We determined the camera placement on site, installed and deployed the cameras, and deployed the video streaming solution. Experts from AWS Generative AI Innovation Center dissected the customer’s requirements and developed an algorithm, which was then engineered by a solution architect for the entire algorithm.

With each inference, we could obtain thousands of decomposed and tagged cow walking videos, each with the original video ID, cow ID, lameness score, and various detailed scores. The complete calculation logic and raw gait data were also retained for subsequent algorithm optimization.

Lameness data can not only be used for early intervention by veterinarians, but also combined with milking machine data for cross-analysis, providing an additional validation dimension and answering some additional business questions, such as: What are the physical characteristics of cows with the highest milk yield? What is the effect of lameness on milk production in cows? What is the main cause of lame cows, and how can it be prevented? This information will provide new ideas for farm operations.

The story of identifying lame cows ends here, but the story of farm innovation has just begun. In subsequent articles, we will continue to discuss how we work closely with customers to solve other problems.

About the Authors

Hao Huang is an applied scientist at the AWS Generative AI Innovation Center. He specializes in Computer Vision (CV) and Visual-Language Model (VLM). Recently, he has developed a strong interest in generative AI technologies and has already collaborated with customers to apply these cutting-edge technologies to their business. He is also a reviewer for AI conferences such as ICCV and AAAI.

Peiyang He is a senior data scientist at the AWS Generative AI Innovation Center. She works with customers across a diverse spectrum of industries to solve their most pressing and innovative business needs leveraging GenAI/ML solutions. In her spare time, she enjoys skiing and traveling.

Xuefeng Liu leads a science team at the AWS Generative AI Innovation Center in the Asia Pacific and Greater China regions. His team partners with AWS customers on generative AI projects, with the goal of accelerating customers’ adoption of generative AI.

Tianjun Xiao is a senior applied scientist at the AWS AI Shanghai Lablet, co-leading the computer vision efforts. Presently, his primary focus lies in the realms of multimodal foundation models and object-centric learning. He is actively investigating their potential in diverse applications, including video analysis, 3D vision and autonomous driving.

Zhang Dai is a an AWS senior solution architect for China Geo Business Sector. He helps companies of various sizes achieve their business goals by providing consultancy on business processes, user experience and cloud technology. He is a prolific blog writer and also author of two books: The Modern Autodidact and Designing Experience.

Jianyu Zeng is a senior customer solutions manager at AWS, whose responsibility is to support customers, such as New Hope group, during their cloud transition and assist them in realizing business value through cloud-based technology solutions. With a strong interest in artificial intelligence, he is constantly exploring ways to leverage AI to drive innovative changes in our customer’s businesses.

Carol Tong Min is a senior business development manager, responsible for Key Accounts in GCR GEO West, including two important enterprise customers: Jiannanchun Group and New Hope Group. She is customer obsessed, and always passionate about supporting and accelerating customers’ cloud journey.

Nick Jiang is a senior specialist sales at AIML SSO team in China. He is focus on transferring innovative AIML solutions and helping with customer to build the AI related workloads within AWS.

Personalize your search results with Amazon Personalize and Amazon OpenSearch Service integration

Amazon Personalize has launched a new integration with Amazon OpenSearch Service that enables you to personalize search results for each user and assists in predicting their search needs. The Amazon Personalize Search Ranking plugin within OpenSearch Service allows you to improve the end-user engagement and conversion from your website and app search by taking advantage of the deep learning capabilities offered by Amazon Personalize. This feature is also available with self-managed OpenSearch.

Search is crucial in engaging users because it brings high-intent traffic from individuals seeking specific products or categories. Previously, customers found it challenging to capitalize on this traffic and provide relevant search results to their users due to infrastructure limitations or lack of ML expertise. This led to increased instances of users failing to find the items they were searching for. With the Amazon Personalize Search Ranking plugin, customers of OpenSearch Service version 2.9.0 or later can go beyond the traditional keyword matching approach and boost relevant items in an individual user’s search results based on their interests, context, and past interactions in real time. You can also fine-tune the level of personalization for every search query to ensure flexibility and control over the search experience.

AWS Partners like Cognizant are excited by the personalization possibilities that the Amazon Personalize Search Ranking plugin will unlock for their media and retail customers.

“Amazon Personalize has been proven to be highly impactful for many businesses with its cost-effective and streamlined implementation. With the release of the new Amazon Personalize Search Ranking plugin within Amazon OpenSearch Service, we can now rapidly deploy and implement real-time user personalization to search results. We are highly confident that it will deliver improved customer experience and satisfaction as well as increase conversion and clickthrough rates by two to three times. Personalized search is a differentiator, especially for media and retail platforms. We are really excited to be a launch partner with AWS on this release and are looking forward to helping businesses deliver personalized search solutions powered by Amazon Personalize.”

– Andy Huang, Head of AI/ML at Cognizant Servian.

In this post, we show you how search results get personalized based on the user and how they vary when you adjust the personalization weight. We specify a value closer to zero to place less emphasis on personalization, and specify a value closer to 1 to re-rank search results based on a higher level of personalization.

Example use cases

To explore the impact of this new feature in greater detail, let’s review an example using a dataset from the Retail Demo Store.

First, we use OpenSearch Service to get search results for the search query “Grooming.” When the personalization weight is set to 0.0, no personalization takes place. As shown in the following table, the top five search results from OpenSearch Service show the grooming items with a higher gender affinity towards women (refer to the Gender_Affinity column, where M stands for male and F stands for female).

Rank	Item_ID	Item_Name	Description	Gender_Affinity
1	1bcb66c4-ee9d-4c0c-ba53-168cb243569f	Women’s Grooming Kit	A must-have in every bathroom	F
2	f91ec34f-a08e-4408-8bb0-592bdd09375c	Besto Hairbrush for Women	Soft brush for everyday use	F
3	4296626c-fbb0-42b4-9a50-b6c6c16095f3	Makeup Brush Kit	This nifty makeup brush kit is essential in ev…	F
4	09920b2e-4e07-41f7-aca6-47744777a2a7	Trendy Razor	A must-have in every bathroom	F
5	39945ad0-57c9-4c28-a69c-532d5d167202	Makeup Brushes	Makeup brushes for every bathroom	F
6	1bfbe5c7-6f02-4465-82f1-6083a4b302c0	Premium Men’s Razor	Razor for every bathroom	M
7	6d5b3f03-ade6-42f7-969d-acd1f2162332	5-Blade Razor for Men	Razor for every bathroom	M
8	83095a08-2968-4275-a375-4fab404df7ac	Fusion5 Razers for Men	Razor for every bathroom	M
9	afdd9c41-2762-45bf-b6a7-e3fb8f1b34ba	Minimalistic Razor	A must-have in every bathroom	M
10	5dbc7cb7-39c5-4795-9064-d1655d78b3ca	Razor Brand for Men	Razor for every bathroom	M

Let’s suppose that a user with gender M (male) performs a search using the same query for “Grooming.” When the personalization weight is set to 0.3, the items with a gender affinity towards men get a subtle boost in ranking. In this example, Premium Men’s Razor, which was originally ranked number 6 in the previous table by OpenSearch Service, gets boosted to rank 2 in the updated table. Similarly, Razor Brand for Men shows up higher in position (rank 6) despite being the lowest-ranked item in the previous table.

Rank	Item_ID	Item_Name	Description	Gender_Affinity
1	1bcb66c4-ee9d-4c0c-ba53-168cb243569f	Women’s Grooming Kit	A must-have in every bathroom	F
2	1bfbe5c7-6f02-4465-82f1-6083a4b302c0	Premium Men’s Razor	Razor for every bathroom	M
3	f91ec34f-a08e-4408-8bb0-592bdd09375c	Besto Hairbrush for Women	Soft brush for everyday use	F
4	4296626c-fbb0-42b4-9a50-b6c6c16095f3	Makeup Brush Kit	This nifty makeup brush kit is essential in ev…	F
5	09920b2e-4e07-41f7-aca6-47744777a2a7	Trendy Razor	A must-have in every bathroom	F
6	5dbc7cb7-39c5-4795-9064-d1655d78b3ca	Razor Brand for Men	Razor for every bathroom	M
7	39945ad0-57c9-4c28-a69c-532d5d167202	Makeup Brushes	Makeup brushes for every bathroom	F
8	afdd9c41-2762-45bf-b6a7-e3fb8f1b34ba	Minimalistic Razor	A must-have in every bathroom	M
9	83095a08-2968-4275-a375-4fab404df7ac	Fusion5 Razers for Men	Razor for every bathroom	M
10	6d5b3f03-ade6-42f7-969d-acd1f2162332	5-Blade Razor for Men	Razor for every bathroom	M

Next, we fine-tune the personalization weight to a value of 0.8 to get more personalized search results for “Grooming.” In the following table, the top four items in the search results are highly suited for men. Premium Men’s Razor and Razor Brand for Men shoot up further in rank. We also see other grooming items such as Minimalistic Razor and Fusion5 Razers for Men surfaced at the top of the search results even though they had a lower ranking in our first query.

Rank	Item_ID	Item_Name	Description	Gender_Affinity
1	1bfbe5c7-6f02-4465-82f1-6083a4b302c0	Premium Men’s Razor	Razor for every bathroom	M
2	5dbc7cb7-39c5-4795-9064-d1655d78b3ca	Razor Brand for Men	Razor for every bathroom	M
3	afdd9c41-2762-45bf-b6a7-e3fb8f1b34ba	Minimalistic Razor	A must-have in every bathroom	M
4	83095a08-2968-4275-a375-4fab404df7ac	Fusion5 Razers for Men	Razor for every bathroom	M
5	1bcb66c4-ee9d-4c0c-ba53-168cb243569f	Women’s Grooming Kit	A must-have in every bathroom	F
6	f91ec34f-a08e-4408-8bb0-592bdd09375c	Besto Hairbrush for Women	Soft brush for everyday use	F
7	6d5b3f03-ade6-42f7-969d-acd1f2162332	5-Blade Razor for Men	Razor for every bathroom	M
8	09920b2e-4e07-41f7-aca6-47744777a2a7	Trendy Razor	A must-have in every bathroom	F
9	39945ad0-57c9-4c28-a69c-532d5d167202	Makeup Brushes	Makeup brushes for every bathroom	F
10	4296626c-fbb0-42b4-9a50-b6c6c16095f3	Makeup Brush Kit	This nifty makeup brush kit is essential in ev…	F

For more details on how to implement personalized search with OpenSearch Service, refer to Personalizing search results from OpenSearch.

Conclusion

With the new Amazon Personalize Search Ranking plugin, customers of both self-managed OpenSearch and OpenSearch Service v2.9 and above can boost relevant items in their search results by including signals from each user’s history, context, and preferences. The plugin enables you to exercise greater control over the level of personalization for each user and query type, and improve the overall search experience for your users.

For more details on Amazon Personalize, refer to the Amazon Personalize Developer Guide.

About the Authors

Shreeya Sharma is a Sr. Technical Product Manager working with AWS AI/ML on the Amazon Personalize team. She has a background in computer science engineering, technology consulting, and data analytics

Ketan Kulkarni is a Software Development Engineer with the Amazon Personalize team focused on building AI-powered recommender systems at scale. In his spare time, he enjoys reading and traveling.

Prashant Mishra is a Software Development Engineer on the Amazon Personalize team.

Branislav Kveton is a Principal Scientist at AWS AI Labs. He proposes, analyzes, and applies algorithms that learn incrementally, run in real time, and converge to near optimal solutions as the number of observations increases.

Johns Hopkins and Amazon announce new AI2AI research awards

Research award recipients named as part of the JHU + Amazon Initiative for Interactive AI (AI2AI), now in its second year.Read More

People of AI: Season 2

Posted by Ashley Oldacre

If you are joining us for the first time, you can binge listen to our amazing 8 episodes from Season 1 wherever you get your podcasts.

We are back for another season of People of AI with a new lineup of incredible guests! I am so excited to introduce my new co-host Luiz Gustavo Martins as we meet inspiring people with interesting stories in the field of Artificial Intelligence.

Last season we focused on the incredible journeys that our guests took to get into the field of AI. Through our stories, we highlighted that no matter who you are, what your interests are, or what you work on, there is a place for anyone to get into this field. We also explored how much more accessible the technology has become over the years, as well as the importance of building AI-related products responsibly and ethically. It is easier than ever to use tools, platforms and services powered by machine learning to leverage the benefits of AI, and break down the barrier of entry.

For season 2, we will feature amazing conversations, focusing on Generative AI! Specifically, we will be discussing the explosive growth of Generative AI tools and the major technology shift that has happened in recent months. We will dive into various topics to explore areas where Generative AI can contribute tremendous value, as well as boost both productivity and economic growth. We will also continue to explore the personal paths and career development of this season’s guests as they share how their interest in technology was sparked, how they worked hard to get to where they are today, and explore what it is that they are currently working on.

Starting today, we will release one new episode of season 2 per week. Listen to the first episode on the People of AI site or wherever you get your podcasts. And stay tuned for later in the season when we premiere our first video podcasts as well!

Episode 1: meet your hosts, Ashley and Gus and learn about Generative AI, Bard and the big shift that has dramatically changed the industry.

Episode 2: meet Sunita Verma, a long-time Googler, as she shares her personal journey from Engineering to CS, and into Google. As an early pioneer of AI and Google Ads, we will talk about the evolution of AI and how Generative AI will transform the way we work.

Episode 3: meet Sayak Paul, a Google Developer Expert (GDE) as we explore what it means to be a GDE and how to leverage the power of your community through community contributions.

Episode 4: meet Crispin Velez, the lead for Cloud’s Vertex AI as we dig into his experience in Cloud working with customers and partners on how to integrate and deploy AI. We also learn how he grew his AI developer community in LATAM from scratch.

Episode 5: meet Joyce Shen, venture capital/private equity investor. She shares her fascinating career in AI and how she has worked with businesses to spot AI talent, incorporate AI technology into workflows and implement responsible AI into their products.

Episode 6: meet Anne Simonds and Brian Gary, founders of Muse https://www.museml.com. Join us as we talk about their recent journeys into AI and their new company which uses the power of Generative AI to spark creativity.

Episode 7: meet Tulsee Doshi, product lead for Google’s Responsible AI efforts as we discuss the development of Google-wide resources and best practices for developing more inclusive, diverse, and ethical algorithm driven products.

Episode 8: meet Jeanine Banks, Vice President and General Manager of Google Developer X and Head of Developer Relations. Join us as we debunk AI and get down to what Generative AI really is, how it has changed over the past few months and will continue to change the developer landscape.

Episode 9: meet Simon Tokumine, Director of Product Management at Google. We will talk about how AI has brought us into the era of task-orientated products and is fueling a new community of makers.

Listen now to the first episode of Season 2. We can’t wait to share the stories of these exceptional People of AI with you!

This podcast is sponsored by Google. Any remarks made by the speakers are their own and are not endorsed by Google.

Goal Representations for Instruction Following

<!– Figure title. Figure caption. This image is centered and set to 50%
page width. –>

A longstanding goal of the field of robot learning has been to create generalist agents that can perform tasks for humans. Natural language has the potential to be an easy-to-use interface for humans to specify arbitrary tasks, but it is difficult to train robots to follow language instructions. Approaches like language-conditioned behavioral cloning (LCBC) train policies to directly imitate expert actions conditioned on language, but require humans to annotate all training trajectories and generalize poorly across scenes and behaviors. Meanwhile, recent goal-conditioned approaches perform much better at general manipulation tasks, but do not enable easy task specification for human operators. How can we reconcile the ease of specifying tasks through LCBC-like approaches with the performance improvements of goal-conditioned learning?

Goal Representations for Instruction Following

<!– Figure title. Figure caption. This image is centered and set to 50%
page width. –>

Striking Performance: Large Language Models up to 4x Faster on RTX With TensorRT-LLM for Windows

Generative AI is one of the most important trends in the history of personal computing, bringing advancements to gaming, creativity, video, productivity, development and more.

And GeForce RTX and NVIDIA RTX GPUs, which are packed with dedicated AI processors called Tensor Cores, are bringing the power of generative AI natively to more than 100 million Windows PCs and workstations.

Today, generative AI on PC is getting up to 4x faster via TensorRT-LLM for Windows, an open-source library that accelerates inference performance for the latest AI large language models, like Llama 2 and Code Llama. This follows the announcement of TensorRT-LLM for data centers last month.

NVIDIA has also released tools to help developers accelerate their LLMs, including scripts that optimize custom models with TensorRT-LLM, TensorRT-optimized open-source models and a developer reference project that showcases both the speed and quality of LLM responses.

TensorRT acceleration is now available for Stable Diffusion in the popular Web UI by Automatic1111 distribution. It speeds up the generative AI diffusion model by up to 2x over the previous fastest implementation.

Plus, RTX Video Super Resolution (VSR) version 1.5 is available as part of today’s Game Ready Driver release — and will be available in the next NVIDIA Studio Driver, releasing early next month.

Supercharging LLMs With TensorRT

LLMs are fueling productivity — engaging in chat, summarizing documents and web content, drafting emails and blogs — and are at the core of new pipelines of AI and other software that can automatically analyze data and generate a vast array of content.

TensorRT-LLM, a library for accelerating LLM inference, gives developers and end users the benefit of LLMs that can now operate up to 4x faster on RTX-powered Windows PCs.

At higher batch sizes, this acceleration significantly improves the experience for more sophisticated LLM use — like writing and coding assistants that output multiple, unique auto-complete results at once. The result is accelerated performance and improved quality that lets users select the best of the bunch.

TensorRT-LLM acceleration is also beneficial when integrating LLM capabilities with other technology, such as in retrieval-augmented generation (RAG), where an LLM is paired with a vector library or vector database. RAG enables the LLM to deliver responses based on a specific dataset, like user emails or articles on a website, to provide more targeted answers.

To show this in practical terms, when the question “How does NVIDIA ACE generate emotional responses?” was asked of the LLaMa 2 base model, it returned an unhelpful response.

Conversely, using RAG with recent GeForce news articles loaded into a vector library and connected to the same Llama 2 model not only returned the correct answer — using NeMo SteerLM — but did so much quicker with TensorRT-LLM acceleration. This combination of speed and proficiency gives users smarter solutions.

TensorRT-LLM will soon be available to download from the NVIDIA Developer website. TensorRT-optimized open source models and the RAG demo with GeForce news as a sample project are available at ngc.nvidia.com and GitHub.com/NVIDIA.

Automatic Acceleration

Diffusion models, like Stable Diffusion, are used to imagine and create stunning, novel works of art. Image generation is an iterative process that can take hundreds of cycles to achieve the perfect output. When done on an underpowered computer, this iteration can add up to hours of wait time.

TensorRT is designed to accelerate AI models through layer fusion, precision calibration, kernel auto-tuning and other capabilities that significantly boost inference efficiency and speed. This makes it indispensable for real-time applications and resource-intensive tasks.

And now, TensorRT doubles the speed of Stable Diffusion.

Compatible with the most popular distribution, WebUI from Automatic1111, Stable Diffusion with TensorRT acceleration helps users iterate faster and spend less time waiting on the computer, delivering a final image sooner. On a GeForce RTX 4090, it runs 7x faster than the top implementation on Macs with an Apple M2 Ultra. The extension is available for download today.

The TensorRT demo of a Stable Diffusion pipeline provides developers with a reference implementation on how to prepare diffusion models and accelerate them using TensorRT. This is the starting point for developers interested in turbocharging a diffusion pipeline and bringing lightning-fast inferencing to applications.

Video That’s Super

AI is improving everyday PC experiences for all users. Streaming video — from nearly any source, like YouTube, Twitch, Prime Video, Disney+ and countless others — is among the most popular activities on a PC. Thanks to AI and RTX, it’s getting another update in image quality.

RTX VSR is a breakthrough in AI pixel processing that improves the quality of streamed video content by reducing or eliminating artifacts caused by video compression. It also sharpens edges and details.

Available now, RTX VSR version 1.5 further improves visual quality with updated models, de-artifacts content played in its native resolution and adds support for RTX GPUs based on the NVIDIA Turing architecture — both professional RTX and GeForce RTX 20 Series GPUs.

Retraining the VSR AI model helped it learn to accurately identify the difference between subtle details and compression artifacts. As a result, AI-enhanced images more accurately preserve details during the upscaling process. Finer details are more visible, and the overall image looks sharper and crisper.

RTX Video Super Resolution v1.5 improves detail and sharpness.

New with version 1.5 is the ability to de-artifact video played at the display’s native resolution. The original release only enhanced video when it was being upscaled. Now, for example, 1080p video streamed to a 1080p resolution display will look smoother as heavy artifacts are reduced.

RTX VSR now de-artifacts video played at its native resolution.

RTX VSR 1.5 is available today for all RTX users in the latest Game Ready Driver. It will be available in the upcoming NVIDIA Studio Driver, scheduled for early next month.

RTX VSR is among the NVIDIA software, tools, libraries and SDKs — like those mentioned above, plus DLSS, Omniverse, AI Workbench and others — that have helped bring over 400 AI-enabled apps and games to consumers.

The AI era is upon us. And RTX is supercharging at every step in its evolution.

NVIDIA RTX Video Super Resolution Update Enhances Video Quality, Detail Preservation and Expands to GeForce RTX 20 Series GPUs

NVIDIA today announced an update to RTX Video Super Resolution (VSR) that delivers greater overall graphical fidelity with preserved details, upscaling for native videos and support for GeForce RTX 20 Series desktop and laptop GPUs.

For AI assists from RTX VSR and more — from enhanced creativity and productivity to blisteringly fast gaming — check out the RTX for AI page.

Plus, this week In the NVIDIA Studio, Twitch personality Runebee shares her inspiration, streaming tips and how she uses AI and RTX GPU acceleration.

And don’t forget to join the #SeasonalArtChallenge by submitting spooky Halloween-themed art in October and harvest- and fall-themed pieces in November. For inspiration, check out the hauntingly adorable work of artists like iryna.blender3d on Twitter.

The #SeasonalArtChallenge continues on with an incredible render from iryna.blender3d (IG).

Share your spooky/Halloween-themed art with #SeasonalArtChallenge for a chance to be featured on the Studio or @NVIDIAOmniverse channels! pic.twitter.com/lOxcYNhBJl

— NVIDIA Studio (@NVIDIAStudio) October 9, 2023

The Super RTX VSR Update 1.5

RTX VSR’s AI model has been retrained to more accurately identify the difference between subtle details and compression artifacts to better preserve image details during the upscaling process. Finer details are more visible, and the overall image looks sharper and crisper than before.

RTX VSR v1.5 improves detail and sharpness.

RTX VSR version 1.5 will also de-artifact videos played at their native resolution — prior, only upscaled video could be enhanced. Providing a leap in graphical fidelity for laptop owners with 1080p screens, the updated RTX VSR makes 1080p resolution, which is popular for content and displays, look smoother at its native resolution, even with heavy artifacts.

RTX VSR now de-artifacts video played at native resolution.

And with expanded RTX VSR support, owners of GeForce RTX 20 Series GPUs can benefit from the same AI-enhanced video as those using RTX 30 and 40 Series GPUs.

RTX VSR 1.5 is available as part of the latest Game Ready Driver, available for download today. Content creators downloading NVIDIA Studio Drivers — designed to enhance features, reduce repetitiveness and dramatically accelerate creative workflows — can install the driver with RTX VSR releasing in early November.

Runebee-lievable Streaming

Runebee has been livestreaming for over 10 years, providing a space for viewers to hang out and talk about games, movies or whatever else is going on in life. Over the years, she’s realized how common a desire for escapism is.

“Things aren’t always sunshine and rainbows, so it’s nice to have some company that can help take your mind off things,” said Runebee.

Runebee has amassed over 100K followers on Twitch, YouTube and Instagram, crediting her success to thorough preparation of her setup. Her technology-forward approach ensures efficiency and reliability — allowing her focus to be on performance.

“There’s a lot of planning involved in streaming, but at the end of the day, hitting the ‘start streaming’ button is the most important step, and NVIDIA GPU-acceleration is a massive factor in allowing it to go as smoothly as it does,” said Runebee.

“I never thought I’d have this smooth of a stream just by upgrading to a GeForce RTX 40 Series GPU.” – Runebee

OBS is Runbee’s preferred open-source software for video recording and livestreaming on Twitch. For maximum efficiency, Runebee deploys her GeForce RTX 4080 RTX GPU, taking advantage of the eighth-generation NVIDIA encoder, NVENC, to independently encode video, which frees up the graphics card to focus on livestreaming.

“Streaming games and running OBS used to kill my CPU, and NVENC has taken so much stress off,” said Runebee. “I was hardly even able to stream PC games until I switched to NVENC.”

For livestreamers, RTX 40 Series GPUs can offer support for real-time AV1 hardware encoding, providing a 40% efficiency boost compared to H.264 and delivering higher quality than competing GPUs.

“As I started building more PCs with NVIDIA GPUs, I never had a reason to switch!” – Runebee

Runebee can export recordings of her livestreams with Adobe Premiere Pro in half the normally required time thanks to GeForce RTX 40 Series dual encoders working together, dividing the work evenly to double output.

They’re capable of recording up to 8K, 60 frames per second content in real time via GeForce Experience and OBS Studio.

Always looking to improve her livestreaming process, Runebee plans on experimenting with the NVIDIA Broadcast app, which transforms any room into a home studio by upgrading standard webcams, microphones and speakers into premium smart devices using the power of AI.

Runebee encourages those interested in livestreaming to at least give their potential passion project a shot. “It’s a great way to meet tons of new friends, become more articulate at describing the things you love — be it games or movies — and cultivate a community to share your passions with,” she said.

Follow Runebee on Twitch.

Follow NVIDIA Studio on Instagram, Twitter and Facebook. Access tutorials on the Studio YouTube channel and get updates directly in your inbox by subscribing to the Studio newsletter. See notice regarding software product information.

How AI can help people with motor disabilities — like my cousin

Editor’s note: Today we’re building on our long-standing partnership with the University of Cambridge, with a multi-year research collaboration agreement and a Google gr…Read More

The LLM-based Q&A chatbot

Accelerating the ML model development

The RAG design pattern

Managing the knowledge base

Solution overview

Conclusion

About the author

Project background

Dairy farm challenges

First is taking inventory of reserve cows

Second is identifying lame cattle

Lastly, there is feed cost optimization

Research

Initial proposal

Difficulties in the solution

Model deployment

Business implementation

Summary

About the Authors

Example use cases

Conclusion

About the Authors

Supercharging LLMs With TensorRT

Automatic Acceleration

Video That’s Super

The Super RTX VSR Update 1.5

Runebee-lievable Streaming

Navigation

GenAI Vision Endless Possibilities

"I'm interested in things that change the world or that affect the future and wondrous, new technology where you see it, and you're like, 'Wow, how did that even happen? How is that possible?'" -- Elon Musk

Copyright © 2019-2025 Vedere AI. All Rights Reserved.