US Healthcare System Deploys AI Agents, From Research to Rounds

US Healthcare System Deploys AI Agents, From Research to Rounds

The U.S. healthcare system is adopting digital health agents to harness AI across the board, from research laboratories to clinical settings.

The latest AI-accelerated tools — on display at the NVIDIA AI Summit taking place this week in Washington, D.C. — include NVIDIA NIM, a collection of cloud-native microservices that support AI model deployment and execution, and NVIDIA NIM Agent Blueprints, a catalog of pretrained, customizable workflows. 

These technologies are already in use in the public sector to advance the analysis of medical images, aid the search for new therapeutics and extract information from massive PDF databases containing text, tables and graphs. 

For example, researchers at the National Cancer Institute, part of the National Institutes of Health (NIH), are using several AI models built with NVIDIA MONAI for medical imaging — including the VISTA-3D NIM foundation model for segmenting and annotating 3D CT images. A team at NIH’s National Center for Advancing Translational Sciences (NCATS) is using the NIM Agent Blueprint for generative AI-based virtual screening to reduce the time and cost of developing novel drug molecules.

With NVIDIA NIM and NIM Agent Blueprints, medical researchers across the public sector can jump-start their adoption of state-of-the-art, optimized AI models to accelerate their work. The pretrained models are customizable based on an organization’s own data and can be continually refined based on user feedback.

NIM microservices and NIM Agent Blueprints are available at ai.nvidia.com and accessible through a wide variety of cloud service providers, global system integrators and technology solutions providers. 

Building With NIM Agent Blueprints

Dozens of NIM microservices and a growing set of NIM Agent Blueprints are available for developers to experience and download for free. They can be deployed in production with the NVIDIA AI Enterprise software platform.

  • The blueprint for generative virtual screening for drug discovery brings together three NIM microservices to help researchers search and optimize libraries of small molecules to identify promising candidates that bind to a target protein.
  • The multimodal PDF data extraction blueprint uses NVIDIA NeMo Retriever NIM microservices to extract insights from enterprise documents, helping developers build powerful AI agents and chatbots.
  • The digital human blueprint supports the creation of interactive, AI-powered avatars for customer service. These avatars have potential applications in telehealth and nonclinical aspects of patient care, such as scheduling appointments, filling out intake forms and managing prescriptions.

Two new NIM microservices for drug discovery are now available on ai.nvidia.com to help researchers understand how proteins bind to target molecules, a crucial step in drug design. By conducting more of this preclinical research digitally, scientists can narrow down their pool of drug candidates before testing in the lab — making the discovery process more efficient and less expensive. 

With the AlphaFold2-Multimer NIM microservice, researchers can accurately predict protein structure from their sequences in minutes, reducing the need for time-consuming tests in the lab. The RFdiffusion NIM microservice uses generative AI to design novel proteins that are promising drug candidates because they’re likely to bind with a target molecule. 

NCATS Accelerates Drug Discovery Research

ASPIRE, a research laboratory at NCATS, is evaluating the NIM Agent Blueprint for virtual screening and is using RAPIDS, a suite of open-source software libraries for GPU-accelerated data science, to accelerate its drug discovery research. Using the cuGraph library for graph data analytics and cuDF library for accelerating data frames, the lab’s researchers can map chemical reactions across the vast unknown chemical space. 

The NCATS informatics team reported that with NVIDIA AI, processes that used to take hours on CPU-based infrastructure are now done in seconds.

Massive quantities of healthcare data — including research papers, radiology reports and patient records — are unstructured and locked in PDF documents, making it difficult for researchers to quickly search for information. 

The Genetic and Rare Diseases Information Center, also run by NCATS, is exploring using the PDF data extraction blueprint to develop generative AI tools that enhance the center’s ability to glean information from previously unsearchable databases. These tools will help answer questions from those affected by rare diseases.

“The center analyzes data sources spanning the National Library of Medicine, the Orphanet database and other institutes and centers within the NIH to answer patient questions,” said Sam Michael, chief information officer of NCATS. “AI-powered PDF data extraction can make it massively easier to extract valuable information from previously unsearchable databases.”  

Mi-NIM-al Effort, Maximum Benefit: Getting Started With NIM 

A growing number of startups, cloud service providers and global systems integrators include NVIDIA NIM microservices and NIM Agent Blueprints as part of their platforms and services, making it easy for federal healthcare researchers to get started.   

Abridge, an NVIDIA Inception startup and NVentures portfolio company, was recently awarded a contract from the U.S. Department of Veterans Affairs to help transcribe and summarize clinical appointments, reducing the burden on doctors to document each patient interaction.

The company uses NVIDIA TensorRT-LLM to accelerate AI inference and NVIDIA Triton Inference Server for deploying its audio-to-text and content summarization models at scale, some of the same technologies that power NIM microservices.

The NIM Agent Blueprint for virtual screening is now available through AWS HealthOmics, a purpose-built service that helps customers orchestrate biological data analyses. 

Amazon Web Services (AWS) is a partner of the NIH Science and Technology Research Infrastructure for Discovery, Experimentation, and Sustainability Initiative, aka STRIDES Initiative, which aims to modernize the biomedical research ecosystem by reducing economic and process barriers to accessing commercial cloud services. NVIDIA and AWS are collaborating to make NIM Agent Blueprints broadly accessible to the biomedical research community. 

ConcertAI, another NVIDIA Inception member, is an oncology AI technology company focused on research and clinical standard-of-care solutions. The company is integrating NIM microservices, NVIDIA CUDA-X microservices and the NVIDIA NeMo platform into its suite of AI solutions for large-scale clinical data processing, multi-agent models and clinical foundation models. 

NVIDIA NIM microservices are supporting ConcertAI’s high-performance, low-latency AI models through its CARA AI platform. Use cases include clinical trial design, optimization and patient matching — as well as solutions that can help boost the standard of care and augment clinical decision-making.

Global systems integrator Deloitte is bringing the NIM Agent Blueprint for virtual screening to its customers worldwide. With Deloitte Atlas AI, the company can help clients at federal health agencies easily use NIM to adopt and deploy the latest generative AI pipelines for drug discovery. 

Experience NVIDIA NIM microservices and NIM Agent Blueprints today.

NVIDIA AI Summit Highlights Healthcare Innovation

At the NVIDIA AI Summit in Washington, NVIDIA leaders, customers and partners are presenting over 50 sessions highlighting impactful work in the public sector. 

Register for a free virtual pass to hear how healthcare researchers are accelerating innovation with NVIDIA-powered AI in these sessions: 

Watch the AI Summit special address by Bob Pette, vice president of enterprise platforms at NVIDIA:

 

See notice regarding software product information.

Read More

Accelerated Computing Key to Yale’s Quantum Research

Accelerated Computing Key to Yale’s Quantum Research

A recently released joint research paper by Yale, Moderna and NVIDIA reviews how techniques from quantum machine learning (QML) may enhance drug discovery methods by better predicting molecular properties.

Ultimately, this could lead to the more efficient generation of new pharmaceutical therapies.

The review also emphasizes that the key tool for exploring these methods is GPU-accelerated simulation of quantum algorithms.

The study focuses on how future quantum neural networks can use quantum computing to enhance existing AI techniques.

Applied to the pharmaceutical industry, these advances offer researchers the ability to streamline complex tasks in drug discovery.

Researching how such quantum neural networks impact real-world use cases like drug discovery requires intensive, large-scale simulations of future noiseless quantum processing units (QPUs).

This is just one example of how, as quantum computing scales up, an increasing number of challenges are only approachable with GPU-accelerated supercomputing.

The review article explores how NVIDIA’s CUDA-Q quantum development platform provides a unique tool for running such multi-GPU accelerated simulations of QML workloads.

The study also highlights CUDA-Q’s ability to simulate multiple QPUs in parallel. This is a key ability for studying realistic large-scale devices, which, in this particular study, also allowed for the exploration of quantum machine learning tasks that batch training data.

Many of the QML techniques covered by the review — such as hybrid quantum convolution neural networks — also require CUDA-Q’s ability to write programs interweaving classical and quantum resources.

The increased reliance on GPU supercomputing demonstrated in this work is the latest example of NVIDIA’s growing involvement in developing useful quantum computers.

NVIDIA plans to further highlight its role in the future of quantum computing at the SC24 conference, Nov. 17-22 in Atlanta.

Read More

A Not-So-Secret Agent: NVIDIA Unveils NIM Blueprint for Cybersecurity

A Not-So-Secret Agent: NVIDIA Unveils NIM Blueprint for Cybersecurity

Artificial intelligence is transforming cybersecurity with new generative AI tools and capabilities that were once the stuff of science fiction. And like many of the heroes in science fiction, they’re arriving just in time.

AI-enhanced cybersecurity can detect and respond to potential threats in real time — often before human analysts even become aware of them. It can analyze vast amounts of data to identify patterns and anomalies that might indicate a breach. And AI agents can automate routine security tasks, freeing up human experts to focus on more complex challenges.

All of these capabilities start with software, so NVIDIA has introduced an NVIDIA NIM Agent Blueprint for container security that developers can adapt to meet their own application requirements.

The blueprint uses NVIDIA NIM microservices, the NVIDIA Morpheus cybersecurity AI framework, NVIDIA cuVS and NVIDIA RAPIDS accelerated data analytics to help accelerate analysis of common vulnerabilities and exposures (CVEs) at enterprise scale — from days to just seconds.

All of this is included in NVIDIA AI Enterprise, a cloud-native software platform for developing and deploying secure, supported production AI applications.

Deloitte Secures Software With NVIDIA AI

Deloitte is among the first to use the NVIDIA NIM Agent Blueprint for container security in its cybersecurity solutions, which supports agentic analysis of open-source software to help enterprises build secure AI. It can help enterprises enhance and simplify cybersecurity by improving efficiency and reducing the time needed to identify threats and potential adversarial activity.

“Cybersecurity has emerged as a critical pillar in protecting digital infrastructure in the U.S. and around the world,” said Mike Morris, managing director, Deloitte & Touche LLP. “By incorporating NVIDIA’s NIM Agent Blueprint into our cybersecurity solutions, we’re able to offer our clients improved speed and accuracy in identifying and mitigating potential security threats.”

Securing Software With Generative AI

Vulnerability detection and resolution is a top use case for generative AI in software delivery, according to IDC(1).

The NIM Agent Blueprint for container security includes everything an enterprise developer needs to build and deploy customized generative AI applications for rapid vulnerability analysis of software containers.

Software containers incorporate large numbers of packages and releases, some of which may be subject to security vulnerabilities. Traditionally, security analysts would need to review each of these packages to understand potential security exploits across any software deployment.

These manual processes are tedious, time-consuming and error-prone. They’re also difficult to automate effectively because of the complexity of aligning software packages, dependencies, configurations and the operating environment.

With generative AI, cybersecurity applications can rapidly digest and decipher information across a wide range of data sources, including natural language, to better understand the context in which potential vulnerabilities could be exploited.

Enterprises can then create cybersecurity AI agents that take action on this generative AI intelligence. The NIM Agent Blueprint for container security enables quick, automatic and actionable CVE risk analysis using large language models and retrieval-augmented generation for agentic AI applications. It helps developers and security teams protect software with AI to enhance accuracy, efficiency and streamline potential issues for human agents to investigate.

Blueprints for Cybersecurity Success

The new NVIDIA NIM Agent Blueprint for container security includes the NVIDIA Morpheus cybersecurity AI framework to reduce the time and cost associated with identifying, capturing and acting on threats. This brings a new level of security to the data center, cloud and edge.

The GPU-accelerated, end-to-end AI framework enables developers to create optimized applications for filtering, processing and classifying large volumes of streaming cybersecurity data.

Built on NVIDIA RAPIDS software, Morpheus accelerates data processing workloads at enterprise scale. It uses the power of RAPIDS cuDF for fast and efficient data operations, ensuring downstream pipelines harness all available GPU cores for complex agentic AI tasks.

Morpheus also extends human analysts’ capabilities by automating real-time analysis and responses, producing synthetic data to train AI models that identify risks accurately and to run what-if scenarios.

The NVIDIA NIM Agent Blueprint for container security is available now. Learn more in the NVIDIA AI Summit DC special address.

(1) Source: IDC, GenAI Awareness, Readiness, and Commitment: 2024 Outlook — GenAI Plans and Implications for External Services Providers, AI-Ready Infrastructure, AI Platforms, and GenAI Applications US52023824, April 2024

Read More

From Concept to Compliance, MITRE Digital Proving Ground Will Accelerate Validation of Autonomous Vehicles

From Concept to Compliance, MITRE Digital Proving Ground Will Accelerate Validation of Autonomous Vehicles

The path to safe, widespread autonomous vehicles is going digital.

MITRE — a government-sponsored nonprofit research organization — today announced its partnership with Mcity at the University of Michigan to develop a virtual and physical autonomous vehicle (AV) validation platform for industry deployment.

As part of this collaboration, announced during the NVIDIA AI Summit in Washington, D.C., MITRE will use Mcity’s simulation tools and a digital twin of its Mcity Test Facility, a real-world AV test environment in its Digital Proving Ground (DPG). The joint platform will deliver physically based sensor simulation enabled by NVIDIA Omniverse Cloud Sensor RTX APIs.

By combining these simulation capabilities with the MITRE DPG reporting framework, developers will be able to perform exhaustive testing in a simulated world to safely validate AVs before real-world deployment.

The current regulatory environment for AVs is highly fragmented, posing significant challenges for widespread deployment. Today, companies navigate regulations at various levels — city, state and the federal government — without a clear path to large-scale deployment. MITRE and Mcity aim to address this ambiguity with comprehensive validation resources open to the entire industry.

Mcity currently operates a 32-acre mock city for automakers and researchers to test their technology. Mcity is also building a digital framework around its physical proving ground to provide developers with AV data and simulation tools.

Raising Safety Standards

One of the largest gaps in the regulatory framework is the absence of universally accepted safety standards that the industry and regulators can rely on.

The lack of common standards leaves regulators with limited tools to verify AV performance and safety in a repeatable manner, while companies struggle to demonstrate the maturity of their AV technology. The ability to do so is crucial in the wake of public road incidents, where AV developers need to demonstrate the reliability of their software in a way that is acceptable to both industry and regulators.

Efforts like the National Highway Traffic Safety Administration’s New Car Assessment Program (NCAP) have been instrumental in setting benchmarks for vehicle safety in traditional automotive development. However, NCAP is insufficient for AV evaluation, where measures of safety go beyond crash tests to the complexity of real-time decision-making in dynamic environments.

Additionally, traditional road testing presents inherent limitations, as it exposes vehicles to real-world conditions but lacks the scalability needed to prove safety across a wide variety of edge cases. It’s particularly difficult to test rare and dangerous scenarios on public roads without significant risk.

By providing both physical and digital resources to validate AVs, MITRE and Mcity will be able to offer a safe, universally accessible solution that addresses the complexity of verifying autonomy.

Physically Based Sensor Simulation

A core piece of this collaboration is sensor simulation, which models the physics and behavior of cameras, lidars, radars and ultrasonic sensors on a physical vehicle, as well as how these sensors interact with their surroundings.

Sensor simulation enables developers to train against and test rare and dangerous scenarios — such as extreme weather conditions, sudden pedestrian crossings or unpredictable driver behavior — safely in virtual settings.

In collaboration with regulators, AV companies can use sensor simulation to recreate a real-world event, analyze their system’s response and evaluate how their vehicle performed — accelerating the validation process.

Moreover, simulation tests are repeatable, meaning developers can track improvements or regressions in the AV stack over time. This means AV companies can provide quantitative evidence to regulators to show that their system is evolving and addressing safety concerns.

Bridging Industry and Regulators

MITRE and its ecosystem are actively developing the Digital Proving Ground platform to facilitate industry-wide standards and regulations.

The platform will be an open and accessible national resource for accelerating safe AV development and deployment, providing a trusted simulation test environment.

Mcity will contribute simulation infrastructure, a digital twin and the ability to seamlessly connect virtual and physical worlds with NVIDIA Omniverse, an open platform enabling system developers to build physical AI and robotic system simulation applications. By integrating this virtual proving ground into DPG, the collaboration will also accelerate the development and use of advanced digital engineering and simulation for AV safety assurance.

Mcity’s simulation tools will connect to Omniverse Cloud Sensor RTX APIs and render a Universal Scene Description (USD) model of Mcity’s physical proving ground. DPG will be able to access this environment, simulate the behavior of vehicles and pedestrians in a realistic test environment and use the DPG reporting framework to explain how the AV performed.

This testing will then be replicated on the physical Mcity proving ground to create a comprehensive feedback loop.

The Road Ahead

As developers, automakers and regulators continue to collaborate, the industry is moving closer to a future where AVs can operate safely and at scale. The establishment of a repeatable testbed for validating safety — across real and simulated environments — will be critical to gaining public trust and regulatory approval, bringing the promise of AVs closer to reality.

Read More

SETI Institute Researchers Engage in World’s First Real-Time AI Search for Fast Radio Bursts

SETI Institute Researchers Engage in World’s First Real-Time AI Search for Fast Radio Bursts

This summer, scientists supercharged their tools in the hunt for signs of life beyond Earth.

Researchers at the SETI Institute became the first to apply AI to the real-time direct detection of faint radio signals from space. Their advances in radio astronomy are available for any field that applies accelerated computing and AI.

“We’re on the cusp of a fundamentally different way of analyzing streaming astronomical data, and the kinds of things we’ll be able to discover with it will be quite amazing,” said Andrew Siemion, Bernard M. Oliver Chair for SETI at the SETI Institute, a group formed in 1984 that now includes more than 120 scientists.

The SETI Institute operates the Allen Telescope Array (pictured above) in Northern California. It’s a cutting-edge telescope used in the search for extraterrestrial intelligence (SETI) as well as for the study of intriguing transient astronomical events such as fast radio bursts.

Germinating AI

The seed of the latest project was planted more than a decade ago. Siemion attended a talk at the University of California, Berkeley, about an early version of machine learning, a classifier that analyzed radio signals like the ones his team gathered from deep space.

“I was really impressed, and realized the ways SETI researchers detected signals at the time were rather naive,” said Siemion, who earned his Ph.D. in astrophysics at Berkeley.

The researchers started connecting with radio experts in conferences outside the field of astronomy. There, they met Adam Thompson, who leads a group of developers at NVIDIA.

“We explained our challenges searching the extremely wide bandwidth of signals from space at high data rates,” Siemion said.

SETI Institute researchers had been using NVIDIA GPUs for years to accelerate the algorithms that separate signals from background noise. Now they thought there was potential to do more.

A Demo Leads to a Pilot

It took time — in part due to the coronavirus pandemic — but earlier this year, Thompson showed Siemion’s team a new product, NVIDIA Holoscan, a sensor processing platform for processing real Ntime data from scientific instruments.

Siemion’s team decided to build a trial application with Holoscan on the NVIDIA IGX edge computing platform that, if successful, could radically change the way the SETI Institute worked.

The institute collaborates with Breakthrough Listen, another SETI Institute research program, headquartered at the University of Oxford, that uses dozens of radio telescopes to collect and store mountains of data, later analyzed in separate processes using GPUs. Each telescope and analysis employs separate, custom-built programs.

“We wanted to create something that would really push our capabilities forward,” Siemion said. “We envisioned a streaming solution that in a more general way takes real-time data from telescopes and brings it directly into the GPUs to do AI inference on it.”

Pointing at the Stars

In a team effort, Luigi Cruz, a staff engineer at the SETI Institute, developed the real-time data reception and inference pipeline using the Holoscan SDK, while Peter Ma, a Breakthrough Listen collaborator, built and trained an AI model to detect fast radio bursts, one of many radio phenomena tracked by astronomers. Wael Farah, Allen Telescope Array project scientist, provided key contributions to the scientific aspects of the study.

They linked the combined real-time Holoscan pipeline, running on an NVIDIA IGX Orin platform, to 28 antennas pointed at the Crab Nebula. Over 15 hours, they gathered more than 90 billion data packets on signals across a spectrum of 5GHz.

Their system captured and analyzed in real time nearly the full 100Gbps of data from the experiment, twice the previous speed the astronomers had achieved. What’s more, they saw how the same code could be used with any telescope to detect all sorts of signals.

‘It’s Like a Magic Wand’

The test was “fantastically successful,” said Siemion. “It’s hard to overstate the transformative potential of Holoscan for radio astronomy because it’s like we’ve been given a magic wand to get all our data from telescopes into accelerated computers that are ideally suited for AI.”

He called the direct memory access in NVIDIA GPUs “a game changer.”

Rather than throw away some of its data to enable more efficient processing — as it did in the past — institute researchers can keep and analyze all of it, fast.

“It’s a profound change in how radio astronomy is done,” he said. “Now we have a viable path to a very different way of using telescopes with smart AI software, and if we do that in a scalable way the opportunities for discovery will be legion.”

Scaling Up the Pilot

The team plans to scale up its pilot software and deploy it in all the radio telescopes it currently uses across a dozen sites. It also aims to share the capability in collaborations with astronomers worldwide.

“Our intent is to bring this to larger international observatories with thousands of users and uses,” Siemion said.

The partnerships extend to globally distributed arrays of telescopes now under construction that promise to increase by an order of magnitude the kinds of signals space researchers can detect.

Sharing the Technology Broadly

Collaboration has been a huge theme for Siemion since 2015, when he became principal investigator for Breakthrough Listen.

“We voraciously collaborate with anyone we can find,” he said in a video interview from the Netherlands, where he was meeting local astronomers.

Work with NVIDIA was just one part of efforts that involve companies and governments across technical and scientific disciplines.

“The engineering talent at NVIDIA is world class … I can’t say enough about Adam and the Holoscan team,” he said.

The software opens a big door to technical collaborations.

“Holoscan lets us tap into a developer community far larger than those in astronomy with complementary skills,” he said. “It will be exciting to see if, say, a cancer algorithm could be repurposed to look for a novel astronomical source and vice versa.”

It’s one more way, NVIDIA and its customers are advancing AI for the benefit of all.

Read More

TSMC and NVIDIA Transform Semiconductor Manufacturing With Accelerated Computing

TSMC and NVIDIA Transform Semiconductor Manufacturing With Accelerated Computing

TSMC, the world leader in semiconductor manufacturing, is moving to production with NVIDIA’s computational lithography platform, called cuLitho, to accelerate manufacturing and push the limits of physics for the next generation of advanced semiconductor chips.

A critical step in the manufacture of computer chips, computational lithography is involved in the transfer of circuitry onto silicon. It requires complex computation — involving electromagnetic physics, photochemistry, computational geometry, iterative optimization and distributed computing. A typical foundry dedicates massive data centers for this computation, and yet this step has traditionally been a bottleneck in bringing new technology nodes and computer architectures to market.

Computational lithography is also the most compute-intensive workload in the entire semiconductor design and manufacturing process. It consumes tens of billions of hours per year on CPUs in the leading-edge foundries. A typical mask set for a chip can take 30 million or more hours of CPU compute time, necessitating large data centers within semiconductor foundries. With accelerated computing, 350 NVIDIA H100 Tensor Core GPU-based systems can now replace 40,000 CPU systems, accelerating production time, while reducing costs, space and power.

NVIDIA cuLitho brings accelerated computing to the field of computational lithography. Moving cuLitho to production is enabling TSMC to accelerate the development of next-generation chip technology, just as current production processes are nearing the limits of what physics makes possible.

“Our work with NVIDIA to integrate GPU-accelerated computing in the TSMC workflow has resulted in great leaps in performance, dramatic throughput improvement, shortened cycle time and reduced power requirements,” said Dr. C.C. Wei, CEO of TSMC, at the GTC conference earlier this year.

NVIDIA has also developed algorithms to apply generative AI to enhance the value of the cuLitho platform. A new generative AI workflow has been shown to deliver an additional 2x speedup on top of the accelerated processes enabled through cuLitho.

The application of generative AI enables creation of a near-perfect inverse mask or inverse solution to account for diffraction of light involved in computational lithography. The final mask is then derived by traditional and physically rigorous methods, speeding up the overall optical proximity correction process by 2x.

The use of optical proximity correction in semiconductor lithography is now three decades old. While the field has benefited from numerous contributions over this period, rarely has it seen a transformation quite as rapid as the one provided by the twin technologies of accelerated computing and AI. These together allow for the more accurate simulation of physics and the realization of mathematical techniques that were once prohibitively resource-intensive.

This enormous speedup of computational lithography accelerates the creation of every single mask in the fab, which speeds the total cycle time for developing a new technology node. More importantly, it makes possible new calculations that were previously impractical.

For example, while inverse lithography techniques have been described in the scientific literature for two decades, an accurate realization at full chip scale has been largely precluded because the computation takes too long. With cuLitho, that’s no longer the case. Leading-edge foundries will use it to ramp up inverse and curvilinear solutions that will help create the next generation of powerful semiconductors.

Image courtesy of TSMC.

Read More

Pittsburgh Steels Itself for Innovation With Launch of NVIDIA AI Tech Community

Pittsburgh Steels Itself for Innovation With Launch of NVIDIA AI Tech Community

Serving as a bridge for academia, industry and public-sector groups to partner on artificial intelligence innovation, NVIDIA is launching its inaugural AI Tech Community in Pittsburgh, Pennsylvania.

Collaborations with Carnegie Mellon University and the University of Pittsburgh, as well as startups, enterprises and organizations based in the “city of bridges,” are part of the new NVIDIA AI Tech Community initiative, announced today during the NVIDIA AI Summit in Washington, D.C.

The initiative aims to supercharge public-private partnerships across communities rich with potential for enabling technological transformation using AI.

Two NVIDIA joint technology centers will be established in Pittsburgh to tap into expertise in the region.

NVIDIA’s Joint Center with Carnegie Mellon University (CMU) for Robotics, Autonomy and AI will equip higher-education faculty, students and researchers with the latest technologies and boost innovation in the fields of AI and robotics.

NVIDIA’s Joint Center with the University of Pittsburgh for AI and Intelligent Systems will focus on computational opportunities across the health sciences, including applications of AI in clinical medicine and biomanufacturing.

CMU — the nation’s No. 1 AI university according to the U.S. News & World Report — has pioneered work in autonomous vehicles and natural language processing.

CMU’s Robotics Institute, the world’s largest university-affiliated robotics research group, brings a diverse group of more than a thousand faculty, staff, students, post-doctoral fellows and visitors together to solve humanity’s toughest challenges through robotics.

The University of Pittsburgh — designated as an R1 research university at the forefront of innovation — is ranked No. 6 among U.S. universities in research funding from the National Institutes of Health, topping more than $1 billion in research expenditures in fiscal year 2022 and ranking No. 14 among U.S. universities granted utility patents.

The university has a long history of learning-technology innovations that are interdisciplinary and conducted within research-practice partnerships. By prioritizing inclusivity and practical experience without technical barriers, Pitt is leading the way in democratizing AI education in healthcare and medicine.

By working with these universities, NVIDIA aims to accelerate the innovation, commercialization and operationalization of a technical community for physical AI, robotics, autonomous systems and AI across the nation — and the globe.

These centers will tap into NVIDIA’s full-stack AI platform and accelerated computing expertise to gear up tomorrow’s technology leaders for next-generation innovation.

Establishing the Centers for AI Development 

Generative AI and accelerated computing are transforming workflows across use cases. Three key AI platforms comprise the engine behind this transformation: NVIDIA DGX for AI training, NVIDIA Omniverse for simulation and NVIDIA Jetson for edge computing.

Through the new centers and public-sector-sponsored research opportunities, NVIDIA will provide CMU and Pitt with access to these and more of the company’s latest AI software and frameworks — such as NVIDIA Isaac Lab for robot learning, NVIDIA Isaac Sim for designing and testing robots, NVIDIA NeMo for custom generative AI and NVIDIA NIM microservices, available through the NVIDIA AI Enterprise software platform.

Advanced NVIDIA technological support can help accelerate the research groups’ workflows and enhance the scalability and resiliency of their AI applications.

In addition, the universities will have access to certain generative AI, data science and accelerated computing resources through the NVIDIA Deep Learning Institute, which provides training to meet diverse learning needs and upskill students and developers in AI.

“Pairing Carnegie Mellon University’s existing deep expertise and resources in AI and robotics with NVIDIA’s cutting-edge platform, software and tools has tremendous potential to power Pittsburgh’s already vibrant innovation ecosystem,” said Theresa Mayer, vice president for research at CMU. “This unique collaboration will accelerate innovation, commercialization and operationalization of robotics and autonomy, advancing the best impacts of AI on society.”

“Pitt has a long history and extraordinary research strengths in life sciences and learning sciences,” said Rob A. Rutenbar, senior vice chancellor for research at the University of Pittsburgh. “By focusing on computational and AI opportunities across these ‘meds and eds’ areas, we plan to leverage our collaboration with NVIDIA to explore new ways to connect these breakthroughs to improved health and education outcomes for everybody.”

Fostering Cross-Industry Collaboration

As part of the AI Tech Community initiative, NVIDIA is also increasing its engagement with Pittsburgh-based members of the NVIDIA Inception program for cutting-edge AI startups and the NVIDIA Connect program for software development companies and service providers.

For example, Inception member Lovelace AI is developing AI solutions using NVIDIA accelerated computing and CUDA to enhance the analysis of kinetic data, providing predictive analytics and actionable insights for national security customers.

Skild AI, a startup founded by two Carnegie Mellon professors, is developing a scalable robotics foundation model, called Skild Brain, that can easily adapt across hardware and tasks.

Skild AI is exploring NVIDIA Isaac Lab, a unified, modular framework for robot learning built on the NVIDIA Isaac Sim reference application for designing, simulating and training AI-based robots.

NVIDIA is also engaging with Pittsburgh’s broader robotics ecosystem through its collaborations with the Pittsburgh Robotics Network — which speeds the commercialization of robotics, AI and other advanced technologies — and technology accelerators like AlphaLab and the Robotics Factory at Innovation Works, which supports startups based in the city that are focused on AI, robotics and autonomy.

And through its Deep Learning Institute, which has trained more than 650,000 people, NVIDIA is committed to furthering AI workforce development worldwide.

Learn more about how NVIDIA is propelling the next era of computing in higher education and research, including at the NVIDIA AI Summit, running through Oct. 9. NVIDIA Vice President of Developer Programs Greg Estes will discuss scaling AI skills and economic growth through public-private collaboration.

Featured image courtesy of Wikimedia Commons.

Read More

PyTorch Foundation Technical Advisory Council Elects New Leadership

PyTorch Foundation Technical Advisory Council Elects New Leadership

We are pleased to announce the first-ever Chair and Vice Chair of the PyTorch Foundation’s Technical Advisory Council (TAC): Luca Antiga as the Chair and Jiong Gong as Vice Chair. Both leaders bring extensive experience and deep commitment to the PyTorch community, and they are set to guide the TAC in its mission to foster an open, diverse, and innovative PyTorch technical community.

Meet the New Leadership

Luca Antiga

Luca Antiga is the CTO at Lightning AI since 2022. He is an early contributor to PyTorch core and co-authored “Deep Learning with PyTorch” (published by Manning). He started his journey as a researcher in Bioengineering, and later co-founded Orobix, a company focused on building and deploying AI in production settings.

“I am looking forward to taking on the role of the chair of the PyTorch TAC,” says Luca. “As the TAC chair, I will ensure effective, timely topic selection and enhance visibility of technical needs from the board members and from the ecosystem at large. I will strive for directional, cohesive messaging throughout the transition of PyTorch from Meta to the Linux Foundation.”

Jiong Gong

Jiong Gong is a Principal Engineer and SW Architect for PyTorch Optimization from Intel. He serves as one of the PyTorch CPU module maintainers and is an active contributor to the TorchInductor CPU backend.

“I plan to further strengthen the collaboration between PyTorch developers and hardware vendors, promoting innovation and performance optimization across various hardware platforms, enhancing PyTorch ecosystem and streamlining the decision-making process,” says Jiong. “I am honored to serve as the vice chair of the TAC.”

What Does the TAC Do?

The PyTorch Foundation’s TAC provides a forum for technical communication, leadership, and collaboration for the PyTorch Foundation. The committee members are members of the PyTorch Foundation. The committee holds open meetings once a month that anyone in the community can attend. The committee provides thought leadership on technical topics, knowledge sharing, and a forum to discuss issues with other technical experts in the community.

New TAC Webpage

Stay connected with the PyTorch Foundation’s Technical Advisory Council (TAC) by visiting our new TAC webpage. Here you can find the TAC members, where to view upcoming meeting agendas, access presentations, attend public meetings, watch meeting recordings and participate in discussions on key technical topics.

Plus stay tuned on our blog for regular updates from the PyTorch Foundation TAC leadership.

Read More

Foxconn to Build Taiwan’s Fastest AI Supercomputer With NVIDIA Blackwell

Foxconn to Build Taiwan’s Fastest AI Supercomputer With NVIDIA Blackwell

NVIDIA and Foxconn are building Taiwan’s largest supercomputer, marking a milestone in the island’s AI advancement.

The project, Hon Hai Kaohsiung Super Computing Center, revealed Tuesday at Hon Hai Tech Day, will be built around NVIDIA’s groundbreaking Blackwell architecture and feature the GB200 NVL72 platform, which includes a total of 64 racks and 4,608 Tensor Core GPUs.

With an expected performance of over 90 exaflops of AI performance, the machine would easily be considered the fastest in Taiwan.

Foxconn plans to use the supercomputer, once operational, to power breakthroughs in cancer research, large language model development and smart city innovations, positioning Taiwan as a global leader in AI-driven industries.

Foxconn’s “three-platform strategy” focuses on smart manufacturing, smart cities and electric vehicles. The new supercomputer will play a pivotal role in supporting Foxconn’s ongoing efforts in digital twins, robotic automation and smart urban infrastructure, bringing AI-assisted services to urban areas like Kaohsiung.

Construction has started on the new supercomputer housed in Kaohsiung, Taiwan. The first phase is expected to be operational by mid-2025. Full deployment is targeted for 2026.

The project will integrate with NVIDIA technologies, such as  NVIDIA Omniverse and Isaac robotics platforms for AI and digital twins technologies to help transform manufacturing processes.

“Powered by NVIDIA’s Blackwell platform, Foxconn’s new AI supercomputer is one of the most powerful in the world, representing a significant leap forward in AI computing and efficiency,” said Foxconn Vice President and Spokesperson James Wu.

The GB200 NVL72 is a state-of-the-art data center platform optimized for AI and accelerated computing.

Each rack features 36 NVIDIA Grace CPUs and 72 NVIDIA Blackwell GPUs connected via NVIDIA’s NVLink technology, delivering 130TB/s of bandwidth.

NVIDIA NVLink Switch allows the 72-GPU system to function as a single, unified GPU. This makes it ideal for training large AI models and executing complex inference tasks in real time on trillion-parameter models.

Taiwan-based Foxconn, officially known as Hon Hai Precision Industry Co., is the world’s largest electronics manufacturer, known for producing a wide range of products, from smartphones to servers, for the world’s top technology brands.

With a vast global workforce and manufacturing facilities across the globe, Foxconn is key in supplying the world’s technology infrastructure. It is a leader in smart manufacturing as one of the pioneers of industrial AI as it digitalizes its factories in NVIDIA Omniverse.

Foxconn was also one of the first companies to use NVIDIA NIM microservices in the development of domain-specific large language models, or LLMs, embedded into a variety of internal systems and processes in its AI factories for smart manufacturing, smart electric vehicles and smart cities.

The Hon Hai Kaohsiung Super Computing Center is part of a growing global network of advanced supercomputing facilities powered by NVIDIA. This network includes several notable installations across Europe and Asia.

These supercomputers represent a significant leap forward in computational power, putting NVIDIA’s cutting-edge technology to work to advance research and innovation across various scientific disciplines.

Learn more about Hon Hai Tech Day.

Read More

Build a generative AI Slack chat assistant using Amazon Bedrock and Amazon Kendra

Build a generative AI Slack chat assistant using Amazon Bedrock and Amazon Kendra

Despite the proliferation of information and data in business environments, employees and stakeholders often find themselves searching for information and struggling to get their questions answered quickly and efficiently. This can lead to productivity losses, frustration, and delays in decision-making.

A generative AI Slack chat assistant can help address these challenges by providing a readily available, intelligent interface for users to interact with and obtain the information they need. By using the natural language processing and generation capabilities of generative AI, the chat assistant can understand user queries, retrieve relevant information from various data sources, and provide tailored, contextual responses.

By harnessing the power of generative AI and Amazon Web Services (AWS) services Amazon Bedrock, Amazon Kendra, and Amazon Lex, this solution provides a sample architecture to build an intelligent Slack chat assistant that can streamline information access, enhance user experiences, and drive productivity and efficiency within organizations.

Why use Amazon Kendra for building a RAG application?

Amazon Kendra is a fully managed service that provides out-of-the-box semantic search capabilities for state-of-the-art ranking of documents and passages. You can use Amazon Kendra to quickly build high-accuracy generative AI applications on enterprise data and source the most relevant content and documents to maximize the quality of your Retrieval Augmented Generation (RAG) payload, yielding better large language model (LLM) responses than using conventional or keyword-based search solutions. Amazon Kendra offers simple-to-use deep learning search models that are pre-trained on 14 domains and don’t require machine learning (ML) expertise. Amazon Kendra can index content from a wide range of sources, including databases, content management systems, file shares, and web pages.

Further, the FAQ feature in Amazon Kendra complements the broader retrieval capabilities of the service, allowing the RAG system to seamlessly switch between providing prewritten FAQ responses and dynamically generating responses by querying the larger knowledge base. This makes it well-suited for powering the retrieval component of a RAG system, allowing the model to access a broad knowledge base when generating responses. By integrating the FAQ capabilities of Amazon Kendra into a RAG system, the model can use a curated set of high-quality, authoritative answers for commonly asked questions. This can improve the overall response quality and user experience, while also reducing the burden on the language model to generate these basic responses from scratch.

This solution balances retaining customizations in terms of model selection, prompt engineering, and adding FAQs with not having to deal with word embeddings, document chunking, and other lower-level complexities typically required for RAG implementations.

Solution overview

The chat assistant is designed to assist users by answering their questions and providing information on a variety of topics. The purpose of the chat assistant is to be an internal-facing Slack tool that can help employees and stakeholders find the information they need.

The architecture uses Amazon Lex for intent recognition, AWS Lambda for processing queries, Amazon Kendra for searching through FAQs and web content, and Amazon Bedrock for generating contextual responses powered by LLMs. By combining these services, the chat assistant can understand natural language queries, retrieve relevant information from multiple data sources, and provide humanlike responses tailored to the user’s needs. The solution showcases the power of generative AI in creating intelligent virtual assistants that can streamline workflows and enhance user experiences based on model choices, FAQs, and modifying system prompts and inference parameters.

Architecture diagram

The following diagram illustrates a RAG approach where the user sends a query through the Slack application and receives a generated response based on the data indexed in Amazon Kendra. In this post, we use Amazon Kendra Web Crawler as the data source and include FAQs stored on Amazon Simple Storage Service (Amazon S3). See Data source connectors for a list of supported data source connectors for Amazon Kendra.

ML-16837-arch-diag

The step-by-step workflow for the architecture is the following:

  1. The user sends a query such as What is the AWS Well-Architected Framework? through the Slack app.
  2. The query goes to Amazon Lex, which identifies the intent.
  3. Currently two intents are configured in Amazon Lex (Welcome and FallbackIntent).
  4. The welcome intent is configured to respond with a greeting when a user enters a greeting such as “hi” or “hello.” The assistant responds with “Hello! I can help you with queries based on the documents provided. Ask me a question.”
  5. The fallback intent is fulfilled with a Lambda function.
    1. The Lambda function searches Amazon Kendra FAQs through the search_Kendra_FAQ method by taking the user query and Amazon Kendra index ID as inputs. If there’s a match with a high confidence score, the answer from the FAQ is returned to the user.
      def search_Kendra_FAQ(question, kendra_index_id):
          """
          This function takes in the question from the user, and checks if the question exists in the Kendra FAQs.
          :param question: The question the user is asking that was asked via the frontend input text box.
          :param kendra_index_id: The kendra index containing the documents and FAQs
          :return: If found in FAQs, returns the answer along with any relevant links. If not, returns False and then calls kendra_retrieve_document function.
          """
          kendra_client = boto3.client('kendra')
          response = kendra_client.query(IndexId=kendra_index_id, QueryText=question, QueryResultTypeFilter='QUESTION_ANSWER')
          for item in response['ResultItems']:
              score_confidence = item['ScoreAttributes']['ScoreConfidence']
              # Taking answers from FAQs that have a very high confidence score only
              if score_confidence == 'VERY_HIGH' and len(item['AdditionalAttributes']) > 1:
                  text = item['AdditionalAttributes'][1]['Value']['TextWithHighlightsValue']['Text']
                  url = "None"
                  if item['DocumentURI'] != '':
                      url = item['DocumentURI']
                  return (text, url)
          return (False, False)

    2. If there isn’t a match with a high enough confidence score, relevant documents from Amazon Kendra with a high confidence score are retrieved through the kendra_retrieve_document method and sent to Amazon Bedrock to generate a response as the context.
      def kendra_retrieve_document(question, kendra_index_id):
          """
          This function takes in the question from the user, and retrieves relevant passages based on default PageSize of 10.
          :param question: The question the user is asking that was asked via the frontend input text box.
          :param kendra_index_id: The kendra index containing the documents and FAQs
          :return: Returns the context to be sent to the LLM and document URIs to be returned as relevant data sources.
          """
          kendra_client = boto3.client('kendra')
          documents = kendra_client.retrieve(IndexId=kendra_index_id, QueryText=question)
          text = ""
          uris = set()
          if len(documents['ResultItems']) > 0:
              for i in range(len(documents['ResultItems'])):
                  score_confidence = documents['ResultItems'][i]['ScoreAttributes']['ScoreConfidence']
                  if score_confidence == 'VERY_HIGH' or score_confidence == 'HIGH':
                      text += documents['ResultItems'][i]['Content'] + "n"
                      uris.add(documents['ResultItems'][i]['DocumentURI'])
          return (text, uris)

    3. The response is generated from Amazon Bedrock with the invokeLLM method. The following is a snippet of the invokeLLM method within the fulfillment function. Read more on inference parameters and system prompts to modify parameters that are passed into the Amazon Bedrock invoke model request.
      def invokeLLM(question, context, modelId):
          """
          This function takes in the question from the user, along with the Kendra responses as context to generate an answer
          for the user on the frontend.
          :param question: The question the user is asking that was asked via the frontend input text box.
          :param documents: The response from the Kendra document retrieve query, used as context to generate a better
          answer.
          :return: Returns the final answer that will be provided to the end-user of the application who asked the original
          question.
          """
          # Setup Bedrock client
          bedrock = boto3.client('bedrock-runtime')
          # configure model specifics such as specific model
          modelId = modelId
      
          # body of data with parameters that is passed into the bedrock invoke model request
          body = json.dumps({"max_tokens": 350,
                  "system": "You are a truthful AI assistant. Your goal is to provide informative and substantive responses to queries based on the documents provided. If you do not know the answer to a question, you truthfully say you do not know.",
                  "messages": [{"role": "user", "content": "Answer this user query:" + question + "with the following context:" + context}],
                  "anthropic_version": "bedrock-2023-05-31",
                      "temperature":0,
                  "top_k":250,
                  "top_p":0.999})
      
          # Invoking the bedrock model with your specifications
          response = bedrock.invoke_model(body=body,
                                          modelId=modelId)
          # the body of the response that was generated
          response_body = json.loads(response.get('body').read())
          # retrieving the specific completion field, where you answer will be
          answer = response_body.get('content')
          # returning the answer as a final result, which ultimately gets returned to the end user
          return answer

    4. Finally, the response generated from Amazon Bedrock along with the relevant referenced URLs are returned to the end user.

    When selecting websites to index, adhere to the AWS Acceptable Use Policy and other AWS terms. Remember that you can only use Amazon Kendra Web Crawler to index your own web pages or web pages that you have authorization to index. Visit the Amazon Kendra Web Crawler data source guide to learn more about using the web crawler as a data source. Using Amazon Kendra Web Crawler to aggressively crawl websites or web pages you don’t own is not considered acceptable use.

    Supported features

    The chat assistant supports the following features:

    1. Support for the following Anthropic’s models on Amazon Bedrock:
      • claude-v2
      • claude-3-haiku-20240307-v1:0
      • claude-instant-v1
      • claude-3-sonnet-20240229-v1:0
    2. Support for FAQs and the Amazon Kendra Web Crawler data source
    3. Returns FAQ answers only if the confidence score is VERY_HIGH
    4. Retrieves only documents from Amazon Kendra that have a HIGH or VERY_HIGH confidence score
    5. If documents with a high confidence score aren’t found, the chat assistant returns “No relevant documents found”

    Prerequisites

    To perform the solution, you need to have following prerequisites:

    • Basic knowledge of AWS
    • An AWS account with access to Amazon S3 and Amazon Kendra
    • An S3 bucket to store your documents. For more information, see Step 1: Create your first S3 bucket and the Amazon S3 User Guide.
    • A Slack workspace to integrate the chat assistant
    • Permission to install Slack apps in your Slack workspace
    • Seed URLs for the Amazon Kendra Web Crawler data source
      • You’ll need authorization to crawl and index any websites provided
    • AWS CloudFormation for deploying the solution resources

    Build a generative AI Slack chat assistant

    To build a Slack application, use the following steps:

    1. Request model access on Amazon Bedrock for all Anthropic models
    2. Create an S3 bucket in the us-east-1 (N. Virginia) AWS Region.
    3. Upload the AIBot-LexJson.zip and SampleFAQ.csv files to the S3 bucket
    4. Launch the CloudFormation stack in the us-east-1 (N. Virginia) AWS Region.Launch Stack to create solution resources
    5. Enter a Stack name of your choice
    6. For S3BucketName, enter the name of the S3 bucket created in Step 2
    7. For S3KendraFAQKey, enter the name of the SampleFAQs uploaded to the S3 bucket in Step 3
    8. For S3LexBotKey, enter the name of the Amazon Lex .zip file uploaded to the S3 bucket in Step 3
    9. For SeedUrls, enter up to 10 URLs for the web crawler as a comma delimited list. In the example in this post, we give the publicly available Amazon Bedrock service page as the seed URL
    10. Leave the rest as defaults. Choose Next. Choose Next again on the Configure stack options
    11. Acknowledge by selecting the box and choose Submit, as shown in the following screenshot
      ML-16837-cfn-checkbox
    12. Wait for the stack creation to complete
    13. Verify all resources are created
    14. Test on the AWS Management Console for Amazon Lex
      1. On the Amazon Lex console, choose your chat assistant ${YourStackName}-AIBot
      2. Choose Intents
      3. Choose Version 1 and choose Test, as shown in the following screenshot
        ML-16837-lex-version1
      4. Select the AIBotProdAlias and choose Confirm, as shown in the following screenshot. If you want to make changes to the chat assistant, you can use the draft version, publish a new version, and assign the new version to the AIBotProdAlias. Learn more about Versioning and Aliases.
      5. Test the chat assistant with questions such as, “Which AWS service has 11 nines of durability?” and “What is the AWS Well-Architected Framework?” and verify the responses. The following table shows that there are three FAQs in the sample .csv file.
        _question _answer _source_uri
        Which AWS service has 11 nines of durability? Amazon S3 https://aws.amazon.com/s3/
        What is the AWS Well-Architected Framework? The AWS Well-Architected Framework enables customers and partners to review their architectures using a consistent approach and provides guidance to improve designs over time. https://aws.amazon.com/architecture/well-architected/
        In what Regions is Amazon Kendra available? Amazon Kendra is currently available in the following AWS Regions: Northern Virginia, Oregon, and Ireland https://aws.amazon.com/about-aws/global-infrastructure/regional-product-services/
      6. The following screenshot shows the question “Which AWS service has 11 nines of durability?” and its response. You can observe that the response is the same as in the FAQ file and includes a link.
        ML-16837-Q1inLex
      7. Based on the pages you have crawled, ask a question in the chat. For this example, the publicly available Amazon Bedrock page was crawled and indexed. The following screenshot shows the question, “What are agents in Amazon Bedrock?” and and a generated response that includes relevant links.
        ML-16837-Q2inLex
    1. For integration of the Amazon Lex chat assistant with Slack, see Integrating an Amazon Lex V2 bot with Slack. Choose the AIBotProdAlias under Alias in the Channel Integrations

    Run sample queries to test the solution

    1. In Slack, go to the Apps section. In the dropdown menu, choose Manage and select Browse apps.
      ML-16837-slackBrowseApps
    2. Search for ${AIBot} in App Directory and choose the chat assistant. This will add the chat assistant to the Apps section in Slack. You can now start asking questions in the chat. The following screenshot shows the question “Which AWS service has 11 nines of durability?” and its response. You can observe that the response is the same as in the FAQ file and includes a link.
      ML-16837-Q1slack
    3. The following screenshot shows the question, “What is the AWS Well-Architected Framework?” and its response.
      ML-16837-Q2slack
    4. Based on the pages you have crawled, ask a question in the chat. For this example, the publicly available Amazon Bedrock page was crawled and indexed. The following screenshot shows the question, “What are agents in Amazon Bedrock?” and and a generated response that includes relevant links.
      ML-16837-Q3slack
    5. The following screenshot shows the question, “What is amazon polly?” Because there is no Amazon Polly documentation indexed, the chat assistant responds with “No relevant documents found,” as expected.
      ML-16837-Q4slack

    These examples show how the chat assistant retrieves documents from Amazon Kendra and provides answers based on the documents retrieved. If no relevant documents are found, the chat assistant responds with “No relevant documents found.”

    Clean up

    To clean up the resources created by this solution:

    1. Delete the CloudFormation stack by navigating to the CloudFormation console
    2. Select the stack you created for this solution and choose Delete
    3. Confirm the deletion by entering the stack name in the provided field. This will remove all the resources created by the CloudFormation template, including the Amazon Kendra index, Amazon Lex chat assistant, Lambda function, and other related resources.

    Conclusion

    This post describes the development of a generative AI Slack application powered by Amazon Bedrock and Amazon Kendra. This is designed to be an internal-facing Slack chat assistant that helps answer questions related to the indexed content. The solution architecture includes Amazon Lex for intent identification, a Lambda function for fulfilling the fallback intent, Amazon Kendra for FAQ searches and indexing crawled web pages, and Amazon Bedrock for generating responses. The post walks through the deployment of the solution using a CloudFormation template, provides instructions for running sample queries, and discusses the steps for cleaning up the resources. Overall, this post demonstrates how to use various AWS services to build a powerful generative AI–powered chat assistant application.

    This solution demonstrates the power of generative AI in building intelligent chat assistants and search assistants. Explore the generative AI Slack chat assistant: Invite your teams to a Slack workspace and start getting answers to your indexed content and FAQs. Experiment with different use cases and see how you can harness the capabilities of services like Amazon Bedrock and Amazon Kendra to enhance your business operations. For more information about using Amazon Bedrock with Slack, refer to Deploy a Slack gateway for Amazon Bedrock.


    About the authors

    Kruthi Jayasimha Rao is a Partner Solutions Architect with a focus on AI and ML. She provides technical guidance to AWS Partners in following best practices to build secure, resilient, and highly available solutions in the AWS Cloud.

    Mohamed Mohamud is a Partner Solutions Architect with a focus on Data Analytics. He specializes in streaming analytics, helping partners build real-time data pipelines and analytics solutions on AWS. With expertise in services like Amazon Kinesis, Amazon MSK, and Amazon EMR, Mohamed enables data-driven decision-making through streaming analytics.

Read More