How Citibot’s chatbot search engine uses AI to find more answers

This is a guest blog post by Francisco Zamora and Nicholas Burden at TensorIoT and Bratton Riley at Citibot. In their own words, “TensorIoT is an AWS Advanced Consulting Partner with competencies in IoT, Machine Learning, Industrial IoT and Retail. Founded by AWS alums, they have delivered end-to-end IoT and Machine Learning solutions to customers across the globe. Citibot provides tools for citizens and their governments to use for efficient and effective communication and civic change.”

Citibot is a technology company that builds AI-powered chat solutions for local governments from Fort Worth, Texas to Arlington, Virginia. With Citibot, local residents can quickly get answers to city-related questions, report issues, and receive real-time alerts via text responses. To power these interactions, Citibot uses Amazon Lex, a service for building conversational interfaces for text and voice applications. Citibot built a chatbot to handle basic call queries, which allows government employees to allocate more time to higher-impact community actions.

The challenges imposed by the COVID-19 pandemic surfaced the need for public organizations to have scalable, self-service tools that can quickly provide reliable information to its constituents. With COVID-19, Citibot call centers saw a dramatic uptick in wait times and call abandonments as citizens tried to get information about virus prevention and unemployment insurance. To increase the flexibility and robustness of their chatbot to new query types, Citibot looked to add a general search capability. Citibot wanted a solution that could outperform third-party solutions and effectively use curated FAQ content and recently published data from multiple websites such as the CDC and federal, state, and local government.

The following image shows screenshots of sample Citibot conversations.

To design this general search solution, Citibot chose TensorIoT, an AWS Advanced Consulting Partner that specializes in serverless application development. TensorIot developed a solution that included TensorIoT’s Web Connector Tool and Amazon Kendra, an enterprise search service. TensorIoT’s Web Connector Tool, built natively on AWS, enabled Amazon Kendra to index the content of target web pages and be a fallback search intent when Amazon Lex intents can’t provide an answer.

This new chatbot search solution helped local citizens quickly find the answers they needed and reduced wait times by up to 90%. This in turn decreased the volume of interactions handled by city officials, eased uncertainty within communities, and allowed municipal governments to focus on keeping their communities safe. As offices closed due to the pandemic, this solution provided a contactless way for residents without internet access to search for information on government websites at any time through their phones.

The following diagram illustrates the architecture for Citibot’s general search solution.

How it all came together

First, TensorIoT deployed a custom Amazon Lex search intent that is triggered when the chatbot receives a question or utterance it can’t answer. The team used AWS Lambda to develop the intent’s dialog and fulfillment code hooks to manage the conversation flow and fulfillment APIs. This new search intent was developed, tested, and merged into the dev version of Citibot to ensure all the original intents worked properly.

Second, TensorIoT needed to create a search query index. They choose Amazon Kendra because it can integrate a variety of data sources and data types into Citibot’s existing technology stack. The TensorIoT and Citibot development teams determined a target group of government data sources, including the CDC website for COVID-19 data and multiple city websites for municipal data, that are checked on a routine basis. This helps the chatbot access the most recent guidelines about the virus and social distancing.

The following diagram illustrates the data sources used for Citibot’s general search solution.

Next, the teams researched the optimal format type and data storage containers for saving information and connecting to Amazon Kendra. TensorIoT knew that Amazon Kendra is trained to systematically process and index data sources to derive meaning from a variety of data formats, such as .pdf, .csv, and .html files. To increase the processing efficiency of Amazon Kendra, the TensorIoT team intelligently partitioned the data into queryable information chunks that could be relayed back to the users. The TensorIoT approach used a combination of .csv, .pdf, and .html files to provide complete data, giving a solid foundation for product build and development.

The TensorIoT team then developed a versatile Web Connector using NodeJS and the Javascript library Cheerio to crawl trusted websites and deposit that information into the data stores. Because COVID-19-related information changes frequently, TensorIoT created an Amazon DynamoDB table to store all the websites to routinely index for updated information.

With the additional information from the targeted websites, the TensorIoT and Citibot teams decided to use Amazon Simple Storage Service (Amazon S3) buckets for data storage. Amazon Kendra provides machine learning (ML)-powered search capabilities for all unstructured data stored in AWS and offers easy-to-use native connectors for popular sources like Amazon S3, SharePoint, Salesforce, ServiceNow, RDS databases, and OneDrive. By unifying the extracted .html pages and .pdf files from the CDC website in the same S3 bucket, the development team could sync the index to the data source, providing readily available data. They also used Amazon Kendra to extract metadata files from the scraped .html pages, which provided additional file attributes such as city names to further improve answer results.

The following image shows an example of the attributes that Citibot could use to tune search results.

Without any model training, TensorIoT and Citibot could point Amazon Kendra at their content stores and start receiving specific answers to natural language queries (such as, “How can I protect myself from Covid-19?”) by extracting the answer from the most relevant document.

To test the solution, the engineers ran sample event scripts with test inputs that allowed them to verify if all the sample questions were being answered successfully. TensorIoT tested and confirmed that each question or utterance returned an answer with a valid text excerpt and link. Additionally, the team used a negative feedback API that flagged answers users had downvoted and gave Citibot the ability to revisit the search answers that were voted as unhelpful. This data helps drive continuous improvement around the answers provided by the index for specific questions.

For curated content search, the developers could also upload a .csv file of FAQs to provide direct answers to the most commonly asked questions. For Citibot, TensorIoT used this feature to fill in the specific answers for municipal information questions, and added a .csv file with relevant questions and answers (Q&A) that required a complete search engine microservice. Using these features brings numerous benefits, including accuracy, simplicity, and connectivity.

In just a few weeks, TensorIoT also built and added custom query logic and feedback submission APIs to the Amazon Lex bot, giving users better answers without requiring human interaction or extensive searching. Amazon Kendra exposes their services via API, such as the submit feedback API, which allows end-users to interact with search results. The team used the custom Amazon Lex intent and Lambda to handle the incoming queries and create a powerful search service.

The following image shows how the solution uses Amazon Lex and Lambda.

The TensorIoT solution was designed so Citibot can effortlessly add new cities to the service and disseminate information to their respective communities. The next challenge for the TensorIoT team was using city-specific information to provide more relevant search results. Combined with the additional session and request attributes of Amazon Lex, TensorIoT provided Amazon Kendra with search filters to refine the data query with specific city information. If no city was stated, the system defaulted to the call location of the user. With TensorIoT’s custom search intent deployed, search filter in place, data sources filled, and APIs built, the team started to integrate this search engine into the existing chatbot product.

Deployment

To deploy this TensorIot solution, the development teams integrated the new Amazon Lex custom search intent with Citibot and tested the bot’s ability to successfully answer queries. Using a sample phone number provided by Citibot through Twilio, TensorIoT used SMS to validate the returned results for each utterance.

With Amazon Kendra, the TensorIoT team eliminated the need for a third-party search engine and could focus on creating an automated solution for gathering information. After the chatbot was updated, the team redeployed the service with a version upgrade of the software development kit. The upgraded chatbot now uses the search power of Amazon Kendra to answer more questions for users based on the curation of document content. The resulting informational Citibot stands above the prior tools the cities had used.

Storing information in a curated content form is especially useful when combining Amazon Lex and Amazon Kendra. Amazon Kendra is perfect for customized information retrieval that is ultimately communicated to the end-user through agentless voice interactions of Amazon Lex.

Conclusion

This use case demonstrates how TensorIot used multiple AWS services to add value in solution development. Beyond COVID-19, cities can continue to utilize the Amazon Kendra-powered chatbot to provide fast access to information about public facility hours, road closures, and events. Depending on your use case, you can easily customize the subject matter of the AWS Kendra index to provide information for emerging user needs.

The TensorIoT search engine proved to be a powerful solution to a modern-day problem, allowing communities to stay informed and connected through text. Although the primary purpose of this application was to enhance customer support services, the solution is applicable to searching internal knowledge bases for schools, banks, local businesses, and non-profit organizations. With AWS and TensorIoT, companies like Citibot can use new and powerful technologies such as Amazon Kendra to improve their existing chatbot solutions.

 


About the Authors

Francisco Zamora is a Software Engineer at TensorIoT.

Nicholas Burden is a Technical Evangelist at at TensorIoT.

Bratton Riley is the CEO at Citibot.

Read More