Inspiration
Most customer service bots are faceless and interchangeable. Hotels in Las Vegas are deeply committed to their theatrical identity and that extends to services they provide, yet their current chat bots are generic. So our goal is to rebrand this new face of hospitality so it does not break this long kept theatrical identity. Meet Jester, a costumer service chat bot who stays in character while assisting with any general questions and has the ability to detect or be explicitly told to escalate to a human contact.
What it does
Essentially Jester is a character-driven AI who's information and responses are supported by a RAG database and prompt guiding presets to insure that the agent does not hallucinate or answer inadequately.
How we built it
Jester is composed of two main parts, its frontend and its backend.
The frontend is a static react page that renders all the relevant html needed and posts users inputs to the backend via API post requests. Written entirely in typescript enhanced by react and stylized by tailwind it provides all the relevant functionality to demo how Jester can properly service the end user.
The core of Jester is in its backend, written in python Jester leverages the featherless.ai API to orchestrate multi-modal communication and provide specific services such as vision or text requests. To prevent hallucinations and guide the agent's responses we implemented a knowledge base system that holds relevant information for each hotel, things like current entertainment, food places, and hours of operation. This knowledge base is read as chunks by an ingest script ran at the start of the backend server, once this completes our RAG database is then constructed using ChromaDb and embedding functions ,namely ONNXMiniLM_L6_V2, to store the data from the knowledge base. The backend service then waits for the user's input from the frontend to be posted, once obtained using fastAPI the information is cleaned then passed into a prompt generation sequence, where we select the model, user's requirements, and then package the characters personality until the final prompt is submitted to the model. The output of the model's thought process is then thrown away and the final response is outputted to the user user once it is posted by the backend. To maintain context in the discussion a python dictionary stores the session id and relevant information.
Challenges we ran into
Throughout this project we faced a few bottlenecks at each phase of development. Our first major bottleneck was figuring out the API interaction and hosting method for our frontend and backend. We ended up selecting render to host our backend as a webservice and frontend as a static site. Then we encountered an issue neither of us thought about, model parameter intake and what that means for our service's response time. We did not want to ruin user experience in both efficiency and effectiveness by using a small language model nor did we want to burn their token limit in one question using something Kimi-K2. Our other restraint is we wanted to support image submission as this increases the amount of services we can provide support for, so we ended up focusing on models around 70 billion parameters. Lastly, the thing that took us the longest was figuring out the user interface and accessibility features that our chat bot can support without taking away from the visual influence and atmosphere. Overall, we had a good amount of minor issues throughout this project, but these were by far the most challenging.
Accomplishments that we're proud of
This our first somewhat full stack project that we have ever made and deployed successfully, we also have never gotten a chance to really interact with language models in this way, so this was a very successful event for us. We are also very happy with how we organized our code infrastructure to support the actual flow of the application, this helped us divide and conquer a lot of the project too.
What we learned
We learned a lot about https requests, most commonly error codes such as status 402, how to properly manage dependencies between two essentially separate cores, prompt guiding and language inference techniques, and the impact planning has on coding efficiency.
What's next for Jester
Three things that can be improved in the future for Jester is changing context storage from a python dictionary to an actual relational database, provide much more dense information in the knowledge base and implement metadata filtering so that agents do not receive information on a hotel they do not represent, and add even more safeguards, escalation detection methods and refine current prompt injections so that responses are more focused.
Built With
- chromadb
- docker
- featherless
- openai
- python
- react
- render
- tailwind
- typescript
Log in or sign up for Devpost to join the conversation.