Inspiration
Truth on the internet is obscure and topics often have a publisher bias. We were pretty shocked to see that professors and journalists don't use LLMs to get their information. Up on surveying a few professors we were able to identify the reason they are not able to use LLMs on a daily basis is because of trust and citation. As LLMs learn from all the data on the internet its hard for them to know fact from fiction. LLM also often have a query bias where the sentiment of the answer depends on the sentiment of the query itself.
What it does
With Cited RAG you are in control of the data sources. You are able to create chats that have separate contexts, where you are able to upload documents and add trusted URLs. Chat with these trusted sources and get cited results.
How we built it
Powered by Google's Vertex AI we load the data sources into a vector database and retrieve matching context that is then fed into the prompt. Using the prompt we add guardrails so that the LLM never searches outside its context
Challenges we ran into
We had to make sure that the LLM would not get data from its learning history or think it up on its own.
Accomplishments that we're proud of
Getting a AI project deployed and working.
What we learned
- How LLMs and RAG works
- Deploying ML projects
- Using Vector Databases
- Vertex AI
What's next for Cited RAG
- Web sockets
- News Sources
- Quoted Citing
Built With
- chroma
- chroma-db
- google-cloud
- koa
- next.js
- node.js
- react.js
- vertex-ai
Log in or sign up for Devpost to join the conversation.