What is it
The app shows a Legal RAG (Retrieval-Augmented Generation) system that works by breaking down legal documents into paragraphs and storing them in a vector database. When a user provides a feature description, the system embeds it and compares it against the stored legal paragraphs using similarity search (MaxSim). The top-matching results are passed to a reranker before being used to construct prompts for an LLM. The LLM then generates an output explaining whether the feature is geo-compliant or not, along with reasoning and references. The results can be validated against ground truth data and metrics.
Challenges and Future Ideas
- parsing and chunking PDf was very difficult
- explore use of active learning + HITL
- automatically update knowledge base
- integrate with github discussion
Built With
- chromadb
- fastapi
- langchain
- nextjs
- python
Log in or sign up for Devpost to join the conversation.