Inspiration
Whether it's a 30-page contract or a mysterious medical bill, legal, financial, and medical jargon is often used as a shield to hide malicious terms and billing errors from the typical consumer. The documents that impact your life the most, whether that be through your health, debt, or rights, are often the hardest to read. We built ClearDocs, an AI-powered pair of second eyes that doesn't just summarize your documents, but hunts red flags and provides solutions, all in plain English, easy for anyone to read.
What it does
- Protects Privacy: Automatically redacts PIIs (Personally Identifiable Information) before any data processing begins
- Decodes Complexity: Generates a summary of the document's purpose and costs involved in plain English
- Identifies Red Flags: Hunts for predatory clauses or billing errors and provides solutions
- Verifies Evidence: Maps each red flag to a quote from the source text, ensuring the user can identify exactly where and what is wrong by highlighting the source pdf.
How we built it
- Document Processing: PyMuPDF4LLM library converts given PDFs to markdown to ensure the AI truly understands the document. It is then given to ftfy, a library that cleans broken encoding or symbols. Finally, private information is redacted through the scrubadub library.
- AI: Before the main AI, Gemini, sees the text, the system uses LSA through Sumy to identify and extract the most significant sentences. This 'compresses' the document into manageable information while retaining the original meaning. Once it is compressed, Gemini (model 3-flash-preview) generates a response including the summary, flags, and solutions that are then processed into a JSON format.
- API: The entire tool is wrapped in FastAPI to return the response to the frontend.
- Frontend: The frontend is built using Vite, TypeScript, and TailwindCSS.
Challenges we ran into
- Regex: Finding the right libraries and regex for the specific use of redacting information related to PII
- AI models: Due to memory, API, and financial constraints, we had to look for and opt for smaller, lightweight models that are free. Originally, our plan was OpenAI API and Llama for the response, along with Microsoft Presidio and spaCy for PII redaction. However, we have no money, resources, or time. Despite this, our current tech stack might be the better option due to its lightweight characteristics.
- Deployment: Even though we tried our best to maintain lower memory usage, it was still too high for the free tier on render.
- Mismatched sentences: Due to the redacted information, there can be trouble finding a corresponding source sentence to serve as evidence for each risk. Ex: "contact {{phone}} to arrange payment" will not match with "contact {{1234567890}} to arrange payment". However, we must maintain this bottom line, as we should never give personal information to the AI.
- Prompt engineering: It's hard to get AI to output exactly what you want. Through trial and error, we got it to format its responses in a specific manner so that we can process them consistently.
Accomplishments that we're proud of
- Design: The design is beautiful, and we are very proud of it.
- Full-stack App: We built an asynchronous back end along with a reactive Vite frontend with an I/O.
What we learned
- AI API: We have used AI by training our own models before, but this was our first time using an API to access a public model.
- Data Processing: It is important to process data before giving it to AI, whether that be through redacting private information for security, putting it in markdown text, or a compound AI system using multiple AIs.
- FastAPI: For some of us, this was our first time using FastAPI. Through it, we learned about HTTP requests and the REST API style
What's next for ClearDocs
We have many ideas for features that we can add:
- Confidence Score: For each flag/risk, label them with a confidence score depending on how much they match with a corresponding sentence from the source pdf.
- Risk Rating: A risk rating for each flag/risk, on a scale from 1-100, depending on how important it is to address this risk.
- Alternatives: Specifically for contracts like housing contracts, we can have the AI find alternatives to the contract a user shows, providing more options.
- Chat Interface: An interactive chat where the user can ask more questions about their documents and their problems.
- Display PDF: A separate page displaying the outputs of the AI, along with a preview of the annotated pdf.
- Deployment: Due to financial and memory constraints, in the end, we were unable to deploy our project and could only run it locally.
Built With
- fastapi
- python
- react
- tailwindcss
- typescript
Log in or sign up for Devpost to join the conversation.