- Problem Statement In the United States, insurance companies deny 250 million claims per year. When a patient receives a denial, they get a code like CO-4 with no explanation. Patients don't know what it means, clinicians waste hours decoding policy documents, and 90% of valid appeals are never filed because the process is too complex and deadlines are missed.
- Solution Overview ClearCare is a healthcare decision intelligence system that reads insurance policy PDFs, traces any denial code to its exact source rule using RAG (Retrieval-Augmented Generation), and produces two simultaneous explanations — one in clinical language for the clinician, one in plain English for the patient. It then drafts a formal appeal letter and sends it from the patient's own Gmail via Auth0 Token Vault, giving the appeal real legal identity rather than coming from a generic service address.
Inspiration
InspirationEvery year, insurance companies in the United States deny over 250 million claims. Behind each denial is a specific rule buried deep inside a policy document that nobody reads — not the patient, often not even the clinic's billing department. What the patient receives instead is a code. CO-4. CO-11. OA-23. A two-to-five character string that tells them nothing.I watched this play out firsthand. Patients who had legitimate, medically necessary procedures denied because of a prior authorization technicality they had no way of knowing about in advance. Clinicians spending hours manually cross-referencing dense policy PDFs trying to build an appeal — time they should be spending on patient care. And in the end, studies show that over 90% of valid appeals are never filed, not because they would fail, but because the process is too opaque, too time-consuming, and the deadlines are easy to miss.What made this problem particularly frustrating is that the information to fight back already exists. It's in the policy document. The rule that caused the denial is written down. The appeals process is documented. The deadline is specified. Everything a patient needs to overturn a wrongful denial is technically available — it's just locked inside a document nobody can practically navigate under real-world time pressure.I built ClearCare because I wanted to close that information gap entirely. Not just explain what happened, but actually hand the patient and clinician the tools to fight back — the explanation, the letter, and the path to submit it — in minutes, not days.
What it does
ClearCare is a healthcare decision intelligence platform that turns an opaque insurance denial into an actionable appeal in three steps. Step 1 — Policy ingestion. A clinician uploads their insurance company's policy PDF once through the Policy Parser. ClearCare extracts every page of text, strips any protected health information before it touches an external API, chunks the document intelligently, embeds every chunk using Gemini's embedding model, and stores the result in a local ChromaDB vector database. From this point forward, every rule in that policy is instantly searchable by semantic meaning — not just keyword. Step 2 — Denial tracing. When a patient receives a denial, they or the clinician enters the denial code and a brief description into the Denial Tracer. ClearCare runs a cosine similarity search against the embedded policy, retrieves the most relevant chunks, and sends them to Gemini 2.5 Flash with a carefully structured dual-output prompt. In a single LLM call, the system produces two structurally different explanations simultaneously: one written in clinical terminology for the provider, referencing the exact section and page number from the policy; one written in plain conversational English for the patient, explaining what the rule means and what their rights are. Every verifiable claim — section numbers, dollar thresholds, deadline counts — is then cross-checked against the source chunks by a hallucination guard. If fewer than 70% of verifiable claims can be confirmed in the retrieved text, the system flags the response with a warning rather than presenting it as fact. Step 3 — Appeal drafting and submission. ClearCare generates a formal appeal letter citing the exact policy rule, the medical necessity of the procedure, and the grounds for reconsideration. The letter is sent directly from the patient's own Gmail account using Auth0 Token Vault — not from a generic service address — because in healthcare, the legal identity of who sends the appeal matters as much as the content of the letter. A 30-day appeal deadline reminder is automatically created in Google Calendar, with alerts set 7 days and 1 day before expiry. Every action taken in the system — PDF upload, denial trace, appeal draft, email sent — is logged to a HIPAA-compliant audit trail with timestamps and IP addresses, with no PHI stored in the logs.
How we built it
Backend — FastAPI with Python 3.11, deployed on Render. The API is structured around three independent agents, each handling one stage of the pipeline, communicating through well-defined interfaces rather than a monolithic chain. RAG Pipeline (Agent 1 — Policy Parser) — Built from scratch without LangChain or any abstraction framework. PyMuPDF handles PDF text extraction. Text is chunked at 800 words with a 100-word overlap — a configuration arrived at through iteration, not convention. Smaller chunks lost context across paragraph boundaries; larger chunks degraded retrieval precision. The 100-word overlap ensures that rules spanning chunk boundaries are always fully retrievable. Each chunk is embedded using the Gemini Embedding API (free tier, no local model, no PyTorch dependency) and stored in ChromaDB with cosine similarity indexing. Before any chunk reaches the embedding step, a regex-based PHI stripper removes SSNs, phone numbers, email addresses, dates of birth, MRNs, and common name patterns — so no patient data ever reaches an external API. Decision Tracer + Hallucination Guard (Agent 2) — On a denial query, the system runs a cosine similarity search returning the top-k most relevant chunks. These chunks, combined with the denial text, are sent to Gemini 2.5 Flash under a strict dual-output JSON schema that forces the model to produce both the clinician explanation and the patient explanation in a single inference call. The prompt explicitly instructs the model to cite section numbers and page references from the retrieved text only. A post-generation verification pass then extracts all verifiable claims using regex patterns and checks each one against the original source chunks. Any claim that cannot be verified triggers a confidence warning visible in the UI. Retries are handled by Tenacity with exponential backoff. Every agent call is traced by Langfuse with latency, token count, confidence score, and chunk retrieval metadata — silently disabled if Langfuse keys are not configured, so the system degrades gracefully. Communication Agent (Agent 3) — Email delivery uses Auth0 Token Vault. Rather than storing a user's Google OAuth refresh token (which is a security liability), Token Vault exchanges the user's Auth0 session token for a short-lived Google API token at send time. The appeal letter goes out from the patient's actual Gmail address — this is critical because insurance companies treat appeal letters from institutional service addresses differently from patient-initiated correspondence. Calendar events are created via the Google Calendar API with two reminder triggers. If Token Vault is not configured (as in the demo environment), the system falls back to Resend for email and a downloadable ICS file for calendar — the architecture degrades gracefully at every layer. Frontend — React 18 with Vite, deployed on Netlify. Role-based routing separates the Clinician Dashboard (Policy Parser, Denial Tracer, Appeal Drafter, Audit Log) from the Patient Portal (Denial Tracer, Appeal Drafter). Authentication uses Supabase JWT stored in sessionStorage — not localStorage — so tokens are cleared when the tab closes. Sessions auto-expire after 15 minutes of inactivity, which is a standard requirement for clinical workstation compliance. Clinician accounts require an org code at signup to prevent unauthorized access to the policy management features. Security — PHI stripping before external API calls, Supabase JWT validation on every protected route, role enforcement at the API layer (not just the frontend), 15-minute inactivity timeout, sessionStorage token storage, full audit log with no PHI, CORS locked to known origins.
Challenges we ran into
The hallucination guard was by far the hardest part. The naive approach — just prompting the model to "only use information from the provided text" — does not work reliably. The model will still occasionally synthesize plausible-sounding section numbers or deadline lengths that don't appear in the retrieved chunks. The solution was a post-generation verification pass: extract every verifiable claim in the output using pattern matching, check each one character-by-character against the source chunks, and compute a match ratio. Below 70%, the response is flagged. Above 70%, the confidence score is shown. This approach isn't perfect — semantic paraphrasing can fail the check even when the underlying fact is correct — but it catches the most dangerous failure mode, which is invented specifics presented as policy fact. Getting Auth0 Token Vault to work reliably required careful debugging of the OAuth scope chain. The token exchange needs to happen at request time, not at login time, which means the user's Auth0 session must still be valid when the appeal is sent. Session management on both the frontend (Supabase) and the backend (Auth0) needed to be kept in sync. ChromaDB on Render's free tier resets its in-memory state on every redeploy, which means policies need to be re-indexed after each deployment. This is a known limitation documented in the README. The production fix is Pinecone for persistent vector storage.
Accomplishments that we're proud of
Building the entire RAG pipeline from scratch — no LangChain, no LlamaIndex, no abstraction layer — means every component is fully understood, debuggable, and tunable. The chunking parameters, the retrieval strategy, the embedding model choice — all of these were deliberate decisions, not framework defaults. The dual-audience explanation from a single LLM call using a forced JSON schema is technically elegant and genuinely useful. One inference call, two structurally different outputs targeting completely different readers. The clinical explanation uses ICD codes, section references, and technical billing language. The patient explanation avoids jargon entirely. Both are grounded in the same retrieved source chunks. Auth0 Token Vault replacing a full OAuth implementation is architecturally clean. No refresh tokens stored, no token rotation logic written, no Google API credentials managed on the backend — Token Vault handles all of it. The appeal letter carries the patient's real identity, not the service's. The system is fully deployed end-to-end, live, and demoed in the video. Not a prototype. Not a mockup. A working system a clinician can use today.
What we learned
RAG quality is determined almost entirely by chunking strategy and retrieval tuning, not by the LLM. Switching from GPT-4 to Gemini didn't meaningfully change output quality once the retrieval was right. But changing chunk size from 400 words to 800 words with overlap made the difference between the system finding the right rule and confidently returning an unrelated paragraph. In healthcare specifically, the identity of the sender matters legally. An appeal letter from a patient's own Gmail is treated differently by insurance companies than one from a third-party service. This is why Auth0 Token Vault became a core architectural feature rather than a convenience — it's not about personalization, it's about legal standing. Hallucination in RAG systems is not solved by retrieval alone. Even with the right chunks in the context, models will occasionally synthesize specifics that weren't there. The post-generation verification pass is not optional in a healthcare context — it's the difference between a useful tool and a liability.
What's next for ClearCare — Healthcare Decision Intelligence
Replace ChromaDB with Pinecone for persistent vector storage that survives redeployment. Add OCR via Google Vision API to support scanned PDFs, which represent a significant portion of real-world insurance policy documents. Replace the regex-based PHI stripper with Microsoft Presidio for NLP-based entity detection that handles edge cases regex cannot. Expand to support multiple insurance companies simultaneously, with per-payer policy isolation. Build a bulk denial analysis dashboard for hospital billing departments to identify systemic denial patterns across hundreds of claims. Explore fine-tuning a smaller model on insurance policy language specifically to reduce reliance on a general-purpose LLM for domain-specific retrieval tasks.
Built With
- auth0-token-vault
- chromadb
- fastapi
- gemini-api
- gmail-api
- google-calendar-api
- javascript
- jwt
- langfuse
- netlify
- pydantic
- pymupdf
- python
- react
- render
- resend
- supabase
- tenacity
- vite
Log in or sign up for Devpost to join the conversation.