🩺 Inspiration
Every year, insurers run a quiet math problem on millions of patients.
In 2024, they denied 85 million in-network claims on the ACA marketplace. Fewer than 1% were ever appealed. And when patients actually fight back? 81.7% of prior authorization denials get overturned.
Let that sink in. The fight works. Most people just never fight.
"The denial is never the last word legally. For most people, it is the last word practically."
Why? Because insurers have entire teams of lawyers, AI models, and clinical reviewers designing the denial. Patients have a letter, a deadline, and hold music. That asymmetry is by design. It keeps denials profitable.
We built Overturn to flip the table.
⚡ What it does
Overturn is an autonomous agent swarm that fights insurance denials for you.
You drop in a denial letter. Sixty seconds later, a certified-mail appeal is in the USPS pipeline. Real tracking number. Real medical guideline cited. Real legal precedent cited. Real clinical harm spelled out. All while you were reading this sentence.
🏥 The flow
| Step | What happens |
|---|---|
| 📨 Read | Extract ICD-10, CPT codes, denial reason from the letter |
| 🩻 Triage | Score clinical urgency. Pick the right legal framework |
| 🧠 Research | Pull medical guidelines + legal precedents in parallel |
| ⚠️ Harm Model | Translate denial into clinical consequences, not dollar amounts |
| ✍️ Draft | Formal appeal citing both kinds of authority |
| 📬 File | Lob returns a certified mail tracking number |
Patient taps approve. Overturn does the rest.
🎯 Why this is healthcare, not legaltech
Every other "appeal generator" on the internet stops at "here's a draft letter, good luck." They treat denials like paperwork problems. Denials are medical problems.
Three pieces make Overturn a healthcare project:
🧪 Clinical Guideline Grounding
The medical brain.
Retrieval over real medical literature. Not summaries. Not paraphrases. The actual guidelines:
- 📘 ACR Appropriateness Criteria (American College of Radiology)
- 🧠 AAN Guidelines (American Academy of Neurology)
- 🎗️ NCCN Guidelines (National Comprehensive Cancer Network)
- 🛡️ USPSTF Recommendations (U.S. Preventive Services Task Force)
Every appeal we draft cites real medical authority. Not just case law.
💊 Care Impact Framing
Medical reasoning.
The agent argues patient harm, not patient finances. A denied MRI isn't "$4,237 in out-of-pocket costs." It's:
"Estimated 6 to 14 week diagnostic delay. Opioid exposure extends with each week. Conservative management already failed at 12 weeks per the patient's PT records."
That's the argument that wins. Not the dollar figure.
🚨 Clinical Urgency Triage
Medical decision-making.
Not every denial is equal. An elective MRI is not an insulin denial. Overturn treats them differently:
| Tier | What it looks like | What the agent does |
|---|---|---|
| 🟢 Routine | Elective imaging, PT sessions | Standard appeal flow |
| 🟡 Urgent | Suspected disc herniation, chronic pain | Appeal + surface alternative providers |
| 🔴 Emergent | Insulin, chemo, oxygen | Crisis pathway: emergency fill laws, parallel filing, acts in minutes |
An insulin denial isn't a routine denial with a faster timer. It's a medical emergency. Overturn knows the difference.
📖 One real case, start to finish
Meet Maria. 67 years old. Chronic low back pain for two years. Her doctor suspects a disc herniation and orders an MRI.
Anthem denies: "not medically necessary."
Maria, like 99% of patients, doesn't appeal. The opioid prescription continues. The disc gets worse. She waits.
Now imagine Maria had Overturn. Here is what happens in 3 seconds:
🩺 Triage: urgent. Estimated 6 to 14 week diagnostic delay. Opioid exposure extends.
📘 Guideline: ACR Appropriateness Criteria rates MRI lumbar 8 of 9 for this exact presentation.
⚖️ Precedent: 11 prior California IMR overturns of the identical Anthem denial on record.
✍️ Draft: formal appeal generated, both citations inline, clinical harm framed.
✅ Filed certified mail. Tracking number returned.
Care Continuity Score: 82 / 100
We don't optimize for appeal wins. We optimize for continuity of care.
🏗️ How we built it
Six specialized TypeScript services. Each on its own port. Each with persistent state. All coordinating through Redis pub/sub with shared state in Postgres + pgvector.
Not one LLM wearing six hats. A real swarm.
🔧 The agents
| Agent | Job |
|---|---|
| 🕵️ Watcher | Ingests Gmail, USPS Informed Delivery, Knot card transactions |
| 🎯 Triage | K2 Think V2 extracts codes, scores urgency, picks legal framework |
| 🧠 Dual Researcher | pgvector search across legal + clinical collections in parallel |
| ✍️ Drafter | Structured output forces both citation slots to be filled |
| 📬 Care Pathway + Submitter | Routes by urgency. Files via Lob, Phaxio, or OpenClaw |
| ⏰ Escalator | Periodic sweep. Wakes on deadlines. Drafts external reviews |
⚙️ The stack
| Layer | Tech |
|---|---|
| Runtime | Bun 1.x |
| Language | TypeScript strict + zod |
| HTTP | Hono |
| Database | Postgres 17 + pgvector |
| Events | Redis pub/sub |
| LLM Core | K2 Think V2 (MBZUAI 70B reasoning model) |
| LLM Backup | Claude Sonnet 4.6 |
| Mail API | Lob (certified mail with tracking) |
| Portals | OpenClaw (Eragon) |
| iMessage | Spectrum (Photon) |
| Transactions | Knot (TransactionLink) |
| Infra | Dedalus Machines (one per agent, persistent) |
| Quality | SonarQube Cloud |
| Velocity | Enter.pro credits |
📚 The data that makes it work
Legal precedents: California's DMHC Independent Medical Review dataset. Every IMR decision publicly reported since 2001. We filtered to medical necessity overturns and embedded the findings text into pgvector. When the Researcher retrieves a precedent, it's real.
Clinical guidelines: four different medical bodies in four completely different formats:
- 🩻 ACR publishes appropriateness ratings on a 1-to-9 scale.
- 🧠 AAN uses narrative evidence summaries.
- 🎗️ NCCN builds decision trees and flowcharts.
- 🛡️ USPSTF assigns letter grades A, B, C, D, I.
Normalizing all four into one pgvector collection with consistent metadata for cross-condition retrieval was one of the harder problems we solved. Worth it. Every appeal now reaches across all four bodies simultaneously.
📏 Repo discipline
We wrote these rules on hour one and held them for the full 36:
- ❌ No classes (except Error)
- ❌ No
anytypes - ❌ No default exports
- ❌ No barrel files
- ✅ All LLM calls go through
shared/llm.ts - ✅ All events go through
shared/events.ts - ✅ All env validation through
shared/config.tswith zod - ✅ Tests live next to source
- ✅ Conventional commits only
The payoff: the swarm feels like a system, not a hack. Adding the sixth agent took 2 hours. The first one took 8.
🔥 Challenges we ran into
🎯 Saying no. Nine sponsor tracks were within reach. On Saturday afternoon we cut Regeneron's clinical trials integration even though the scaffold existed. Shipping it half-wired would have weakened the core demo. Hard call. Right call. The cuts are a feature, not a failure.
🗃️ Seeding real precedents. The demo moment where the Researcher returns a real legal citation only works if real citations exist. We used California DMHC's public IMR dataset, filtered to medical necessity overturns, chunked the findings text, embedded into pgvector. Getting the top-1 retrieval to actually match took multiple rounds of query-side prompt tuning. Retrieval quality isn't glamorous, but it is the product.
🧬 Four formats, one retrieval layer. ACR, AAN, NCCN, USPSTF each publish in ways that do not speak to each other. Appropriateness ratings vs narrative evidence vs decision trees vs letter grades. We ended up building a normalization layer that made all four query-compatible while preserving the original format in metadata. Not sexy. Essential.
⚖️ Forcing dual-source drafting. Our first Drafter prompts leaned heavily on whichever chunk came back first. If legal won, the appeal sounded like a lawyer. If clinical won, it sounded like a radiologist. Neither was right. The fix: structured output with explicit clinical-authority and legal-authority slots so K2 literally cannot generate an appeal unless both are filled.
🚨 Urgency tier calibration. Insulin = emergent. Elective imaging = routine. Easy. But what about a medication for a controlled but progressive condition? What about pain management on day 90? These are clinical ethics questions disguised as engineering problems. We iterated until the tier boundaries were defensible.
🤖 Portal automation edge cases. OpenClaw works beautifully on our test portal. Real insurer portals have CAPTCHAs, session timeouts, and anti-bot measures we'd fight in production. We scoped portal submission to one known-cooperative flow and kept Lob certified mail as the primary channel. Scope discipline over demo flashiness.
🏆 Accomplishments we're proud of
- 📮 A real certified mail tracking number, live on stage, in under 2 seconds. Not a draft. Not a summary. A filed appeal.
- 🧠 Dual-source retrieval across 4 clinical guideline bodies + 1 legal precedent database. Every appeal cites both. No other tool does this.
- 🚦 Urgency triage that actually changes behavior. The agent makes clinical decisions, not just paperwork decisions.
- 💊 Care impact framing in clinical language, not financial. The medical argument is the strong argument.
- 🕸️ Six genuinely coordinating agents on Dedalus Machines. Not one LLM pretending to be six. The Escalator already runs a periodic sweep with real endpoints.
- 🚀 Full portability.
bash scripts/setup.shspins up the entire stack on macOS, Linux, or WSL2. Zero host dependencies.
🎓 What we learned
Legal reasoning is clinical reasoning in disguise. Every strong health insurance appeal argues medicine and law together. We started building a letter drafter and realized we were building a two-headed expert system.
Retrieval quality beats model size. K2 Think V2 is a strong reasoning model. But appeal quality was gated by whether the Researcher pulled the right precedent, not by raw LLM intelligence. The data is the product.
Agent orchestration is 80% plumbing. Once clean event schemas existed, adding a new agent took 2 hours. The first took 8. Invest in infrastructure early and it pays you back every hour after.
Urgency tiers are ethical decisions. Calling a denial "emergent" vs "urgent" changes what the agent does next. We learned to be conservative on boundaries and explicit about the clinical reasoning behind each tier.
Real action beats pretty demos. A Lob tracking number returned live beats any slide deck. Judges stop thinking "cool prototype" and start thinking "this actually works."
Saying no is a feature. We had 9 sponsor tracks available. We picked the ones our core build actually touched. The project is stronger for what we cut.
🚀 What's next for Overturn
| Feature | Why it matters |
|---|---|
| 🧬 Clinical trial matching | When care is denied, match patients to active trials via ClinicalTrials.gov |
| 🏥 Medicare Advantage + Medicaid | Highest prior-auth denial volume in the country |
| 🧑⚕️ Provider-side flow | Physicians spend 13 hrs/week on PA paperwork (AMA 2024). We can give it back |
| ⏰ Longer-horizon escalation | Framework-specific 30/60/180-day deadlines + external review drafting |
| 🗺️ Expanded precedent DB | NY DFS, Maryland IRO, CMS Medicare Appeals Council, specialty guidelines (AHA, ADA, APA) |
| 👪 Caregiver trust model | Spectrum already puts us in iMessage with adult children. Formalize caregiver permissions next |
Every patient deserves a lawyer AND a doctor in their corner.
Overturn is both.
📚 References
[1] Kaiser Family Foundation. Claims Denials and Appeals in ACA Marketplace Plans in 2024. Published March 24, 2026.
[2] Kaiser Family Foundation. Medicare Advantage Insurers Made Nearly 53 Million Prior Authorization Determinations in 2024. Published January 28, 2026.
[3] American Medical Association. 2024 Prior Authorization Physician Survey (1,000 practicing physicians, December 2024).
https://www.ama-assn.org/system/files/prior-authorization-survey.pdf
Built With
- biome
- bun
- claude
- colima
- dedalus-machines
- docker
- hono
- ioredis
- k2-think-v2
- knot
- lob
- openclaw
- pgvector
- postgresql
- redis
- sonarqube
- spectrum
- typescript
- zod
Log in or sign up for Devpost to join the conversation.