Inspiration

Every year, 19% of U.S. health-insurance claims are denied outright. Less than 1% are ever appealed. Globally, the pattern is identical: one in nine claims rejected in India (IRDAI data), two-thirds of UK disability denials overturned on appeal. The system incentivizes denial—it is working exactly as designed.

The insight: a challenged denial often falls. The work of challenging it almost never gets done.

I built Juro because every patient deserves a hearing, and no one should need a lawyer to get one.


What We Built

Juro is a deployed, adversarial AI tribunal for denied health-insurance claims. It runs four specialized AI agents in one shared "courtroom," each arguing a different role:

  • Advocate (claude-haiku-4-5): Marshals clinical facts and provisions that support coverage
  • Scrutinizer (claude-haiku-4-5): Argues the insurer's case—exclusions, gaps, cost
  • Evidence (claude-haiku-4-5): Produces the statute or guideline that resolves disputed facts
  • Adjudicator (claude-sonnet-4-6): Runs the hearing, recommends overturn or uphold

Both sides are fully argued in real-time before any decision is made. A human always delivers the final ruling, which is recorded on a tamper-evident, hash-chained record.

The system generates a cite-grounded appeal letter in seconds.

Live demo: https://juro-eta.vercel.app/


How We Built It

Architecture

Frontend: Next.js 14 (TypeScript, Tailwind CSS), deployed on Vercel. Live reviewer console streams the hearing turn-by-turn.

Orchestration: Anthropic Band SDK + LangGraph deterministic state machine. Four agents argue by @mention relay; guaranteed-post mechanism ensures no turn is skipped.

AI Models:

  • Light models (claude-haiku-4-5) handle fast argumentation
  • Heavy model (claude-sonnet-4-6) chairs and adjudicates—right-sized reasoning per role

Voice Layer: Deepgram integration lets patients speak their denial letter aloud; auto-transcribed to text before hearing starts. No typing, no forms.

Backend API: Python + FastAPI. Async agent dispatch, Pydantic schemas for type safety.

Audit & Integrity: SHA-256 hash chain. Each turn hashes the prior turn's digest → sealed ROOT. Any edit breaks the chain—detectable instantly.

Storage: Supabase (PostgreSQL) for persistent hearing records, rulings, appeal letters per case.

Auth: Clerk for reviewer identity, locked to ruling record.

Key Technical Decisions

  1. Adversarial by design: Two opposing agents must exhaust arguments before recommendation. Structural guarantee of balanced analysis.

  2. Cross-model tribunal: Haiku for debate (speed + cost), Sonnet for adjudication (reasoning depth). Each seat has the right model.

  3. Guaranteed-post relay: LangGraph enforces turn order. If an agent fails, retry triggers before next agent is invoked. No silent failures.

  4. Hash-chained record: Not a feature—it's the architecture. The transcript is either intact or broken. No in-between. Auditable by construction.

  5. Human-in-the-loop, non-negotiable: AI recommends. A person always delivers the ruling before appeal letter is generated. System is gated.

  6. Real law, every citation: Arguments land on actual statutes (ERISA §29 CFR 2560.503-1, ACA §2719 §45 CFR 147.136, Jimmo v. Sebelius, ACR Appropriateness Criteria, NCCN guidelines). Reviewers can fact-check every claim.


Challenges & Solutions

Challenge 1: Deterministic Multi-Agent Relay

Problem: Four agents arguing in one room—how do you guarantee no turn is silent? How do you prevent agent A's response from corrupting agent B's input?

Solution: LangGraph state machine with forced handoffs. Each agent's output becomes the input to the next. State is immutable between turns. Retry logic is baked in—if agent fails, we retry before advancing. The chain cannot break.

Challenge 2: Making Arguments Credible

Problem: An AI can sound confident saying false things. If Juro cites a statute incorrectly, the whole hearing collapses.

Solution: Agent system prompts include retrieval constraints. Every legal argument must cite a real statute or guideline. We validate against a curated library (ERISA, ACA, IRDAI, NCCN). Arguments without citations are rejected at generation time.

Challenge 3: Keeping the Hearing Under 90 Seconds

Problem: With four agents arguing back-and-forth, latency compounds. Network → LLM → Supabase write → next agent invoke = slow.

Solution: Async FastAPI dispatch. All I/O is non-blocking. Parallel writes where safe. Light models (Haiku) for argumentation—they're fast. Heavy model (Sonnet) only at adjudication stage, where reasoning depth matters.

Challenge 4: The Human Gate

Problem: How do you prevent the system from outputting an appeal letter without human sign-off, while still making the process frictionless?

Solution: System-level constraint. Appeal letter generation is gated on human ruling input. The UI doesn't show the "generate letter" button until ruling is entered. Clerk auth ties reviewer identity to the record.

Challenge 5: Deepgram Integration for Low-Literacy Users

Problem: Rural patients in India may not read English or type easily. Text-based denial letters are inaccessible.

Solution: Deepgram voice ingestion. Patient speaks the denial letter in Hindi or English. Deepgram transcribes. Text is fed into the hearing. Removes the literacy barrier entirely.


What We Learned

  1. Adversarial design is not optional. Single-pass systems (one model, one perspective) will always miss half the case. Debate structures force balanced reasoning.

  2. Hash chains are powerful. Once you commit to hashing every turn, the record becomes evidence. You can't edit your way out of a bad ruling.

  3. Right-sizing models matters. Using Sonnet for everything is overkill and expensive. Haiku for debate, Sonnet for adjudication—same quality, half the cost, twice the speed.

  4. Band SDK is built for exactly this. Four agents in one room, arguing by @mention, with deterministic ordering—that's Band's superpower. LangGraph handles state; Band handles the relay.

  5. Humans have to own the call. The moment you try to automate the final decision, you've lost credibility. A person must deliver the ruling.


Next Steps

Phase 1 (0–3 months): Onboard 3–5 patient advocacy organizations. Add IRDAI & India Insurance Ombudsman statute library. Hindi-language agent prompts.

Phase 2 (3–12 months): Open Hearing API to healthtech partners. Contingency billing infrastructure. Expand to disability claims, prior authorization. SOC 2 Type I audit.

Phase 3 (12–24 months): "Adjudication Layer"—any rules-based dispute (veterans benefits, unemployment appeals) configurable via YAML.


One-Line Takeaway

Payers automated the denial. We automated the counter.

Built With

  • accessibility
  • anthropic
  • api
  • applications
  • artificial
  • development
  • fastapi
  • full-stack
  • healthcare
  • insurance
  • intelligence
  • llm
  • microsoft-band
  • multi-agent
  • python
  • regulatory
  • sdk
  • systems
  • technology
Share this project:

Updates