Inspiration

Every hospital discharge involves the same manual burden: a physician writes a clinical summary, reconciles the medication list against what the patient arrived on, checks for drug interactions, and produces patient-friendly instructions. This takes 60–90 minutes per patient. Medication errors at discharge are a leading cause of preventable readmissions — each costing a hospital £3,000–£8,000 and triggering quality penalty flags. We built Discharge Copilot to reduce that burden while keeping clinicians in control at every step.

What it does

One command — "Prepare discharge for this patient" — triggers a coordinated three-agent workflow that produces a complete draft discharge packet in under 60 seconds:

  • Clinical discharge summary — structured draft for the medical record
  • Medication reconciliation report — flags drug interactions, duplicates, omissions, and allergy conflicts
  • Patient instructions — plain English at 6th-grade reading level
  • Full audit trail — request ID, per-agent timing, error codes, clinical metrics

All output is clearly marked DRAFT — FOR PHYSICIAN REVIEW ONLY. The system cannot be used for patient care without explicit physician sign-off.

How we built it

Three specialist A2A agents, each with a distinct role and risk tolerance:

Orchestrator (port 8001) — the entry point. Receives the discharge command from the Prompt Opinion platform, fetches FHIR patient data, generates the clinical summary, delegates to both downstream agents via A2A, assembles the final packet, and enforces the physician review workflow.

Safety & Reconciliation Agent (port 8002) — a hybrid pharmacist model. A hardcoded rule engine runs first, deterministically detecting known drug interactions, duplicates, and allergy conflicts across 13 high-risk drug pairs. No LLM involvement in detection — zero hallucination risk. Claude then adds a narrative reconciliation summary and catches issues outside the rule set, clearly tagged as llm_inferred with MEDIUM confidence. Every flag includes detection_method and confidence fields.

Patient Translator Agent (port 8003) — takes the physician-verified medication list and rewrites it for patient comprehension at a 6th-grade reading level. Outcome-focused: dosing clarity, warning signs, follow-up steps.

Key technical decisions:

  • All inter-agent communication uses strict Pydantic typed schemas — judges can inspect the exact JSON payload between agents
  • DATA_INCOMPLETE error code — if medication data is missing from FHIR, discharge is blocked and the clinician is told exactly what to verify
  • Full audit trail — every request generates a DischargeAuditTrail with per-agent timestamps and statuses
  • Clinician override loopPOST /override captures physician corrections; GET /learning/patterns surfaces improvement trends over time
  • WorkflowState machine — every packet moves through DRAFT → PENDING_PHYSICIAN_REVIEW → PHYSICIAN_APPROVED → READY_FOR_EHR_INSERTION

Tech stack: Python, FastAPI, Anthropic Claude API, FHIR R4, Prompt Opinion A2A, ngrok

Challenges we faced

Medication safety framing — the biggest challenge was designing the Safety Agent so it doesn't feel like "LLM pretending to be a pharmacist." The solution was the hybrid model: rule engine for detection, LLM for narrative only, with every flag explicitly tagged with its detection method and confidence level.

Error handling discipline — building the DATA_INCOMPLETE error code that actively blocks discharge was deliberately chosen over gracefully degrading. Healthcare systems must fail safely, not silently.

Inter-agent data contracts — ensuring the typed Pydantic schemas were strict enough to be trustworthy but flexible enough to handle FHIR data quality variation across different patient records.

Language calibration — every line of output had to use language that supports clinician decision-making without overclaiming. "Flags potential discrepancies" not "detects interactions." "Draft for review" not "safe medication list."

What we learned

That the hardest part of healthcare AI is not the AI — it's the governance. Knowing when to block rather than proceed. Making confidence levels visible. Building the physician review step into the architecture rather than bolting it on. These decisions matter more than model choice.

Built With

  • anthropic-claude-api-(claude-opus-4-5)
  • fastapi
  • fhir-r4
  • httpx
  • mcp
  • ngrok
  • openai
  • prompt-opinion-a2a-protocol
  • pydantic
  • python
Share this project:

Updates