Inspiration

Hospital discharge is one of the most dangerous moments in a patient's care journey — and one of the most operationally broken. A single post-discharge plan typically requires a hospital coordinator to spend three days on faxes and phone calls across the patient's pharmacy, home-health agency, payer, and downstream specialists. The same FHIR context gets re-keyed at every hop. Errors are common. Care gets delayed. Industry estimates put the total US prior-authorization burden in the tens of billions annually — commonly cited figures land around $25–35B per year across transaction costs, physician practice burden, and care delays.

We asked: what if instead of a coordinator with a phone, the workflow ran as a real-time conversation between AI agents — each one belonging to its own organization, each one reasoning over its own state, all of them coordinating through open protocols?

What it does

Overture is a multi-agent system that composes a complete 30-day post-discharge care plan in real time. The user sends a single clinical request to the Discharge Orchestrator. The orchestrator:

  1. Reads the patient's chart from FHIR via an MCP server (encounter, active conditions, medications, service requests).
  2. Opens parallel A2A conversations with two more independent agents on separate servers — a specialty pharmacy and a home-health agency. Each agent has its own Gemini-powered LLM, its own state, and no shared memory.
  3. Negotiates blockers in real time. If the pharmacy reports a medication backorder, the orchestrator decides on a clinically appropriate substitute, gets prescriber confirmation, and updates the plan. If home health can't cover a service window, it surfaces that as a real blocker.
  4. Composes a US Core compliant CarePlan, validates it against hl7.fhir.us.core via the MCP server, and writes the resource back to the FHIR server.

The output is a validated, queryable FHIR resource id — not a chatbot summary. Any production EHR (Epic, Cerner, athenahealth) can ingest it directly.

Demo patient: Sarah Chen, postpartum day 3 with severe preeclampsia, prescribed labetalol 200mg BID. In the demo, the pharmacy reports labetalol on backorder; the orchestrator substitutes nifedipine ER 30mg, gets confirmation, and updates the plan. Total time: ~90 seconds.

How we built it

Architecture: 3 A2A agents + 1 MCP server, all powered by Google Gemini, talking over open protocols.

  • Discharge Orchestrator — Google ADK agent running Gemini 2.5 Flash, with custom tool wrappers. Single entry point that PO talks to.
  • Pharmacy Agent and Home Health Agent — independent ADK agents on separate servers, each running its own Gemini 2.5 Flash instance. Each owns its own state file (5+ realistic clinical scenarios per agent: postpartum hypertension, cardiac anticoagulation, heart failure, orthopedic post-op, diabetes titration). Reached over A2A.
  • CarePlan Composer (MCP server) — FastMCP server exposing 6 tools for FHIR reads, US Core CarePlan validation, and write-back.
  • FHIR context forwarding uses the SHARP / Prompt Opinion FHIR Context Extension. The token chains correctly through all four hops: PO → orchestrator → MCP → PO's FHIR server.

Stack: Python, FastAPI, Google Gemini 2.5 Flash (via Google ADK + LiteLLM), Google Agent Development Kit, A2A SDK, FastMCP, httpx, FHIR R4, US Core. Deployed on Render with a Cloudflare Tunnel for the local MCP during development.

Challenges we ran into

  • Multi-tunnel networking on free tiers. Free ngrok only allows one reserved domain — when we tried to expose four services, two of them got load-balanced onto the same URL and PO's traffic randomly hit the wrong service. We solved this by splitting infra: ngrok for the public orchestrator, Cloudflare Tunnel for the local MCP, and direct localhost calls for the leaf agents.
  • LLM non-determinism on multi-step tool use. On some runs the orchestrator would refuse to call any tools and surface a fabricated "I encountered an error" message instead of working through its five-step workflow. We tightened the orchestrator's instruction with explicit step-by-step rules and built fallback prompt variants that reliably anchor the model to its tool roster.
  • US Core CarePlan validation is strict. Missing meta.profile, text.div, or one wrong field name (activities vs activity) causes silent validation failure. We built a combined ValidateAndWriteCarePlan MCP tool to catch issues before they hit the FHIR server.
  • Render free-tier cold starts (30–60 seconds per service) made the demo flaky over the public deployment. We mitigated with a pre-warm script and kept the local development loop as the reliable demo path.
  • A CAREPLAN_MCP_URL double-path bug where our orchestrator appended /mcp to a URL that already had /mcp on the end, producing /mcp/mcp 404s. Caught it with a curl probe and a one-line env-var fix.

Accomplishments we're proud of

  • A working multi-protocol architecture that uses MCP and A2A for the right things — MCP for tool/data access, A2A for agent-to-agent dialogue. Not buzzword-stacking; each protocol earns its place.
  • A genuine medication-substitution moment in the demo where the orchestrator independently decides to substitute nifedipine for labetalol when the pharmacy reports a backorder. No human involvement, no script.
  • Real US Core compliant CarePlan write-back to FHIR, with a queryable resource id at the end of every successful run.
  • Three independent Gemini-powered agents on separate servers — proving the protocol layer can scale to N parties without architectural change.

What we learned

  • The hard part of multi-agent systems isn't the LLM — it's the protocol seams. Token forwarding, scope handling, error propagation, timeout budgets across protocol boundaries. Get those right and the agents do their job. Get them wrong and nothing works.
  • Gemini 2.5 Flash is fast enough to keep a multi-agent conversation feeling synchronous even with 3 LLMs in the loop and 5+ FHIR calls per request. Latency was rarely the bottleneck.
  • Be honest about the canned vs. live boundary: in our system the conversation is live, but the pharmacy/home-health state is fixed scenario data. Clear framing of what's real vs. what's a stand-in for production APIs makes the demo more credible, not less.
  • US Core compliance is the moat. A CarePlan that's not valid is just JSON. A validated one is healthcare infrastructure.

What's next

  • A payer agent to close the loop on prior authorization end-to-end (current demo includes pharmacy benefit checks but not full PA negotiation).
  • Live integrations with real pharmacy benefit managers and home-health scheduling APIs to replace the canned scenario state.
  • Production deployment with named Cloudflare Tunnels / Cloud Run for stable URLs and no cold starts.
  • More clinical scenarios — cardiac, orthopedic, oncology discharge patterns — to demonstrate breadth beyond postpartum care.
  • A clean EHR plugin so a discharge clinician can launch the workflow from inside Epic / Cerner without leaving the chart.

Built With

Share this project:

Updates