Overture

Validated US Core CarePlan with FHIR resource id — written back to the FHIR server in 90 seconds
Validated US Core CarePlan with FHIR resource id — written back to the FHIR server in 90 seconds

Inspiration

Hospital discharge is one of the most dangerous moments in a patient's care journey — and one of the most operationally broken. A single post-discharge plan typically requires a hospital coordinator to spend three days on faxes and phone calls across the patient's pharmacy, home-health agency, payer, and downstream specialists. The same FHIR context gets re-keyed at every hop. Errors are common. Care gets delayed. Industry estimates put the total US prior-authorization burden in the tens of billions annually — commonly cited figures land around $25–35B per year across transaction costs, physician practice burden, and care delays.

We asked: what if instead of a coordinator with a phone, the workflow ran as a real-time conversation between AI agents — each one belonging to its own organization, each one reasoning over its own state, all of them coordinating through open protocols?

What it does

Overture is a multi-agent system that composes a complete 30-day post-discharge care plan in real time. The user sends a single clinical request to the Discharge Orchestrator. The orchestrator:

Reads the patient's chart from FHIR via an MCP server (encounter, active conditions, medications, service requests).
Opens parallel A2A conversations with two more independent agents on separate servers — a specialty pharmacy and a home-health agency. Each agent has its own Gemini-powered LLM, its own state, and no shared memory.
Negotiates blockers in real time. If the pharmacy reports a medication backorder, the orchestrator decides on a clinically appropriate substitute, gets prescriber confirmation, and updates the plan. If home health can't cover a service window, it surfaces that as a real blocker.
Composes a US Core compliant CarePlan, validates it against hl7.fhir.us.core via the MCP server, and writes the resource back to the FHIR server.

The output is a validated, queryable FHIR resource id — not a chatbot summary. Any production EHR (Epic, Cerner, athenahealth) can ingest it directly.

Demo patient: Sarah Chen, postpartum day 3 with severe preeclampsia, prescribed labetalol 200mg BID. In the demo, the pharmacy reports labetalol on backorder; the orchestrator substitutes nifedipine ER 30mg, gets confirmation, and updates the plan. Total time: ~90 seconds.

How we built it

Architecture: 3 A2A agents + 1 MCP server, all powered by Google Gemini, talking over open protocols.

Discharge Orchestrator — Google ADK agent running Gemini 2.5 Flash, with custom tool wrappers. Single entry point that PO talks to.
Pharmacy Agent and Home Health Agent — independent ADK agents on separate servers, each running its own Gemini 2.5 Flash instance. Each owns its own state file (5+ realistic clinical scenarios per agent: postpartum hypertension, cardiac anticoagulation, heart failure, orthopedic post-op, diabetes titration). Reached over A2A.
CarePlan Composer (MCP server) — FastMCP server exposing 6 tools for FHIR reads, US Core CarePlan validation, and write-back.
FHIR context forwarding uses the SHARP / Prompt Opinion FHIR Context Extension. The token chains correctly through all four hops: PO → orchestrator → MCP → PO's FHIR server.

Stack: Python, FastAPI, Google Gemini 2.5 Flash (via Google ADK + LiteLLM), Google Agent Development Kit, A2A SDK, FastMCP, httpx, FHIR R4, US Core. Deployed on Render with a Cloudflare Tunnel for the local MCP during development.

Challenges we ran into

Multi-tunnel networking on free tiers. Free ngrok only allows one reserved domain — when we tried to expose four services, two of them got load-balanced onto the same URL and PO's traffic randomly hit the wrong service. We solved this by splitting infra: ngrok for the public orchestrator, Cloudflare Tunnel for the local MCP, and direct localhost calls for the leaf agents.
LLM non-determinism on multi-step tool use. On some runs the orchestrator would refuse to call any tools and surface a fabricated "I encountered an error" message instead of working through its five-step workflow. We tightened the orchestrator's instruction with explicit step-by-step rules and built fallback prompt variants that reliably anchor the model to its tool roster.
US Core CarePlan validation is strict. Missing meta.profile, text.div, or one wrong field name (activities vs activity) causes silent validation failure. We built a combined ValidateAndWriteCarePlan MCP tool to catch issues before they hit the FHIR server.
Render free-tier cold starts (30–60 seconds per service) made the demo flaky over the public deployment. We mitigated with a pre-warm script and kept the local development loop as the reliable demo path.
A CAREPLAN_MCP_URL double-path bug where our orchestrator appended /mcp to a URL that already had /mcp on the end, producing /mcp/mcp 404s. Caught it with a curl probe and a one-line env-var fix.

Accomplishments we're proud of

A working multi-protocol architecture that uses MCP and A2A for the right things — MCP for tool/data access, A2A for agent-to-agent dialogue. Not buzzword-stacking; each protocol earns its place.
A genuine medication-substitution moment in the demo where the orchestrator independently decides to substitute nifedipine for labetalol when the pharmacy reports a backorder. No human involvement, no script.
Real US Core compliant CarePlan write-back to FHIR, with a queryable resource id at the end of every successful run.
Three independent Gemini-powered agents on separate servers — proving the protocol layer can scale to N parties without architectural change.

What we learned

The hard part of multi-agent systems isn't the LLM — it's the protocol seams. Token forwarding, scope handling, error propagation, timeout budgets across protocol boundaries. Get those right and the agents do their job. Get them wrong and nothing works.
Gemini 2.5 Flash is fast enough to keep a multi-agent conversation feeling synchronous even with 3 LLMs in the loop and 5+ FHIR calls per request. Latency was rarely the bottleneck.
Be honest about the canned vs. live boundary: in our system the conversation is live, but the pharmacy/home-health state is fixed scenario data. Clear framing of what's real vs. what's a stand-in for production APIs makes the demo more credible, not less.
US Core compliance is the moat. A CarePlan that's not valid is just JSON. A validated one is healthcare infrastructure.

What's next

A payer agent to close the loop on prior authorization end-to-end (current demo includes pharmacy benefit checks but not full PA negotiation).
Live integrations with real pharmacy benefit managers and home-health scheduling APIs to replace the canned scenario state.
Production deployment with named Cloudflare Tunnels / Cloud Run for stable URLs and no cold starts.
More clinical scenarios — cardiac, orthopedic, oncology discharge patterns — to demonstrate breadth beyond postpartum care.
A clean EHR plugin so a discharge clinician can launch the workflow from inside Epic / Cerner without leaving the chart.