Care Gap Closer for Healthcare

Care Gap Agent
Care Gap MCP
Find Care Gaps
Draft Outreach Message

Inspiration

Care managers and primary care teams already know which patients are overdue for screenings — HEDIS dashboards and EHR alerts have done that for two decades. The bottleneck isn't detecting the gap; it's the 20 minutes per patient spent writing the follow-up. A nurse drafting outreach for 40 patients on a Friday afternoon is doing the same translation work over and over: take a structured clinical fact ("HbA1c 65 months ago"), rewrite it as warm sixth-grade-reading-level English, personalize it to this patient, decide which channel.

That's where the AI factor lives. A rule engine can flag the gap deterministically. It cannot write the message. We built Care Gap Closer to draw exactly that line: rules find the gap, Gemini writes the message.

What it does

Care Gap Closer is a two-piece system that drops into the Prompt Opinion platform:

An MCP server (care-gap-mcp) exposing five tools:
- SummarizePatient, ListActiveConditions, ListRecentObservations — deterministic FHIR queries
- FindCareGaps — the rule engine. Reads the patient's FHIR record, applies four USPSTF-aligned rules, returns structured gaps with evidence. Gemini then authors a one-sentence clinician-facing rationale for each gap based on the evidence dict.
- DraftOutreachMessage — takes a gap object, asks Gemini for an SMS (<160 chars) and/or a portal message (3 short paragraphs) in plain patient-friendly language.
An A2A v1 agent (care-gap-agent) that uses those tools. The agent has no FHIR tools of its own, it owns the reasoning (which tool to call, threading evidence between tools, surfacing a concise summary), and delegates all data access and content authorship to the MCP server.

The rule engine is 100% data-driven: every USPSTF rule, every SNOMED/ICD-10/LOINC/CPT code set, every threshold, lives in YAML under care_gap_mcp/knowledge_base/. Every LLM prompt lives in Markdown under care_gap_mcp/prompts/. A clinician can review or edit a rule without reading Python.

Rules implemented (USPSTF-aligned, conservative)

Diabetes A1c overdue — active DM (SNOMED 44054006 or ICD-10 E10/E11/E13) + no LOINC 4548-4 in 6 mo
Hypertension BP overdue — active HTN + no LOINC 8480-6 in 12 mo
Colorectal screening overdue — age 45–75 + no colonoscopy in 10 y / FIT in 1 y
Mammography overdue — female, age 40–74 + no mammogram in 24 mo

Standards used

MCP (Model Context Protocol, Anthropic) — five tools registered on a FastMCP server
SHARP-on-MCP (Prompt Opinion) — ai.promptopinion/fhir-context capability extension declares required SMART scopes; FHIR context arrives as X-FHIR-Server-URL / X-FHIR-Access-Token / X-Patient-ID headers
A2A v1 (Google) — agent card with supportedInterfaces, nested apiKeySecurityScheme, FHIR extension under capabilities.extensions, scopes in params.scopes
FHIR R4 (HL7) — Patient, Condition, Observation, Procedure resources

How we built it

We started from the Prompt Opinion reference repos — po-fastmcp for the MCP server and po-adk-python for the A2A agent — gutted the sample tools, and kept the shared infrastructure (the FHIR context bridge in shared/middleware.py + shared/fhir_hook.py, the A2A v1 forward-compat layer in shared/app_factory.py). The hard plumbing was already solved by Prompt Opinion; we focused on the use case.

The architecture intentionally splits responsibility:

PO portal ──A2A JSON-RPC──► care-gap-agent ──MCP Streamable HTTP──► care-gap-mcp ──FHIR R4──► SMART Health IT
message metadata ADK + Gemini SHARP headers FastMCP + rules public Synthea
(FHIR context) (reasoning) (auth + scopes) (auth + LLM) sandbox

The agent does Gemini-driven reasoning — picking the right tool, threading the gap object from find_care_gaps into draft_outreach_message. The MCP server does Gemini-driven authorship — rationale + outreach copy off structured evidence. Each Gemini call has a narrow, well-defined job.

We externalized everything that judges or clinicians might want to audit:

knowledge_base/care_gap_rules.yaml — the rules themselves, with rationale_template fallbacks for when the LLM is rate-limited
knowledge_base/terminology.yaml — coding sets with exact vs prefix match modes
prompts/rationale_system.md, prompts/outreach_sms.md, prompts/outreach_portal.md, prompts/tone_guide.md — every LLM prompt, in plain Markdown
prompts/agent_instruction.md — the agent's workflow
routing/rules.yaml — explicit intent → tool-chain mapping with documented hand-off points to future agents (scheduling, CDS)

The whole stack is deployed on Render: two web services from one render.yaml blueprint, connected via the MCP service's public URL.

Challenges we ran into

1. The MCP client handshake. Our first agent-side MCP client was hand-rolled — single-shot HTTP POSTs to tools/call. It got 400 Bad Request on every call. FastMCP's Streamable HTTP transport requires the full initialize → notifications/initialized handshake before tools/call works. We rewrote the client around the official mcp Python SDK (streamablehttp_client + ClientSession), and the SHARP headers ride along on every HTTP request in that session, including initialize — so the server has FHIR context throughout.

2. ADK tool signatures. Google ADK requires tool_context: ToolContext to be the last positional parameter. We had it first on two tools, which leaked it into the JSON schema sent to Gemini as a callable param. Reordering fixed it. We also dropped Optional[X] = None defaults on LLM-facing params — ADK's schema generation is more reliable with required strings.

3. The A2A v1 spec drift. The installed a2a-sdk (0.3.x) was still on the pre-v1 schema. Prompt Opinion's portal expects v1: supportedInterfaces replacing url + preferredTransport, nested apiKeySecurityScheme, params.scopes on FHIR extensions. We kept Prompt Opinion's forward-compat shim (AgentCardV1 / AgentExtensionV1 in shared/app_factory.py) until the SDK ships native v1 support.

4. The 406 health-check loop on Render. MCP-over-HTTP returns 406 Not Acceptable on any GET that lacks Accept: text/event-stream. Setting healthCheckPath: /mcp made Render's prober flood the service with GETs and refuse to mark it Live. Removed the path; Render falls back to "process is listening on $PORT" as the liveness signal.

**5. The Gemini-503 demo. Mid-validation, Gemini's API briefly 503'd. Instead of degrading silently, our rule engine fell back to each rule's rationale_template in care_gap_rules.yaml. The patient still saw a gap, the clinician still saw a sentence, the demo kept moving. That fallback path was deliberate; seeing it exercised felt great.

## Accomplishments that we're proud of

Drew a clean line between rules and authorship. The LLM never invents a gap — only the YAML rule engine can flag one. The LLM only authors copy off structured evidence. That's the right shape for healthcare AI.
Data-driven rule engine. Adding a new USPSTF rule is a YAML edit, not a Python change. Reviewers and clinicians can audit the rules without reading code.
Zero FHIR tokens in any prompt. Credentials live in HTTP headers (MCP) and session state (agent), never in the Gemini context window. We verified by logging prompt payloads.
Real Synthea patient, end-to-end. Verified against the SMART Health IT public sandbox patient Danae Kshlerin (61F, active diabetes). Found three gaps (A1c overdue 65 months, colorectal screening, mammography), Gemini wrote a 146-char SMS that passed the tone guide (no "overdue" or "gap" words).
Graceful LLM failure. When Gemini 503s, the YAML rationale_template takes over and the user-facing message still parses. We saw this fire live in testing.
One blueprint deploy. render.yaml brings up both services with persistent HTTPS URLs that the Prompt Opinion portal can register directly.

## What we learned

MCP + A2A is a great division of labor. MCP is for stateless, reusable tools (FHIR queries, terminology lookups). A2A is for stateful, reasoning agents that thread tools together. Doing both lets you reuse the MCP server from any agent — ours or somebody else's on the marketplace.
The "AI factor" is authorship, not detection. Rule engines have flagged care gaps for decades. What an LLM uniquely adds is the per-patient, per-channel writing — clinician rationale, sixth-grade SMS, longer portal copy. Keep the LLM on the authorship side of the line and the system stays trustworthy.
Externalize prompts and rules from day one. We started with everything inline in Python and refactored mid-build into YAML + Markdown. The refactor unblocked clinical review, made the demo more legible to judges, and cost us nothing at runtime (@lru_cache on file reads).
SHARP is the right healthcare-MCP pattern. Sending FHIR context in HTTP headers (not the prompt, not the tool arguments) is a small spec decision with big safety implications. It made our integration trivial and our audit story clean.

## What's next for Care Gap Closer for Healthcare

Short term (weeks)

More USPSTF/HEDIS rules: lipid screening, statin therapy for ASCVD, depression screening (PHQ-9), tobacco cessation counseling, vaccinations
ADA / ACC / USPSTF clinical guideline coverage as separate rule packs (each loadable from its own YAML)
A BulkFindCareGaps MCP tool that runs the engine against a panel of patients for the daily care manager pre-huddle

Medium term (months)

Spanish, Mandarin, Hindi outreach variants — same rule, swap the prompt locale
Health-literacy adaptation: detect reading-level mismatch and re-author at 4th-grade if the patient's portal data suggests it
Hand-off to a scheduling A2A agent (declared today in routing/rules.yaml, not yet wired) so the closer can not just suggest the visit but book it
Hand-off to a clinical-decision-support agent for gaps where the answer isn't simply "screen" (e.g. statin: which dose? who's contraindicated?)

Longer term

Closed-loop measurement: did the outreach lead to a completed screen? Feed outcomes back into the rule weights and the prompt selection.
Bring-your-own-rule: clinicians upload a custom YAML rule pack at their org level; the engine validates and runs it.
Pluggable evidence sources beyond FHIR: payer claims for screening completion, patient-reported outcomes, wearable signals.

The platform is built; the rules and prompts are the part that grows.

Built With

a2a
docker
fastmcp
fhir
github
google-adk
google-gemini
healthcare
hl7
httpx
icd-10
loinc
mcp
prompt-engineering
pydantic
python
render
snomed-ct
starlette
synthea
uspstf
uvicorn
yaml

Updates

Anmol Varshney started this project — May 11, 2026 03:38 PM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.