Inspiration

The rapid advancement of LLMs has unlocked massive potential for autonomous healthcare agents capable of assisting clinicians, summarizing patient histories, and automating administrative workflows. However, there is a massive chasm between AI models and actual healthcare data: security and interoperability.

LLMs speak JSON, but healthcare speaks FHIR (Fast Healthcare Interoperability Resources). Furthermore, giving an AI agent direct, raw access to a hospital's EHR system is a massive HIPAA liability. We realized that for healthcare AI to be viable, we needed a secure, intelligent translation layer—a middleware that lets AI agents interact safely with patient data while strictly enforcing authorization, context boundaries, and audit logging. That realization led to the FHIR-MCP Gateway.

What it does

The FHIR-MCP Gateway is a production-ready middleware server that connects LLM agents securely to EHR systems (like Epic, Cerner, or HAPI FHIR). It leverages the new Model Context Protocol (MCP) to expose complex FHIR APIs as simple, natural-language-friendly tools.

Instead of writing complex OAuth flows and FHIR Bundle pagination logic, an AI agent simply calls a tool like get_patient or search_conditions. The gateway intercepts this call and:

Handles Auth: Retrieves or refreshes SMART on FHIR access tokens from a sub-millisecond Redis cache.

Injects Context: Uses SHARP (Secure Healthcare Agent Request Protocol) headers to auto-inject the correct patient context so the agent can never accidentally query the wrong patient.

Normalizes Data: Converts paginated, deeply-nested FHIR XML/JSON Bundles into flat, agent-friendly JSON representations.

Logs Audits: Writes a content-free, HIPAA-safe trace of the action to a PostgreSQL audit log.

Summarizes: Uses an onboard Claude 3 Haiku model to translate raw clinical JSON into plain-English medical narratives before returning it to the agent.

How we built it

We engineered the gateway with an absolute zero-tolerance policy for framework bloat. Every architectural choice was deliberate:

Protocol Layer: Built on FastMCP (a Python SDK) to handle all JSON-RPC 2.0 communication, giving us auto-generated tool schemas via Pydantic v2 without writing boilerplate.

Auth & Cache: Implemented full SMART on FHIR with PKCE via authlib, backed by a lightning-fast Redis 7 cache to eliminate redundant OAuth roundtrips.

Data Flow: Used asynchronous HTTPX to fetch FHIR data concurrently without thread-blocking.

Observability: Integrated OpenTelemetry to export distributed traces to Jaeger, giving us pinpoint visibility into the latency of multi-agent tool chains.

Persistence: Used SQLAlchemy async with PostgreSQL for non-blocking, high-throughput audit logging.

Challenges we ran into

The hardest challenge was context leakage. When multiple autonomous agents are making concurrent, async requests to the gateway, ensuring that Agent A's token isn't accidentally used to fetch Agent B's patient data is critical. We solved this by implementing a custom SHARP middleware that binds the patient ID and FHIR tokens to a Python ContextVar, ensuring that context is safely isolated to the specific asyncio execution thread.

Additionally, handling FHIR Bundle pagination invisibly to the agent was tricky. LLMs have context window limits, so returning a 500-page FHIR Bundle breaks the agent. We had to build an intelligent FHIR Client Adapter that automatically handles pagination, enforces strict limits (max_resources), and normalizes the output.

Accomplishments that we're proud of

The Plugin Architecture: We built a Python metaclass ToolRegistry that automatically discovers new tools on startup. Adding a new FHIR resource tool now takes exactly one file and under 30 minutes.

HIPAA-Safe Auditing by Design: Our PostgreSQL audit trail logs that a tool was called, but never logs the actual clinical data returned. It’s a structurally secure approach to logging.

The Premium Documentation Suite: We didn't just build the backend—we designed a high-performance, interactive, and visually stunning web documentation suite (complete with dynamic sequence diagrams and tech stack breakdowns) that rivals enterprise SaaS products.

What we learned

We learned that while standardizing APIs is hard, standardizing AI tool schemas is even harder. Finding the right balance between giving the LLM enough context to be useful, but not so much data that it overwhelms the token limit, required extensive iteration. We also learned the massive performance benefits of caching OAuth tokens in Redis—dropping authorization latency from ~800ms to under 1ms on repeated agent calls.

What's next for FHIR-MCP Gateway

Write-Back Capabilities: Currently, the gateway focuses heavily on READ and SEARCH operations. The next step is building safe, human-in-the-loop workflows for POST and PUT operations (e.g., allowing an agent to draft a clinical note or place a preliminary order).

Local LLM Support: Swapping out the Claude 3 Haiku summarizer for a lightweight, locally hosted medical model (like Llama 3 or MedAlpaca) to ensure zero data ever leaves the VPC.

Advanced Rate Limiting: Implementing token-bucket rate limiting per agent ID to prevent aggressive LLM loops from accidentally DDoS-ing the hospital's FHIR server.

Built With

Share this project:

Updates