Clinical Decision Support — Hybrid AI MCP

Medical errors kill 250,000 Americans every year — the third leading cause of death. Adverse drug events send 1.3 million people to the ER annually. And with 40% of elderly patients on five or more
medications, clinicians face an overwhelming cognitive burden at the point of care.

We saw most AI healthcare submissions doing the same thing: dump FHIR data into an LLM and hope it doesn't hallucinate a fake drug interaction or invent a stroke risk score. That's not acceptable in clinical
decision support. A physician needs to audit every number.

So we built the opposite.

## What it does

Clinical Decision Support MCP Server exposes 9 production-grade clinical tools via the Model Context Protocol, working on top of any FHIR R4 endpoint:

generate_patient_summary — Aggregates demographics, conditions, medications, labs, allergies, and encounters into a clinician-ready narrative.
calculate_risk_scores — CHA2DS2-VASc (stroke risk), HEART (chest pain), MELD-Na (liver severity) — computed from FHIR data using published, peer-reviewed formulas.
check_drug_interactions — AI pharmacist analyzing polypharmacy with severity, mechanism, and clinical recommendations.
check_contraindications — Prescribing safety cross-referencing conditions, allergies, current meds, and renal/hepatic function with PASS / CAUTION / CONTRAINDICATED verdicts.
interpret_lab_results — LOINC-keyed reference range flagging plus AI clinical interpretation.
suggest_care_plan — Evidence-based recommendations citing AHA/ACC, ADA, KDIGO, AASLD guidelines.
parse_clinical_notes — NLP extraction of diagnoses, medications, procedures, vitals from unstructured documents.
FindPatientId / GetPatientAge — Foundational lookup utilities.

## How we built it

The core architecture is hybrid AI — and it's the key differentiator:

Deterministic Layer: For every clinical score, we use the exact published formula. CHA2DS2-VASc uses Lip et al. (Chest 2010). MELD-Na uses the validated logarithmic formula from Kim et al. (Hepatology
2008). These never hallucinate.
AI Layer (Claude): Adds contextual interpretation, identifies clinically relevant findings, and produces clinician-ready narratives — on top of verified data, not in place of it.

Stack:

TypeScript + Express + @modelcontextprotocol/sdk on Node.js 20 (Alpine Docker)
Anthropic Claude (Haiku) for AI interpretation with exponential-backoff retry
FHIR R4 via Axios with parallel resource fetching (Promise.allSettled for graceful degradation)
SHARP Extension headers (x-fhir-server-url, x-fhir-access-token, x-patient-id) — patient ID is preferred from headers to prevent LLM hallucination of fake IDs
Deployed on Render with self-ping keep-alive
Published on the Prompt Opinion Marketplace

The platform is the spiritual successor to CDS Hooks — same FHIR ecosystem that Josh Mandel brought to every certified EHR in America, now extended to the agentic AI era through MCP.

## Challenges we ran into

Patient ID hallucination: Early LLMs would invent FHIR patient IDs. We fixed this by making FhirDataService.getPatientId() strictly prefer the SHARP header over any tool argument.
Render cold starts: Free tier spins down after 15 min (50s spin-up). We added a self-ping via RENDER_EXTERNAL_URL every 4 minutes — critical for judging windows.
Polypharmacy deduplication: Patients can have the same drug in both MedicationRequest and MedicationStatement. We dedupe by RxNorm code before passing to the interaction checker.
Graceful degradation: If a FHIR resource fetch fails, we don't fail the whole tool — Promise.allSettled lets us return partial summaries with clear "not available" markers.

## Accomplishments we're proud of

49 automated tests — 100% passing — covering every deterministic clinical calculation
Zero vendor lock-in — works with any FHIR R4 endpoint (Epic, Cerner, HAPI, open-source). A rural community health center gets the same clinical intelligence as Cleveland Clinic.
Audit-grade reasoning — every risk score has a point-by-point breakdown a physician can verify
Production-ready — Dockerized, health monitoring, parallel queries, exponential backoff retry, clinical disclaimers on every response
Synthetic data only — no PHI ever processed; demo uses HAPI FHIR sandbox + Synthea-generated patients

## What we learned

Determinism is non-negotiable for clinical scoring. LLMs can interpret, but they cannot be trusted to calculate. The hybrid pattern is the only responsible architecture.
MCP is the right primitive for clinical interoperability. Tool definitions map cleanly to clinical workflows.
FHIR-native > vendor-specific. Every EHR has a FHIR R4 endpoint now. Building on FHIR means we're EHR-portable from day one.
AI in healthcare needs guardrails baked into the architecture, not bolted on. Disclaimers, source attribution, and deterministic verification have to be in the response shape itself.

## What's next

Pediatric-specific CDS — weight-based dosing, age-adjusted reference ranges
Integration with pharmacological databases for fully deterministic drug interaction checking (no LLM in the loop for high-risk interactions)
PELD scoring for pediatric liver disease
Multi-language support for global health equity — rural community clinics anywhere can plug in
Per-tool-call pricing on the Prompt Opinion Marketplace ($0.01–$0.05/call) for sustainable scaling

Built on open standards — because clinical reasoning should be both deterministic and intelligent.

Built With

14.
anthropic-claude
axios
clinical-decision-support
docker
express.js
fhir-r4
hapi-fhir
healthcare-ai
model-context-protocol
node.js
render
sharp-extensions
smart-on-fhir
synthea
typescript

Updates

avadh-pro Dobariya started this project — May 11, 2026 12:04 PM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.