architecture diagram
full assessment output format

Clinical Decision Orchestrator

Complete patient assessment. Zero delay.

💡 Inspiration

Every day, clinicians face an impossible cognitive load — juggling vitals, drug charts, lab results, screening schedules, and dietary restrictions for dozens of patients simultaneously. A single missed drug interaction or an overlooked early warning score can cascade into a life-threatening situation.

We asked ourselves: what if a doctor could summon the collective reasoning of five specialist minds in the time it takes to read a single chart?

The inspiration came from watching real ward rounds — the rushed handovers, the post-it notes on whiteboards, the junior doctor Googling a drug interaction at 2 AM. We knew AI could do better. Not replace the clinician, but amplify them — giving them a structured, evidence-based second opinion in seconds, for every patient, every time.

That vision became the Clinical Decision Orchestrator.

🏥 What It Does

The Clinical Decision Orchestrator is a multi-agent AI system that delivers a complete clinical patient assessment by coordinating 5 specialist agents in an intelligent cascade — all triggered by a single natural language query.

Ask: "Give me a full assessment for Ravi Sharma" Get: A complete clinical decision report in seconds, covering:

Agent	Output
🔴 Clinical Triage Agent	NEWS2 score (0–20), risk level, immediate action
💊 Medication Safety Agent	Drug interactions, allergy conflicts, severity ratings
🧬 Diagnosis Support Agent	Top 3 differentials with likelihood percentages
📋 Care Gap Agent	Overdue screenings, vaccinations, USPSTF gaps
🥗 Nutrition & Diet Safety Agent	Drug-food interactions, personalised dietary plan

The NEWS2 (National Early Warning Score 2) is computed as:

$$\text{NEWS2} = \sum_{i=1}^{6} w_i \cdot s_i$$

Where $w_i$ is the weight and $s_i$ is the score for each vital parameter: respiratory rate, oxygen saturation, systolic BP, heart rate, consciousness, and temperature. A score $\geq 7$ triggers an urgent clinical response.

🔧 How We Built It

The system is built on a modern, standards-compliant healthcare AI stack:

Agent Layer

5 specialist agents coordinated via the A2A (Agent-to-Agent) protocol using Prompt Opinion
Each agent has a defined input/output contract and calls only the MCP tools it needs
The Orchestrator manages sequencing — triage runs first so its risk level informs every downstream agent

Tool Layer

14 MCP (Model Context Protocol) tools spanning vitals retrieval, drug interaction lookup, diagnosis fetching, nutritional alert generation, and more
Tools are scoped per agent — no agent has access beyond its clinical domain

Data Layer

FHIR R4 (Fast Healthcare Interoperability Resources) as the data standard
Resources used: Patient, Observation, MedicationRequest, Condition, AllergyIntolerance
HAPI FHIR public sandbox as the development server
SHARP (Secure Healthcare Context Propagation) for passing clinical context securely across agents

Standards

NEWS2 — NHS deterioration scoring
USPSTF — Preventive care gap guidelines
ICD-10, RxNorm, LOINC — Coding interoperability

⚠️ All patient data is 100% synthetic and de-identified. No real Protected Health Information (PHI) was used at any point. The architecture is HIPAA-compliant by design.

😤 Challenges We Ran Into

This was not a smooth build. We hit walls — real ones.

1. FHIR Data Format Hell

FHIR R4 is powerful, but the real-world data we pulled from the HAPI sandbox was inconsistent and deeply nested. Vital signs came back in wildly different Observation structures depending on the LOINC code. Parsing valueQuantity vs component.valueQuantity vs valueCodeableConcept for the same conceptual field cost us enormous time. We had to write robust normalisation layers just to get reliable vitals into the NEWS2 calculator.

2. MCP Connection Instability

Connecting 14 MCP tools across 5 agents meant 14 potential failure points. We faced intermittent connection timeouts, tool registration failures, and context propagation errors — especially when agents tried to call tools in rapid succession. Implementing retry logic and graceful degradation (returning partial results rather than crashing the full cascade) was a hard-won lesson.

3. Agent Sequencing & Context Passing

The Orchestrator needs to pass the triage risk level into the Diagnosis, Care Gap, and Nutrition agents. Getting that context to flow cleanly through the A2A protocol without duplication or loss — especially when one agent timed out — required careful state management we hadn't fully anticipated at the start.

4. Drug Interaction Coverage Gaps

RxNorm-based interaction lookups are only as good as the database behind them. Several edge-case combinations (especially for post-surgical polypharmacy patients) returned no data, forcing us to handle null results gracefully while still surfacing a clinically useful warning.

5. Designing for Clinician Trust

Pure accuracy isn't enough — clinicians need to trust the output. Formatting the report so that evidence for and against each diagnosis was explicit, severity levels were colour-coded, and immediate actions were unambiguous took more iteration than the underlying logic.

🏆 Accomplishments That We're Proud Of

✅ 5-agent cascade completing in seconds — what would take a clinical team 15+ minutes to manually compile
✅ Full FHIR R4 compliance — interoperable with real hospital systems in principle
✅ NEWS2 implementation matching NHS clinical standards exactly
✅ Zero PHI — privacy-safe from day one, not bolted on afterwards
✅ Natural language interface — clinicians can query in plain English, no structured input required
✅ A unified clinical report that a doctor can act on immediately, without needing to cross-reference five different tools

📚 What We Learned

FHIR is a standard, not a guarantee. Implementations vary wildly. Never assume two FHIR servers will return the same structure for the same resource type.
Multi-agent systems fail at the seams. The hardest bugs weren't inside individual agents — they were in the handoffs between them. Context propagation, error handling across agents, and partial failure recovery deserve as much engineering attention as the core logic.
Clinical formatting is a UX problem. Showing a NEWS2 score of 8 is useless without the why. We learned to design outputs for the cognitive state of a busy clinician at 3 AM — scannable, prioritised, and action-oriented.
Synthetic data is harder to make realistic than it looks. Generating believable patient profiles with coherent vitals, plausible drug combinations, and realistic care gaps took significant effort.
Agent trust hierarchies matter. When agents disagree (e.g., triage says LOW risk but diagnosis flags a concerning differential), the system needs a principled way to surface the conflict rather than silently resolve it.

🚀 What's Next for Clinical Decision Orchestrator

Real EHR integration — connecting to live hospital systems via HL7 FHIR APIs (with full governance and consent frameworks)
Batch ward round mode — assess all patients in an ICU or ward simultaneously, prioritised by NEWS2 score
Temporal reasoning — tracking how a patient's risk trajectory evolves over hours and days, not just point-in-time snapshots
Voice interface — letting clinicians query during ward rounds hands-free
Specialist expansion — adding agents for radiology correlation, surgical risk scoring (e.g. P-POSSUM), and mental health flagging
Audit trail & explainability — every clinical recommendation logged with the evidence chain, for medico-legal defensibility
Clinician feedback loop — letting doctors rate outputs so the system improves over time