Agentic Medical Team

Inspiration

Every day, clinicians spend hours manually cross-referencing lab results, medication lists, imaging reports, and clinical notes scattered across fragmented EHR systems — time stolen from patient care. We saw a specialist physician spend 45 minutes piecing together a patient summary before a ward round. That shouldn't happen in 2026. We asked: what if an AI could function as a virtual attending physician — one that never sleeps, never misses a drug interaction, and synthesises an entire patient record in seconds?

What it does

Our system is a production-grade clinical decision support platform built on Google's Agent-to-Agent (A2A) protocol. A clinician types a natural-language question into Prompt Opinion — and our orchestrator activates the right specialist agents automatically.

Eight AI agents collaborate in real time:

Patient Records — pulls demographics, conditions, and medications from a live FHIR R4 server
Lab Diagnostics — interprets blood work and flags abnormal trends
Pharmacist — detects drug interactions with severity ratings (major / moderate / minor)
Radiology — surfaces imaging findings
Clinical Notes — summarises discharge notes and clinical history
Surgical Planning — assesses pre-operative risk
Attending Doctor — generates evidence-based treatment recommendations
MDT Coordination — synthesises everything into a multi-disciplinary team brief

The response is grounded in real FHIR data, validated by output guardrails, and returned in seconds — not hours.

How we built it

A2A v1 Protocol — a single Orchestrator endpoint on port 8003 handles all agent communication, publishing a standards-compliant agent card at /.well-known/agent-card.json that Prompt Opinion discovers automatically
CrewAI hierarchical crew — eight specialist agents run in-process; a keyword classifier performs dynamic task routing, activating only the agents relevant to each query
FHIR R4 integration — credentials travel securely in A2A message metadata, never touching the LLM prompt
RAG memory — ChromaDB with sentence-transformers gives every agent persistent long-term memory across sessions
Patient chat history — stored in PostgreSQL and prepended to each query for full conversation continuity
Output guardrails — a custom validator checks for hallucination indicators, verifies clinical claims are grounded in FHIR data, prevents PII/token leakage, and sanitises output before it reaches the clinician
One-command deployment — the entire stack runs with docker-compose up

Challenges we ran into

A2A spec compliance — implementing the full v1 agent card schema (supportedInterfaces, SMART-on-FHIR scopes, security schemes) correctly so Prompt Opinion could auto-discover our agents took significant iteration
FHIR credential security — ensuring credentials flowed through A2A metadata and were never serialised into LLM prompts required a careful architecture review
Dynamic task routing accuracy — tuning the keyword classifier to correctly activate the right subset of agents without over- or under-firing across diverse clinical queries
Hallucination guardrails — defining reliable heuristics for grounding checks without blocking legitimate clinical language that lacks explicit FHIR citations
Agent coherence in hierarchical crew — getting eight agents to produce a unified, non-contradictory output required careful prompt engineering and result synthesis in the MDT agent

Accomplishments that we're proud of

Built a fully A2A v1 compliant multi-agent system that Prompt Opinion discovers and invokes with zero manual wiring
Achieved 40–60% latency reduction through dynamic task routing — irrelevant agents are never invoked
Delivered end-to-end FHIR security — no credentials, no PHI, no API tokens ever leak into an LLM context window
Built production-grade guardrails covering hallucination detection, PII scrubbing, and clinical grounding checks
The entire system — eight agents, RAG, PostgreSQL, ChromaDB — deploys from a single docker-compose up command

What we learned

The A2A protocol is genuinely powerful for healthcare: the agent card's skill taxonomy maps naturally to clinical specialties, and the metadata channel is the right place for sensitive credentials
Dynamic routing matters more than raw model quality — sending only relevant context to each specialist agent produced more accurate, faster responses than a single monolithic prompt
Guardrails are non-negotiable in clinical AI — even well-grounded models occasionally hedge with language that could mislead a clinician; automated output validation is essential, not optional
FHIR R4 is rich but inconsistent — real-world FHIR servers return incomplete or missing resources; resilient tooling that degrades gracefully is critical

What's next for Agentic Medical Team

SMART-on-FHIR OAuth — replace API key auth with full SMART launch flow for EHR-embedded deployment
Streaming responses — implement A2A streaming so clinicians see agent outputs appear incrementally, reducing perceived latency
Additional specialist agents — Cardiology, Oncology, and Psychiatry agents are already scoped
Audit trail & explainability — every clinical recommendation linked back to the specific FHIR resource that grounded it, for regulatory compliance
Multi-patient dashboard — extend beyond single-patient Q&A to ward-level triage and escalation prioritisation
Clinical trial matching — integrate a trial eligibility agent that cross-references patient conditions against open trials in real time

Built With

agents
crewai
docker
docker-compose
fhir-r4
google-a2a-sdk
google-adk
google-gemini
httpx
litellm
pgvector
postgresql
python
rag
render
sentence-transformers
smart-on-fhir
uvicorn

Updates

Rishabh Singh started this project — May 11, 2026 12:30 PM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.