MediTwin AI

Inspiration

Healthcare professionals carry an impossible cognitive load. In a single patient encounter, a clinician must recall the patient's full history, interpret labs, check for drug interactions, review imaging, and synthesize a treatment plan — often in minutes, often alone.

We kept asking: what if every clinician had a full team of specialists available instantly, for every patient, every time?

That question became MediTwin AI — a digital twin of the patient, analyzed in real time by eight AI specialists working in parallel. The name is deliberate: not a single chatbot, but a twin of the patient that lives in the system and can be interrogated from every clinical angle simultaneously.


What It Does

MediTwin AI is a multi-agent clinical decision support system. When a clinician submits a patient query, eight specialized AI agents activate in parallel:

Agent Role
Patient Context Agent Ingests and normalizes FHIR R4 patient data
Diagnosis Agent Generates a ranked differential diagnosis with confidence scores
Lab Analysis Agent Flags abnormal values, trends, and critical findings
Drug Safety Agent Checks every medication against allergies and contraindications
Imaging Triage Agent Analyzes chest X-rays and surfaces key findings
Digital Twin Agent Simulates treatment outcomes and risk trajectories
Explanation Agent Synthesizes all findings into a unified clinical narrative
Conversational AI Agent Natural language interface with access to all tools, capable of answering any clinical question about the patient

All results stream to the clinician in real time via SSE — findings appear as they emerge, not after a long wait. Every response includes a human_review_required flag and explicit safety warnings for critical findings.


How I Built It

The architecture is a microservices system — each agent runs as an independent FastAPI service on its own port, orchestrated by a central coordinator.

  • Backend: Eight Python FastAPI services, each with its own LangChain/LangGraph agent. The Orchestrator calls all relevant agents in parallel using async HTTP and merges the results into a unified response.

  • Frontend: A React 19 + Vite application with a real-time SSE consumer. The UI renders each agent's card as results stream in, with markdown rendering for rich clinical output.

  • Memory & State: LangGraph's PostgreSQL checkpointer gives each conversation thread persistent memory. The Conversational AI agent can reference earlier exchanges naturally — "as we discussed about the labs earlier..." — because every message is stored and replayed.

  • Infrastructure: The entire system runs in Docker Compose — PostgreSQL, Redis, ChromaDB, and all eight agent containers brought up with a single command.

  • LLM: Google Gemini 2.5 Flash powers every agent, chosen for its speed and large context window — essential when reasoning over full FHIR patient records.


Challenges I Ran Into

Safety by design. In medicine, a wrong answer isn't just incorrect — it's dangerous. We had to build the system so that critical findings (unsafe drug combinations, sepsis indicators, critical lab values) are always surfaced with explicit urgent language and never buried or softened. Defining what "safe output" looks like for a clinical AI was harder than building the agents themselves.

Connection stability at scale. Running eight services against a shared PostgreSQL instance under real query load exposed a subtle but serious bug: AsyncPostgresSaver opens a single database connection with no keepalives. After any idle period, PostgreSQL silently closes it — and the next query crashes the server with "server closed the connection unexpectedly". We fixed this by replacing the single connection with an AsyncConnectionPool with TCP keepalives, max_idle eviction, and 60-second reconnect retry logic.

Streaming across a multi-agent pipeline. Getting token-by-token LLM output and tool lifecycle events to stream correctly to the frontend — through FastAPI's SSE layer, through LangGraph's event system, without either blocking the other — took significant iteration to get right.

First time with Docker and microservices. Managing eight services, their inter-dependencies, startup order, shared volumes, and network configuration from scratch was a steep but rewarding learning curve.


Accomplishments That I'm Proud Of

  • Built a production-grade multi-agent system — not a demo wrapper, but a real orchestrated pipeline with independent services, streaming, memory, and fault tolerance
  • Real-time SSE streaming across the entire stack — the clinician sees findings the moment they are ready, agent by agent
  • Safety-first architecture — drug contraindication checks, explicit human_review_required flags, and urgent-language enforcement baked into the system, not added as an afterthought
  • Persistent conversation memory — the Conversational AI agent remembers the full thread across sessions, stored in PostgreSQL, surfaced instantly
  • Taking an idea from zero to a fully containerized, multi-agent clinical AI in hackathon time

What I Learned

  • Microservices are not just about splitting code — they are about designing failure boundaries. When one agent is slow or unavailable, the rest of the system continues. That resilience had to be intentional.
  • Idle database connections are a silent killer — we learned this the hard way under real query load and came away with a much deeper understanding of connection pool management, TCP keepalives, and psycopg internals.
  • LangGraph's checkpointer gives you real memory — not simulated context stuffing, but genuine per-thread state that persists across restarts. Once we understood how to drive it correctly, the conversational experience became dramatically more natural.
  • Streaming UX changes how AI feels — showing results as they arrive, token by token, makes the system feel alive and responsive even when the underlying computation takes several seconds. Latency perception is a UX problem as much as an engineering one.
  • Working first-time with Docker Compose at this scale — managing service dependencies, shared networking, and environment configuration — was one of the most practically valuable things we took away.

What's Next for MediTwin AI

  • Voice interface — clinicians should be able to query the system hands-free at the bedside
  • EHR integration — direct FHIR R4 pull from Epic, Cerner, and other major systems so patient data flows in automatically
  • Longitudinal monitoring — the Digital Twin agent tracks a patient over time and proactively surfaces deterioration signals before they become critical
  • Explainability layer — every agent recommendation linked back to the specific data point and clinical guideline that drove it, so clinicians can audit the reasoning
  • Multi-patient dashboard — triage view across an entire ward, with the system flagging which patients need attention most urgently
  • Regulatory pathway — pursuing FDA Software as a Medical Device (SaMD) classification, with the safety architecture we built as the foundation

Built With

Share this project:

Updates