Apollo — AI Ward Round Assistant

Apollo listens to doctor ward rounds, transcribes speech in real-time, extracts structured medical actions (medication changes, lab orders, nursing tasks), and stages them for one-click approval into the EHR. Inspiration

Ward rounds are fast, noisy, and information-dense. Clinicians constantly context-switch between talking, thinking, and clicking through the EHR to answer questions like:

“How have the vitals been trending?”

“What are the latest labs?”

“What meds changed since yesterday?”

Then the plan is spoken aloud, but often captured late (or imperfectly), leading to missed tasks, unclear monitoring instructions, and documentation drift.

We wanted to build something that feels like a real assistant during rounds: hands-free, immediate, and safe.

What Apollo does

Apollo is not a typical “scribe.” It’s a workflow generator:

Voice command → EHR retrieval The doctor can say “Show latest labs” or “Show vitals trend” and Apollo pulls the relevant data and displays it as quick cards—no navigation.

Plan capture → structured outputs (not a summary) As the doctor discusses next steps, Apollo produces:

Tasks (owner role, due time, priority)

Monitoring instructions (what + frequency)

Draft orders (kept minimal in MVP to stay safe and reliable)

Evidence-first safety rule Every output item must include evidence:

a transcript pointer (character range / timestamp), and optionally

an EHR field reference If evidence or required details are missing (e.g., antibiotic without dose/route/frequency), Apollo marks it Needs clarification instead of guessing.

Clinician verification → Sync/export The clinician can edit, approve, or reject items, then sync the approved payload to the EHR (or export JSON as structured write-back).

How we built it

Frontend (Lovable + React UI)

Built a review-focused cockpit:

Draft outputs grouped by category

Evidence links that highlight the exact transcript span

Approve/Reject/Edit with strict gating

Approved payload preview + JSON export

Backend functions (Lovable Cloud)

Secure speech-to-text integration using ElevenLabs, with the API key stored as an environment variable:

backend functions read Deno.env.get("ELEVENLABS_API_KEY")

Plan structuring and reasoning powered by Claude (strict JSON outputs with safety constraints).

Database-backed patient context and synthetic EHR data for vitals/labs/medications to keep the workflow fully functional end-to-end.

Intent routing

We route spoken segments into three buckets:

Command (EHR retrieval): only triggers with a wake phrase (e.g., “Jarvis, …”) to avoid accidental triggers in multi-speaker rounds

Plan commit (structuring): triggered by explicit doctor markers (e.g., “Plan: …” / “Jarvis, capture plan”)

Context: captured for transparency but does not create tasks/orders

What we learned

“Evidence-first” changes everything. It’s the difference between a helpful draft and a risky hallucination machine.

The hardest part isn’t generating text—it’s generating the right structure that maps to real clinical workflows (tasks, monitoring, orders) and is easy to review.

Multi-speaker environments require product decisions, not just better models: wake phrases and commit markers are simple, effective safety levers.

Challenges we faced

Ambient vs. safe: Truly ambient capture is valuable, but in a real ward round multiple people speak. We designed explicit triggers so only the clinician-of-record can create actions.

Schema discipline: LLM outputs can drift. We enforced strict JSON schemas, runtime validation, and approval gating to keep outputs predictable.

Avoiding “order complexity”: Medication orders explode in edge cases (dose, route, frequency, allergies). We kept orders minimal and used “Needs clarification” instead of guessing.

What’s next

SMART-on-FHIR integration for real EHR write-back (orders, tasks, and note drafts)

Stronger identity + diarization (doctor-worn mic / voiceprint) for safer ambient capture

Audit logs, role-based task routing (nurse/resident/pharmacy), and monitoring templates per specialty

Architecture

┌─────────────── Browser (React + Vite :8080) ───────────────┐
│                                                             │
│  PatientSidebar ─► PlanStructurer ─► EHRQuickView          │
│                        │                  │                 │
│               useAmbientRound      VitalsGraphModal         │
│              (mic → 15s chunks)    (recharts trends)        │
│                        │                                    │
└────────────────────────┼────────────────────────────────────┘
                         │ /api (Vite proxy)
┌────────────────────────▼────────────────────────────────────┐
│              FastAPI Backend (:8000)                         │
│                                                             │
│  /api/transcribe ──────────► Deepgram Nova-3 (STT)          │
│  /api/process-transcript ──► Kimi K2.5 (clinical reasoning) │
│  /api/patients/* ──────────► SQLite (Synthea EHR data)      │
│  /api/patients/{id}/flags ─► Dynamic vital sign alerts      │
│                                                             │
│  Background: flag evaluator daemon (every 15s)              │
└─────────────────────────────────────────────────────────────┘

Voice Commands

Say "Apollo" during a round to trigger commands:

Utterance Action
"Apollo, show vitals" Switches to vitals tab
"Apollo, show labs" Switches to labs tab
"Apollo, show meds" Switches to medications tab
"Let's continue furosemide..." Captured as plan item
Everything else Clinical context → sent to AI for extraction

Data Pipeline

  1. Mic → 15-second WebM chunks → POST /api/transcribe → Deepgram STT
  2. Classify each sentence: command / plan / context
  3. On round end: regex structurer (instant) + Kimi K2.5 (async) extract draft items
  4. Doctor reviews each item (approve/reject)
  5. Sync approved items → backend (med changes, nursing notes, meeting log)

Project Structure

src/
├── components/jarvis/
│   ├── PlanStructurer.tsx      # Main orchestrator (round lifecycle, review UI, sync)
│   ├── PatientSidebar.tsx      # Patient list + recording toggle
│   ├── EHRQuickView.tsx        # Tabbed vitals/labs/meds display
│   ├── VitalsGraphModal.tsx    # Full-screen 14-day trend charts
│   ├── NursingNotesPanel.tsx   # Nursing notes slide-out
│   ├── MedicalHistoryPanel.tsx # Patient timeline
│   ├── PharmacyQueue.tsx       # Medication order verification
│   ├── CommandHistory.tsx      # Recent voice command log
│   ├── ReviewTab.tsx           # AI plan review with evidence
│   ├── RoundTab.tsx            # Transcript input + vitals display
│   ├── LiveContextFeed.tsx     # Real-time segment feed
│   └── BenchmarkTab.tsx        # Precision/recall scorer
├── hooks/
│   ├── useAmbientRound.ts      # Core: mic → STT → classify → route
│   └── usePatientData.ts       # Patient fetch (API + mock fallback)
├── lib/
│   ├── planStructurer.ts       # Two-pass regex extraction engine
│   ├── kimiClient.ts           # API client (Kimi, flags, sync, notes)
│   ├── commandParser.ts        # Voice command → EHR action mapping
│   ├── transcriptObfuscator.ts # Reversible PHI masking
│   ├── fakeVitalsGenerator.ts  # Seeded PRNG for demo vital signs
│   └── orderExtractor.ts       # Transcript → pharmacy orders
├── pages/
│   ├── Index.tsx               # Home → PlanStructurer
│   ├── FakeEHR.tsx             # Mock EHR for testing
│   └── PrescriptionsPage.tsx   # Prescription table
├── types/jarvis.ts             # Patient, VitalSign, LabResult, etc.
├── data/
│   ├── mockEHR.ts              # 3 demo patients with full clinical data
│   ├── demoTranscripts.ts      # Sample round transcripts
│   └── benchmarkCases.ts       # Ground-truth test cases
└── sourced-data/real_data/     # Python backend
    ├── api.py                  # FastAPI server (all endpoints)
    ├── kimi_client.py          # Kimi K2.5 transcript → structured JSON
    ├── monitoring.py           # Vital sign flag evaluator
    ├── nursing_notes.py        # Nursing note CRUD
    ├── medication_changes.py   # Med change tracking
    ├── meetings.py             # Round meeting logging
    ├── plans.py                # Plan of action CRUD
    ├── sync_operations.py      # Undo-capable sync tracking
    ├── audit_log.py            # Compliance audit trail
    ├── demo_sample.db          # SQLite demo database
    └── requirements.txt        # Python deps

Getting Started

Prerequisites

  • Node.js 18+, Python 3.10+
  • API keys: Deepgram, Kimi K2.5 (Moonshot AI)

Setup

# Frontend
npm install
npm run dev                    # → http://localhost:8080

# Backend
cd src/sourced-data/real_data
pip install -r requirements.txt
uvicorn api:app --port 8000 --reload

Environment Variables (.env.local)

DEEPGRAM_API_KEY=...
KIMIK2_API_KEY=...
VITE_SUPABASE_URL=...
VITE_SUPABASE_PUBLISHABLE_KEY=...

Key API Endpoints

Method Endpoint Description
POST /api/transcribe Audio → Deepgram STT
POST /api/process-transcript Transcript → Kimi K2.5 extraction
GET /api/patients List patients
GET /api/patients/{id}/flags Dynamic monitoring flags
POST /api/patients/{id}/medication-changes Record med change
POST /api/patients/{id}/nursing-notes Create nursing note
POST /api/meetings Log round meeting
POST /api/sync-operations Sync with undo support

Tech Stack

Frontend: React 18, TypeScript, Vite, Tailwind CSS, shadcn/ui, Recharts

Backend: FastAPI, SQLite (Synthea synthetic EHR), Deepgram Nova-3, Kimi K2.5 (Moonshot AI), Playwright

Built With

  • antigravity
  • claudecode
  • lovable
Share this project:

Updates