Inspiration
Roughly one in twenty cases of sepsis, stroke, and heart attack are missed at the triage desk of the emergency department. The failure is not nurse competence. It is structural. A triage nurse has approximately ninety seconds with each arriving patient, during which they must listen to the chief complaint, capture vital signs, run several validated risk scores in their head, and assign an acuity level that determines whether the patient waits or is seen immediately. The cognitive load is enormous, and the cost of a single missed case can be a life. Existing electronic health records do not help with this moment. They are designed for the doctor downstream, not the nurse upstream. Triage itself is the gap, and the most consequential ninety seconds in the entire emergency department workflow has no decision support. We built Triage to fill that gap, without ever taking the nurse out of the loop.
What it does
Triage is a clinical decision support system that runs alongside the nurse during the patient interview. The application captures the spoken conversation through a live diarized transcript that separates nurse and patient in real time. It extracts a structured ClinicalSnapshot from natural language, including the chief complaint, vital signs, neurological findings, relevant history, and current medications. It then runs a battery of peer‑reviewed bedside scoring rules including qSOFA for sepsis, NEWS2 for general deterioration, BE‑FAST for stroke, HEART for chest pain, and Wells for pulmonary embolism. The output is a one‑screen handoff packet for the receiving clinician containing the Emergency Severity Index level, the specific scoring rules that fired with their inputs visible, and a ranked differential diagnosis with the reasoning shown. The nurse stays in charge. The system never assigns acuity unilaterally, never recommends treatment, and never replaces the clinician's judgment. Its job is to make sure that whatever the nurse already saw is preserved into the doctor's first look, and that any score‑triggering pattern in the conversation is surfaced rather than missed. In a tested sepsis scenario, Triage produces a complete handoff in roughly six seconds.
How we built it
The front end is Next.js 14 with the App Router and TypeScript, deployed to Vercel. Voice capture and live transcription run on Deepgram's nova‑2‑medical model, which produces a diarized stream that distinguishes nurse and patient speakers as they speak. Structured extraction from the transcript runs on Anthropic's Claude Sonnet 4.6 with forced tool‑use, which guarantees that the extraction agent returns a schema‑conformant ClinicalSnapshot every time rather than drifting into prose. The scoring engine is deliberately deterministic Python running on Vercel's serverless platform. Every threshold in every score is hard‑coded against the published clinical literature, so the result for a given snapshot is reproducible and auditable. Session state and rate counters are persisted in Vercel KV. The differential diagnosis agent runs as a separate Claude call once the deterministic scores are in hand, which constrains the language model to ranking and explanation rather than to the scoring itself. The architecture is multi‑agent in the strict sense, where each agent owns a bounded responsibility and a typed contract with the next stage of the pipeline, rather than a single agent making many calls.
Challenges we ran into
Three problems demanded the most engineering time. The first was the prompt injection surface that opens up the moment patient speech becomes language model input. A panicked family member can say anything, and any of it could, if relayed verbatim into a prompt, redirect the extraction agent. We addressed this with strict input sanitization, schema‑locked tool‑use that constrains the model's output shape, and explicit refusal of instruction‑like content addressed to the assistant. The second challenge was ephemeral Deepgram token minting. Long‑lived API keys cannot be exposed to the browser, and the live streaming flow expects credentials at session start. We built a short‑lived token endpoint that mints per‑session credentials on the server with strict expiry, so a captured browser session cannot be replayed later. The third challenge was deriving the Emergency Severity Index level from heterogeneous score outputs. qSOFA, NEWS2, BE‑FAST, HEART, and Wells return different output shapes (some binary, some ordinal, some continuous), and ESI is a five‑level acuity scale that must reflect the most concerning signal across all of them. We built a deterministic adjudication layer that maps each scoring outcome to an ESI floor and then takes the most acute floor as the recommended level, with the full chain of reasoning visible to the nurse.
Accomplishments that we're proud of
The accomplishment we care about most is the safety architecture. Every threshold in every scoring rule cites a specific piece of published clinical literature, and the citation is visible in the application alongside the score that used it. The system never originates a clinical claim that is not traceable to a peer‑reviewed source. S
What we learned
Two lessons stand out. The first is that multi‑agent is not the same as multi‑call. The temptation in a generative AI hackathon is to chain language model calls and describe the result as an agentic system. The real architectural discipline is to define bounded responsibilities with typed contracts at each handoff, so that the system fails predictably and the failure modes are auditable.
What's next for Triage
Four enhancements are queued for the next iteration. The first is dedicated red‑flag detection that runs as a third agent, scanning the transcript for specific high‑acuity phrase patterns (sudden onset weakness, worst headache of life, chest pain radiating to the jaw) that may not be fully captured by the current scoring rules but warrant immediate clinician notification regardless. The second is comprehensive request logging and an audit trail, since any production clinical decision support tool must be able to reconstruct exactly what the system told the clinician at the moment of triage, both for quality improvement and for medicolegal review. The third is rate limiting and cost controls, both to protect the deployment against abuse and to make the operational economics realistic for a hospital department that runs hundreds of triage encounters per day. The fourth and most ambitious is electronic health record integration, so that the structured ClinicalSnapshot Triage produces can flow directly into the patient's chart on arrival rather than requiring re‑entry by the receiving clinician. Each of these is a separate engineering project, but together they are the path from a hackathon demonstration to a tool that an emergency department could actually deploy at the front door.
Built With
- anthropic-claude
- deepgram
- nextjs
- python
- react
- tailwindcss
- typescript
- vercel
- vercel-kv
Log in or sign up for Devpost to join the conversation.