Inspiration
Emergency triage fails in exactly the place it cannot afford to fail: rare, critical patients.
In most ED datasets, ESI 1 (Resuscitation) cases are under 2% of visits. That creates an accuracy paradox. A model can look "good" on paper by predicting moderate acuity most of the time, while still missing the patients who need immediate intervention.
We built Frostbyte to confront that directly. Our goal was not just a classifier, but a real-time triage copilot that is multimodal, explainable, uncertainty-aware, and explicitly human-supervised.
What it does
Frostbyte is a real-time emergency triage decision-support system.
It predicts Emergency Severity Index (ESI) levels by combining three clinical modalities:
- structured vitals
- free-text chief complaint
- optional medical imagery
For each patient, Frostbyte outputs:
- predicted ESI level
- confidence and uncertainty signals
- SHAP-based feature attribution (why the model made that call)
It also streams grounded clinical guidance through a dual-path retrieval pipeline (text similarity + vitals similarity), retrieves similar historical cases, supports clinician override with rationale logging, and includes an MCI mode for rapid batch triage during surge scenarios.
How we built it
We built Frostbyte as a full-stack, low-latency system:
- Frontend: Next.js HUD-style triage dashboard
- Inference backend: Rust + Axum for orchestration and hot-path reliability
- Model serving: LightGBM loaded through Rust FFI (to keep core inference outside Python)
- ML sidecar: Python FastAPI for ClinicalBERT embeddings, ResNet-50 image features, SHAP explainability, Chroma retrieval, and RAG generation
The training base combines:
- 197 real ED patients from MIMIC-IV-ED
- 1,000 physiologically grounded synthetic patients distributed across all five ESI levels
Those modalities are fused into a 22-feature representation for triage prediction.
Challenges we ran into
1) Class imbalance in safety-critical data
Critical cases are rare. That makes aggregate accuracy misleading. We had to design data balancing around clinical realism, not random oversampling.
2) Multimodal asymmetry
Not every patient has an image. We designed the pipeline so optional modalities do not break inference or skew predictions.
3) Cross-stack reliability
Rust, Python, and React had to behave like one system under demo conditions. Contract drift and startup brittleness were real engineering risks.
4) Trust and accountability
In healthcare, "prediction only" is not enough. We had to build explainability, uncertainty flags, override flow, and audit logging as first-class features, not add-ons.
Accomplishments that we're proud of
We delivered a complete triage workflow, not a notebook model.
Frostbyte can:
- predict acuity in real time
- explain its reasoning per patient
- show confidence and uncertainty
- retrieve similar historical cases
- generate grounded next-step guidance
- preserve clinician authority through override + audit trail
- scale to batch triage in MCI mode
On evaluation, we achieved 90% overall accuracy with 1.00 precision and 1.00 recall on ESI 1 (zero missed resuscitation cases in evaluation).
What we learned
A healthcare model can be statistically impressive and clinically unsafe at the same time. We learned to optimize for harm-aware performance, not just aggregate metrics.
We also learned that explainability and override are not "nice-to-have" UX features. They are core product requirements for real clinical trust.
Finally, we learned that architecture matters: separating low-latency inference from heavy preprocessing let us keep both speed and flexibility without collapsing reliability.
What's next
Our next steps are focused on clinical realism and deployment readiness:
- expand real-patient coverage and modality depth
- run broader triage simulations and stress testing
- improve calibration and uncertainty behavior under edge cases
- deepen audit analytics and human-in-the-loop workflows
We are also continuing development of a research-forward successor to the current production fusion path: a cross-attention multimodal model, evaluated against the same safety-critical triage benchmarks.
The direction is unchanged: faster triage, clearer reasoning, stronger accountability, with clinicians always in control.
Note: I am calling it Frostbyte( I know I know its literally name of the hackathon. Trust me I am not trying to be too meta here) because calling it the complete name -- MultiModal AI Triage Assistant, every single time just doesn't make any sense.
Log in or sign up for Devpost to join the conversation.