HeartRisk Assist - Cardiac Risk Triage (Medi-Hack 2025)

Precision-Recall Curve
Reliability Curve
Area Under Curve
SHAP Explainability Plot

Inspiration

Crowded triage lines, limited cardiology slots, and inconsistent risk estimates. I wanted a tiny, transparent tool that provides clinicians with a calibrated probability (not just a score), explains why the prediction occurred, and highlights fairness gaps - so they can route patients more efficiently and safely.

What it does

Predicts a calibrated probability that heart disease is present from 13 routine inputs.
Maps probability into Low / Medium / High bands using a documented demo policy (Low < 7%, High ≥ 35%).
Explains each decision (SHAP or robust fallbacks) with signed bars.
Monitors fairness via simple slice AUCs (sex, age buckets, chest‐pain type).
Shows model quality (ROC, PR, calibration curve, AUC 95% CI).
Scores uploaded CSVs in batch and lets you push any dataset row directly into Triage.

How we built it

Data: Kaggle Heart Disease dataset; I deduplicated to 302 rows to avoid leakage/duplicates.
Modeling: ColumnTransformer (scale numeric + one-hot categoricals) → Random Forest (final) and Logistic Regression (baseline) → isotonic calibration with CalibratedClassifierCV on a held-out fold → 20% test evaluation.
Explainability: shap.TreeExplainer for RF, shap.LinearExplainer for LR, with coefficient/importance or directional ablation fallbacks.
App: Streamlit multipage UI (Triage, Explanations, Fairness, Model Quality, Data Explorer, Batch Scoring).
Thresholds: I documented a principled selection method (capacity/PPV for High, miss-rate/NPV for Low) and used 7% / 35% as sensible demo defaults.

Challenges we ran into

Tiny dataset → risk of overfitting; required restrained RF tuning and held-out calibration.
Explaining calibrated models → I had to unwrap CalibratedClassifierCV to explain the underlying estimator cleanly.
UI clarity → making calibration/bands and “per-100 patients” interpretation obvious to non-ML users.
Data hygiene → removing duplicates so train/validation/test splits were honest.

Accomplishments that we're proud of

End-to-end, reproducible pipeline with artifacts and a clean demo app.
Calibrated probabilities that make thresholds meaningful, not arbitrary.
Clear, signed local explanations and quick fairness slices that surface operational watchpoints.
Lightweight app that runs anywhere without PHI.

What we learned

Calibration changes the conversation: 30% means ~30/100, which clinicians understand.
Thresholds must be tied to capacity, PPV/NPV, and miss tolerance, not gut feel.
SHAP with calibrated models requires care; transparency beats raw accuracy in triage tools.
Small UX choices (vertical bars, per-100 wording) dramatically improve comprehension.

What's next for HeartRisk Assist - Cardiac Risk Triage (Medi-Hack 2025)

Train on-site EHR cohorts (e.g., MIMIC-IV/PhysioNet) with IRB/governance.
Add an auto-threshold finder (capacity/PPV and miss-rate/NPV rules, decision curves) and cohort-specific bands.
Prospective A/B tests, drift/bias monitoring, and clinician-in-the-loop feedback.
Expand features (labs/meds), improve accessibility, and package for hospital deployment.

Team

Solo: Sweety Seelam

Built With

altair
github
joblib
jupyternotebook
kaggle
machine-learning
matplotlib
numpy
pandas
python
scikit-learn
shap
streamlit

Updates

Sweety Seelam started this project — Aug 23, 2025 09:00 PM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.