Clinically Interpretable Heart Disease Risk Stratification

Target class distribution of heart disease outcomes.
Correlation of selected clinical features with heart disease.
Global SHAP analysis of the Random Forest model.
Model-based cardiovascular risk stratification using predicted probabilities.
Calibration curve assessing probability reliability of the Random Forest model.

Inspiration

Cardiovascular disease is a leading cause of death, yet many machine learning approaches for prediction focus mainly on improving accuracy while ignoring whether the results are interpretable or usable in real clinical settings.

This project was inspired by the gap between predictive performance and clinical trust. In early screening scenarios, models need to be transparent, reliable, and aligned with medical reasoning. The goal was to explore whether simple, interpretable models could still provide strong performance while producing explanations that clinicians could reasonably trust.

What it does

The project predicts the risk of heart disease using structured clinical data and stratifies individuals into low, medium, and high cardiovascular risk groups.

Instead of producing only a binary prediction, the model outputs calibrated risk probabilities and explains which clinical features contribute most to each prediction. This allows the system to support early screening and triage decisions, rather than acting as a black-box classifier.

How I built it

The workflow began with exploratory data analysis to assess data quality, feature behavior, and class balance. A logistic regression model was first trained as an interpretable baseline to capture linear relationships.

To model non-linear interactions, a Random Forest classifier was then trained and evaluated using ROC-AUC, recall, and confusion matrices, with an emphasis on minimizing false negatives.

SHAP was applied to interpret model predictions and verify alignment with established clinical understanding. Finally, predicted probabilities were calibrated and used to stratify individuals into clinically meaningful risk groups.

Challenges I ran into

One challenge was balancing model performance with interpretability. Increasing complexity can improve accuracy, but it often reduces transparency, which is critical in healthcare settings.

Another challenge was ensuring that predicted probabilities were reliable enough to be used for risk stratification, which required calibration analysis rather than relying solely on ranking metrics like ROC-AUC.

Accomplishments that I am proud of

Achieving strong predictive performance (ROC-AUC ≈ 0.93) using interpretable models
Demonstrating consistent clinical patterns across correlation analysis, model coefficients, feature importance, and SHAP explanations
Moving beyond binary classification by implementing risk stratification and probability calibration
Framing the model explicitly as a decision-support tool rather than a black-box system

What I learned

This project reinforced the importance of interpretability and calibration in healthcare machine learning. High accuracy alone is not sufficient when model outputs influence screening and triage decisions.

We also learned that well-chosen, simple models combined with careful evaluation and explanation can be more valuable than overly complex approaches in clinical contexts.

What's next for Clinically Interpretable Heart Disease Risk Stratification

Future work could include validating the models across additional cardiovascular datasets to assess robustness and generalizability. Integrating temporal ECG features and exploring subgroup performance across age and sex groups could further improve clinical relevance.

Ultimately, the goal is to extend this work toward a more comprehensive and trustworthy decision-support system for cardiovascular risk screening.

Built With

googlecolab
matplotlib
numpy
pandas
python
scikit-learn
seaborn
shap

Updates

Saravana Priyaa C R started this project — Feb 02, 2026 01:52 PM EST

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.