HeartRisk AI: Explainable & Trustworthy CVD Prediction

Inspiration

Cardiovascular Disease (CVD) continues to be the leading cause of global mortality. Despite this, most individuals lack clarity about the specific factors driving their personal risk and what measurable actions could reduce it. Existing risk calculators typically output a single numerical score, offering little to no explanation or guidance.

This project was inspired by the need for a transparent and user-centric AI solution that not only predicts cardiovascular risk but also clearly explains the underlying drivers and demonstrates how modifiable health factors can improve long-term outcomes.

What It Does

CardioInsight AI estimates an individual’s 10-year CVD risk and extends beyond traditional prediction tools by:

Identifying the most influential clinical and lifestyle risk factors

Quantifying potential risk reduction through improvements in modifiable variables

Providing actionable, interpretable insights instead of a black-box score

Supporting both complete clinical records and limited-input scenarios

Additionally, the system enables population-level risk evaluation using real-world cardiovascular datasets and interactive individual-level analysis.

How We Built It

The system follows a dual-model architecture:

Clinical Model: Uses a rich set of medical features for high-precision predictions in clinical or research environments

Lightweight Model: Operates with fewer, commonly available inputs to ensure accessibility in constrained settings

Both models are trained using XGBoost, with explicit handling of class imbalance and probability calibration for reliable risk estimation.

Explainability and trust were enhanced through SHAP-based feature attribution, counterfactual “what-if” analysis, and deployment via an interactive Streamlit application.

Challenges

Managing severe class imbalance in medical datasets

Producing well-calibrated probabilities suitable for medical decision support

Translating complex ML explanations for non-technical users

Designing medically realistic simulations

Ensuring dataset consistency across training and evaluation

Accomplishments

Built a transparent and explainable CVD risk prediction system

Delivered actionable risk-reduction insights

Designed a scalable dual-model framework

Deployed a functional interactive web application

Ensured reproducibility with structured notebooks and saved artifacts

What We Learned

Accuracy alone is insufficient in healthcare AI—interpretability and calibration are equally critical. Even limited data can yield meaningful insights when used carefully, and explainable models significantly increase user trust.

What’s Next

Future work includes large-scale clinical validation, personalized lifestyle guidance, wearable data integration, fairness analysis, and collaboration with healthcare researchers.

Built With

google-colab
jupyter
numpy
pandas
python
scikit-learn
shap
streamlit
xgboost

Updates

Chetan Jangid started this project — Feb 01, 2026 08:40 AM EST

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.