CardioInsight AI: Explainable CVD Risk Prediction

Inspiration

Cardiovascular Disease (CVD) is one of the leading causes of death worldwide, yet most people only discover their risk after symptoms appear. We were inspired by a simple question:

“What if people could understand their future CVD risk early — and also know why that risk exists and how to reduce it?”

Most existing systems only give a risk score, which is hard for non-medical users to trust or act upon. We wanted to build a system that not only predicts risk but also explains the contributing factors and provides actionable insights in a simple, understandable way.

What it does

CardioInsight AI is an AI-powered cardiovascular risk prediction system that:

Predicts 10-year CVD risk percentage

Classifies users into Low Risk / High Risk

Explains which factors contribute most to the risk

Simulates “what-if” scenarios to show how reducing certain factors (like BP, BMI, cholesterol, smoking) can lower future risk

Works with both:

Complete medical data

Limited real-world user data

The system is designed as a decision-support tool, not a medical diagnosis.

How we built it 🧠 Data Strategy

Training Dataset We trained our models using the Framingham Heart Study dataset, a well-known clinical dataset used in cardiovascular research.

Evaluation / Hackathon Dataset The hackathon-provided cardiovascular dataset was used during inference to validate model behavior in a real-world scenario.

🤖 Dual-Model Architecture (Key Design Choice)

We built two separate models to handle different real-world situations:

Model A – Full Clinical Model

Uses all available medical features

Designed for hospitals or detailed health records

Higher accuracy with comprehensive data

Model B – Lightweight Practical Model

Uses only 8 commonly available features (Age, Gender, BMI, Blood Pressure, Cholesterol, Glucose, Smoking)

Designed for public health tools, surveys, or limited data scenarios

Faster and more accessible

👉 This dual-model approach makes the system flexible and realistic, instead of assuming perfect medical data.

⚙️ Technical Approach

XGBoost classifier for strong performance on tabular medical data

Handled class imbalance using scale_pos_weight

Hyperparameter tuning with RandomizedSearchCV

Probability calibration using CalibratedClassifierCV

Threshold optimization focused on high recall (minimizing missed high-risk cases)

🔍 Explainability

Integrated SHAP (SHapley Additive Explanations) to:

Identify top risk-increasing factors

Provide human-readable explanations

🔁 What-If Simulation

Simulated medically safe improvements

Example: “If systolic BP is reduced from 150 → 120, risk decreases by X%”

Challenges we ran into

Handling imbalanced medical data where high-risk cases are rare

Avoiding data leakage during calibration and cross-validation

Making AI explanations understandable for non-technical users

Mapping hackathon dataset features to clinical equivalents safely

Ensuring “what-if” suggestions remain medically reasonable

Accomplishments that we're proud of

Built a fully explainable AI system, not just a black-box predictor

Successfully implemented dual-model architecture

Added actionable insights, not just predictions

Maintained strong performance while prioritizing recall

Designed the system to work in real-world, imperfect data conditions

What we learned

In healthcare AI, explainability is as important as accuracy

A single model is often not enough for real-world deployment

Risk prediction becomes meaningful only when users understand why

Small, interpretable improvements can have a big impact on trust

What's next for CardioInsight AI: Explainable CVD Risk Prediction

Deploy as a web application for public access

Integrate with wearable or EHR data

Add time-based risk progression tracking

Collaborate with healthcare professionals for clinical validation

Expand to include preventive recommendations aligned with guidelines

Built With

github
joblib
jupyter/colab
numpy
pandas
python
scikit-learn
shap
streamlit
streamlit-cloud
xgboost

Updates

Chetan Jangid started this project — Feb 01, 2026 05:01 AM EST

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.