Inspiration

Early cardiovascular risk often goes undetected; we wanted to see how far structured clinical data alone can go in enabling early warning.

What it does

Predicts cardiovascular disease risk from basic demographic, physiological, and lifestyle features using machine learning.

How we built it

We benchmarked linear models, neural networks, and tree-based ensembles, ultimately optimizing an XGBoost classifier.

Challenges we ran into

Aligning model choice with tabular medical data and avoiding overfitting in neural networks.

Accomplishments that we’re proud of

Achieved ~73% accuracy and clearly demonstrated why gradient-boosted trees outperform deep models on this dataset.

What we learned

Model–data fit matters more than model complexity, especially in tabular medical problems.

What’s next for CVD Detection using Machine Learning

Improving recall, adding explainability (SHAP), and validating the model on external clinical datasets.

Share this project:

Updates