Inspiration
Cardiovascular diseases are high-stakes problems: a false negative (telling someone they’re fine when they’re not) can be much worse than a false positive.
What it does
In the notebook, I:
Loaded two different cohorts / datasets (one with a cardio target, one with HeartDisease).
Trained an XGBoost classifier on each cohort.
For each dataset, I reported: ROC-AUC, Log-loss, Confusion matrix, full classification report (precision, recall, F1)
Instead of using the default threshold (0.5), I tested: 0.5 0.3 0.1 0.01
This allowed me to clearly observe the trade-off between:
Recall → Catch more at-risk patients (reduce false negatives)
Precision → Avoid too many false alarms
This threshold analysis is central to the project
Tests feature removal (dropping columns like gender, alco, or a chest pain category) to see if performance stays stable with fewer/noisier features removed.
Adds evaluation visuals like ROC curve, plus extra checks like calibration curve and precision-recall curve.
How we built it
Everything was built step-by-step inside a Jupyter notebook.
Data loading
Imported two different cardiovascular datasets
Dataset 1: cardio target
Dataset 2: HeartDisease target
Cleaned and encoded categorical variables
Model training
Used XGBoost (XGBClassifier)
Stratified train/validation split
Hyperparameter tuning with GridSearch
Optimization metric: ROC-AUC
Evaluation
Beyond ROC-AUC, I focused on:
Confusion matrix analysis
Classification report
Precision-Recall curve
Calibration curve
Threshold experimentation
Instead of keeping the default 0.5 threshold, I tested: 0.5 0.3 0.1 0.01
Looking for additional data
To reduce dataset bias, I explored:
Additional cardiovascular datasets (cardio2, cardio3, etc.)
What's next for gabbag
Multi-dataset alignment
Built With
- jupyter
- python
Log in or sign up for Devpost to join the conversation.