Inspiration

Cardiovascular diseases are high-stakes problems: a false negative (telling someone they’re fine when they’re not) can be much worse than a false positive.

What it does

In the notebook, I:

Loaded two different cohorts / datasets (one with a cardio target, one with HeartDisease).

Trained an XGBoost classifier on each cohort.

For each dataset, I reported: ROC-AUC, Log-loss, Confusion matrix, full classification report (precision, recall, F1)

Instead of using the default threshold (0.5), I tested: 0.5 0.3 0.1 0.01

This allowed me to clearly observe the trade-off between:

Recall → Catch more at-risk patients (reduce false negatives)

Precision → Avoid too many false alarms

This threshold analysis is central to the project

Tests feature removal (dropping columns like gender, alco, or a chest pain category) to see if performance stays stable with fewer/noisier features removed.

Adds evaluation visuals like ROC curve, plus extra checks like calibration curve and precision-recall curve.

How we built it

Everything was built step-by-step inside a Jupyter notebook.

Data loading

Imported two different cardiovascular datasets

Dataset 1: cardio target

Dataset 2: HeartDisease target

Cleaned and encoded categorical variables

Model training

Used XGBoost (XGBClassifier)

Stratified train/validation split

Hyperparameter tuning with GridSearch

Optimization metric: ROC-AUC

Evaluation

Beyond ROC-AUC, I focused on:

Confusion matrix analysis

Classification report

Precision-Recall curve

Calibration curve

Threshold experimentation

Instead of keeping the default 0.5 threshold, I tested: 0.5 0.3 0.1 0.01

Looking for additional data

To reduce dataset bias, I explored:

Additional cardiovascular datasets (cardio2, cardio3, etc.)

What's next for gabbag

Multi-dataset alignment

Built With

Share this project:

Updates