Longitudinal MRI-Based Alzheimer’s Risk Prediction

Inspiration

Clinical decisions are usually made using multiple MRI scans collected over time, unlike a single snapshot. The core idea of this project was to take multiple scans' data into consideration. I wanted to explore whether including temporal features (e.g., cognitive decline, brain volume changes, visit frequency) could produce an accurate and reliable model.

What it does

It takes the demographic and clinical data of a patient, along with the data of one or more MRI scans. Then the first model, i.e., a RandomForestClassifier predicts the CDR value, and it is fed as an input column to the XGBoostClassifier model for the Demented/Non-Demented prediction. Feature engineering for longitudinal data is done by converting the data from multiple visits to temporal data for each subject. After the model training, it shows the feature importance, SHAP values etc., which lets us know which features hold more importance.

How we built it

The ensemble-based classifiers have been trained and evaluated, including Random Forest and XGBoost models, using subject-wise train-test splitting to prevent data leakage. XGBoost classifier is trained on the longitudinal data and before that feature extraction is done, while the RandomForestClassifier is trained on the cross-sectional data.

Challenges we ran into

Information leakage: During training, early versions inadvertently used future visits. By using pipelines and carrying out a subject-level split while handling the longitudinal features, this problem was resolved.

Calibration trade-offs: Real-world metric trade-offs were highlighted in certain instances where calibration increased the Brier score at the expense of ROC-AUC.

Deployment readiness: To make the notebook code deployable, it was divided into multiple smaller sections.

Accomplishments that we're proud of

The final longitudinal XGBoost model has achieved an ROC-AUC of 0.982 and demonstrates a strong probabilistic calibration. The Brier score of 0.077 indicates that the predictions are well-calibrated and accurate with the actual results. Feature importance and SHAP analysis reveal that visit frequency, socioeconomic status, and baseline brain volume are among the most influential factors.

Built With

optuna
pandas
python
scikit-learn
xgboost

Updates

Aarya Dave started this project — Dec 31, 2025 07:12 AM EST

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.