Inspiration
Alzheimer’s diagnosis using MRI has advanced rapidly, but most AI systems stop at classification — telling what stage the disease is today, not what might happen tomorrow. During competitions and research reviews, I noticed a major gap:
Clinicians struggle not with labeling disease stages, but with anticipating progression early enough to intervene.
Existing MRI models treat each scan as an isolated snapshot, ignoring the temporal story of neurodegeneration.
Genetic risk models exist separately, but are rarely integrated into a clinically interpretable decision flow.
This motivated to build a system that doesn't just classify MRI scans, but also provides explainable spatial risk signals (via Grad-CAM) and genetic predisposition insights, enabling neurologists to understand where the model is looking and how risky the underlying biological profile is, even without paired patient IDs.
Our vision is to evolve competition AI from a grading tool into a clinical insight engine — reliable, leak-free, explainable, and neurologist-friendly.
What it does
1. MRI Intelligence (Image CNN) 1.Uses an EfficientNet CNN backbone to encode 84,000+ MRI slices into deep feature embeddings.
2.Predicts one of 4 clinical dementia stages:
NonDemented, VeryMildDemented, MildDemented, ModerateDemented
3.Produces confidence scores (softmax probability) for each prediction.
4.Generates Grad-CAM spatial attention heatmaps from the final convolutional layer, resized smoothly to 224×224.
5.Overlays heatmaps on MRI slices to show clinically meaningful disease focus regions, not pooled artifacts.
2. Genetic Intelligence (Tabular Risk Model) 1.Loads large-scale NIAGADS AD genetic association data.
2.Computes effect size (OR), -log10(p-value) signal, and infers risk probability.
3.Converts probabilities into clinically decoded risk levels (High/Moderate/Low).
4.No hallucinated patient pairing genetic insights are fused only at inference as independent risk evidence.
3. Late Fusion Clinical Insight
outputs a 3-panel grid per sample:
_ Original MRI _
Grad-CAM heatmap
Overlay (CNN attention + MRI anatomy)
And prints:
MRI predicted label
MRI confidence %
Genetic risk level %
Source folder/file path True label (from test split)
Batch comparison-ready visuals
This fusion helps clinicians answer:
Does the model focus on real anatomical risk areas?
Is the prediction biologically supported by genetic risk signals?
Is there any data leakage? (No proven by zero intersections)
Is the model stable across samples and batches?
How we built it
💻 Development Environment
Platform: Kaggle Notebook (cloud kernel, GPU enabled)
Frameworks: Python, PyTorch, TorchVision, OpenCV, Matplotlib (no forced colors), TQDM, Scikit-learn, Missingno
Hardware: Trained fully on cuda:0 GPU using EfficientNet encoder and a Random Forest tabular classifier for genetic risk.
Training Pipeline
*Loaded 84,384 MRI images *
Extracted labels and file paths into a DataFrame
Applied filename-level deduplication
Created fixed 70/15/15 train/val/test split with stratification
Ran leakage proof code ensuring:
Train ∩ Val = 0
Train ∩ Test = 0
Val ∩ Test = 0
Built DataLoaders (batch size 16)
Trained EfficientNet CNN for 10 epochs with logging: loss
accuracy
AUC (OVR)
Evaluated final MRI model on test set:
Accuracy = 99.71%
Macro-F1 = 99.73%
AUC (OVR) = 0.9999
Computed separate genetic risk using TSV data pipeline.
Implemented Grad-CAM with smooth upsampling for correct anatomical attention visuals.
📊 Visualization
3-panel Grad-CAM grid for 10 random test images
Confidence histogram plots
Top SNP effect size bar graph
Built With
- kaggle
- python
Log in or sign up for Devpost to join the conversation.