Inspiration
Alzheimer's Disease (AD) is a silent crisis. By the time clinical symptoms appear, significant brain damage has often already occurred. While researching the field, we found a critical gap: Deep Learning models can detect AD with high accuracy, but they suffer from the "Black Box" problem. Clinicians cannot trust an algorithm that says "Diagnosis: Alzheimer's" without explaining why.
What it does
NeuroSight is an end-to-end computer vision pipeline that analyzes brain MRI scans to classify the progression of Alzheimer's into four stages: non-demented, very mild, mild, and moderate. Unlike standard classifiers, NeuroSight includes an explainability engine. It generates visual "heatmaps" overlaying the MRI, showing exactly which parts of the brain the AI focused on to make its decision. This allows doctors to verify if the model is looking at relevant atrophy (shrinkage) or just noise.
How we built it
We utilized a robust tech stack centered on Python and TensorFlow/Keras: Data Engineering: We worked with a large dataset of over 6,000 MRIs. To handle this efficiently, we converted the data into Parquet format and built a custom "lazy loading" data generator. This allowed us to stream data to the GPU in small batches, preventing memory overflows. The Brain (Model): We used transfer learning with the VGG16 architecture. We initialized the model with weights pre-trained on ImageNet to grasp basic visual features. We utilized a two-phase training strategy: first training a custom classification head, then "fine-tuning" the top convolutional layers to adapt specifically to brain tissue textures. The Eyes (Interpretability): We implemented Grad-CAM (Gradient-weighted Class Activation Mapping). This algorithm intercepts the gradients flowing into the final convolutional layer to visualize the "regions of interest" that triggered the diagnosis.
Challenges we ran into
The RAM Crash: Early in the project, attempting to load all 6,000 high-res images into RAM caused our environment to crash repeatedly. We had to pivot from simple array loading to writing a custom Keras Sequence Generator that reads from Parquet files on-the-fly. The "Very Mild" Boundary: Our Confusion Matrix revealed that the model struggled most to distinguish between "Non-Demented" and "Very Mild Demented." This reflects the real-world biological difficulty of diagnosing early-stage AD, where anatomical changes are subtle.
Accomplishments that we're proud of
Achieving 81.4% validation accuracy, significantly outperforming baseline models. Successfully implementing Grad-CAM. Seeing the heatmaps light up around the ventricles and hippocampus—confirming that our model learned real biology and not just background noise—was a huge win. Building a production-grade data pipeline that can scale to millions of images without crashing.
What we learned
Data Engineering is Key: The model is only as good as the pipeline feeding it. Moving to Parquet/Generators saved the project. Trust > Accuracy: In medical AI, an explainable model with 81% accuracy is often more valuable than a black-box model with 90% accuracy.
What's next for NeuroSight
Addressing Class Imbalance: We plan to use SMOTE (Synthetic Minority Over-sampling Technique) to generate more examples of the "Moderate" class, which was underrepresented. Multimodal Integration: Combining these MRI visuals with tabular clinical data (age, genetic markers like APOE4) to improve the detection of "Very Mild" cases.
Built With
- keras
- machine-learning
- pandas
- python
- tensorflow
- vgg16

Log in or sign up for Devpost to join the conversation.