NeuroGen

Inspiration

Alzheimer's Disease (AD) is a devastating neurodegenerative disorder where early diagnosis is critical yet often delayed until significant damage has occurred. We were driven by the observation that current diagnostics are siloed, with genetic risk factors and neuroimaging findings rarely analyzed together. Our goal was to bridge this gap by building a tool that not only predicts cognitive status but also provides a cohesive and transparent patient profile to help clinicians intervene earlier.

What it does

NeuroGen is a multimodal AI pipeline that fuses genetic data in the form of SNPs with MRI neuroimaging to enhance the precision of Alzheimer’s diagnosis. The system performs multimodal diagnosis by ingesting and processing heterogeneous biomedical data to classify patients into cognitive stages such as Non-Demented, Mild, and Moderate. It emphasizes explainable AI rather than operating as a black box. SHAP is used to identify key genetic risk factors, while Grad-CAM visualizes specific brain regions such as the hippocampus that influence the diagnosis. The entire pipeline is deployed through a Gradio web interface that allows clinicians to upload patient data and receive instant, interpretable diagnostic reports.

How we built it

We developed an end-to-end pipeline using Python and PyTorch as the core deep learning framework. For data processing, Pandas and NumPy were used to handle raw genetic variant data, with robust preprocessing techniques implemented to manage missing values and outliers. MRI scans were processed using standard image transformations. For modeling, convolutional neural networks were used for imaging data, while separate neural architectures were designed for tabular genetic data. Interpretability was achieved by integrating the SHAP library for genetic feature importance and implementing Grad-CAM algorithms to generate MRI heatmaps. Deployment was handled through Gradio, providing a seamless interface between complex models and clinical usability.

Challenges we ran into

One of the major challenges was data heterogeneity. Combining high-dimensional MRI data with tabular genetic SNP data required careful pipeline design to normalize and process vastly different data structures simultaneously. Data quality posed another difficulty, as real-world genetic data is often noisy and incomplete. This required implementing strong imputation and normalization strategies. Additionally, balancing interpretability with performance was challenging. Significant effort was spent ensuring that Grad-CAM visualizations highlighted biologically meaningful regions such as the hippocampus and ventricles rather than irrelevant artifacts.

Accomplishments that we’re proud of

We successfully built a complete working prototype that transforms raw heterogeneous biomedical data into a deployable diagnostic tool. The model demonstrated biological validity, with Grad-CAM visualizations confirming reliance on brain regions historically associated with Alzheimer’s pathology, reinforcing clinical relevance. By deploying the system through a Gradio interface, we made advanced AI-driven healthcare tools accessible to non-technical users, contributing to the democratization of medical AI.

What we learned

This project highlighted the power of multimodal data, showing that combining genetics and imaging provides a more holistic understanding of Alzheimer’s disease than either modality alone. We also learned that trust in healthcare AI requires transparency, and that explainability tools such as SHAP and Grad-CAM are essential for real-world adoption. Finally, we gained insight into the complexity and promise of feature fusion, identifying future opportunities such as intermediate fusion strategies and the use of 3D convolutional neural networks.

Built With

grad-cam
gradio
matplotlib
numpy
pandas
pyarrow
python
pytorch
seaborn
shap

Updates

Farhan Arif started this project — Dec 31, 2025 08:12 PM EST

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.