AlzFusion: Multi-Modal Alzheimer's Disease Prediction System

"Genetics Meets Imaging, AI Predicts Alzheimer's"

About the Project

What Inspired Us

Alzheimer's disease affects millions worldwide, and early detection is crucial for effective intervention. Current diagnostic approaches often rely on single data modalities—either genetic risk factors or brain imaging—but clinicians in practice use both. We were inspired to create an AI system that mirrors this real-world approach by fusing multiple data sources to provide more accurate and comprehensive predictions.

The challenge of combining fundamentally different data types—genetic variants (tabular) and MRI brain scans (images)—presented an exciting opportunity to explore multi-modal deep learning and attention mechanisms, cutting-edge techniques that could unlock new insights in medical AI.

What We Learned

Building AlzFusion taught us several valuable lessons:

Multi-Modal Fusion is Powerful: Combining genetic and imaging data significantly improves prediction accuracy compared to single-modality approaches. The attention mechanism revealed fascinating relationships between specific genetic markers and brain imaging patterns.
Attention Mechanisms Provide Interpretability: Unlike simple concatenation, our cross-modal attention mechanism allows us to understand which genetic variants are most relevant to which brain regions—providing valuable insights for researchers and clinicians.
Data Alignment Challenges: Working with real-world medical data taught us the importance of robust data handling. We implemented flexible data loaders that gracefully handle mismatched dataset sizes and missing data.
Transfer Learning Works: Using pre-trained ResNet18 for MRI encoding dramatically improved performance and training efficiency, demonstrating the value of transfer learning in medical imaging.
End-to-End Training Matters: Jointly training both encoders and the fusion mechanism outperformed training them separately, showing the importance of learning modality-specific representations in the context of fusion.

How We Built It

Our project follows a systematic, research-driven approach:

Phase 1: Architecture Design

We designed a dual-stream architecture with:

Genetic Encoder: Fully connected neural network (256→128→64) with batch normalization and dropout
MRI Encoder: ResNet18 backbone with custom projection head (512→256→64)
Cross-Modal Attention: Bidirectional attention mechanism learning genetic-imaging relationships
Fusion & Classification: Multi-layer fusion network with 9-class output

Phase 2: Data Pipeline

Implemented custom PyTorch Dataset class handling both genetic features and MRI images
Created robust data loaders with automatic size alignment
Added data augmentation for MRI images (flips, rotations) to improve generalization

Phase 3: Training Infrastructure

Built comprehensive training loop with validation, checkpointing, and early stopping
Integrated TensorBoard for real-time monitoring
Implemented learning rate scheduling and gradient clipping for stable training

Phase 4: Evaluation & Visualization

Created detailed metrics system (accuracy, precision, recall, F1-score)
Built confusion matrix visualization
Added attention weight visualization for interpretability

Phase 5: Production-Ready Scripts

Modular codebase with clear separation of concerns
Command-line interfaces for training, evaluation, and inference
Comprehensive error handling and logging

Challenges We Faced

Dataset Size Mismatch: The genetic dataset (6,346 samples) and MRI dataset (5,120 samples) had different sizes. We solved this by implementing flexible data alignment that uses the minimum available size while maintaining data integrity.
Memory Constraints: Training with both modalities required significant GPU memory. We optimized by:
- Using mixed precision training
- Implementing efficient data loading with proper batching
- Using gradient accumulation for larger effective batch sizes
Feature Alignment: Ensuring genetic features (130 dimensions) and MRI embeddings (64 dimensions) could be effectively fused required careful architecture design. The attention mechanism solved this by learning optimal alignments.
Class Imbalance: The dataset had uneven class distributions. We addressed this by:
- Using weighted loss functions
- Implementing stratified sampling
- Focusing on macro-averaged metrics
Interpretability: Making the "black box" neural network interpretable was crucial for medical applications. The attention mechanism provides visualizations showing which genetic-imaging relationships the model finds important.

Key Innovations

Bidirectional Cross-Modal Attention: Unlike unidirectional attention, our model learns relationships in both directions—how genetics influence imaging features and vice versa.
Modular Architecture: Each component (genetic encoder, MRI encoder, fusion) can be trained independently or jointly, providing flexibility for different use cases.
Production-Ready Pipeline: Complete system from data loading to inference, suitable for real-world deployment.

Built With

Languages & Frameworks

Python 3.8+ - Core programming language
PyTorch 2.0+ - Deep learning framework
Torchvision - Pre-trained models and image transforms

Libraries & Tools

NumPy - Numerical computations
Pandas - Data manipulation
Scikit-learn - Machine learning utilities
Matplotlib & Seaborn - Visualization
Pillow - Image processing
PyArrow - Parquet file handling
Category Encoders - Feature encoding
TensorBoard - Training visualization

Architecture & Techniques

ResNet18 - Pre-trained CNN backbone for MRI encoding
Attention Mechanisms - Cross-modal attention for fusion
Transfer Learning - Leveraging ImageNet pre-trained weights
Batch Normalization - Training stability
Dropout - Regularization
Learning Rate Scheduling - Adaptive learning rate optimization

Development Tools

Git - Version control
Jupyter Notebooks - Data exploration and prototyping
VS Code / PyCharm - IDE

Cloud & Deployment (Optional)

CUDA - GPU acceleration
Docker (for containerization, if deployed)

Built With

Updates

Harsha k started this project — Dec 07, 2025 07:03 AM EST

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.