AlzFusion: Multi-Modal Alzheimer's Disease Prediction System
"Genetics Meets Imaging, AI Predicts Alzheimer's"
About the Project
What Inspired Us
Alzheimer's disease affects millions worldwide, and early detection is crucial for effective intervention. Current diagnostic approaches often rely on single data modalities—either genetic risk factors or brain imaging—but clinicians in practice use both. We were inspired to create an AI system that mirrors this real-world approach by fusing multiple data sources to provide more accurate and comprehensive predictions.
The challenge of combining fundamentally different data types—genetic variants (tabular) and MRI brain scans (images)—presented an exciting opportunity to explore multi-modal deep learning and attention mechanisms, cutting-edge techniques that could unlock new insights in medical AI.
What We Learned
Building AlzFusion taught us several valuable lessons:
Multi-Modal Fusion is Powerful: Combining genetic and imaging data significantly improves prediction accuracy compared to single-modality approaches. The attention mechanism revealed fascinating relationships between specific genetic markers and brain imaging patterns.
Attention Mechanisms Provide Interpretability: Unlike simple concatenation, our cross-modal attention mechanism allows us to understand which genetic variants are most relevant to which brain regions—providing valuable insights for researchers and clinicians.
Data Alignment Challenges: Working with real-world medical data taught us the importance of robust data handling. We implemented flexible data loaders that gracefully handle mismatched dataset sizes and missing data.
Transfer Learning Works: Using pre-trained ResNet18 for MRI encoding dramatically improved performance and training efficiency, demonstrating the value of transfer learning in medical imaging.
End-to-End Training Matters: Jointly training both encoders and the fusion mechanism outperformed training them separately, showing the importance of learning modality-specific representations in the context of fusion.
How We Built It
Our project follows a systematic, research-driven approach:
Phase 1: Architecture Design
We designed a dual-stream architecture with:
- Genetic Encoder: Fully connected neural network (256→128→64) with batch normalization and dropout
- MRI Encoder: ResNet18 backbone with custom projection head (512→256→64)
- Cross-Modal Attention: Bidirectional attention mechanism learning genetic-imaging relationships
- Fusion & Classification: Multi-layer fusion network with 9-class output
Phase 2: Data Pipeline
- Implemented custom PyTorch
Datasetclass handling both genetic features and MRI images - Created robust data loaders with automatic size alignment
- Added data augmentation for MRI images (flips, rotations) to improve generalization
Phase 3: Training Infrastructure
- Built comprehensive training loop with validation, checkpointing, and early stopping
- Integrated TensorBoard for real-time monitoring
- Implemented learning rate scheduling and gradient clipping for stable training
Phase 4: Evaluation & Visualization
- Created detailed metrics system (accuracy, precision, recall, F1-score)
- Built confusion matrix visualization
- Added attention weight visualization for interpretability
Phase 5: Production-Ready Scripts
- Modular codebase with clear separation of concerns
- Command-line interfaces for training, evaluation, and inference
- Comprehensive error handling and logging
Challenges We Faced
Dataset Size Mismatch: The genetic dataset (6,346 samples) and MRI dataset (5,120 samples) had different sizes. We solved this by implementing flexible data alignment that uses the minimum available size while maintaining data integrity.
Memory Constraints: Training with both modalities required significant GPU memory. We optimized by:
- Using mixed precision training
- Implementing efficient data loading with proper batching
- Using gradient accumulation for larger effective batch sizes
Feature Alignment: Ensuring genetic features (130 dimensions) and MRI embeddings (64 dimensions) could be effectively fused required careful architecture design. The attention mechanism solved this by learning optimal alignments.
Class Imbalance: The dataset had uneven class distributions. We addressed this by:
- Using weighted loss functions
- Implementing stratified sampling
- Focusing on macro-averaged metrics
Interpretability: Making the "black box" neural network interpretable was crucial for medical applications. The attention mechanism provides visualizations showing which genetic-imaging relationships the model finds important.
Key Innovations
Bidirectional Cross-Modal Attention: Unlike unidirectional attention, our model learns relationships in both directions—how genetics influence imaging features and vice versa.
Modular Architecture: Each component (genetic encoder, MRI encoder, fusion) can be trained independently or jointly, providing flexibility for different use cases.
Production-Ready Pipeline: Complete system from data loading to inference, suitable for real-world deployment.
Built With
Languages & Frameworks
- Python 3.8+ - Core programming language
- PyTorch 2.0+ - Deep learning framework
- Torchvision - Pre-trained models and image transforms
Libraries & Tools
- NumPy - Numerical computations
- Pandas - Data manipulation
- Scikit-learn - Machine learning utilities
- Matplotlib & Seaborn - Visualization
- Pillow - Image processing
- PyArrow - Parquet file handling
- Category Encoders - Feature encoding
- TensorBoard - Training visualization
Architecture & Techniques
- ResNet18 - Pre-trained CNN backbone for MRI encoding
- Attention Mechanisms - Cross-modal attention for fusion
- Transfer Learning - Leveraging ImageNet pre-trained weights
- Batch Normalization - Training stability
- Dropout - Regularization
- Learning Rate Scheduling - Adaptive learning rate optimization
Development Tools
- Git - Version control
- Jupyter Notebooks - Data exploration and prototyping
- VS Code / PyCharm - IDE
Cloud & Deployment (Optional)
- CUDA - GPU acceleration
- Docker (for containerization, if deployed)
Log in or sign up for Devpost to join the conversation.