Inspiration

Alzheimer’s disease is often diagnosed too late, after significant and irreversible neurological damage has already occurred. At the same time, EEG provides a non-invasive, scalable, and relatively low-cost signal that has strong potential for early detection, yet remains underutilized.

We were particularly motivated by the gap between powerful machine learning models and their lack of interpretability in healthcare settings. Many existing approaches rely on deep learning systems that function as black boxes, making them difficult for clinicians to trust.

Our goal was to build a system that is not only accurate, but also grounded in neuroscience and capable of explaining its predictions in a meaningful way.

What it does

NeuroSignal is an EEG-based machine learning system that detects Alzheimer’s disease from brain signals.

Given raw EEG recordings, the system: • Cleans and preprocesses the signal • Extracts clinically meaningful features • Aggregates patterns at the patient level • Outputs: • A probability score for Alzheimer’s disease • A final classification (Alzheimer’s vs Control)

In addition, the system provides interpretability by identifying which neurological features contributed most to each prediction.

How we built it

We designed a full end-to-end pipeline combining signal processing and machine learning:

Preprocessing • Bandpass filtering between 0.5–45 Hz • Downsampling from 500 Hz to 128 Hz • Removal of noisy or invalid signal segments

Windowing • EEG signals segmented into overlapping 30-second windows • Only high-quality windows retained

Feature Engineering We extracted features grounded in neuroscience: • Relative band power (delta, theta, alpha, beta, gamma) • Band ratios (theta/alpha, delta/alpha, slowing index) • Spectral entropy (signal complexity) • Hjorth parameters (activity, mobility, complexity) • Functional connectivity (coherence between brain regions)

Aggregation • Window-level features aggregated into subject-level statistics (mean and variance)

Model • Elastic-Net Logistic Regression • Feature selection using SelectKBest • Class balancing to address dataset imbalance

Validation • Subject-level 5-fold cross-validation • Threshold tuning using out-of-fold predictions

Challenges we ran into • Limited dataset size (38 subjects) Required careful regularization and feature selection to avoid overfitting • Risk of data leakage Initial approaches using window-level splits led to inflated performance We corrected this by enforcing strict subject-level validation • Noisy EEG signals Required robust preprocessing and filtering to ensure signal quality • Balancing interpretability and performance More complex models could increase accuracy, but reduce transparency

Accomplishments that we’re proud of • Built a complete EEG-based diagnostic pipeline from raw signals to predictions • Achieved strong performance with a balanced accuracy of 0.82 • Maintained full interpretability using a transparent model • Demonstrated alignment between model features and known neurological biomarkers • Avoided common pitfalls such as data leakage

What we learned • Proper validation strategy is critical, especially with biomedical data • Simpler, well-regularized models can outperform complex ones on small datasets • Feature engineering grounded in domain knowledge is extremely powerful • Interpretability is not just a bonus, but a necessity in healthcare applications

What’s next for NeuroSignal • Validate the model on larger and more diverse clinical datasets • Extend the system to detect earlier stages such as mild cognitive impairment (MCI) • Explore hybrid models combining deep learning with interpretability • Develop a real-time EEG analysis interface for clinical use • Improve robustness across different EEG acquisition systems

Built With

Share this project:

Updates