Alzheimer's Disease Detection - Enhanced Multimodal Approach

Github

https://github.com/iponugoti/ADDetection.git

Final Writeup

https://docs.google.com/document/d/1SCbFpK-Bs91rMj2miwl_qgmh24kHxR_4KWgi8l112AQ/edit?usp=sharing

Check-In 3 Reflection

https://docs.google.com/document/d/1pXjM1QICj1LkUdQwTd_9xf2-nmvbLKQvCMOlHJOqnn4/edit?usp=sharing

Who

Karis Ma, jma78

Timothy Pyon, tpyon

Isha Ponugoti, iponugot

Raima Islam, rislam9

Introduction

According to the National Institute of Neurological Disorders and Strokes, approximately 50 million Americans are affected by neurodegenerative diseases each year, yet the prevalence of missed and delayed diagnoses of such disorders is extremely high. Among these, Alzheimer’s Disease (AD) is one of the most common and costly conditions – without a slowdown in AD diagnoses or a dramatic increase in early treatment, aggregate care costs for AD patients are expected to increase from $203 billion in 2013 to $1.2 trillion in 2050. There is a clear need for improvement in AD treatment and identification. Although no cure for AD exists, there are medications that exist that can delay its progression. It is important to get these medications to patients early to improve their quality of living and delay AD as much as possible. A paper by Golovanevsky et al. utilized a deep learning approach with clinical, genetic, and imaging data to diagnose AD. The proposed extension of the paper will create a model with an improved diagnostic accuracy, and move towards early detection of AD in order to provide patients with the best care possible. We propose an analysis that builds upon this work by diversifying data sources to other genres included in the ADNI database, potentially including but not limited to PET scans, clinical notes, and biomarkers.

Related Work

As data becomes more abundant and the robustness of multimodal deep learning techniques improves, the application of deep neural networks for AD detection continues to gain traction. Since launching in 2004, Alzheimer’s Disease Neuroimaging Initiative (ADNI), the most comprehensive effort to identify neuroimaging measures and biomarkers associated with cognitive and functional changes in healthy elderly, Mild Cognitive Impairment (MCI), and AD subjects, has significantly enhanced data accessibility.

Research in the field has explored various combinations of data modalities and methods. The most prevalent combinations include MRI, PET, and clinical data, [Magda Bucholc et al., Abuhmed et al.] followed by MRI, Single Nucleotide Polymorphism (SNP), and clinical data [Venugopalan et al., Golovanevsky et al.]. These studies have contributed to our baseline understanding of multimodal approaches in AD detection.

A notable advancement since the publication of MADDi is Multi-Modal Mixing Transformer (3MT), developed by Liu et al. This study introduced a multi-modal classification approach that handles incomplete data, avoiding the need to discard samples when one or more data modalities are unavailable for any given patient. This model, which incorporates Deep Learning (DL) with attention mechanisms and uses MRI, SNP, and clinical data, is distinctive in its training on the ADNI dataset and subsequent testing on both ADNI and the Australian Imaging, Biomarker & Lifestyle Flagship Study of Ageing (AIBL) without any model fine-tuning or retraining. Utilizing a combination of Convolutional Neural Networks (CNNs), transformers, and Cross-Modal Transformers (CMTs), this study reported an unprecedented accuracy of 99.4%.

Building on these findings, our study is motivated to explore whether a combination of MRI, SNP, clinical, and PET data could achieve a similarly high accuracy while improving generalizability. To tackle the challenge of generalizability in AD detection using deep learning, our research extends beyond the ADNI dataset to include testing and validation on both AIBL and the ADNI Department of Defense (ADNIDOD) datasets. Furthermore, we implement techniques to manage incomplete data sources, aiming to enhance the robustness and applicability of our model across diverse datasets. To account for missing data, we attempt to mask incomplete data sources. Ultimately, our objective is to fill the existing gap in the generalizability of AD detection models by integrating multiple data sources and addressing the issues associated with incomplete data.

Public implementations of paper:

MADDi is open source (in TensorFlow)
None

Data

We have put in a request to the ADNI database (https://adni.loni.usc.edu/data-samples/data-types/) for ADNI, AIBL, and ADNIDOD data. If our ADNI data request is accepted, we plan to draw diverse forms of data from that database. This would include various types of data that were not used in the original paper, potentially including but not limited to PET scans, clinical notes, and biomarkers. If the ADNI team grants us access to other novel data sources, we would love to include that as well. Since a variable number of participants consented to data collection for each type of data, we will likely further refine our data types to ensure that we have a statistically significant amount of observations. Other complications might also necessitate shifts in the data sources used – after all, a large part of the project's goal is to explore the use of an existing model with different data modalities and types. However, we will need access to the ADNI dataset to fully evaluate the feasibility of different data types.

If we are not granted this access, we are also looking into a couple of other databases concerning neurodegenerative diseases – we will pivot to those. Some potential other databases that we are looking into include the BrainLat project (https://www.nature.com/articles/s41597-023-02806-8) that compiles open-source multimodal neuroimaging data on neurodegeneration, or the OASIS Brains Datasets (https://oasis-brains.org/#data) that contains, across various iterations, different kinds of data including but not limited to neuroimaging, clinical, and cognitive data.

We are currently unsure of the size of the data we will be granted access to, but based on the number of observations in the original paper, our dataset is unlikely to contain an overly large number of discrete samples (<2500 per data type, and often in the ballpark of 200-300). We will definitely need to perform preprocessing, which will differ for each type of data. For instance, image data will likely require convolution to extract essential features, while clinical notes may require natural language processing to standardize the varied observations of different clinicians into a uniform corpus. Biomarkers and other kinds of genetic data may require additional filtering to determine the markers most relevant to diagnosis of AD, since genetic data is often large and unwieldy to begin with. Further, we will have to aggregate and standardize the samples across the 3 ADNI studies. Overall, we may need some clinical knowledge to perform this preprocessing as well – in addition to determining the most appropriate genetic information to use, we might have to perform tasks such as isolating key medical terms in clinical notes.

Methodology

We are training our multi-class classification model to detect the presence of AD, mild cognitive impairment (MCI), and no cognitive decline using imaging (MRI and PET scans), genetic, and clinical data. We will use both self-attention and cross-modal attention to capture the interactions between modalities. Most of the model architecture will be adopted from the paper and translated into PyTorch – this is justifiable given that the main goal of our project is to explore the utility of the paper’s findings against different forms of data. We have also spoken to one of the paper’s authors (Michal Golovanevsky, a TA for this course) on translating it into PyTorch, and have been advised that this is likely both feasible and probably more appropriate than the Tensorflow the paper was originally written in. Given that the model architecture is not overly complex or involved, we are able to run it locally. If our preprocessing steps add additional computational needs, we will look into using department resources such as OSCAR or Hydra.

We predict the hardest part of implementing our model will be the data aggregation and preprocessing. Because we are aggregating ADNI, AILB, and ADNIDOD data, we will have to ensure our data is preprocessed similarly to increase model accuracy. Further, since our dataset is different from that of the original paper (2021 ADNI data), it will take some trial and error to bring the model up to the same standards and high accuracy rate. Additionally, as we combine modalities, we may lose a significant amount of data. Unlike the original paper, we plan to mask any given modality with zeros if a modality is missing for a specific patient to minimize the effect of missing values from propagating down the layers to allow for prediction with some missing data.

Metrics

Base goal: Equal accuracy to the paper (96.88%)
Target goal: Reach a similar accuracy (within 5%) of our base goal while utilizing a different dataset, as well as finding an optimal set of modalities to utilize with the model.
Reach goal: Achieve similar accuracy in situations where there is missing data

Ethics

Interpretability and transparency: In medical applications of deep learning, interpretability is crucial to ensure both clinician and patient confidence. Our Alzheimer's Disease (AD) detection system utilizes a multimodal model with attention mechanisms, which can inherently function as a "black box." To mitigate this, it's important to develop methods that offer insights into the model's decision-making processes. This transparency is essential not only for trust but also for clinicians to appropriately utilize and verify the AI-generated diagnoses. Representation and fairness: The Alzheimer’s Disease Neuroimaging Initiative (ADNI) dataset, like many medical datasets across the field, suffers from a lack of diversity and is not representative of the entire population that needs treatment. Approximately 80% of its participants are white and college-educated. This underrepresentation can skew the AI model's performance, potentially leading to higher diagnostic accuracy for certain demographic groups over others. This disparity could inadvertently perpetuate existing healthcare inequalities, leading to poorer health outcomes for underrepresented groups. To address this, we plan to diversify training data as much as possible (through AIBL, ADNIDOD, and potentially BrainLat) or develop methods to adjust for these imbalances in the model's training process. Ethical implications of early detection: Finally, although earlier AD detection would drastically improve patient outcomes by allowing for earlier intervention, it also raises ethical concerns. For example, knowledge of a likely future decline in cognitive health can impact a patient's life and mental health profoundly. Additionally, there are potential implications for how such information could be used by insurance companies, possibly affecting coverage and costs for the patient. Therefore, it is critical to handle such data and its implications with extreme care, ensuring that patients fully understand the consequences of early AD detection. Moreover, policies and safeguards should be in place to prevent misuse of this sensitive health information.

Division of Work

ADNI data request: Isha
Preprocessing:
- Imaging data: Tim (MRI) and Raima (PET)
- Clinical data: Karis
- Genetic data: Isha
Model implementation:
- Training: Tim and Karis
- Validation: Isha and Raima
- Testing: Isha and Raima
Final writeup: everyone