The Github link in the "Try It Out" section leads to a repository with all of my reports, model cards, and code for this hackathon. Any written documents needed for this submission can be found in the docs/ branch.
Inspiration
Alzheimer's disease affects millions of people around the world, yet early detection is not common. The symptoms are very subtle, and usually tend to be confused with normal aging. After learning how long it takes a patient to get a clear diagnosis, we decided to make two research models that could help in detecting patterns in clinical/biomarker/MRI data that humans would miss or ignore.
What it does
This webpage classifies the subject based on inputted data. There are two models, an MRI-focused one and a clinical/biomarker-focused one. The MRI model takes the image, converts it to 128x128x1 grayscale pixelated image, and then classifies it into 4 levels of dementia. The clinical/biomarker takes multiple numerical inputs from the user and then classifies the subject into either cognitively normal, mildly cognitive impaired, or Alzheimer's disease.
How we built it
We trained two separate models, an XGBoost classifier for clinical/biomarker data and a Custom CNN model for the MRI images. The MRI model learns spatial patterns in the images to classify into 4 dementia stages, while the clinical model uses a decision tree to separate the subject into three classes based on identified patterns. Both models were then evaluated on a held-out test set to check accuracy and other metrics before tying it all together with a Streamlit interface.
Challenges we ran into
One major challenge we ran into was finding the right dataset for the clinical/biomarker data. We made the decision to move away from superficial data (which is usually the case for Kaggle) and decided to look for real, professional datasets. However, no dataset that we found had everything we needed. After a bit of digging we eventually found the ADNI dataset, and from there we had to apply and get approved before gaining access.
Accomplishments that we're proud of
For both of our models we managed to get very high accuracy, recall, and weighted F1 scores, showing that the models are very good at classifying and don't tend to estimate false positive/negatives. To achieve this we had to add noise, merge on different basses, and change parameters in our models, but in the end we got good results.
What we learned
First of all, after our initial research we learnt about what Alzheimer's was and how much early detection matters. We also learned how to properly find and apply for access to get professional datasets. Finally, we learned how to use XGBoost and other decision trees for our models and optimizations.
What's next for Early Detection of Alzheimer's Through MRI and Clinical Data
When continuing this project, we plan to combine both of our models into one, final multimodal model. This would improve accuracy and reliability, while also decreasing generalizations that could pop up in the smaller, independent models. Also, we would continue looking for more datasets so we could add different variables such as speech and sleep quality.
Built With
- adni
- numpy
- pandas
- python
- scikit-learn
- streamlit
- tensorflow
- xgboost
Log in or sign up for Devpost to join the conversation.