Predicting Alzimer Causing Mutations in Females

Inspiration

Alzheimer’s Disease is the most common cause of dementia, causing loss of cognitive brain function and affecting over 55 million people worldwide. The disease is generally not known to have an individual genetic cause, however, it is influenced by many genes as well as environmental factors and lifestyle. Regardless of this, about two-thirds of the people with Alzheimer’s are women. Using bioinformatics tools, we will investigate the risk of Alzheimer's development through genomics. This lets us better understand the genetic factors contributing to Alzheimer’s in women and may provide insights into developing targeted therapies.

What it does

Our model aims to predict whether a sequence of genes contains mutations with a high likelihood of leading to Alzheimer's disease. The input is a genetic sequence, and the output is either 0 or 1, where 1 indicates a high probability that the sequence contains Alzheimer's-related mutations.

How we built it

Our prediction approach uses machine learning. We filtered data from DSS NIAGADS to obtain mutations with p-values less than 0.05, then preprocessed neutral genetic data from GenomeKit by applying transformations from the mutation data. Finally, we used Borzoi from gReLU, combined with a random forest classifier, to train a model that can predict whether an input genetic sequence contains Alzheimer's-related mutations.

Challenges we ran into

The major challenge we faced was setting up the environment and managing limited training time. We had to quickly learn how genomics and sequences relate to Alzheimer’s disease, which was complex and unfamiliar. However, through teamwork and seeking support, we were able to overcome these obstacles and move forward.

Accomplishments that we're proud of

We’re proud of building a model, demonstrating teamwork, problem-solving, and innovative thinking. Our goal was to create a solution with real-world impact, particularly focused on empowering underrepresented groups like women in tech, and addressing critical challenges in healthcare.

What we learned

We learned that improving disease-causing mutation prediction relies on better data quality, refining models, and increasing accessibility. Enhancing datasets and model accuracy enables more reliable predictions, while broader accessibility allows for practical use in predicting diseases like Alzheimer’s and identifying harmful mutations, aiding early detection and personalized care.

What's next for Predicting Disease Causing Mutations

The next steps for predicting disease-causing mutations involve improving the quality and completeness of training data, refining the model for better accuracy and interpretability, and making it accessible to clinicians and researchers through user-friendly platforms. The model should be capable of predicting whether an individual is at risk for diseases like Alzheimer’s and identifying the exact location of harmful mutations in the genome. By enhancing the data, model, and accessibility, this approach can provide powerful tools for early detection and personalized treatment.