I wanted to encourage people to and get screened earlier than they think and the idea is that by having a prediction of when you were onset with cancer gives one a relative idea as to how to move this predictor to predict even if you don't have cancer but have some N number of mutations and given the rate of mutations to help in getting people to get screened earlier so they can treat earlier.
What it does
Given a genome sequence it predicts the age of when someone was diagnosed with cancer
How I built it
Using Pytorch I built many different models, and RNN, Feed Forward NN, Logistic, and Linear classifiers to see how it would behave on the TCGA dataset.
Challenges I ran into
Debugging Pytorch and the node sizing to pool down correctly. Should have used a CNN now realizing that a lot of genome sequencing, 99 percent of dna is the same so I could apply a convolution to it to filter it out. Also reading in the data from TCGA.
Accomplishments that I'm proud of
It works with 80 percent accuracy.
What I learned
A lot of ML
What's next for Cancer Predictor
Use it to predict ahead of time instead of having immediate labels.