Inspiration

Chronic Myelogenous Leukemia affects 2 in every 100 000 people and, although there are some treatments, once they are removed, it is hard to predict whether a person will relapse or not. The idea behind this project is to detect which genes have an impact in relapse.

What it does

First, we use the T1K genotyper to infer the alleles for the KIR genes. Then we process that data to generate the clean tables required to train a predictive model, which performs a deep remission (RM profunda), and once we get the results, we analyze the alleles that might help achieve this remission faster.

How we built it

We used Python to generate the scripts necessary to process the data, and for the predictive model we used a gradient boosting from the scikit-learn library.

Challenges we ran into

We had some issues with the T1K model and had to process the data generated by it to achieve a greater accuracy, and performing the actual genotyping with T1K took a considerable amount of time, which we were quite short of. The same goes for the predictive model training.

Accomplishments that we're proud of

We were able to solve all the challenges that we encountered, and taking into account the limited time we have had, we are quite proud of our overall solution, specially the alleles genotyping.

What we learned

We have learned about several biology concepts, like genetics and how alleles works, and how to apply our technological knowledge to the biology field. Modern sciences are interdisciplinary, so learning about different topics is very useful.

What's next for Allelator

With more time, we can work with larger datasets to better tune the models and to make other optimizations

Built With

Share this project:

Updates