Genome-wide association analyses have uncovered tens of thousands regions in our genome associated with human traits and disease susceptibility. Recently, this approach is beginning to be used to study how our genetic makeup influences drug response.
What it does
StayinAlive uses time-to-event regression analysis of genetic data to analyse drug trial data. It allows to identify genetic variants associated with drug response. In addition, we've built an interactive visualization tool to convey the results in a way that's understandable to clinical specialists.
How we've built it
StayinAlive is written in Python. We parse the genetic data using the
cyvcf2 library and performs regression using the
lifelines package. In addition, we have written a support tool for running StayinAlive on high-performance computing environments (and tested it on a 300 CPU cluster). Data was visualized using an interactive visualization
Python library bokeh. Kaplan-Meier curves were created which allow users to interactively get details on each individual data point.
Challenges we ran into
It was difficult trying to find a functioning Python package to create interactive visualization of the results. For example, the built-in Kaplan-Meier function in Lifelines only produced static plots. Making it interactive using the plotly proved challenging due to incompatible data objects between packages. Bokeh was used instead and proved effective.
Accomplishments that we're proud of
In 36 hours we've written a time-to-event analysis package that outperforms the existing tools in terms of speed, while matching it in output. This will allow to use time-to-event analysis on larger patient cohorts.
We learned a tremendous amount from collaborating with our teammates and other hackers. We managed to learn how to use powerful plotting and data science libraries such as Bokeh, and numpy and were able to use them effectively despite very little previous experience with them.
What I learned
We've explored various packages in python, especially plotly and bokeh. We've also learned about a branch of statistics called "survival analysis". Learned how multidisciplinary hacking can be. In the project alone we learned statistics, human health, genetics and computer science. We about the power of collaboration to tackle scientific challenges.
What's next for StayinAlive
We're hoping to improve the StayinAlive tool so it can be used by the genetics community. One of the team-mates is planning to use the package in research.