Genome-wide association analyses have uncovered tens of thousands regions in our genome associated with human traits and disease susceptibility. Recently, this approach is beginning to be used to study how our genetic makeup influences drug response.

What it does

StayinAlive uses time-to-event regression analysis of genetic data to analyse drug trial data. It allows to identify genetic variants associated with drug response. In addition, we've built an interactive visualization tool to convey the results in a way that's understandable to clinical specialists.

How we've built it

StayinAlive is written in Python. We parse the genetic data using the cyvcf2 library and performs regression using the lifelines package. In addition, we have written a support tool for running StayinAlive on high-performance computing environments (and tested it on a 300 CPU cluster). Data was visualized using an interactive visualization Python library bokeh. Kaplan-Meier curves were created which allow users to interactively get details on each individual data point.

Challenges we ran into

It was difficult trying to find a functioning Python package to create interactive visualization of the results. For example, the built-in Kaplan-Meier function in Lifelines only produced static plots. Making it interactive using the plotly proved challenging due to incompatible data objects between packages. Bokeh was used instead and proved effective.

Accomplishments that we're proud of

In 36 hours we've written a time-to-event analysis package that outperforms the existing tools in terms of speed, while matching it in output. This will allow to use time-to-event analysis on larger patient cohorts.

We learned a tremendous amount from collaborating with our teammates and other hackers. We managed to learn how to use powerful plotting and data science libraries such as Bokeh, and numpy and were able to use them effectively despite very little previous experience with them.

What I learned

We've explored various packages in python, especially plotly and bokeh. We've also learned about a branch of statistics called "survival analysis". Learned how multidisciplinary hacking can be. In the project alone we learned statistics, human health, genetics and computer science. We about the power of collaboration to tackle scientific challenges.

What's next for StayinAlive

We're hoping to improve the StayinAlive tool so it can be used by the genetics community. One of the team-mates is planning to use the package in research.

Built With

Share this project: