Recent spikes in influenza cases in the United States and the general fight in science against rapidly changing and evolving viruses inspired us to identify this problem and pursue an algorithmic and computational solution.
What it does
Using genomic data on the hundreds of influenza strains from the past 30 years, our algorithms tackle the problem of vaccine production in a two-pronged manner. First, our statistical analysis of particular strains over time enable us to identify common sites of mutation in a particular influenza strain from year to year. This tool serves to help researchers visualize the changes that are occurring in the DNA sequences of these viral vectors. By understanding the mutations that change the virus the most, they can develop more precise viral cocktails for vaccines based on previous vaccines.
The second tool that we offer is a predictive analysis machine learning model that uses logistic regression modeling to help government officials and vaccine developers predict the transmissivity and epidemiological characteristics in a particular viral strain in the next few years. By utilizing historical data and information from previous influenza pandemics, this tool is able to predict mutations that could significantly alter the disease characteristics of influenza in a given year. In particular, the tool leverages large data sets of strain genomic data from over 37 country samples leading up to the major 2009 global flu epidemic.
How we built it
Challenges we ran into
Parsing through and statistically manipulating the data from the NCBI government database was probably the most difficult aspect of our hack. In addition to this, we had issues developing the machine learning model and identifying the most accurate set of data for future projections of outbreak probabilities.
Accomplishments that we're proud of
Developed an original algorithm for dealing with and statistically analyzing NCBI and genomic data rigorously. Developed an interactive user interface that serves as a base plate for a research database.
What we learned
Parsing data and utilizing REST APIs to deal with data. Formatting and graphics (data visualization) on website. Machine learning (through logistic regression) model development.
What's next for Flulytics
Our next steps are likely to build out our platform more extensively. The goal is to interface better with the NCBI genomic database and select out key statistics that could be useful to researchers. Our other goal is to begin training our model continuously as new data flows in.
Future Business Models
Model 1: Business to Business: One model we potentially see ourselves working under is by engaging in a technology licensing deal with vaccine producers and pharmaceutical companies. By helping to optimize their production lines and improve profit in the short and long term, our technology would enable companies with the tools to grow.
Model 2: Business to Government: Another potential business model we are exploring is based on engaging in a contract with the United States government or government entities. By helping to avert major public health crises and preparing adequately for future outbreaks, our technology could help groups like the CDC and NIH predict the severity of future crises.