Inspiration
Imputation is a technique to increase the amount of data available for genetic studies. Due to price constraints, the vast majority of these studies only sequence a spare subset of each subjects' genome and then use a database of fully sequenced genomes to fill in the most likely intervening sequences. Most imputation works through "imputation servers"—databases of private genetic data that researchers may upload data to and receive imputed genomes back. Although this method initially keeps the data inaccesible, it is still vulnerable to Reconstruction Attacks—sets of data cleverly designed to reveal the underlying genetic data.
Differential privacy methods are well established techniques to protect from reconstruction attacks by adding noise to the dataset that is designed to not interfere to greatly with the accuracy of stastistical summaries.
What it does
Although Imputation and Differential Privacy are well-established techniques, we were unable to find applications of differential privacy to protecting genetic information. This project demonstrates one way that differential privacy could be applied to protect sensitive genetic data while also maintaining acceptable imputation accuracy. The web app lets the user explore different configuration options for differential privacy and view the effectiveness of different methods.
How we built it
We construct manageable and plausible genetic sequences by simulating Meiosis—the process that combines genetic sequences during sexual reproduction—over the course of multiple generations. We then apply different levels of noise to the data set and measure the difference in imputation quality.
Challenges we ran into
The complexity of generating genetic datasets and effectively applying our privacy algorithm.
Accomplishments that we're proud of
We were able to adapt to new frameworks, such as Spring Boot while also applying abstract mathematical concepts to a real-world problem.
What we learned
We learned new mathematical approaches to protecting user data and algorithms to implement the approaches.
What's next for Safe Impute
Safe Impute will work to implement a greater variety of imputation and differential privacy methods while using more realistic data.
Built With
- amazon-web-services
- java
- next.js
- spring
- tailwindcss
- typescript
Log in or sign up for Devpost to join the conversation.