Inspiration
HaploRecstr is a C++ haplotype reconstruction program. Method used is based on the algorithm introduced by Rastas et al.[Rastas, P., Koivisto, M. et al. (2005). Algorithms in Bioinformatics: 5th International Workshop. Berlin, Heidelberg: Springer. 145-151]
What it does
The program uses a Hidden Markov Model (HMM) to construct the data. By default, the model is initialized by going through the data to select the major alleles and assigning parameter values, then EM algorithm is used to optimize the likelihood function, and then Viterbi is used to reconstruct the haplotype data.
The program takes genotype as input. Outputs of the program include: 1) a set of reconstructed haplotypes; 2) a summary of the frequencies of all possible haplotypes in the population (sorted in descending order).
How I built it
C++
Challenges
3-dimensional matrix manipulation and time efficiciency when going through large datasets
Accomplishments that I'm proud of
It's our first attempt to implement EM algorithm using HMM
What's next for HapRecstr
Improve phasing accuracy and optimise memory cost
Log in or sign up for Devpost to join the conversation.