As a team of students with a background in data science, medicine and mathematics- we decided to tackle the problem of virus's mutations. Currently the virus is mutating quite heavly, but the good this is that in some genes (i.e. gene S) it is mutating not so much. With mathematical modelling and machine learning we will be able to foresee this changes and help to develop vaccine therapies targeted on genes that have lower probability of future mutations. During this weekend we managed to polish our current model and develop numerous data visualisation tools to help us see where exactly our work should be put next. Many of them can be seen in the presentation.

What it does

Our finished product will be a tool based in a website. Using this tool, any laboratory working on a COVID-19 vaccine will be able to check chances of viruses mutation in given gene and even part of gene. Additionally, we will provide information whether an antigen should be added to a vaccine in order to create more succesfull vaccine.

How I built it

We based our primary analysis on python, and we are building stochastic modelling using Markov chains and we are currently developing ML model to drastically upgrade our accuracy.

Challenges I ran into

There were a lot of challenges on our way. Firstly, it was not so easy to get hold of precise SARS-CoV-2 virus genomes as students. Secondly, the analysis and building of model was sometimes very challenging for us. Fortunately, we managed to overcome this two and many more problems, connect with few partners & advisors and we are currently working constantly to deliver full product as fast as possible

Accomplishments that I'm proud of

I could list them for a long time. I'm very proud of our team, the things that we managed to build from scratch are still shocking me. But we still have a way to go, our alghoritms can be greatly improved and we will definately not settle down until it is finished.

What I learned

We learned a lot. Both from working as a team and managing each other time and resources, as well as from just reading research papers, getting to know our enemy- SARS-CoV-2 better. Overally the experience from this project taught us a lot and we are proud of it.

What's next for Covid Genomics

In the near future, we plan to expand our model of predicting mutations. Using machine learning we will predict more precisely than ever how any mutation can affect protein folding and if it will make virus resistant to drug or vaccine.


In the future, our necessities will for sure be- much more computing power to develop ML models and for i.e. phylogenetic analysis. Moreover we need contacts to pharmaceutical companies to work with them directly and to know exactly what they desire. We also lack funding for our research.

Value after crisis

We are planning to post our results in a paper, if not a series of papers. These papers will be for sure become handy during the analysis of future viruses, especially if the next pandemy would occur.


The impact of our project is massive. If implemented correctly, our research can greatly improve chances of developing vaccine and other drugs to fight of COVID-19.

Built With

+ 1 more
Share this project: