Until a few weeks ago there were hardly any useful infographics, modelling of the spread or curated datasets about Covid-19 available. This made it hard to communicate the gravity of the situation to friends and decision makers. Also this made it difficult for researchers to just start, analyse data, develop machine learning models. Thus, Richard Leibrandt started it as a project at #CodeVsCovid19. His inspiration was to build unified and curated datasets of various countries, which contain Covid-19 cases with population details and political actions to forecast future spread in order to enable both decision makers and individuals to make personal adjustments.
What it does so far
The entire project has three parts:
- A set of data preparation software that downloads publicly available datasets and harmonizes them such that they can be merged together easily.
- Models that are able to forecast the spread of the virus.
- A webpage which displays the course of the spread on country level, regional level (smaller than country, e.g. "Kanton") and county level. The webpage will show both the historical and the future course using the predicitve models.
What we managed to do in the Hackathon
We have been working on all three parts to provide an end-to-end prototype to showcase how such projects can be done and motivate to continue working on the project:
- For some countries part the data preparation is finished (Switzerland, USA, UK, Italy), others are close to finishing.
- Different models where theoretically discussed, one was implemented.
- The webpage will be up and running (but for now, possibly only for historical data).
How we built it
We used the data science stack of python: python, scipy, numpy, pandas. The webpage is running on the Oracle cloud platform, thanks to the help of one very capable Oracle employee, whom we had as a team member.
Challenges we ran into
Finding data for the countries was not that simple, cleaning it, defining standards - all that took quite a bit of time. With 16 people, we were a pretty large team, which made coordination challenging, especially since we were a pretty heterogeneous group. We also had some challenges with the virtual communication within our team, because not to have a real face-to-face communication can be quite difficult regarding coordinations, overcoming the timidity to say something in the calls or to get an overview of who does what.
What we are proud of and what we learned
Even though we were so many people and so heterogeneous, we still we were able to function as a group. We are proud of that nobody did throw the towel, everyone was motivated and we were able to work together harmoniously. We learned how to best cooperate with a virtual team around the world: regular e-meetings, clarify and assign tasks to each group member. The task is pretty enormous and since we had varying level of expertise we needed to help each other. But even nobody knew anyone before the hackathon, we managed to do this smoothly - this is something to be proud of. Some people learned about software engineering, some about data science, others how to manage - and we all had a lot of fun.
What's next for Spread Modelling
We want to continue on the project. Whether in another hackathon or not... definitely as a continuous project.