EveryoneCounts - The Social Distancing Dashboard
Social Distancing is one of the most important measures to reduce the spread of contagious diseases. Our solution makes it possible to measure social distancing and therefore more tangible for all stakeholders in society.
The project is based on continuous collection of publicly available data sources and their aggregation into one common data pool. This pool is visualized in our Dashboard. Every visitor can use it to easily check how political measures are effecting social distances in different regions. All information are available via API to connect further projects with our data.
Our drive: citizens, politicians and data scientists
The corona virus is spreading exponentially and has significant impact on every single member of our society. Actors face different problems when implementing measures that restrict public life as we knew it. This MVP focuses primarily on Germany, its citizens, politicians and data scientists. Further iterations should implement other stakeholders, like companies or other countries as well.
We created three focus personas:
Björn Bürger - Citizen He follows the new social distancing routines, stays at home and watches TV. After two days he wants to know:
- "Will my behaviour change anything?"
- "Are other people doing it as well?" ---
Reiner Klein - mayor of a typical middle-sized German City With more and more crisis meetings in his itinerary, Reiner is very worried about the health of the inhabitants of his city. He frequently asks himself:
- "Do the measures work? Are there any trends?"
* "Do we need adjustments? Do we need further restrictions?"
Franziska Bartels - dedicated data scientist While working from home Franziska is thinking about using her skills to help with the crisis. She wants to know:
- "Where can I get the necessary data to develop helpful applications for others?"
*"It's our vision to actively help reduce the spreading speed of Corona. We help by making political measures for social distancing measureable."
We provide an intuitive, data-based tool for all social actors to make social distancing measure- and tangible. Users are not required to have a deep understand of data analytics. We use a relative score as a quick visualization. It's calculated as a current value against the average at a comparable scenario before the crisis. Currently we use a scale from 0 to 1, or 0% to 100%. 0 would indicate a total isolation without any contacts, while 100% would point to no change in behaviour. Different data sources are available for the visualization.
The click-journey below shows an example of the GUI:
On the website the user can search for different regions. A coloured map directly reports on the effectiveness of social distancing in the desired region:
Many users might be interested in comparing regions. An example would be to see how their region is holding up to other places (with different measures). Therefore several regions can be selected for easy comparison.
To understand the effectiveness of measurements it is important to explore the trends. Our GUI can deliver daily scores, but also time series of historical data. Interested users can select different sources in a dropdown menu to explore the data themselves. Further features might be implementing weather data or information about infections - with a delay for the incubation period.
We support the usage of the aggregated data for third party analytics with an API. This is already available for large parts and will be rolled out further. We update on this work on our usual channels and mention new data source implementations.
Why do we need #EveryoneCounts?
Every persona above has their own needs. See what they are interested in:
All the requirements of the above stakeholders are addressed in our dashboard and the API. Our data does not work to identify individuals - opposed to pseudo-anonymized mobile connection data.
What was achieved during the #WirVsVirus hackathon?
It started with a tweet by Philip. He pointed out, that publicly available data can be used to quantize social distancing. With the help of some other people he used the #wirvsvirus hackathon to expand the idea. The goal was an interactive chart which included as many sources as possible.
Friday was the first day of the #wirvsvirus hackathon. During the weekend the team identified 7 data sources, integrated them, build a fitting IT infrastructure and an intuitive frontend with a free API. Some *analysis examples** can be seen on the projects twitter account.
A fully functional API is our next goal. We are currently developing all interfaces to satisfy power-users like our persona Franziska.
Our first iteration of a dashboard is complete to supply information to our persona Björn Bürger. We are working on facilitating the usage experience with more functions (comparison, inclusion of infection rates, weather, political measures,...)
Short list of future tasks:
- Publication of data via public API
- Continuous development of dashboad to improve context of data:
- political measures
- weather data
- infection progress
- Implementation of other data sources and ideas (german)
- Aggregation of financial support to keep the project running long term
For the technical implementation of EveryoneCounts - the social distancing dashboard, the code is accessible on Github: https://github.com/socialdistancingdashboard/virushack/. A short overview is presented below.
The data pool is the heart of EveryoneCounts. It's based on several anonymized data sets from different sources. These sets are being crawled by custom scripts and then fed into a S3 Bucket as JSON files. Another script transfers these into a MongoDB. This step includes the mapping of the data points to GPS coordinates and thus countries, states and counties. During the hackathon a daily granularity was deemed sufficient, as this is also the minimum temporal size that is needed in the frontend. The data is then displayed via the Streamlit Pyhton-Framework.
First focus were sources that are publicly available. We wanted a strong focus on a direct connection between data and the activity of people in public spaces. More data sets are on their way to be implemented. Suggestions are always welcome.
hystreet.com is collection data on pedestrians passing through shopping streets with laster scanners. They currently check 117 places in 57 cities in Germany. They opened their API for us and deliver hourly rates for pedestrian trafic. Historical data is reaching back several yours for comparision.
Many cities have counting stations for cyclists. The data is published by Eco-Compteur. The data for the 42 German stations is aggregated into our database. Example: http://eco-public.com/public2/?id=100004595
Deutsche Bahn has the biggest railroad network in Germany. They supply information about their connections via API. These information can be used to see if stops are skipped or connections are cancelled. An example is the connection change in February:
- The diagrams show an incline for cancelled connections on 2020-02-08 and 2020-02-09. This was caused by a the storm "Sabrina"
- In the last part of the graph the incline can be explained with Corona caused problems.
Via Google Maps it is possible to show the popularity of places, to see if museums or restaurants are crowded to plan events. This was the base for the initial idea and the first data source for the project.
The Fraunhofer IoT-Reallabor Lemgo Digital collects data in the city of Lemgo, Germany. The data set includes pedestrian frequencies, noise and traffic data at different places in Lemgo. Real-time data is acquired by the IOSB-INA and processed in the Urban Data Platform by FIWARE. A https request returns a csv-file for us to use.
World Air Quality Project
Changes in air quality can point to increase or decrease traffic. Theses sensors can therefore used to quantify social distancing measures. 402 stations are currently used for the German part of our data sets of the World Air Quality Projects. The data set includes parameters like the local concentration of nitrogen oxides, sulphur dioxide, carbon monoxide particulates, ozone, and weather data like temperatures, humidity and air pressure.
There are many openly available webcams that show public places in our cities. Image detection processes can be used to count pedestrian. Data is collected on an hourly base.
Aggregation and data processing
Every data source supplies different values and granularity. therefore data has to be aggregated and transferred into a database with a coherent location data structure. At the moment we decided upon a daily temporal granularity and an aggregation by counties. Reference data is based on the same weekday to build information on the change to "normal".
Read only, work in progress.
Kineo.ai -> Server Google -> Google Maps API Credits Hystreet -> API Fraunhofer IOSB-INA -> Data of Lemgo Digital, data processing,vVideo Eco Compteur -> Access to cyclist counting station data
Overview of succeded sub-goals
Explorative analysis of Hystreet data: https://github.com/socialdistancingdashboard/virushack/blob/master/hystreet/mean_profiles/hystreet_eda.md Explorative analysis of train connection data: https://github.com/socialdistancingdashboard/virushack/blob/master/zugdaten/README.md ML analysis from public webcams to count pedestrians: https://github.com/socialdistancingdashboard/virushack/tree/master/WebcamCounter
Social Distancing Dashboard Team for the WirVsVirusHackathon: https://devpost.com/software/12-social-distancing-dashboard