COVID-19 Dashboards are everywhere, but this time we had an opportunity to research a different insights than the usual infected/deaths statistics, although we were inspired by the existing COVID-19 dashboards out there.
What it does
Input digesting country-wide news data, outputing a map view of "Corona scare-level", how "Corona" is trending is a specific areas, or in other words, how "scare" are the crowd of Corona.
How we built it
We split it to two main branches,
Data analytics - Exploration, clean-up, tagging location (German NLP model to extract "location" named entities), extracting COVID related words using our tailor-made terms list aiming for minimum false positives, filter only the relevant articles and then divide their count by the number of total articles for each specific location. Salting with public health data to find correlations. We also looked at IBM Watson tone analyzer but ran out of time.
User Interface - Journalist Dashboard - Web app built with ReactJS and d3 visualization library. Map view and statistics view based on the output of the previous process. Cloud tags. the App is hosted on IBM Cloud.
Challenges we ran into
- In terms of innovation, the challenge was pretty straight-forward, so we where looking for a ways to add more added value to the journalist (as equipped with numerical statistics and tags cloud alongside the map view).
- We focused on the provided articles, we had to read about NLP tagging and explore this area which was challenging, in addition to the all article's language which were German and not English.
- Collaborate in remote.
Accomplishments that we're proud of
- Tagged successfully around ~7k articles, completed the PoC for Journalist dashboard.
What we learned
- Working with a lot of data, NLP, data exploring, working with maps in d3js
What's next for Corona Scare-Level
- There's a way long research and work on moving it to be in-realtime digesting millions of articles, tweets,..