The COVID-19 is one of the worst global crisis in recent years, and a coordinated effort is needed from governments down to individuals to curb its impact. The understandings of society-wide trends and public sentiments are crucial for the success of this effort, and for this reason, we want to provide a single platform where all relevant analytics are made easily accessible for policy-makers, researchers, and individuals to better understand how the public perceives the current crisis.
What it does
Our CoroNow dashboard provides a graphical interface for the user to access information and analytics surrounding the virus outbreak. Specifically, our project achieves the following: we track the popularity of relevant hashtags on twitter and generate a near real-time dynamic word cloud, employ Google Cloud machine learning functionalities to track the general sentiments towards economy, public health, or government in tweets addressing the virus, and provide a fast news search and aggregation service.
How we built it
We divided the team into two task forces: two of us worked on the sentiment analysis/machine learning part, while the other three worked on building the frontend and backend of the website, including its integration with Firebase.
Challenges we ran into
One major challenge was implementing the sentiment analysis model. Initially, we intended to apply the domain-adapted-atsc model specified in https://arxiv.org/pdf/1908.11860.pdf for our sentiment analysis. However, we ran into a few problems when trying to integrate a data pipeline, most notably where a script failed to transform the input data into the correct format, and we did not have enough time to fully debug it. Instead, we opt for using Google Cloud's pre-trained sentiment analysis and text classification functionalities.
Another notable challenge we had was in the backend and in Firebase, where we need to frequently transform data from and to various formats, which sometimes result in bugs that are only caught later. In the front end, we had to make a lot of design choice, both in functionality and UI logic. For example, one problem we encountered was making sure scrolling is disabled when certain menus or windows are open.
Accomplishments that we are proud of
We have an interactive and dynamically-updating word cloud that can self-adjust based on the window size. In addition, the historical trend of each hashtag can be conveniently displayed by clicking on that tag in the word cloud. But most importantly, we are very proud of the fact that our website is up and running by itself, periodically fetching new tweets, performing data analysis and aggregation, and updating data from Firebase all the way to the frontend.
What I've learned
Personally, I have learned a lot about PyTorch and the BERT model, even through we did not end up using it. I also learned a lot about how to use Firebase and Google Cloud in a real production environment.
What's next for CoroNow
First, we want to change the time window for popularity tracking from hourly to daily, so that we can better observe how the popularity of tags change in relation to the spread of the disease and avoid the daily fluctuation of user activity. We also seek to eventually apply the domain-adapted-atsc for sentiment analysis, in order to achieve a more accurate and targeted sentiment analysis on keywords. In addition, we want to continue to improve our website graphical design to make it look more modern and intuitive. And finally, we hope to introduce more data analytics functionalities beyond hashtag popularity and sentiment analysis to provide more insights into the situation.