What it does

The app shall have 3 main features: 1️⃣Interactive Map - Shows the heat and saturation of number of cases, deaths and vaccination progress based on the user's geographical location 2️⃣Data Prediction - Graphs of numbers on of the platform's prediction of COVID cases for the next seven days relative to the previous seven days 3️⃣Search Bar Comparison - Users can search and compare between each county they wish to track. If starred counties have a spike in cases or reached cases above the threshold, users will be notified.

How we built it

➡️Forecasting COVID-19 spike in a county: Firstly, the COVID-19 data was scraped from the NYT GitHub. The tabular data records information on daily statistics related to COVID-19 for each county over a range of period. We then created a new feature called cases_delta, which is simply the change in the number of cases per daily basis. It is computed as the difference in numbers of cases in a given day and the day before. We defined the risk of COVID-19 spike over the next 7 days as the ratio between the expected number of new cases in the next 7 days and the number of new cases in the previous 7 days. The reason why both sliding windows are 7 days in length is because this period coincides with a week, thus the weekly seasonality effect will not be accounted. For this purpose, we use the Facebook Prophet library for the time series regression model. The Prophet model is fitted on the historical data (i.e. cases_delta) with a daily timestamp. We adopted a piecewise linear trend and leave it to Prophet to automatically find the change points.

➡️ Estimating risk from neighbouring counties: A county does not exist in isolation so we decided to develop a machine learning model. We investigated linear regression and multilayer perceptron (MLP). We evaluated the performance with mean squared error (MSE) and found that they were not satisfactory. Thus, we decided to train a ridge regression model. The optimal hyperparameters setting was obtained through a grid search cross-validation. [[ We wrote more about how we trained our ML model in the linked MEDIUM article ]]

Challenges we ran into

➡️Figuring out a ML model that was accurate

Accomplishments that we're proud of

🏆Training our model to update real time and scrape from the NYT github

What we learned

A lot!

A proof-of-concept COVID tracker was developed which can potentially be further developed into a mobile application. The COVID tracker is able to forecast the risk of COVID surge in the next 7 days for the tracked county, which can be set by the user on the app. The risk of COVID surge over the next 7 days is defined as a ratio between the predicted number of new cases in the next 7 days over the actual number of new cases in the previous 7 days. The forecasting for the number of cases was performed by utilizing the Facebook Prophet library. A model was fitted for each county by accounting the trend and seasonality of past daily number of cases for that county.

We found that the predicted risk of each county is correlated with the predicted risks in its neighboring counties. This meant that human flows between counties do affect the predicted risk in the tracked county. To account for this, we trained a multilayer perceptron (MLP) model for the tracked county so that it will learn the influence of neighboring counties to the predicted risk. The goal is for the MLP model to learn the trustworthiness of the fitted curve from Prophet for each neighboring county.

What's next for COVID Tracker

🎯Develop a fully functional App in order to allow the general public to use it 🎯 Add links and resources to the App

Built With

Share this project: