CovidFreeMe

Click here for the live website The live website above provides real-time information about the spread of the coronavirus in your community and provides tools to help you inform your friends.

Inspiration

Even at a time of crisis, it is hard to see how important one's actions are for the collective good. There are many people and employers who refuse to participate in social distancing to help flatten the curve and save lives. For example, see this video or this other video (There are way too many of these).

This is where CovidFreeSpace comes in.

We want to tackle this problem by displaying visualizations and extrapolating data that clearly show the importance of social distancing and to give users the means to inform their friends and employers about the need to social distance. We also have a service that allows users to email/text their friends to encourage social distancing, or employers to push them to adopt quarantine-friendly policies (i.e. allowing their employees to work from home).

The only way we can tackle the problem of containing a global pandemic is if everyone is on the same page and does their part.

What it does

Our website displays to users the current cases of coronavirus and the predicted number of cases in the next 30 days. It also display the availability of hospitals in your county/state for the next month to show that they are filling up in capacity, as well as helps you find directions to them in case of an illness. These visualizations demonstrate the urgency to social distance. Users can then write an email using our personalized email template to their loved ones to remind them to social distance or to employers who are not adhering to the call to allow employees to work from home.

How we built it

Most projects that attempt to use data to map the number of cases take it from the Johns Hopkins University dataset, which experts have cast some doubt on due to inconsistent reporting. So instead, we spent part of the weekend building a data pipeline that uses webscraping to take data from the most reliable sources available and merges them into one dataset.

We decided to use data from the ECDC for worldwide data as they were easier to clean using python libraries such as Pandas. We also used data from Harvard and Definitive Healthcare to determine the number of hospital beds in each area, cleaned it, and used it for calculations for bed occupancy. Once we finish cleaning the data, we uploaded the data to a MongoDB server for our frontend to fetch.

We tried using various Machine Learning models to predict (separately) the number of confirmed Coronavirus cases, given the latitude & longitude and the number of previously deceased/confirmed/recovered by region. Eventually, after trying SVMs, Linear & Logistic Regression (with Polynomial Feature Expansion), and XGBoost, we settled on using a 3-layered Neural Network with the Rectified Linear Activation function to perform regression. We then uploaded this data to the MongoDB database also for the frontend to fetch.

We built a frontend using Flask and made API calls to ArcGIS and Google Maps API for data visualizations. In particular, we use Google Maps Places Autocomplete API to help the user input their location in a quick and easy manner. We then leverage Google Maps Geocoding API to lookup the latitude and longitude of the user's location. With this data, we can then use Google Maps Places API Nearby Search library to find nearby hospitals. Lastly, we display these hospitals using Google Maps Javascript API. We also built a email service using the SendGridAPI to allow users to send and templates email to their loved ones or employers. In addition, we made an SMS service using the Twilio API so that users can also contact via their phones.

Challenges we ran into

Working remotely was a difficult challenge for us because we could not coordinate as well as if we were working together in person. We needed to be proficient in communicating via Slack and scheduling Zoom meetings to keep up to date.

There was some difficulty working with the data because the location of confirmed cases are not always consistent and there are sometimes missing columns. This was the most time consuming part of the development process.

Our machine learning models started off with very poor accuracy. It took us a while to realize that there was a mistake in the data pipeline. This mistake along with some communication overhead among our team cost us a lot of time because the predictions were not aware of the mistake in the data until a few hours later, so the rest of the team needed to wait for him to get back and rerun the models for predictions.

On the frontend side of things, it was a challenge to coordinate work on the website because we wanted our website to look cohesive and concise. To keep our style consistent, we made mocks up of our website using Figma so that all of us were on the same page.

Accomplishments that we're proud of

Despite difficulty of not being able to meet each other in person, we were able to coordinate and deploy a polished and useful website. We're able to use neural networks to extrapolate the data from cases and show users how this relates to their own county.

What we learned

We should have spent more time at the beginning coming up with certain checkpoint meetings. A lot of time was wasted when one of us was waiting for another person to respond. Every bug/mistake/miscommunication costed us ten times as much time as it otherwise would if we were in an in-person hackathon.

What's next for CovidFreeMe

We would like to add more metrics for a more comprehensive overview of how hospitals are handling the covid-19 outbreak. It would also help to add visualizations that compares and contrasts the result of no controls vs social distancing. This would help users more directly see how many lives they can save by choosing to stay at home. We're planning on launching this to the world after perfecting our data and models to ensure that everyone has access to this tool, so it can make the greatest impact possible.