CheckIfCovid

High-level overview
Analysis from actual data.
Architecture

(previously called "FindTheCluster")

Inspiration

Getting tested for COVID-19 can be challenging due to the limited availability of testing kits and overwhelming patient load on healthcare systems. Many individuals may only be mildly symptomatic, or asymptomatic - but they are often overlooked and deprioritized to undergo testing.

What it does

CheckIfCovid is a self-reporting survey for symptoms which calculates the probability of a COVID-19 infection based on the participant’s input. It works by training the data collected from other participants' responses and the data from confirmed COVID-19 cases to calculate the likelihood that the reported symptoms are associated with COVID-19.

Features

Instant assessment - Immediately know the probability that you have COVID-19 based on self-reported symptoms
Geolocated symptoms - Plot out all symptoms in a map and identify clusters. Have you ever wondered if all occupants within the building you live in are also experiencing persistent cough?
Privacy secured - No personally identifiable information will be stored.
Third-party Integration - integrate with existing health systems through our API. If you own a product, we encourage to integrate (and anonymize) your data.

How do we collect data

Survey - Self-reported symptoms
Scrapers - We get data of confirmed COVID-19 cases from CSSEGIS, ECDC, GISAID, KAGGLE.

Benefits

Provide data to the government to run more targeted testing on identified clusters.
More information dissemination or resources can be deployed to the identified clusters.
Provides awareness to occupants of the identified clusters.
Predict the probable locations of the next outbreaks.

Teams

We are currently organized into 3 teams and have a corresponding team page:

Survey App (https://github.com/orgs/findthecluster/teams/survey-app)
API App (https://github.com/orgs/findthecluster/teams/api-app)
Data Science (https://github.com/orgs/findthecluster/teams/data-science)

Slack room: https://app.slack.com/client/T0103C6KMKM/C010RKNQ5QC

Challenges

Lack of dataset that had cases who reported symptoms and then tested negative. We mainly used unsupervised learning to determine whether the given symptoms of a case and it's proximity to one of the identified clusters will likely test positive

Built With

Submitted to

Created by

Built the Survey-app that communicates with our API. I was also the Project manager for the survey-team.

Mark Santiago
Software Engineer
Hospital medicine physician in Seattle, soon-to-be Infectious Disease fellow at Johns Hopkins. Medical consulting, writing, design, research

Diana Zhong, MD
Hospital medicine physician in Seattle, soon-to-be Infectious Disease fellow at Johns Hopkins. Illustration, game design, writing, research.
Worked on the API team and helped create the S3 utility that allows my teammates to convert various data objects to csv files, upload them to s3 buckets and also facilitate download of data from s3 buckets.

Glory Adedayo
I am a computer science undergraduate and I'm passionate about computer programming especially for building powerful web applications.
Worked on building the Survey-app front end.
Contributed in brainstorming ideas for the survey form to improve data collection for processing.

Siddharth Pathak
I worked on identifying and building the ML model that identifies the clusters of symptomatic patients. The data was challenging, but it was a good learning experience.

Mohamed Abdelrehim
Worked as a API and Infrastructure lead. Helped in designing and implementation of API service and survey data storage pipeline using dynamo DB. Also helped in setting up AWS account and on-boarding of users and services. Helped in designing the complete Architecture for the project using AWS services.

Rashnil Chaturvedi
Identified the ML model specification tailored to the dataset we pulled, indexed symptoms, web-scrapped for datasets on a province/county level and stored on S3,run prediction model to determine if a survey respondent has score of overall symptoms above a certain threshold, what is their infection probability.I was also Product Manager of the Data Science team and coordinated communications.

Maria Christina Kalogera
Worked on a data batch to parse csv-data and normalize the symptoms within those data-sets and store them on S3 for later use by the ML team.

Setup CI/CD for the api part

Dominik Einkemmer
Focused on ML pipeline. Specifically, pre-processing of data for ML modelling.

Yaakov Bressler
Part of the Data Since Team, implementing the K-means algorithm and visualise the cluster spots

Sebastian-Sye Klute
Fire starter. Product Manager for the concept and providing vision and guidance to the teams. Actor.

Efren Macasaet
Full stack engineer turned Product Manager, Expedia Group
Prince Owusu Attah
I'm an interaction designer working to solve complex human-computer interactions.
Jason Lee
Tech founder, hackathoner, researcher, mixologist, sports enthusiast, thought-leader, and lover of all things impossible.
Benjamin Von Wong