In the current world we live in, it can be difficult to stay informed and up to date with all the public health guidelines. We lack a solid infrastructure for predicting and tracking COVID-19. We were inspired by recent research published in Science of the Total Environment, where scientists correlated high degrees of COVID-19 atmospheric diffusion, transmission, and lethality with high levels of particulate matter pollution (PM2.5). We believe that this connection can help communities make better decisions in the interest of public health.
What it does
A mobile Arduino device equipped with a dust sensor and Bluetooth shield collects data on particulate matter pollution. This device can be easily replicated and attached to public transportation such as local bikes, scooters, and buses. The data collected by the Arduino is transmitted to the user's phone via Bluetooth using the 1Sheeld app. The 1Sheeld app then uploads this data to a real-time, publically available Internet of Things (IoT). The real-time data is inputted into an artificial neural network (ANN), which makes a prediction for the pollution levels for the next 24 hours. The pollution level is correlated with COVID-19 transmission risk and sent to our mobile iOS app, JAAN. The app will then alert users based on their location if a COVID-19 hotspot is likely to develop.
How I built it
The Arduino device was created using the Waveshare dust sensor, which detects any dust particles larger than 0.8 microns using an optical sensor. The Arduino is also equipped with a 1Sheeld Bluetooth board, which communicates the 1Sheeld app on a nearby phone. This device was tested by attaching it to a bike and riding around my neighborhood. The ANN was created using Python Jupyter Notebook. It was trained using supervised learning on publically available air quality data from NYC Open Data. The training was done multiple times with different batch sizes and epochs and analyzed using a linear regression model to assess the optimal parameters. The 24-hour prediction is made by applying the model to a batch of current data. Only one hour worth of data is needed to make the first prediction. The next prediction is made using the previous prediction and recursively repeats to complete a 24-hour prediction.
Challenges I ran into
We initially coded our mobile iOS app in Swift, however, we found that the backend was more complicated than anticipated. Being beginners, we decided to finish our prototype app in Figma to properly convey the user experience we envisioned. Additionally, while we were not new to Python to Arduino, integrating all of the components of the infrastructure such as the ANN and the IoT required a certain degree of creativity. We researched a lot of different softwares and programs before deciding on an optimized data flow.
Accomplishments that I'm proud of
After multiple rounds of training and parameter adjustment, we were able to parameterize the ANN such that the predictions made by the sequential model have a root mean square error 0.03. This error is incredibly minimal and suggests a very high degree of accuracy in the prediction. This was no easy task and required hours of researching data sets, trying new configurations, and reading up on AI infrastructure.
What I learned
Although our final product did not include Swift, we strengthened our background in the language and improved our understanding of object-oriented coding. Additionally, we learned a lot about statistical analysis while assessing the accuracy of our AI. Lastly, through the different workshops and panels we attended, we learned a lot about how technology does not have to be limited to a computer science perspective and can be interfaced with public health, liberal arts, engineering, and so much more.
What's next for JAAN
We would like to realize our Figma prototype in Swift in order to complete the flow of data. Additionally, we would like to try strengthening our ANN by adding additional weather data such as temperature, precipitation, and other pollutants to strengthen the predictive model and possibly correlate other factors.