We want water quality information to be accessible and available to the everyday drinker. Oftentimes there are violations of primary and secondary contaminants that residents are unaware of. Historically, water violations may not always be addressed and the resident may be unaware until the effects are harmful. We want our tool to be interactive and intuitive.
What it does
Our program is an interactive map that displays water stations (over 15,000) in the state of California within a short radius of your location. Upon entering their location, the user may select their nearby water sources and investigate recent water reports in the area. The report focuses on active violations as well as recent reports of primary and secondary chemicals that are above the Maximum Contaminant Level.
How we built it
The data parsing and curation was done in Python Pandas. We used the address, city, and name of the water source to query a Google Places API and obtain the exact longitude and latitude coordinates of the water source. We used a Python Flask web app and built our GUI in leaflet.js. Our location analysis and logic (used to find the nearest water source to the user) uses Redis to expedite computation.
Challenges we ran into
There were many challenges with the dataset. We first had to find the correct dataset. Then, we had to use Google Places API to retrieve the exact latitude and longitude coordinates for the water sources. We also had to aggregate multiple data sources, such as chemical measurements for water quality, Maximum Contaminant Level for each chemical, and Primary/Secondary labeling of the chemicals. The runtime of the map with 15,000 water sources was also concerning, so we used Redis capabilities to expedite this computation.
Accomplishments that we're proud of
We are proud of the data analysis, interactive map, and the scale of big data computation that this project required!
What we learned
We learned how to integrate web apps between multiple programmers, work through data analysis and curation with a new unexplored dataset, and the caveats of this type of data.
What's next for How's My Water?
We would like to create a 'rating' system to make the information more accessible to the user! I.e. differently-colored pins for different danger levels, an A to F rating or red-to-green color rating