SemantAqua is a tool that collects water quality data from different organizations (e.g. Environmental Protection Agency, United States Geological Survey) so that they can be queried and presented in a more approachable manner than traditional data access mechanisms. Using regulation information from different states, the portal automatically classifies water sources as polluted or not.  Queries can be tailored from multiple facets: data source, water regulations, water characteristics, and health concerns. With the "regulations" facet, community members can apply different state-level regulations to water data within their local communities to determine how strict different regulation bodies are. With the "health concern" facet, users can specify their health concern and the portal will detect only the polluted sites that contain pollutants known to relate to certain health effects. For example, if one is curious about contaminants related to circulatory system problems, the portal knowns that, among other things, arsenic is known to cause circulatory problems and will return points known to contain arsenic. SemantAqua also provides trend tools that show how pollutant concentrations or quantities change over time along with changes in regulated levels for all characteristics (e.g. acidity, temperature) and pollutants (e.g. arsenic, lead). Furthermore, the portal allows users to comment on measurement sites and discuss information related to the captured data. To aid communication between users and governmental sponsors, the portal presents links to federal and state agencies to report water quality issues in the local community based on the current ZIP code active in the system. Users can also take advantage of smart phones and geolocation services available on these devices to find polluted water sites near their current location. The technical highlights of SemantAqua include: 1) it captures the semantics of domain knowledge using a family of modular ontologies, 2) detects water pollution using an intelligent reasoner, 3) integrates data following Linked Data principles, 4) preserves provenance metadata using the Proof Markup Language (PML).  This work was advised by Joanne S. Luciano and Deborah L. McGuinness at the Tetherless World Constellation, Rensselaer Polytechnic Institute.

Share this project: