Hack the Bay Chesapeake Bay Water Quality Hackathon

Challenge 1: Develop a Restoration Case Study (Time Series / Visualization Challenge) Using data from CMC, the Chesapeake Bay Program, and supplementary sources, tell a story about how water quality has changed over time in the Chesapeake Bay watershed.


  1. Jupyter Notebook
  2. Presentation pdf
  3. Presentation Video

The Chesapeake Monitoring Cooperative (CMC) and The Chesapeake Bay Program (CBP) contain a variety of water quality measurements across time and across the entire Chesapeake Bay watershed. Because the Chesapeake Bay watershed is so important for tourism and fishing industries and is so heavily impacted by land use and population, these efforts have been created to attempt to paint a picture of the long-term health of the watershed.

The Middle Potomac-Catoctin and Rapidan-Upper Rappahannock HUC8's were selected for focus because they interface directly with the Potomac river which feeds into the Chesapeake Bay, and also because the regions contain a variety of land uses that reflects the overall variety of the Chesapeake Bay watershed. Within this region, focus was placed on the HUC12's that contained benthic quality measurements from the past three years. One HUC12, the Little Seneca Creek subwatershed was identified as the single HUC12 in the region that displayed improvement in benthic quality over the past three years.

Using these benthic measurements, time series of mean benthic rating per HUC12 can be displayed for both Little Seneca Creek and the overall Middle Potomac-Catoctin and Rapidan-Upper Rappahannock subwatersheds. However, after plotting the measurement locations for each year of measurements, it becomes apparent that the measurements are generally not in the same locations year after year. For example, in 2006 a wider variety of locations were sampled, and this led to an severe decrease in the mean benthic rating for that year, even though the overall benthic quality of the subwatershed may not have changed in that way. In order to construct a true time series, it becomes necessary to examine only the locations with repeated measurements.

Plotting the time series of each individual repeated location offers a more complex picture than the aggregated mean for the entire watershed. This allows for locating the specific sites that may be improving or degrading over time. This analysis can be aggregated to the entire Chesapeake Bay watershed to display all time series of repeated measurements, which can further be clustered and divided to locate areas experiencing improvement or degradation.

The time series generated by restricting to repeated measurements can be differenced and summed to determine the overall trend of each site (positive for improving benthic rating; negative for degrading). Mapping this trend across the entire Chesapeake Bay watershed can illustrate the regions experiencing either positive or negative changes, directing attention to places that are benefitting from best management practices and places that may need to implement best management practices. Alongside this map of benthic rating trends, the locations of the CBP and CMC water quality measurements can be mapped and compared. Overall it appears that there is poor overlap between the benthic measurements and the water quality data, which presents the challenge of complicating any attempt to correlate water quality variables such as pH, water temperature, and salinity to the benthic macroinvertebrate ratings. It is recommended that the CBP and CMC focus on adding more water quality regions where benthic quality measurements are taken in order to provide a holistic view of watershed health.


The following data sources were used in this analysis:

  1. CBP and CMC Water Quality Data
  2. Benthic Macroinvertabrates Data
  3. USGS Watershed Boundary Dataset

The following Python libraries were used in this analysis:

  1. Pandas
  2. NumPy
  3. GeoPandas
  4. CartoPy
  5. MatPlotLib

Built With

Share this project: