There has been ample work on how land use and land cover change affects water quality in the Chesapeake Bay Watershed (CBW), and recent studies have shown that spatial location is one of the most important factors determining pollution loads to the bay. One important spatial factor is near-surface bedrock geology, which comes into contact with groundwater, streams, and rivers throughout the CBW. Some rock types contain minerals and other components that are reactive towards water, and can thus change the chemistry of the water flowing through the watershed and into the Chesapeake Bay. Geologic maps display information on the spatial distribution of different rock types in a given region. However, traditional geologic maps emphasize age and stratigraphic relationships between rock types rather than their chemical reactivity. A study by the USGS devised a lithogeochemical geologic map where the map units are classified based on their composition, mineralogy, and texture. In this way, we can see the spatial distribution of rock types based on their potential effects on water chemistry.

This partial hackathon submission utilizes the lithogeochemical map produced by the USGS along with water quality data from the Chesapeake Bay Program and Chesapeake Monitoring Cooperative to determine if this lithogeochemical classification can be of use in predicting water quality across the CBW. As a partial submission, the main goal of this project is to draw attention to the importance of rock geochemistry in understanding and restoring the Chesapeake Bay Watershed, and to be a starting point for further studies that will utilize this information.

What it does

I performed hypothesis tests on the geology and water quality data, using nitrate as an example, to determine if pollution loads were significantly different at water sampling stations located in different lithogeochemical regions. This can allow researchers and policy makers to further understand if geology is an important in controlling water quality in the CBW. Using a lithogeochemical map rather than a traditional geologic map makes it easier to understand which physico-chemical mechanisms throughout the watershed are shaping water quality, because the map is devised with these processes in mind. This analysis can enable decision makers to decide where how to plan land use based on geology.

How I built it

This preliminary analysis was built entirely in Python. The Python geopandas package was used to read in the geospatial data and to spatially join the water quality data to the geology data. The Python pandas package was used to manipulate dataframes, and packages matplotlib and seaborn were used for data visualization. Finally, the scipy package was used to perform the statistical hypothesis tests. The hypothesis test used was the Mann-Whitney U test, which is a non-parametric hypothesis test. P-values were obtained from this test with a significance level of 0.05 to determine if the nitrate concentrations between the lithogeochemical rock types were statistically significant.

Challenges I ran into

Working solo was my biggest challenge as I did not have teammates to bounce ideas off of and to answer any questions I had. As my knowledge of predictive modeling is limited, this severely limited the progress of this project. Additionally, the complexity of working with three-dimensional data posed a challenge: where the effects of time, and spatial data (below and above ground) are important it is difficult to determine the accuracy of the results.

Accomplishments that I'm proud of

  • Utilizing a unique geological dataset. This untraditional geologic map can bring a new perspective to modeling geographical influences of nutrient pollution and water quality in the CBW.
  • Working alone and figuring out solutions to problems that I encountered on my own.
  • Taking advantage of my knowledge as a geochemist and combining it with data science to perform an analysis.

What I learned

There are significant differences between the nitrate loads of different rock types classified lithogeochemically. This indicates that rock lithogeochemistry can be a useful predictor of water quality in the CBW when incorporated into predictive machine learning models. Some rock types do not have significantly different nitrate loads, and the reasons for this can be further investigated by determining what characteristics make these rock types react similarly.

What's next for A Potential Litho-geochemical Predictor of Pollution Loads

  • Collaboration: Team-up with data scientists and machine learning scientists to build a predictive model around geology
  • Utilize land use/land cover data sets to deconvolve other possible factors affecting water quality in differing lithogeochemical regions
  • Clean up data more so that temporal changes are taken into account
  • Utilize land use/land cover data sets to deconvolve other possible factors affecting water quality in differing lithogeochemical regions
  • Determine how rock type affects other water quality indicators such as phosphate and pH
  • Understand why certain rock types do not have significantly different nitrate loads
  • Incorporate surficial rock types and groundwater data into analysis
  • Incorporate lithogeochemical rock types into predictive models of Chesapeake Bay water quality

Built With

Share this project: