Inspiration
Monarch butterfly population is an important indicator of ecosystem health. Without pollinators, our food supply would be in jeopardy. We aimed to investigate whether pollution (Air Quality Index "AQI") and temperature might play a role in this decline. We used Monarch butterfly sighting data from Journey North and air quality and temperature readings from the EPA to investigate the problem.
What it does
This project maps Monarch butterfly sightings across U.S. counties from 1996–2024 and compares them with local air quality data. It aims to identify correlations between declining monarch populations and environmental trends.
How we built it
Using Python (Pandas and Matplotlib), SQL, and Node.JS, we integrated the datasets and visualized the data on maps. We also analyzed correlations between sightings and the environmental factors of AQI and temperature.
Statistical Analysis Used:
- Choropleth Map Visualization
- Standard deviation and data normalization
- Cohen’s d (effect size / significance of difference between 2 groups)
- Linear Correlation Coefficient and Correlation Matrix
- Time-Series Cross-Correlation
Challenges we ran into
Data gaps: we faced challenges in filling data gaps and inferring county information from Journey North’s city-based data. Data began being collected on sightings in 1996 when internet use was not common, so data was sparse. Collection increased from 2010 forward, but we saw drops in information in years like 2020 when people were experiencing the COVID lockdowns. Since collection of sighting data was voluntary, the consistency and reliability of the dataover time can be called into question.
Inconsistent air quality reporting and lack of matching temperature readings over time also made it difficult to draw firm conclusions in certain areas.
Accomplishments that we're proud of
We successfully created an integrated map that displays monarch sightings for each U.S. county. Our analysis revealed a lack of a notable pattern in pollution's potential impact on monarch populations.
What we learned
We learned how to utilize an API for data collection and how to handle data inconsistencies and gaps while working with extremely large datasets. We learned how to use the Pandas library to handle data tables in Python.
What's next for The Journey North DataThon
We plan to analyze air pollution in more depth by looking at fine-grained air readings of the compounds that the Air Quality Index is derived from (ground-level ozone, particle pollution, carbon monoxide, nitrogen dioxide, and sulfur dioxide) to build a more comprehensive model. Eventually it would also be interesting to incorporate the impact of pesticide use in the areas, as well as to analyze whether the earlier stages of the butterfly life cycle are impacted by any of these factors more so than the adult populations.
Built With
- .csv
- .txt
- api
- epa
- geopandas
- javascript
- json
- juypter
- matplotlib
- node.js
- numpy
- pandas
- python
- sql
- sqlite
- us-census-bureau
- web-scraping

Log in or sign up for Devpost to join the conversation.