Inspiration
The Monarch Butterfly is a flagship species that serves as an educational ambassador to the public about migration. Its ecological and cultural significance underscore the importance of supporting this pollinator and its role in maintaining biodiversity in the United States.
What it does
We wrote scripts to scrape global adult monarch butterfly sighting web pages and refine the data to capture unique US city and state combinations. Then, an OpenAI API was used to assign counties to each unique combination and integrate that data into the master data set. This data was then used to create ridgeline plots of adult monarch butterfly sightings over time per state, showing localized migration patterns through the seasons.
Air quality data was analyzed, and while a broad decrease in greenhouse gases was notable, correlation between butterfly population and various air pollutants differed by region.
Using agricultural census data, we created a script that generates a sampling distribution of counties for a particular crop in a given state via bootstrapping. This generates a list of counties that the crop is most likely to have originated from while still preserving uncertainty, cautiously avoiding overly assumptive imputation.
How we built it
As a team composed of diverse backgrounds (computer science, mathematics, and statistics), each hacker used the language we were most comfortable with, namely python or R. VSCode was the IDE of choice to maintain team-wide version control. Generative AIs such as Copilot, ChatGPT, and Claude were used to assist in coding tasks and debugging.
Challenges we ran into
Setting up the virtual environment and GitHub, and relating databases without common keys, requiring stepwise translation.
Accomplishments that we're proud of
Tenacity in pursuing solutions in the face of an extremely challenging problem! We all stayed onsite all night to keep generating ideas and chipping away at the tasks. Getting data visualizations on CODAP and with various python packages showing migration patterns was a huge confidence boost that helped keep us going.
What we learned
Many of our analyses failed to reject the null hypothesis, a result we are ethically bound to stand by. While we can by no means declare that there exist no negative impacts on monarch butterfly populations, we resist the urge to perform multiple investigations until a positive result is obtained. The replicability crisis in scientific research is in part due to a positive-results bias that motivates investigators to root out correlations they may not definitively exist.
Built With
- chatgpt
- claude
- copilot
- r
- vscode

Log in or sign up for Devpost to join the conversation.