Inspiration
In the U.S., more than 17 million people live in food deserts, which are areas with limited access to healthy, affordable food. This lack of access makes residents rely on unhealthy, convenient, cheap food, leading to poor health. By analyzing data, we can figure out key causes and indicators of food deserts and point government programs to direct resources to areas that need it the most.
What it does
We identified and visualized trends in the data to show the relationships between LILA (low income and low access) to many different factors, including median income, poverty rates, SNAP-authorized stores, etc.
How we built it
We used data cleaning, processing, and visualization techniques to prepare our dataset for analysis. We used Python, pandas, Matplotlib, and seaborn. We first cleaned our data by removing null values and irrelevant columns. We visualized scatter plots and a density map of the United States.
Challenges we ran into
We had a hard time cleaning our data, as we kept having row explosions when joining our datasets. We fixed this by changing our merge parameters to merge on "FIPS" instead of "COUNTY" + "STATE". We also struggled to find correlations between variables in our datasets. We resolved this by creating a matrix of correlation coefficients between every variable to see that most correlated variables.
Accomplishments that we're proud of
We are proud of our visualization of high-risk areas on the U.S. map. Processing the data and visualizing this was beyond our initial knowledge, and we learned a lot on the way.
What we learned
We learned that some variables that we thought would correlate did not correlate. There were confounding factors.
What's next for Food Desert Analysis
We hope to find more confounding factors in our correlations, so that we have more accurate and representative data and suggestions.
Built With
- matplotlib
- pandas
- python
- scikit-learn
- seaborn


Log in or sign up for Devpost to join the conversation.