Outline your ideation and development process.

We were inspired to create a tool to help small farmers, such as those in charge of local farmers markets, because they lack the resources that big chains and grocery stores have access to. We wanted to create an intuitive and convenient platform that could help farmers understand the impact that various temperatures can have on their crops. We decided to allow farmers to be able to visualize the entire growing season, since crops are highly impacted by weather over time rather than in the immediate period. We started by collecting our initial data, filtering datasets to the 5 midwest states in the desired year range. We additionally filtered to the months of May-September to focus specifically on the main growing season. Once we collected our data, we used SQL Editor and Genie to consolidate our collected data into one main table that we could use to compare with the provided agricultural data. For the initial model training, we focused on a smaller dataset of one county from each state while we were still working on cleaning our larger dataset. We decided that a linear regression ML model was best suited after realizing weather trends in the midwest were linear during the growing season. It was also the format that minimized our R^2 value. Finally, to create the dashboard, we added visualization widgets that we felt best represented the data at hand and populated the charts with the desired dataset. We increased interactivity and ease of use by showing additional information when the end user hovers over the visualization, such as the potential cause of identified anomalies.

Share how you used Databricks and what was great or frustrating about the platform

Databricks proved to be really useful for parsing through large datasets. We found Notebooks to be particularly helpful for this, and we appreciated the flagship Databricks Assistant that aided us in debugging our python scripts and giving us direction in building the project overall. One thing that was frustrating was how we had to reshare new data tables/files that we made, and it was unclear where our created tables ended up going, we frequently had to search through our catalog to find our files, and it got quite messy.

Credit any public frameworks, APIs, or external tools used

We used the external database of the National Centers for Environmental Information to get county specific temperature, precipitation, drought and cold/hot day periods. We also used the US Census Data API to convert from latitudes/longitudes to US counties

Built With

Share this project:

Updates