Inspiration

We were drawn to TouchBistro’s challenge because of our love for restaurants—and, of course, food! When exploring the problem sets, we were particularly inspired by the one involving weather. It presented an exciting opportunity to integrate datasets collected externally with TouchBistro’s data and uncover relevant insights and connections to real-life factors.

What it does

Climalytics is a model that predicts net sales and the counts of each type of order (dine-in, takeout, delivery, bartab, online order) based on several factors: Internally: restaurant ID (concept is optional) Weather: Temperature, level of rain and snow, and daylight Day: Whether it is a weekend and/or holiday.

How we built it

  • Data Analysis and Manipulation: Analyzed and combined datasets using python (pandas and numpy).
  • Geographical Data: Added data for weather API calls with geopy
  • Weather data: Obtained open-source weather data from open-meteo
  • Holiday data: Obtained data from data.nager
  • Model: Sci-kit learn

Challenges we ran into

  • Dataset: The data only included transactions from July to December, which, for our sake where we were measuring weather metrics, was not enough to form concrete conclusions on the effects of colder weather and snowfall on the data. Similarly, July to December was restrictive when it came to adding holidays to the dataset
  • Business scales: The firms, being of different sizes, affected our visualizations and our model training. They kept influencing the data for other firms, even when using weights and percentages.

Accomplishments that we're proud of

  • This was our first time really working with models and multiple datasets. Predicting on a multivariate scale was really satisfying to perform. We are particularly proud of being able to output multiple values from the same model, which is something we did not know was possible.
  • Extracting the data from public data sources was pretty cool. We did not know the availability of such data or the variety of ways the APIs were called. Being able to learn it during this hackathon is something we are happy about.

What we learned

  • We learned that there will always be more factors influencing the data than we can use in our model, and we needed to know which ones to prioritize to train the model.
  • We also learned a variety of options for doing the same thing. We each had our own way of doing API calls or regression models, but we discovered that there are a lot more different ways to achieve similar results.

What's next for Climalytics

  • Train it over a longer period to see if the model needs more fine-tuning
  • Have a testing period where we see how useful to the clients the model ends up being
  • Test the data with more types of models to better compare the accuracy between each model
Share this project:

Updates