Flopping the Nuts

Inspiration

We wanted to find an application of the PoC data that both assisted individual PoCs and advanced ABInBev's operations across the board. While exploring the data, we realized that we could measure volume consumed off both the draught and the bottle sales. This lead us on our quest to understand the beer vertical and horizontal markets through the lens of consumption data.

What it does

FtN predicts the sales of ABInBev products based on factors such as location, type of venue, demographics, weather conditions, day of week, and day of yea. With FtN, ABInBev sales representatives stay proactive when advertising to PoCs. Currently, FtN predicts consumption of Budweiser Draught, Stella Draught, Bud Light Draught, Labatt 50 Draught, Budweiser Bottle, and Bud Light bottle.

How we built it

We created a Neural Network using unsupervised learning and semi-supervised feature selection. First, we cleaned and filtered the data in MySQL and reorganized it to identify potential predictors of consumption. Each potential factor was analyzed for statistical significance. The predictive power of the type of establishment was tested through a Chi-Squared Goodness of Fit Test. The day of the week and the weather were tested through Pearson Correlation Coefficient. Other variables like the quality of the lines and the amount of non-beer items sold were tested but did not show a relationship or predictive strength.

Once the factors were narrowed down, we took the 4 months of consumption data and divided them into 3 months of training data and 1 month of testing data. The resulting Neural Network was able to predict the consumption of the ABInBev products w/ over 70% accuracy.

Challenges we ran into

The PoS records was broken into four files, so we had to create a master PoS file w/ over 3 million records. This made analysis of consumption computationally expensive when relating data across the various data sources.

When analyzing the type of PoC data for statisical significance, one challenge was overcoming the different sample sizes of PoC types. There were over 40 Bar/Pubs but only 1 Adult PoC. Creating expected values for the consumption of PoC types with small sample sizes required a bit of out of the box thinking -- utilizing a Chi-Square test but with weighted expected values based on proportion of sample size instead of the product of averages.

Accomplishments that we're proud of

The sheer effort and technical work done to analyze variables and test potential factors for statistical significance in a semi-supervised feature selection process paid off when we found over 70% accuracy for our Neural Network despite the limited number of data points.

What we learned

We learned how to distribute tasks among team members to emphasize our strengths and cover for our weaknesses. We came in as a team with a wide set of skills between Database Administrators, a Statistician, and Computer Scientists. It was confusing at first but we were able to make a supply chain to efficiently accomplish tasks.

What's next for Flop the Nuts

FtN can grow in two primary ways. First, additional data points can be gathered and this will increase the accuracy of the Neural Network. Second, additional analysis of potential factors can be performed. Additional PoCs could be added to the sample to allow for better testing of statistical significance. A suggested beer purchase for PoC from distributors can be generated off the FtN predictive model.

Built With

Submitted to

Hack the World - New York
- Winner First Place - US$10,000

Created by

I handled the feature selection by analyzing the data, picking variables to test, and then determining if the selected variables had a statistically significant impact on beer consumption.

Ahmed Elborolosy
I worked on implementing and tweaking the Neural Network in Python, as well as researching various aspects of optimizing the NN.

Adarsha Subick
I cleaned up the data using MySQL and prepared them so that they can be analyzed for the feature selection. I also set up our server for the web app and worked on our web app.

NaHyun Kim
I worked with loading the data into MySQL, performing necessary ETL functions and producing data-sets for further analysis down the pipe-line. I was also heavily involved with strategizing and finding a path for the project.

Aritra Datta
Gul Ahmed
rubin Taipi