The holiday season is the pinnacle of retail shopping for both consumers and retail businesses. As a result of the antics of frivolous gift-givers, demand for certain products skyrockets during this period. However, after the holiday season, many products see a sharp dip in demand. Businesses often lose money if they have an excess of these products in their inventory. From the consumer's lens, many parents and non-parents alike are desperately looking to find more gifts in a short amount of a time. We believe that we have found a way to solve both problems at once using an untapped and readily available resource.

What it does

Our project consists of three main parts. The first analyzes data from past years' sales and advertisements to predict which products will experience the biggest drop off in sales for the next year. The second is a function that takes in a list of products and their respective category that represent a shopper's cart and returns items that are both related to the user's cart and will experience a demand fall off after the holiday season. The third part is displaying these recommendations to a user on the point-of-sale system (likely a self-checkout kiosk) and allowing them to add this item to the cart so that they can pay for it now, while entering their address so it can be delivered straight to their door. We decided to use the point-of-sale system because we believe that this is one of the few areas in business today that is not being utilized for recommendations or advertisement of any sort. We believe that the shipping component can accomplish a variety of goals. For one, it allows for the user to simply add the item to their cart and pay for it now, which prevents them from having to take the time to search for an item online and enter their payment information online. In addition, this allows for the software to be able to make inferences based from any user’s cart at a certain time, without saving any important personal information in an account. There are a variety of other possible benefits from this that would require additional research, such as possibly introducing the less technologically advanced older generations into online shopping.

Challenges we ran into

The main challenge that we ran into was finding a public dataset that would give us the information that we needed to predict the products that would drop in demand. After scouring possible datasets for hours, we settled on a dataset that had about 52,000 instances from online shoppers at supermarkets across the world. This data allowed us to determine which “sub-categories” had the most drop off in their respective categories. There are a still a few issues in generalizing this dataset. For example, this data was taken from online shoppers, while our project is intended to work on a POS system in a store. Also, we were not able to build a very complex model due to a lack of possible features. However, we were still able to find the sub-categories which had the largest drop in demand post-holiday season and recommend specific products to the user. Having access to a better dataset would allow us to make a more comprehensive decision on what products to make to the user. We also faced a serious time crunch considering our team only consisted of two hackers.

How we built it

The first step consisted of cleaning the data in excel, R, and Python (it took a lot of cleaning to make this even remotely usable) and building a model that will use the previous three holiday and post-holiday seasons to predict the biggest drop off after the 2014 holiday season (where our dataset ends). We found that the only possible way to do this was to train the model on the previous three years of data using the Holiday Season sales as the predictor (without an intercept), and the post-Holiday season sales as our response. We predicted the drop-offs as shown below.

Sub-Category Demand Decline Accessories 0.724769981 Appliances 0.282612646 Art 0.485109323 Binders 0.668074624 Bookcases 0.360555273 Chairs 0.482811107 Copiers 0.439933478 Envelopes 0.516173106 Fasteners 0.585601794 Furnishings 0.497862974 Labels 0.616935128 Machines 0.414805008 Paper 0.557486368 Phones 0.52579451 Storage 0.492096984 Supplies 0.409004446 Tables 0.500396498

We then found the biggest demand decline per category, as shown below.

Furniture Office Supplies Electronics Fasteners Binders Accessories 58.5% demand decline 66.8% demand decline 72.5% demand decline

This leads us to the second step - building a function that takes in the cart of an example customer in the holiday season of 2014 and returns the products that should be advertised. This function takes in a list of product ids (can be found from scanning the item at the register) from the user’s cart, finds the most common category among the cart and the corresponding sub-category with the highest demand decline from the category, and returns three random products from this subcategory to be recommended to the customer. Finally, we built a primitive design of how the POS system would look with our advertisements based on the products returned from the function. If one of the advertisements is clicked on, it takes the user to a screen that allows for the user to type in their address to receive the shipment. The cart is also updated so that the user can pay for their shipment along with the rest of their current cart that they are buying in the store. Since this is just a PowerPoint hyperlink, it can only be performed once, but, in practice, it could be performed multiple times.

Accomplishments that we're proud of

We are proud of ourselves for working through adversity during this project. We filtered through so many datasets that lacked a key component, created so many models that predicted ridiculous values, and spent hours cleaning the data set attempting to use it to create new parameters, all while balancing other responsibilities throughout the weekend. We truly put everything that we had into this project, and we could not be prouder of our work.

What we learned

We learned quite a few things. We learned that good data can still be scarce. We learned that taking a break and looking at the problem from another point of view can make a world of difference. Most importantly, we were reminded that hard work really can pay off.

What's next for Demand-Based POS Recommendations

The next steps for Demand-Based POS Recommendations are clear – build a better interface that is actually available on a POS system, and get a large amount of data from one of NCR’s supermarkets, grocery stores, or any other retail stores where customers are expected to buy multiple items at once. We believe that if we had the right data from a large API (we found that nearly all of retailer’s API’s were private), our model would be able to work on a much larger and more efficient sale. Since our project uses data that is likely already available in these API’s it should not be too expensive to recreate our model on a larger scale.

Share this project: