Inspiration

As we know, year of 2020 is a year of unfortune as the Covid pandemic spreads around the world. We emphasize on hygiene more than ever.

A quite coincidental incident happened after the data was released. As one of our team member walked into a parking lot last week (fully masked and social distanced of course), a note attached says 'need a refill, please call...' In order to avoid awkward scenario like this, our project focused on optimizing dispenser distribution for Tork’s clients in reducing cleaning labor and improve user’s experience.

What it does

We decomposed the bottom-line retail revenue to be 'Number of Locations * Dispenser/Locations * Refillment Frequency * Refillment Amount * Price. As Refillment Amount and Number of Locations are fixed, we aim to increase total revenue by increasing Dispenser/Locations, Refillment Frequency and Price.

Transforming the goal into a well-defined machine learning problem, we aim at identifying locations in need of more dispensers, and use data-driven evidences to convince our customers to adopt products of higher quality level. The final model judges if current number of dispensers are sufficient and predicts optimal number of dispensers for incoming customers.

How I built it

We started from cleaning data and feature engineer. Based on traffic and dispenser data, we also defined metrics to evaluate dispenser sufficiency. Using outlier detection methodology IQR, we were able to identify locations with additional dispensers need to be installed. We built 2 Random Forest models, one regression model to predict the number of dispensers, and another binary classification model to evaluate if the current number of dispensers is sufficient.

See detailed implementation on our Google Lab, feel free to leave a comment! XD

Challenges I ran into

Despite only given 7 days, as a team of 3, we synthesized a well-defined business problem out of a convoluted dataset, performed analysis, and delivered actionable insights. We constantly reviewed our ideas and made corrections along the way, even involving rewriting the notebook almost from scratch. I am glad we had a challenging yet inspiring 7 days, preserved along disagreement and built lasting bonds!

Accomplishments that I'm proud of

We are also proud of the data visualizations we created throughout the analysis. Those data visualizations are helpful to explain our methodologies and findings to all kinds of audience, for both people with all level of technical backgrounds. Moreover, we are proud of our clear logic throughout the analysis process and believe our project will indeed create impact on Tork.

What I learned

We learned how to formulate a business question with available data in order to generate actionable insights. We also learned the importance of seeing out of the box and connecting different information together.

What's next for Data Dispenser Optimization

  • More data and features please! More qualitative data, such as cleaner feedback, customer business types shall improve model robustness.
  • Alternative models. Maybe? XGBoost, Long-short term Memory, given the data is a time-series flow data involving seasonality.

Built With

Share this project:

Updates