Inspiration
We aimed to create a lightweight, scalable model to help small businesses optimize sales revenue using data-driven insights. By focusing on a realistic dataset size, we ensured that any restaurant could implement similar methodologies to build their own predictive models.
What it does
Our model predicts relative sales revenue for individual venues based on features like time of day, day of the week, order duration, and tip percentage. Instead of absolute predictions, it provides a scaled revenue estimate, allowing businesses to gauge performance trends.
How we built it
Data Preprocessing: Merged datasets, handled missing values using KNN imputation, and applied one-hot/target encoding for categorical data. Feature Engineering: Added insights like seasonality, weekday trends, and tip percentage while removing redundant features. Model Selection: Experimented with LSTMs but found that XGBoost provided the best accuracy and efficiency. Training & Evaluation: Used Grid Search for hyperparameter tuning and Group K-Fold cross-validation to validate performance.
Challenges we ran into
Handling High Variability: Sales revenue fluctuates due to external factors like market trends and policies, making prediction challenging. Balancing Accuracy & Efficiency: Finding a model that was both lightweight and powerful led us away from deep learning solutions. Ensuring Generalization: Needed to scale revenue per venue to make predictions meaningful across different business sizes.
Accomplishments that we're proud of
Built an accurate, efficient model that small businesses can easily adopt. Identified key sales drivers such as time of day (3 PM) and weekday patterns (Tuesdays). Achieved an RSME of 0.436, demonstrating strong performance given the dataset’s complexity.
What we learned
Feature selection is crucial: Removing redundant data and engineering meaningful features improves model interpretability. Scaling matters: Standardizing revenue per venue made the model more generalizable. Classical models like XGBoost can outperform deep learning in cases with structured, tabular data.
What's next for TouchBistro Challenge
Expand dataset coverage to incorporate more venues for broader generalization. Enhance feature selection by exploring additional external factors like local events and promotions. Improve interpretability by providing actionable recommendations for businesses based on model insights.
Built With
- google-colab
- python

Log in or sign up for Devpost to join the conversation.