Inspiration

As a student pursuing a B.S. in Data Science, I was tasked with a project to apply my theoretical knowledge to a real-world business problem. I didn't want to just analyze a dataset; I wanted to find a genuine pain point. I started talking to local business owners—from corner grocers to cafe managers—and I kept hearing the same story, a story of a silent killer of profits and customer trust: inventory management.

They spoke of the "haunting ghost of lost sales" from an empty shelf, the "financial drain of spoilage" from an overstocked backroom, and the constant, nagging "what if" questions that kept them up at night. "What if I run a promotion next Tuesday and it rains? Should I order more?" This wasn't just an inefficiency; it was the central, defining challenge of their business. That discovery lit a fire in me. This was no longer just a project; it became a mission.

The catalyst was this very hackathon: "Build Real ML Web Apps: No Wrappers, Just Real Models." It was the perfect call to action, challenging me to stop just analyzing the problem and start building the solution. RETAILPILOT is the result of that mission—an intelligent partner designed to bring the power of predictive AI to every retailer, big or small.

What it does

RETAILPILOT is an ML-powered intelligence hub for smart inventory management. It moves beyond simple historical reporting to offer a full suite of predictive tools, allowing retailers to anticipate future trends and proactively manage their stock. Users can:

  • Analyze Trends: Visualize sales and stock-out trends over time, switching between Daily, Weekly, and Monthly views to identify seasonal patterns.
  • Identify Key Performers & Problems: Instantly discover top-performing products and stores, and, more importantly, pinpoint which products suffer the most from stock-outs.
  • Evaluate Model Performance: Transparently assess how the core ML models perform on historical, unseen test data. This section provides key metrics like R², MAE, and RMSE, building trust by showing exactly how accurate the underlying AI is.
  • Simulate "What-If" Scenarios with Manual Prediction: This is the strategic heart of the application. It answers the crucial "what if" questions by allowing users to manually input conditions—a future date, a planned promotion, a specific weather forecast—and instantly receive a full sales, stock-out, and severity prediction. It turns the user from a passive observer into an active strategist.
  • Generate Multi-Day Forecasts: With a single click, users receive an automated multi-day forecast for any store-product combination, predicting three critical metrics:
    1. Sales Forecasting: The expected sales amount.
    2. Stock-Out Hour Prediction: The forecasted number of hours an item will be unavailable.
    3. Stock Severity Classification: An intuitive, four-tier severity level (Fully Stocked, Mild, Moderate, Severe).

How I built it

RETAILPILOT is engineered as a full-stack ML application, designed for performance, accuracy, and intuitive interaction.

  1. Data Foundation: The project began with rigorous feature engineering on the FreshRetailNet-50K dataset. I created lag features (sales from previous days/weeks) and rolling averages to give the models a sense of memory and momentum, which was crucial for time-series prediction.
  2. Pre-processing Pipeline: To avoid slow, on-the-fly data cleaning, I built a pre-processing pipeline that cleaned the raw data, engineered the features, and saved the final, model-ready DataFrames as highly efficient Parquet files.
  3. The Predictive Core: I ran a comprehensive gauntlet, testing a suite of models (RandomForest, GradientBoosting, XGBoost, LightGBM) for each of the three predictive tasks. The final architecture uses a trio of specialized models, each chosen for its optimal balance of speed and accuracy:
    • Sales Forecaster: An XGBoost Regressor.
    • Stock-Out Predictor: An XGBoost Regressor, carefully tuned with sample weights to handle data imbalance.
    • Severity Classifier: A LightGBM Classifier, chosen for its incredible speed.
  4. Forecasting & "What-If" Engines: I implemented two distinct prediction engines. The Forecasting Engine uses an iterative loop to predict future days sequentially. The Manual Prediction Engine features a robust feature assembly pipeline that takes direct user inputs from the UI, combines them with historical data, and constructs a perfect feature vector on-the-fly for the models to score.
  5. User Interface: The entire application is powered by Streamlit, with a strong focus on performance and user experience.

Challenges I ran into

A project of this ambition is defined by its challenges. RETAILPILOT tested me on two distinct fronts: the computational rigor of the backend and the user-centric design of the frontend.

The Backend Gauntlet: Choosing the Right Tools for Three Different Jobs

My first task was to select the optimal model for each of the three predictive tasks. I ran a comprehensive gauntlet, testing a suite of powerful models including RandomForest, GradientBoosting, XGBoost, and LightGBM for each problem.

  • For Sales Forecasting, it was a close race, but the XGBoost Regressor ultimately won due to its superior balance of high accuracy and manageable model size—a critical consideration for a web application.
  • For Stock Severity Classification, speed was paramount. I needed a model that could deliver a classification in milliseconds. Here, the LightGBM Classifier was the undisputed champion, offering an incredible combination of speed, low memory footprint, and strong performance.

But the greatest modeling challenge, the one that truly forged the project's direction, was the Stock-Out Hour Predictor. My initial, academically "correct" approach was to treat it as a multi-class classification problem with 17 classes (0-16 hours). This was a computational disaster. Training sessions on cloud GPUs would run for over six hours before crashing. The few times I trained a model on a small subset, the model file was a monstrous 4 GB, with projections for the full dataset exceeding a completely undeployable 10 GB. I had hit the wall of impracticality.

This forced a pragmatic pivot. I abandoned classification and reframed it as a regression problem. The new challenge was the data's severe imbalance (most values were 0). I overcame this by engineering a sample_weights strategy during the XGBoost training, teaching it to pay extraordinary attention to the rare but critical instances of stock-outs. This journey taught me that the best model isn't just the one with the highest score on paper, but the one that performs reliably and realistically within the constraints of the final application.

Frontend: From Static Reports to a Strategic Co-Pilot

Initially, the application was good at showing what had happened (Analytics) and proving the models were good (what was then called the "Prediction" page). But I realized a critical gap: the user couldn't act on the insights. They couldn't ask their own questions. This led to two major evolutions.

First, I recognized that showing predictions on a fixed test set wasn't true forecasting. To be transparent and build trust, I renamed that section to Model Evaluation. Its purpose became clear: to benchmark the AI's accuracy, not to predict the live future.

This clarity created the space for the app's most powerful feature: Manual Prediction. The challenge was no longer just about models, but about user experience. How could I design a form that captured dozens of complex features (weather, holidays, historical sales) without overwhelming the user? The solution was an intuitive UI that pre-fills sensible defaults based on the selected store and product, allowing the user to tweak only the variables they care about. Building the backend pipeline to instantly assemble these inputs into a feature vector the models could understand was a significant design challenge that bridged the gap between raw data and human intuition.

Finally, taming the memory beast of the 4-million-row dataset was a constant battle. Initially, my application would crash on startup due to memory errors. My first solution was to perform all the heavy data cleaning and feature engineering offline and save the final, processed data as optimized Parquet files. This solved the processing bottleneck but not the memory issue. The breakthrough came from using st.session_state. I engineered the app to load the large, pre-processed datasets and the ML models into st.session_state only once when a user's session begins. This was the key to stabilizing the application, preventing memory re-allocation on every interaction and allowing the app to run smoothly without crashing.

Accomplishments that I am proud of

  • Building a Full-Stack ML Application: I'm incredibly proud of building a complete, end-to-end system—from raw data pre-processing and rigorous model training to a fully functional, deployed web application.
  • Creating a True "What-If" Engine: The Manual Prediction feature is my biggest accomplishment. It elevates the app from a passive dashboard to an active, strategic decision-making tool, directly answering the core questions of business owners.
  • Solving the Impractical Model Problem: Overcoming the 10 GB model challenge by pivoting my approach from classification to a weighted regression was a major breakthrough. It represents the kind of real-world problem-solving that defines a data scientist.
  • Taming the Memory Issues: Successfully engineering the app to handle massive datasets in a memory-constrained environment using a strategic combination of offline pre-processing and st.session_state was a significant technical accomplishment.

What I learned

  1. The Chasm Between Theory and Reality: I learned to navigate the vast difference between a textbook algorithm and a real-world, production-ready system. Engineering for constraints—like memory, deployment size, and inference speed—is just as important as algorithmic accuracy.
  2. The Criticality of a Data Strategy: This project hammered home that you must have a clear strategy for handling large data. The decision to pre-process data offline and use st.session_state for in-app memory management was more crucial than any single model choice.
  3. The Art of Translation: I learned that the final, most important step is translating complex numerical output into a simple, actionable human insight. The goal is not just to be right; it's to be understood.

What's next for RETAILPILOT

This hackathon submission is a powerful proof-of-concept, but it is merely the blueprint for a revolutionary vision. Based on my initial research and feedback from the business community, the potential is immense. The future of RETAILPILOT is a production-scale SaaS platform that will:

  • Become the living, beating heart of a store's data ecosystem, integrating directly with live Point-of-Sale (POS) and inventory systems for real-time analysis.
  • Incorporate a full MLOps pipeline for automatic, weekly model retraining, ensuring the AI is constantly learning and adapting to shifting customer behavior and market trends.
  • Move from Prediction to Prescription. It will not only warn "Severe shortage of Product X is likely in two days," but will advise, "Order 50 units of Product X now for delivery tomorrow to prevent an estimated $500 in lost sales."
  • Democratize AI for all. This level of predictive intelligence is typically the exclusive domain of corporate giants. RETAILPILOT aims to level the playing field, making this game-changing power accessible and affordable for the small and medium-sized businesses that are the backbone of our communities.

Built With

Share this project:

Updates