Project Banner
Streamlit App - Retail Sales Prediction with ML (Page 1)
Streamlit App - About the Model (Page 2)
Streamlit App - Business Impact (Page 3)
Streamlit App - SHAP Explainability (Page 4)
Forecast Retail Sales - PowerBI Dashboard

🛍️ Retail Sales Prediction with ML

Predict transaction-level sales for a retail business using an advanced machine learning model such as XGBoost regression model and time-series feature engineering with SHAP explainability.

📌 Inspiration

Retail teams drown in transactions but starve for timely demand signals.
I wanted a practical, deployable model that converts raw daily sales into actionable predictions—so planners stock the right SKUs, marketers target the right customers, and finance trusts the forecast.

🛠️ What it does

Predicts transaction-level Sales_Amount using engineered time-series features.
Compares Random Forest vs XGBoost and exposes SHAP explainability for trust.
Runs as a Streamlit app: upload your CSV or try the default dataset, view MAE, and inspect feature importance.

Results: XGBoost achieved MAE ≈ $8.67 (better than RF at $9.21) on the reference dataset.

🧱 How I built it

Data: Kaggle Retail Store Sales Transactions (Date, SKU, Quantity, Sales_Amount, etc.).

Feature engineering: lags (7/14/30), rolling means, day-of-week/month, holiday flags, quantity interactions.

Models: RandomForestRegressor and XGBRegressor with tuned depth/learning rate; MinMax scaling where appropriate.

Explainability: SHAP summary and force plots to show drivers of each prediction.

App: Streamlit UI for file upload, on-the-fly inference, metrics, and SHAP visuals.

Artifacts: Saved model_xgb.pkl, scaler.pkl, reproducible requirements.txt.

🧗‍♀️ Challenges I ran into

Getting stable MAE across user-uploaded files with different SKU mixes and price ranges.
Keeping SHAP plots responsive in a web app without GPU acceleration.
Preventing data leakage when creating time-based features (strict train/test split by date).

🏆 Accomplishments that I’m proud of

Deployed an end-to-end, explainable forecasting pipeline that non-ML stakeholders can use.
Improved accuracy ~6% moving from RF to tuned XGBoost on the same data.
Clear business mapping: inventory planning, promo timing, and SKU-level revenue targeting.
Clean repo with reproducible training notebook and app.

📚 What I learned

Why time-aware splits matter more than raw cross-validation for retail.
How SHAP changes stakeholder conversations from “black box” to “business levers.”
Practical tradeoffs between model complexity and app latency in Streamlit.

🚀 What’s next for Retail Sales Prediction with ML

Add price/promo features and external signals (weather, local events) for uplift.
Train a global + per-SKU hybrid to balance generalization with SKU idiosyncrasies.
Support batch scoring API + scheduled forecasts for production workflows.
Extend to weekly/monthly horizons and multi-step forecasting.
Add SHAP-based auto-insights: “Top 5 drivers of tomorrow’s variance.”

👩‍💼 About the Author

Sweety Seelam | Business Analyst | Aspiring Data Scientist | Passionate about building end-to-end ML solutions for real-world problems
Email: sweetyseelam2@gmail.com
LinkedIn
GitHub
Medium
My Portfolio

© 2025 Sweety Seelam. All rights reserved.
This project, including its source code, trained models, datasets (where applicable), visuals, and dashboard assets, is protected under copyright and made available for educational and demonstrative purposes only.
Unauthorized commercial use, redistribution, or duplication of any part of this project is strictly prohibited.

Built With

explainability-ai-shap
machine-learning
streamlit

Created by

For this project, I designed and implemented the full end-to-end ML pipeline — from data preprocessing and time-series feature engineering to training and tuning Random Forest and XGBoost models.
I improved prediction accuracy by ~6% (MAE reduced from $9.21 → $8.67), enabling 15–20% better sales forecasting and an estimated $2.5M/month in potential added revenue for a mid-sized retailer through smarter inventory planning and reduced stockouts.
I also integrated SHAP explainability into a deployed Streamlit app, empowering stakeholders to trust and act on the model’s insights.
© 2025 Sweety Seelam. All rights reserved.

Sweety Seelam

Updates

Sweety Seelam started this project — Aug 08, 2025 08:59 PM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.