📦 Multi-SKU Demand Forecasting Engine

Gaurav Bhatia — MSc Data Science, AI & Digital Business | GISMA University, Berlin

🧠 What This Project Does

A production-style demand forecasting engine that compares three machine learning and statistical models — ARIMA, Facebook Prophet, and XGBoost — across 12 real product SKUs using 5 years of weekly sales data.

The project is deployed as an interactive Streamlit dashboard where users can select any SKU, choose a model, and adjust the forecast horizon — seeing live MAE, RMSE, and MAPE metrics instantly.

💡 Inspired by real inventory challenges at Bhatia Traders (Chandigarh, India), where applying demand forecasting reduced excess stock by 10%.

🎯 Business Problem

Businesses that cannot accurately forecast demand face two costly problems:

Overstocking → capital tied up in unsold inventory
Stockouts → lost sales and unhappy customers

This engine tackles both by forecasting weekly demand per SKU with multiple models and selecting the best performer — enabling smarter purchasing and replenishment decisions.

🏗️ Project Architecture

demand-forecasting-engine/
│
├── P1_Demand_Forecasting_Gaurav.ipynb   # Main notebook (run top to bottom)
├── app/
│   └── streamlit_app.py                 # Interactive dashboard
├── data/
│   └── weekly_sales.csv                 # Prepared weekly data (auto-generated)
├── models/
│   └── SKU_XX_xgboost.pkl              # Saved XGBoost models per SKU
├── reports/
│   └── model_comparison.csv            # Full results table
├── requirements.txt
└── README.md

🔬 Methodology

Step 1 — Data Preparation

Raw daily sales data aggregated to weekly granularity
Filtered to top 12 best-selling SKUs from Store 1
5-year date range (2013–2017), ~260 weekly observations per SKU

Step 2 — Feature Engineering (for XGBoost)

Lag features: demand from 1, 2, 4, 8, 13, 26, 52 weeks ago
Rolling statistics: mean & std deviation over 4, 8, 13, 26-week windows
Calendar features: week of year, month, quarter, year
Strict time-based train/test split (no data leakage)

Step 3 — Models Compared

Model	Approach	Strengths
ARIMA	Statistical time series	Interpretable, handles trends
Facebook Prophet	Decomposition-based	Handles seasonality & holidays automatically
XGBoost + Optuna	Gradient boosting + HPO	Best accuracy, uses engineered features

Step 4 — Evaluation

Test set: last 12 weeks held out per SKU
Metrics: MAE, RMSE, MAPE%
Hyperparameter tuning: Optuna with TimeSeriesSplit cross-validation (20 trials)

📊 Results

Evaluated on the last 12 weeks of data per SKU (held-out test set), averaged across all 12 SKUs:

Model	Avg MAE	Avg RMSE	Avg MAPE%
ARIMA	71.44	90.11	15.49%
Prophet	29.34	35.94	6.07% ✅
XGBoost	32.96	40.24	6.61%

Key finding: Prophet outperformed XGBoost with a MAPE of 6.07% — both well below the industry benchmark of 10–15%. Prophet's strong performance suggests clear yearly seasonality patterns in this dataset, which it models natively. ARIMA, lacking engineered features, lagged behind at 15.49% — a 60% higher error rate than Prophet.

🖥️ Streamlit Dashboard Features

SKU selector — switch between any of the 12 product SKUs
Model selector — compare ARIMA, Prophet, XGBoost side by side
Adjustable test horizon — slider from 4 to 24 weeks
Live metrics — MAE, RMSE, MAPE update instantly
Interactive Plotly chart — hover to see exact forecast vs actual values

🛠️ Tech Stack

Category	Tools
Language	Python 3.10+
Forecasting	ARIMA (statsmodels), Facebook Prophet, XGBoost
Hyperparameter Tuning	Optuna
Feature Engineering	Pandas, NumPy
Visualisation	Plotly, Matplotlib
Dashboard	Streamlit
Notebook	Google Colab / Jupyter

🚀 How to Run

Option 1 — Google Colab (Recommended)

Open P1_Demand_Forecasting_Gaurav.ipynb in Google Colab
Run cells top to bottom (Shift+Enter or Runtime → Run All)
Download train.csv from Kaggle and upload when prompted
Launch the Streamlit dashboard in the final cell — you'll get a public ngrok URL

Option 2 — Run Locally

git clone https://github.com/gauravbhatia-bit/demand-forecasting-engine
cd demand-forecasting-engine
pip install -r requirements.txt
python data/prepare_data.py
streamlit run app/streamlit_app.py

Dataset

Download train.csv from Kaggle:
👉 https://www.kaggle.com/competitions/demand-forecasting-kernels-only/data

🔑 Key Learnings

Prophet outperformed XGBoost (6.07% vs 6.61% MAPE) — counterintuitive but explained by strong yearly seasonality in the dataset, which Prophet models natively without feature engineering
Optuna saves significant time vs. manual grid search — finds optimal hyperparameters in ~20 trials
No data leakage is critical in time series — all features are strictly shifted backwards before training
MAPE alone can be misleading for low-demand SKUs — always report MAE and RMSE alongside it

📁 Related Projects

Bhatia Traders Sales Analysis — EDA & inventory analysis using Python/Pandas
Inventory Alert System — SQL-based real-time stock monitoring

👤 About Me

I'm Gaurav Bhatia, currently pursuing an MSc in Data Science, AI & Digital Business at GISMA University of Applied Sciences in Berlin, Germany.

Before moving into data science, I managed operations at Bhatia Traders — a sauces & condiments business in Chandigarh, India — where I applied Python and SQL to solve real inventory and supply chain challenges. That hands-on experience is what drives my interest in building practical, business-focused data tools.

I'm actively looking for Data Analyst / Data Science internship opportunities in Berlin and Germany.
Feel free to reach out on LinkedIn or via email at gauravbhatia.gb6@gmail.com.

📄 License

MIT License — free to use, modify, and distribute with attribution.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
.gitignore		.gitignore
P1_Demand_Forecasting_Gaurav.ipynb		P1_Demand_Forecasting_Gaurav.ipynb
README.md		README.md
features.py		features.py
prepare_data.py		prepare_data.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

📦 Multi-SKU Demand Forecasting Engine

🧠 What This Project Does

🎯 Business Problem

🏗️ Project Architecture

🔬 Methodology

Step 1 — Data Preparation

Step 2 — Feature Engineering (for XGBoost)

Step 3 — Models Compared

Step 4 — Evaluation

📊 Results

🖥️ Streamlit Dashboard Features

🛠️ Tech Stack

🚀 How to Run

Option 1 — Google Colab (Recommended)

Option 2 — Run Locally

Dataset

🔑 Key Learnings

📁 Related Projects

👤 About Me

📄 License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

📦 Multi-SKU Demand Forecasting Engine

🧠 What This Project Does

🎯 Business Problem

🏗️ Project Architecture

🔬 Methodology

Step 1 — Data Preparation

Step 2 — Feature Engineering (for XGBoost)

Step 3 — Models Compared

Step 4 — Evaluation

📊 Results

🖥️ Streamlit Dashboard Features

🛠️ Tech Stack

🚀 How to Run

Option 1 — Google Colab (Recommended)

Option 2 — Run Locally

Dataset

🔑 Key Learnings

📁 Related Projects

👤 About Me

📄 License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages