Skip to content

nixnutz/course_intro_ds_and_ml

Repository files navigation

Sales Forecasting for a Bakery Branch

About

This is a fork of the original team effort for the OpenCampus.sh course Introduction into Data Science and Machine Learning (2025/2026). All code was developed independently (Ulf Wendel); no code from other team members was reused.

Repository Link

https://github.com/nixnutz/course_intro_ds_and_ml

Description

This project focuses on sales forecasting for a bakery branch, utilizing historical sales data spanning from July 1, 2013, to July 30, 2018, to inform inventory and staffing decisions. We aim to predict future sales for six specific product categories: Bread, Rolls, Croissants, Confectionery, Cakes, and Seasonal Bread. Our methodology integrates statistical and machine learning techniques, beginning with a baseline linear regression model to identify fundamental trends, and progressing to a sophisticated neural network designed to discern more nuanced patterns and enhance forecast precision. The project includes data preparation, crafting bar charts with confidence intervals for visualization, and fine-tuning models to assess their performance on test data from August 1, 2018, to July 30, 2019, using the Mean Absolute Percentage Error (MAPE) metric for each product category.

Task Type

Regression

Results Summary

Best Model

Neural Network Model Version 1.2 (balanced_v1_2.ipynb)

Evaluation Metrics

  • Primary metric: MAPE (Mean Absolute Percentage Error)
  • Business target: Precision gain (revenue-weighted improvement)

Model Performance (v1.2)

Metric Value
Training MAPE 17.33%
Validation MAPE 18.54%
Validation MAE 30.51 EUR
Saisonbrot wMAPE (sales months only) 31.25%

Precision Gain vs. Baseline

Model Precision Gain (€/year vs OLS) Incremental Gain
OLS (baseline) 0
NN 1.0 +100,374
NN 1.1 +106,936 +6,560 vs 1.0
NN 1.2 +109,306 +2,370 vs 1.1

Results by Product Category (v1.2)

Category MAPE / wMAPE
Bread (1) 18.88%
Rolls (2) 10.95%
Croissant (3) 19.93%
Confectionery (4) 23.33%
Cake (5) 14.79%
Seasonal Bread (6) 31.25% (wMAPE)

Note: For detailed breakdowns and analysis, see 3_Model/.

Why Precision Gain?

The models have largely exhausted the dataset's information capacity; further generic optimization risks fitting noise. We therefore chose a fictitious business goal—precision gain—optimizing for revenue per product group (prioritizing Saisonbrot where the highest gains are expected). See 3_Model/ for details.

Setup Instructions

GitHub Codespaces is the recommended setup method for this project. This provides a consistent development environment for users without requiring local configuration.

For detailed setup instructions, including:

  • GitHub Codespaces setup (recommended)
  • Connecting VS Code (or Cursor, a VS Code derivative) to your Codespace
  • Local development setup
  • Troubleshooting

Please see SETUP.md.

Documentation

  1. Data Import and Preparation
  2. Dataset Characteristics
  3. Baseline Model
  4. Model Definition and Evaluation

About

OpenCampus.sh course: Introduction to Data Science and Machine Learning

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors