"Turning raw retail data into crystal-clear sales foresight."
🎯 Powered by: XGBoost, LightGBM, CatBoost, Prophet, Optuna, and CUDA
💼 Engineered by: Vishnupriyan P R
SaleFore AI is a modular, production-ready, GPU-accelerated machine learning system that delivers highly accurate sales forecasting for retail and e-commerce businesses. With automated preprocessing, feature engineering, model training, and Power BI-ready outputs — this suite was built to replace traditional forecasting pipelines with intelligent automation and accuracy.
- 🔁 Automated Feature Engineering – Temporal, cyclical, rolling, lag, and store-level features
- 🧠 Ensemble Models – Uses XGBoost, CatBoost, LGBM, RF, ElasticNet, Prophet
- ⚙️ Hyperparameter Tuning – Smart optimization using Optuna
- ⚡ GPU Support – Full GPU-acceleration (RTX tested)
- 🧪 Time Series Validation – Non-leaky splits using TimeSeriesSplit
- 📦 Modular Pipeline – Train once, reuse with ease; full model persistence
- 📊 Power BI Compatible Output – Exported forecasts ready to be visualized
- 🔒 Failsafe CPU fallback – No GPU? No problem.
SaleFore-AI/
├── data/
│ ├── raw/ # Unprocessed datasets
│ ├── processed/ # Cleaned and feature-rich data
│ └── forecasts/ # Final forecast results
├── models/ # Trained models and artifacts
├── outputs/ # Metrics, logs, plots, evals
├── src/
│ ├── config.py # Constants, file paths, parameters
│ ├── preprocess.py # All cleaning + feature engineering logic
│ ├── train_model.py # Model training and tuning
│ ├── forecast.py # Inference pipeline for future dates
│ └── utils.py # Common utilities
└── requirements.txt # Dependencies list
git clone https://github.com/vishnupriyanpr183207/SaleFore-AI
cd SaleFore-AI
pip install -r requirements.txtSet paths for:
RAW_DATA_PATH,PROCESSED_DATA_PATHFORECAST_OUTPUT_PATHMODEL_SAVE_DIR, etc.
Also configure:
- GPU flags
- Forecasting horizon
- Train/val/test split logic
- Feature flags (e.g., enable/disable Prophet)
- Handles missing data, removes outliers
- Generates rolling mean, lag, day-of-week, month, holiday, promo, event, etc.
- Trains multiple models in parallel (LightGBM, CatBoost, Prophet, XGBoost, ElasticNet, RF)
- Optimizes using Optuna with early stopping
- Exports:
model.pkl,metrics.json,importance.png
- Loads trained models
- Predicts N future time steps
- Saves output in Power BI-compatible CSV
outputs/
├── forecast_results.csv # Final predictions
├── model_metrics.json # Accuracy, RMSE, MAPE, etc.
├── plots/
│ ├── feature_importance.png
│ └── prediction_vs_actual.png
- Import
forecast_results.csvin Power BI or Tableau - Connect model metrics
- Build dynamic trend visualizations & what-if simulators
erDiagram
RAW_DATA ||--o{ PROCESSED_DATA : cleaned_from
PROCESSED_DATA ||--o{ FEATURES : generates
FEATURES ||--|| CONFIG : uses
FEATURES ||--o{ MODEL_TRAINING : input_for
MODEL_TRAINING ||--|{ MODELS : generates
MODEL_TRAINING ||--o{ VALIDATION_RESULTS : outputs
MODELS ||--o{ FORECASTS : used_for
CONFIG ||--|| FORECASTS : controls
FORECASTS ||--o{ PLOTS : visualized_as
FORECASTS ||--o{ METRICS : evaluated_by
METRICS ||--|| PLOTS : illustrated_with
RAW_DATA {
string store_id
string date
int sales
int promo_flag
}
PROCESSED_DATA {
string store_id
string date
float clean_sales
int is_holiday
string weekday
}
FEATURES {
float rolling_mean
float lag_7
int day_of_week
bool is_event
}
CONFIG {
string model_type
int forecast_horizon
string gpu_flag
}
MODEL_TRAINING {
datetime start_time
datetime end_time
string model_name
}
MODELS {
string file_path
string format
datetime trained_on
}
FORECASTS {
date future_date
float predicted_sales
string model_used
}
METRICS {
float rmse
float mape
float r2_score
}
PLOTS {
string file_path
string type
}
Made with 💓 Vishnupriyan P R
📧 priyanv783@gmail.com
🔗 GitHub: vishnupriyanpr
|
|
MIT License – use, share, and scale freely.