(from Kaggle Competition)
Rossmann is a chain stores in Europe. And this project is from Kaggle competition Rossmann Store Sales. I need to predict Rossmann stores' sales according provided data, like the promotion, competition, holiday and etc.
In this project, I use Python for data explore, visulization and featuring engineering. Python version is windows X64 3.5.5. And involved packages inlcude:
- Jupyter Notebook
- numpy
- pandas
- seaborn
- matplotlib
- xgboost
- To run this project, you should include sub directory: input. The Input directory has data downloaded from Kaggle and unzip to local: train.csv, test.csv.
- In Jupyter Notebook to load “Rossmann Store project for report Eng.ipynb” and execute it. The executation time needs about 7 hours。
- The result is "xgboost_rossman_submission.csv" and can be uploaded to Kaggle.
The result of local test set: RMSPE: 0.109707
Kaggle Private Score 0.11518
Kaggle Public Score 0.11636