This repository contains the code and analysis for a sales prediction project. The project utilizes a dataset obtained from Kaggle, which can be found here.
The main goal of this project is to predict sales based on advertising data. The following steps were performed:
-
Exploratory Data Analysis (EDA): A thorough exploration of the dataset was conducted to understand its structure and characteristics.
-
Data Preprocessing: The dataset underwent preprocessing to handle missing values, encode categorical variables, and scale numerical features.
-
Data Visualization: Various visualizations were created to gain insights into the relationships between different variables in the dataset.
-
Model Building:
- Linear Regression: A linear regression model was implemented to predict sales based on the features in the dataset.
- Random Forest Regression: A random forest regression model was also trained and evaluated for comparison.
-
Results Comparison: The results of the linear regression and random forest regression models were compared to determine the effectiveness of each model in predicting sales.
To replicate or further explore the analysis, follow these steps:
git clone https://github.com/pelinozden/Sales_Prediction_Linear_Regression_Random_Forest_Regression.gitcd Sales_Prediction_Linear_Regression_Random_Forest_Regressionconda activate your_environment_nameThe dataset used in this project can be found on Kaggle here. Make sure to download it and place it in the data/ directory before running the notebook.
This project demonstrates the process of exploring, analyzing, and predicting sales based on advertising data. The comparison of linear regression and random forest regression models provides insights into the predictive performance of each approach.
For any questions or suggestions, feel free to open an issue or contact me.
Happy coding!