Skip to content

House price prediction using Kaggle's Advanced Regression Techniques dataset. EDA reveals key features (OverallQual, GarageArea) driving SalePrice across 1460 homes. Features missing value analysis, correlations, and visualization.

Notifications You must be signed in to change notification settings

rivu-intel45/house-prices-regression

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

4 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

🏑 House Prices Exploratory Data Analysis (EDA)

Python Pandas Seaborn


πŸ“Š Overview

Discover the hidden patterns behind what drives house prices.
This project performs an in‑depth exploratory data analysis (EDA) on the
Kaggle House Prices – Advanced Regression Techniques dataset using Python and popular data science libraries.


πŸ“ Dataset

  • Rows: 1,460 Β  | Β  Columns: 81
  • Source: Kaggle competition (House Prices – Advanced Regression Techniques)
  • Target variable: SalePrice

πŸš€ Quick Start

  1. Clone the repository -git clone https://github.com/rivu-intel45/house-prices-regression.git -cd house-prices-regression

  2. Install dependencies -pip install -r requirements.txt

  3. Run the notebook -jupyter notebook Open house-regression.ipynb and run all cells.

  4. Add data (if needed)
    If train.csv is not present, download it from Kaggle
    and place it in an input/ folder (or update the path in the notebook).


🧐 EDA Highlights

  • Target distribution
    Distribution and spread of SalePrice, including skewness and outliers.

  • Missing value analysis
    Bar plots to visualize columns with the highest proportion of missing data.

  • Key feature relationships

  • GarageArea vs. SalePrice

  • OverallQual vs. SalePrice

  • Impact of SaleType on prices

  • Visual insights

  • Histograms, KDE plots, and boxplots for numeric features

  • Scatter plots for important feature–price relationships

  • Major findings

  • Higher OverallQual and GarageArea are strongly associated with higher SalePrice.

  • Some SaleType categories correspond to notable price outliers.

  • Several categorical features contain substantial missing values that need careful handling.


πŸ“¦ Requirements

  • Python 3.8+
  • pandas
  • numpy
  • matplotlib
  • seaborn

🀝 Contributions

Pull requests are welcome.
For larger changes, please open an issue first to discuss what you would like to add or modify.


πŸ“š References


Happy analyzing! πŸ πŸ“ˆ

About

House price prediction using Kaggle's Advanced Regression Techniques dataset. EDA reveals key features (OverallQual, GarageArea) driving SalePrice across 1460 homes. Features missing value analysis, correlations, and visualization.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published