Skip to content

AndryADSM/Predicting-House-Prices

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 

Repository files navigation

Predicting House Prices

🌐 Check this project on my website!

Go to Kaggle.

Files

  • 'house-prices-with-feature-analyzer.ipynb' is the Python Notebook where all the work was done. Works well only inside Kaggle.
  • 'submission.csv' is the final output that is submitted to the Kaggle Competition

You can also get the code (.ipynb) and output files in Kaggle. Note that they will work well only inside Kaggle.


📌 Type

Kaggle Competition, Regression.

⚜️ Domain

Real Estate, House Prices. ​

💻 Technologies

  • Python (Kaggle Notebook)
    • pandas
    • numpy
    • sklearn
    • matplotlib
    • seaborn

🕹️ Skills

  • Machine Learning
  • Data Preprocessing
  • Feature Engineering
  • Data Visualization
  • Data Analysis

🏘️ Worked on the Kaggle competition "House Prices - Advanced Regression Techniques" where I successfully predicted the sale price of 1459 houses from a dataset of 1460 records of 79 features using Python 🐍.

🔎 Performed Exploratory Data Analysis (EDA), looking deep for missing values, distributions, counts, correlations and more with a lot of use of pandas, matplotlib and seaborn.

📊 Created a "Feature Analyzer", really helpful for EDA, which gives relevant information and plots to quickly get useful insights about a certain feature, categorical or numerical, taking advantage of matplotlib and seaborn.

🧹 Used pandas, numpy and sklearn for cleaning and preprocessing, changing data types, ordinal encoding, dummies, lots of feature engineering 🛠️ and more.

🤖 Tested different models, including several from sklearn, like RandomForestRegressor and GradientBoostingRegressor optimizing with GridSearchCV, concluded with CatBoostRegressor as the best model.

🧾 Evaluated performance with a custom scorer, RMSLE (root-mean-squared-log-error), and got 0.12236, which is as high as top 10% of competitors 🏆.


p1_numr