Skip to content

The project implements regression models to estimate rental prices of properties across the USA.

Notifications You must be signed in to change notification settings

allorenz/rental-price-estimator

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

47 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Introduction

The objective of this project is to predict the monthly rental prices of real estate properties across the USA. The dataset Apartment for Rent Classified [Dataset]. (2019). UCI Machine Learning Repository. https://doi.org/10.24432/C5X623 consists of approximately 100,000 entries, providing comprehensive details about each property, such as size (square footage), number of bathrooms, etc., and additional features like amenities (e.g., air conditioning, garage, pool). This project aims to leverage these features to develop accurate predictive models for rental pricing using Linear Regression, XGB Regressor, and a Neural Network.

Model Evaluation Results

Model Split Encoding MSE RMSE MAE STD R^2
LinearRegression (60/20/20) target encoding 232525.45 482.21 259.83 860.21 0.75
LinearRegression (80/20) target encoding 232844.02 482.54 259.77 860.21 0.75
LinearRegression (60/20/20) label encoding 556765.25 746.17 453.85 860.21 0.28
LinearRegression (80/20) label encoding 556278.57 745.84 454.03 860.21 0.28
NeuralNet (60/20/20) target encoding 235137.10 484.91 268.31 860.21 NaN
NeuralNet (80/20) target encoding 245613.92 495.59 263.95 860.21 NaN
NeuralNet (60/20/20) label encoding 595048.13 771.39 548.00 860.21 NaN
NeuralNet (80/20) label encoding 563128.43 750.42 514.33 860.21 NaN
XGBRegressor (60/20/20) target encoding 171560.15 414.20 210.52 860.21 0.83
XGBRegressor (80/20) target encoding 149103.65 386.14 207.05 860.21 0.83
XGBRegressor (60/20/20) label encoding 228073.42 477.57 244.76 860.21 0.78
XGBRegressor (80/20) label encoding 213766.41 462.35 244.42 860.21 0.78

Conclusion

  • XGBRegressor (80/20) - target encoding outperforms all models in terms of MSE, RMSE, MAE, and R², making it the best-performing model.
  • Linear Regression models with target encoding perform better than those with label encoding, but they still lag behind XGBRegressor models.
  • Neural Network models do not provide meaningful improvements
  • Target encoded data outperforms label encoded data

Considering the Standard Deviation ($860.21) as the threshold, the performance of the XGBRegressor (80/20) with target encoding is deemed sufficient. The model's error metrics RMSE ($386.55) and MAE ($207.08) are significantly lower than the standard deviation, indicating relatively good performance. However, while the MAE ($207.08) represents a noticeable deviation, this discrepancy can be attributed to the inherent complexity and variability of the market, as well as limitations in the available data and assumptions made by the model.

The Neural Network did not yield significant improvements, likely due to the limited amount of data, suggesting that Linear Regression and XGB Regressor perform better and are more suitable for smaller data sets.

About

The project implements regression models to estimate rental prices of properties across the USA.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •