GitHub - allorenz/rental-price-estimator: The project implements regression models to estimate rental prices of properties across the USA.

Introduction

The objective of this project is to predict the monthly rental prices of real estate properties across the USA. The dataset Apartment for Rent Classified [Dataset]. (2019). UCI Machine Learning Repository. https://doi.org/10.24432/C5X623 consists of approximately 100,000 entries, providing comprehensive details about each property, such as size (square footage), number of bathrooms, etc., and additional features like amenities (e.g., air conditioning, garage, pool). This project aims to leverage these features to develop accurate predictive models for rental pricing using Linear Regression, XGB Regressor, and a Neural Network.

Model Evaluation Results

Model	Split	Encoding	MSE	RMSE	MAE	STD	R^2
LinearRegression	(60/20/20)	target encoding	232525.45	482.21	259.83	860.21	0.75
LinearRegression	(80/20)	target encoding	232844.02	482.54	259.77	860.21	0.75
LinearRegression	(60/20/20)	label encoding	556765.25	746.17	453.85	860.21	0.28
LinearRegression	(80/20)	label encoding	556278.57	745.84	454.03	860.21	0.28
NeuralNet	(60/20/20)	target encoding	235137.10	484.91	268.31	860.21	NaN
NeuralNet	(80/20)	target encoding	245613.92	495.59	263.95	860.21	NaN
NeuralNet	(60/20/20)	label encoding	595048.13	771.39	548.00	860.21	NaN
NeuralNet	(80/20)	label encoding	563128.43	750.42	514.33	860.21	NaN
XGBRegressor	(60/20/20)	target encoding	171560.15	414.20	210.52	860.21	0.83
XGBRegressor	(80/20)	target encoding	149103.65	386.14	207.05	860.21	0.83
XGBRegressor	(60/20/20)	label encoding	228073.42	477.57	244.76	860.21	0.78
XGBRegressor	(80/20)	label encoding	213766.41	462.35	244.42	860.21	0.78

Conclusion

XGBRegressor (80/20) - target encoding outperforms all models in terms of MSE, RMSE, MAE, and R², making it the best-performing model.
Linear Regression models with target encoding perform better than those with label encoding, but they still lag behind XGBRegressor models.
Neural Network models do not provide meaningful improvements
Target encoded data outperforms label encoded data

Considering the Standard Deviation ($860.21) as the threshold, the performance of the XGBRegressor (80/20) with target encoding is deemed sufficient. The model's error metrics RMSE ($386.55) and MAE ($207.08) are significantly lower than the standard deviation, indicating relatively good performance. However, while the MAE ($207.08) represents a noticeable deviation, this discrepancy can be attributed to the inherent complexity and variability of the market, as well as limitations in the available data and assumptions made by the model.

The Neural Network did not yield significant improvements, likely due to the limited amount of data, suggesting that Linear Regression and XGB Regressor perform better and are more suitable for smaller data sets.

Name		Name	Last commit message	Last commit date
Latest commit History 47 Commits
data		data
output		output
src		src
.gitignore		.gitignore
README.md		README.md
report.pdf		report.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Introduction

Model Evaluation Results

Conclusion

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

allorenz/rental-price-estimator

Folders and files

Latest commit

History

Repository files navigation

Introduction

Model Evaluation Results

Conclusion

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages