A Machine Learning powered web application that estimates residential property prices in Mumbai. It uses Ridge Regression with a sophisticated data pipeline to analyze location, area, and bedroom count, providing users with instant, data-driven valuations.
---
Unlike basic calculators that use simple averages, this project implements a full Supervised Learning Pipeline:
- Data Processing: * Cleaned a dataset of 5,000+ real Mumbai listings.
- Handled high-cardinality location data (300+ unique regions) using One-Hot Encoding.
- Scaled numerical features (Area, BHK) using
StandardScalerto normalize distributions.
- The Model:
- Algorithm: Ridge Regression (Linear Least Squares with L2 Regularization).
- Why Ridge? Chosen over standard Linear Regression to handle multicollinearity between location features and prevent overfitting.
- Performance: Achieved an R² Score of 0.84, significantly outperforming the baseline.
- Deployment:
- Wrapped the model in a Streamlit web interface.
- Deployed via CI/CD pipeline on Streamlit Community Cloud.
- Language: Python
- Machine Learning: Scikit-Learn (Ridge, Pipeline, ColumnTransformer)
- Data Manipulation: Pandas, NumPy
- Web Framework: Streamlit
- Model Persistence: Joblib
├── app.py # The main Streamlit web application
├── train_advanced.py # The training script (Data cleaning -> Pipeline -> Model Save)
├── model_advanced.pkl # The trained serialized model file
├── cleaned_data_v2.csv # Processed dataset used for the Location Dropdown
├── requirements.txt # Dependencies for deployment
└── README.md # Documentation
If you want to run this project on your own machine:
- Clone the repository
git clone https://github.com/[YOUR-USERNAME]/mumbai-house-prices.git
cd mumbai-house-prices
- Install dependencies
pip install -r requirements.txt
- Run the app
streamlit run app.py
This tool is the MVP for a larger vision: PropShare, a fractional real estate investment platform.
- Phase 1 (Done): Price Discovery Engine.
- Phase 2 (In Progress): "Undervalued Deal" Alert System using Anomaly Detection.
- Phase 3: Fractional Investment Marketplace.
Yogin
- LinkedIn: www.linkedin.com/in/yogin-langalia