This project applies unsupervised machine learning to segment customers based on their behavior and demographics. The goal is to help businesses tailor marketing strategies and personalize experiences for different customer groups.
- Dataset: Mall Customer Segmentation Data
- Problem Type: Clustering (Unsupervised ML)
- Algorithm: K-Means
- Tools: Python, Pandas, Scikit-learn, Matplotlib, Seaborn
- Python
- Pandas, NumPy
- Scikit-learn
- Matplotlib, Seaborn
- Google Colab
β
Data Preprocessing (Label Encoding + Standard Scaling)
β
Elbow Method to determine optimal K
β
K-Means Clustering (with n_clusters=5)
β
3D Visualization of customer segments
β
Final CSV of customer segments for business use
- K-Means effectively grouped customers into 5 behavior-based clusters
- Spending Score and Annual Income showed the most distinctive patterns
- Helps identify target groups like:
- High income, low spenders
- Low income, high spenders
- Balanced/mid-range customers
customer-segmentation/ βββ data/ β βββ customers.csv βββ images/ β βββ 2d_clusters.png β βββ elbow_method.png βββ notebooks/ β βββ 01_customer_segmentation.ipynb βββ customers_clustered.csv βββ README.md
- Upload the dataset (
customers.csv) to your project - Run the Jupyter/Colab notebook
- The clustered output is saved to
customers_clustered.csv
- Use PCA or t-SNE for dimensionality reduction
- Build an interactive dashboard with Streamlit
- Apply DBSCAN or Hierarchical clustering for comparison
π§ kanishtyagi123@gmail.com
π LinkedIn
π GitHub