Skip to content

Customer segmentation for Ready, Steady Ride using K-Means (k=4) across weather, behavior and time perspectives - NOVA IMS ML project

License

Notifications You must be signed in to change notification settings

DiogoGAndrade/ready-steady-ride-customer-segmentation

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Ready, Steady Ride – Customer Segmentation (Clustering)

Second Machine Learning project developed for the Machine Learning course at NOVA IMS (2022/2023).

🎯 Objective

Identify customer segments for Ready, Steady Ride (ride-sharing) using unsupervised learning.
The goal is to uncover weather-driven and behavioral patterns to guide marketing and resource allocation.

🧩 Dataset

  • ~9,600 observations and 17 features
  • Weather (Temperature, Humidity, WindSpeed, WeatherForecast), behavior (Registered/Non-registered users), and time (Hour, WorkingDay, Holiday, DayofWeek, Month)
  • Unsupervised task (no target)

⚠️ The original course dataset is not shared due to licensing. A synthetic dataset (customers_sample.csv) is included under data/raw/, mimicking the original structure for demonstration and reproducibility.

🧠 Methods & Workflow

  • Data exploration & visualization; value harmonization and outlier handling
  • Feature engineering & selection: totals per month/day, Spearman correlation (removed highly correlated/uninformative features)
  • Scaling: Robust Scaler (more stable under outliers)
  • Clustering: K-Means and K-Prototypes (self-study)
  • Model selection: elbow (inertia), hierarchical clustering (Ward dendrogram) and silhouette

Perspectives considered

  1. Weather conditions – Temperature, Humidity, WindSpeed, etc.
  2. Customer behavior – Registered / Non-registered, totals per month/day.
  3. Temporal patterns – Hour of day, Holiday, WorkingDay, DayofWeek.

📈 Results

  • Final solution: K-Means with k = 4 clusters (global run after combining perspectives).
  • Cluster profiles (examples):
    • Cold & humid, low engagement (non-registered low; totals month/day modest)
    • Moderate weather, high registered activity
    • Warm weather, high engagement (registered + non-registered)
    • Warm but more humid, moderate engagement

🚀 Action Plan (business)

  • Loyalty program to convert heavy non-registered users.
  • Weather-based promotions in windows of high propensity.
  • Resource allocation tuned by hour and cluster demand profile.

🛠️ How to run

pip install -r requirements.txt
jupyter lab
# open notebooks/ML1_Group18_Clustering_Notebook.ipynb

About

Customer segmentation for Ready, Steady Ride using K-Means (k=4) across weather, behavior and time perspectives - NOVA IMS ML project

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published