Skip to content

Data-driven analysis of user churn patterns to optimize retention strategies and enhance user engagement

Notifications You must be signed in to change notification settings

mslawsky/waze-user-analytics

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

51 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

Waze User Retention Analytics ๐Ÿš˜

Waze User Churn Prevention Dashboard
This project enables data-driven insights for understanding and preventing user churn, focusing on optimizing user retention through behavioral pattern analysis.

Dashboard Overview


Executive Summary & Key Findings ๐Ÿ“„

Executive Summary


Strategic Insights: User Behavior Patterns ๐Ÿ”Ž

Our analysis revealed key patterns in user churn behavior:

  1. Usage Intensity (82% Retention vs. 18% Churn): โค๏ธโ€๐Ÿ”ฅ

    • Churned Users: Higher intensity (~3 more drives/month)
    • Retained Users: More consistent usage patterns
    • Strategic Focus: Balance between engagement and sustainability
  2. Activity Concentration (Key Metrics): โ†•๏ธ

    • Drive Patterns: Churned users average 698km per driving day
    • Time Distribution: Retained users show 2x more active days
    • Resource Focus: Target high-intensity users with specialized features
  3. Platform Distribution: โ†”๏ธ

    • 64.48% iPhone users
    • 35.52% Android users
    • Platform Impact: No significant difference in churn rates

Business Impact ๐Ÿ’ฅ

  • Goal: Reduce user churn rate (currently 18%)
  • Potential Impact: Target high-risk user segments
  • Resource Allocation Model: Focus on user experience optimization

Statistical Analysis: Device Impact on User Engagement ๐Ÿ“Š

A/B Testing for Business Decision-Making

Device Usage Analysis

We conducted hypothesis testing to determine if device type affects user engagement:

  1. Research Question: ๐Ÿ”

    • Is there a statistically significant difference in ride frequency between iPhone and Android users?
    • Null Hypothesis: No difference in mean rides between platforms
    • Alternative Hypothesis: Significant difference exists in mean rides
  2. Statistical Insights: ๐Ÿ“ˆ

    • Two-sample t-test performed (p-value = 0.1434)
    • Failed to reject null hypothesis at ฮฑ = 0.05
    • Key Finding: Device type does not significantly impact ride frequency
  3. Business Implications: ๐Ÿ’ก

    • Platform-agnostic user experience successfully maintained
    • Resource allocation should focus on usage patterns rather than device-specific solutions
    • Consistent cross-platform performance validates current development approach

Implementation Strategy ๐Ÿ“‹

  • Develop retention strategies focusing on usage frequency rather than device type
  • Maintain cross-platform consistency in future feature development
  • Weight device type appropriately in churn prediction models

Phase 2: Advanced Regression Analysis ๐Ÿ”ฌ

Project Overview

Following our initial exploratory analysis, we conducted a comprehensive binomial logistic regression model to predict user churn with greater precision.

Regression Model

Model Performance ๐Ÿ“Š

  • Dataset Size: 14,299 users
  • Model Accuracy: 82.55%
  • Precision: 54.44%
  • Recall: 9.66%

Key Insights from Regression Analysis ๐Ÿง 

  1. Engagement Intensity

    • Each additional day a user opens the app reduces churn risk by ~10%
    • Strategic Recommendation: Develop features encouraging daily app engagement (traffic updates, gas prices, road alerts)
  2. User Segmentation

    • Professional Drivers: 7.6% churn rate
    • Non-Professional Drivers: 19.9% churn rate
    • Strategic Recommendation: Create segment-specific retention tactics
  3. Proactive Retention Strategies

    • Develop an early warning system to identify at-risk users
    • Focus on enhancing features that demonstrate value during shorter, routine drives

Phase 3: Advanced Machine Learning Models ๐Ÿค–

Project Overview

Building on our regression analysis, we developed and compared tree-based machine learning models to further improve churn prediction accuracy.

Model Comparison Results ๐Ÿ“Š

Model Precision Recall F1 Accuracy
Random Forest 0.501428 0.105772 0.174212 0.822240
XGBoost 0.407280 0.183973 0.253176 0.807553
Random Forest Validation 0.446281 0.106509 0.171975 0.818182
XGBoost Validation 0.413934 0.199211 0.268975 0.808042
XGBoost Test 0.439689 0.222880 0.295812 0.811888

Key Improvements

  • XGBoost Performance: Successfully identified 22.3% of churning users (2.3x improvement over regression)
  • Precision-Recall Balance: 44% precision ensures efficient targeting of at-risk users
  • Model Stability: Consistent performance across validation and test sets

Most Important Predictors

The XGBoost model identified these top factors influencing churn:

  1. Driving speed/efficiency (km_per_hour)
  2. Navigation to favorite places (total_navigations_fav1)
  3. User tenure (n_days_after_onboarding)
  4. Recent engagement intensity (percent_sessions_in_last_month)
  5. Time spent driving (duration_minutes_drives)

XGBoost Feature Importance

Feature Engineering Impact

Engineered features accounted for 6 of the top 10 predictors, highlighting the importance of domain knowledge in model development:

  • km_per_hour: Driving efficiency metrics
  • percent_sessions_in_last_month: Recent engagement intensity
  • km_per_driving_day: Distance driven on active days
  • km_per_drive: Average trip distance

Confusion Matrix Analysis

Confusion Matrix

Predicted Not Churned Predicted Churned
Actual Not Churned 2209 (True Negatives) 144 (False Positives)
Actual Churned 394 (False Negatives) 113 (True Positives)

Our model correctly identified 113 out of 507 churning users while maintaining a low false positive rate.

Implementation Recommendations

  1. Deploy the XGBoost model to score current users for churn risk
  2. Implement targeted retention campaigns focused on:
    • Improving navigation efficiency and time-saving features
    • Encouraging setup and use of favorite destinations
    • Creating special engagement programs at critical tenure points
  3. Monitor model performance monthly and refine as needed

XGBoost Optimal Parameters

Our model performed best with these parameters:

  • Learning rate: 0.2
  • Max depth: 5
  • Min child weight: 5
  • Number of estimators: 300

Next Steps ๐Ÿš€

  • Improve model predictive capability
  • Conduct qualitative user research
  • Develop personalized re-engagement strategies

Project Documentation ๐Ÿ“„

Business Intelligence Documents ๐Ÿ“‘

Data Analysis Process ๐Ÿ“ถ

Data Files ๐Ÿ“‚


Dashboard Development ๐Ÿ“Š

  1. Data Integration & Cleaning ๐Ÿ’พ

    • Standardized user activity metrics
    • Validated data completeness (700 records with missing labels addressed)
    • Normalized driving metrics
    • Cross-referenced device data
  2. Metric Development ๐Ÿ“ˆ

    • User Activity Patterns
    • Drive Intensity Metrics
      • Kilometers per drive
      • Drives per active day
      • Total activity days
    • Platform Usage Statistics
    • Churn Probability Indicators
  3. Visualization Strategy ๐Ÿ–ผ๏ธ

    • User behavior pattern tracking
    • Cross-platform comparison
    • Temporal usage analysis
    • Churn risk indicators
  4. Statistical Analysis ๐Ÿ“‰

    • Hypothesis test formulation
    • Descriptive statistics computation
    • Two-sample t-testing methodology
    • Statistical significance interpretation
  5. Machine Learning Integration ๐Ÿงฎ

    • XGBoost model integration for real-time churn probability
    • Feature importance visualization
    • Risk segmentation dashboard
    • Prediction accuracy monitoring

Implementation Recommendations ๐Ÿ“‹

  1. Immediate Actions โœ…

    • Deploy XGBoost model to identify high-risk users
    • Develop targeted retention strategies focusing on driving efficiency
    • Encourage favorite destination setup for new users
    • Create re-engagement campaigns for users showing recent activity decline
  2. Resource Optimization โž•

    • Implement features that improve navigation efficiency
    • Develop specialized engagement programs based on user tenure
    • Create features that enhance favorite place functionality
    • Focus on creating value for both professional and casual drivers

Next Steps ๐Ÿš€

  • Enhance model with additional data sources
  • Develop automated retention campaign system
  • Create a user risk score API for integration with marketing tools
  • Implement A/B testing framework for retention strategies
  • Expand model to predict specific churn timeframes

Contact โœ‰๏ธ

For inquiries about this analysis:


ยฉ Melissa Slawsky 2025. All Rights Reserved.
This repository contains proprietary analysis.

Published Project URL: Waze User Retention Dashboard


Additional Technical Documentation ๐Ÿ“„

  1. Python Analysis Notebooks

  2. Model Deployment Resources

About

Data-driven analysis of user churn patterns to optimize retention strategies and enhance user engagement

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages