Waze User Churn Prevention Dashboard
This project enables data-driven insights for understanding and preventing user churn, focusing on optimizing user retention through behavioral pattern analysis.
Our analysis revealed key patterns in user churn behavior:
-
Usage Intensity (82% Retention vs. 18% Churn): โค๏ธโ๐ฅ
- Churned Users: Higher intensity (~3 more drives/month)
- Retained Users: More consistent usage patterns
- Strategic Focus: Balance between engagement and sustainability
-
Activity Concentration (Key Metrics):
โ๏ธ - Drive Patterns: Churned users average 698km per driving day
- Time Distribution: Retained users show 2x more active days
- Resource Focus: Target high-intensity users with specialized features
-
Platform Distribution:
โ๏ธ - 64.48% iPhone users
- 35.52% Android users
- Platform Impact: No significant difference in churn rates
- Goal: Reduce user churn rate (currently 18%)
- Potential Impact: Target high-risk user segments
- Resource Allocation Model: Focus on user experience optimization
We conducted hypothesis testing to determine if device type affects user engagement:
-
Research Question: ๐
- Is there a statistically significant difference in ride frequency between iPhone and Android users?
- Null Hypothesis: No difference in mean rides between platforms
- Alternative Hypothesis: Significant difference exists in mean rides
-
Statistical Insights: ๐
- Two-sample t-test performed (p-value = 0.1434)
- Failed to reject null hypothesis at ฮฑ = 0.05
- Key Finding: Device type does not significantly impact ride frequency
-
Business Implications: ๐ก
- Platform-agnostic user experience successfully maintained
- Resource allocation should focus on usage patterns rather than device-specific solutions
- Consistent cross-platform performance validates current development approach
- Develop retention strategies focusing on usage frequency rather than device type
- Maintain cross-platform consistency in future feature development
- Weight device type appropriately in churn prediction models
Following our initial exploratory analysis, we conducted a comprehensive binomial logistic regression model to predict user churn with greater precision.
- Dataset Size: 14,299 users
- Model Accuracy: 82.55%
- Precision: 54.44%
- Recall: 9.66%
-
Engagement Intensity
- Each additional day a user opens the app reduces churn risk by ~10%
- Strategic Recommendation: Develop features encouraging daily app engagement (traffic updates, gas prices, road alerts)
-
User Segmentation
- Professional Drivers: 7.6% churn rate
- Non-Professional Drivers: 19.9% churn rate
- Strategic Recommendation: Create segment-specific retention tactics
-
Proactive Retention Strategies
- Develop an early warning system to identify at-risk users
- Focus on enhancing features that demonstrate value during shorter, routine drives
Building on our regression analysis, we developed and compared tree-based machine learning models to further improve churn prediction accuracy.
Model | Precision | Recall | F1 | Accuracy |
---|---|---|---|---|
Random Forest | 0.501428 | 0.105772 | 0.174212 | 0.822240 |
XGBoost | 0.407280 | 0.183973 | 0.253176 | 0.807553 |
Random Forest Validation | 0.446281 | 0.106509 | 0.171975 | 0.818182 |
XGBoost Validation | 0.413934 | 0.199211 | 0.268975 | 0.808042 |
XGBoost Test | 0.439689 | 0.222880 | 0.295812 | 0.811888 |
- XGBoost Performance: Successfully identified 22.3% of churning users (2.3x improvement over regression)
- Precision-Recall Balance: 44% precision ensures efficient targeting of at-risk users
- Model Stability: Consistent performance across validation and test sets
The XGBoost model identified these top factors influencing churn:
- Driving speed/efficiency (km_per_hour)
- Navigation to favorite places (total_navigations_fav1)
- User tenure (n_days_after_onboarding)
- Recent engagement intensity (percent_sessions_in_last_month)
- Time spent driving (duration_minutes_drives)
Engineered features accounted for 6 of the top 10 predictors, highlighting the importance of domain knowledge in model development:
- km_per_hour: Driving efficiency metrics
- percent_sessions_in_last_month: Recent engagement intensity
- km_per_driving_day: Distance driven on active days
- km_per_drive: Average trip distance
Predicted Not Churned | Predicted Churned | |
---|---|---|
Actual Not Churned | 2209 (True Negatives) | 144 (False Positives) |
Actual Churned | 394 (False Negatives) | 113 (True Positives) |
Our model correctly identified 113 out of 507 churning users while maintaining a low false positive rate.
- Deploy the XGBoost model to score current users for churn risk
- Implement targeted retention campaigns focused on:
- Improving navigation efficiency and time-saving features
- Encouraging setup and use of favorite destinations
- Creating special engagement programs at critical tenure points
- Monitor model performance monthly and refine as needed
Our model performed best with these parameters:
- Learning rate: 0.2
- Max depth: 5
- Min child weight: 5
- Number of estimators: 300
- Improve model predictive capability
- Conduct qualitative user research
- Develop personalized re-engagement strategies
- Strategy Document (PDF)
- Project & Stakeholder Requirements (PDF)
- EDA Results (PDF)
- Statistical Analysis Report (PDF)
- Dashboard Mockup (Image)
- Regression Model Report (PDF)
- Machine Learning Model Report (PDF)
Data Files ๐
- Waze User Activity Data
- Churn Analysis Results
- Platform Usage Patterns
- Combined Analysis Results
- Regression Analysis
- Feature Importance Analysis
- Confusion Matrix
-
Data Integration & Cleaning ๐พ
- Standardized user activity metrics
- Validated data completeness (700 records with missing labels addressed)
- Normalized driving metrics
- Cross-referenced device data
-
Metric Development ๐
- User Activity Patterns
- Drive Intensity Metrics
- Kilometers per drive
- Drives per active day
- Total activity days
- Platform Usage Statistics
- Churn Probability Indicators
-
Visualization Strategy ๐ผ๏ธ
- User behavior pattern tracking
- Cross-platform comparison
- Temporal usage analysis
- Churn risk indicators
-
Statistical Analysis ๐
- Hypothesis test formulation
- Descriptive statistics computation
- Two-sample t-testing methodology
- Statistical significance interpretation
-
Machine Learning Integration ๐งฎ
- XGBoost model integration for real-time churn probability
- Feature importance visualization
- Risk segmentation dashboard
- Prediction accuracy monitoring
-
Immediate Actions โ
- Deploy XGBoost model to identify high-risk users
- Develop targeted retention strategies focusing on driving efficiency
- Encourage favorite destination setup for new users
- Create re-engagement campaigns for users showing recent activity decline
-
Resource Optimization โ
- Implement features that improve navigation efficiency
- Develop specialized engagement programs based on user tenure
- Create features that enhance favorite place functionality
- Focus on creating value for both professional and casual drivers
- Enhance model with additional data sources
- Develop automated retention campaign system
- Create a user risk score API for integration with marketing tools
- Implement A/B testing framework for retention strategies
- Expand model to predict specific churn timeframes
For inquiries about this analysis:
ยฉ Melissa Slawsky 2025. All Rights Reserved.
This repository contains proprietary analysis.
Published Project URL: Waze User Retention Dashboard
-
Python Analysis Notebooks
-
Model Deployment Resources