Orange Telecom | Churn Detection and Intervention Delivery Program
Orange Telecom loses an estimated $21,000 per month per 1,000 customers to churn that is structurally detectable 30-60 days before it happens. The gap is not the churn itself, it is the absence of a system to detect behavioral warning signals, route customers to the right intervention, and measure whether the intervention worked. This system predicts individual churn probability, segments at-risk customers by the behavioral driver behind their risk, and maps each driver to a specific retention action, shifting decisions from reactive to proactive. I structured delivery as a phased program: batch scoring and two A/B-tested interventions in Phase 1, real-time event-triggered intervention in Phase 2, with a full program delivery plan covering stakeholder map, RACI, risk register, and success metrics. Built on 3,333 Orange Telecom customer records. 14.5% churn rate. Three quantified retention strategies.
| Metric | Value |
|---|---|
| Best Model | Random Forest |
| ROC-AUC | 0.941 |
| PR-AUC | 0.906 |
| Churn Prevented per 1,000 Customers | 130 |
| Business Cost at Optimal Threshold | $6,300 per 667-customer cohort |
| Top Churn Segment | International Plan Holders: 38.5% churn, 2.64x lift |
Run python run_pipeline.py to regenerate all metrics from source.
| Artifact | Purpose |
|---|---|
| Program Delivery Plan | Stakeholder map, RACI, phased rollout, risk register, OKRs |
| Executive Brief | 1-page summary for VP-level audience |
| Retention Strategy Specifications | Per-strategy: owner, acceptance criteria, A/B test design, KPIs |
| ML System Design | Technical architecture and delivery considerations |
| Interview Preparation | ML system design Q&A and TPM question bank |
Customers with an active international plan churn at 38.5%, 2.64x the baseline rate. Two sub-segments drive this: heavy users who feel overcharged, and underusers paying for a service they barely use.
Action: Proactive plan optimization call. For underusers: offer downgrade with loyalty credit. For heavy users: offer loyalty pricing or rate reduction. Expected impact: ~12 churners prevented per 1,000 customers/month at 30% capture rate. Delivery owner: Customer Operations Director. Activates Phase 1, Week 6.
Customers with 3 or more service calls churn at 31.7%, 2.18x baseline. The inflection point is between calls 2 and 3. By the 3rd unresolved contact, the customer has started evaluating alternatives.
Action: Proactive outreach after the 2nd service call in a 30-day window, before the 3rd. Expected impact: ~17 churners prevented per 1,000 customers/month. Delivery owners: CRM Platform Lead (real-time trigger infrastructure) and Customer Operations Director (agent workflow). Requires real-time event streaming, activates Phase 2, Week 14. Batch interim (flag previous day's 2-contact customers) activates Phase 1, Week 8.
Top-quartile daytime users (over 220 min/month) churn at 27.5%, 1.89x baseline. Heavy daytime users face compounding per-minute billing exposure. The friction is structural.
Action: Proactively offer unlimited or capped-rate day plan to flagged high-usage customers. Expected impact: 20-30% churn reduction in segment with plan conversion. Risk: Revenue cannibalization. Mitigate by only offering to customers above churn threshold (0.24+). Delivery owner: Customer Operations Director. Activates Phase 1, Week 7.
| Phase | Weeks | Scope |
|---|---|---|
| Phase 0: Foundation | 1-3 | Legal gate, CRM integration spec, staging validation, baseline measurement |
| Phase 1: Batch Scoring | 4-10 | Model deployed, daily batch scoring, CRM integration, Strategies 1 and 3 as A/B tests |
| Phase 2: Real-Time Trigger | 11-24 | Event streaming, inference API, Strategy 2 A/B test, Phase 1 readout, model retraining |
| Phase 3: Operational | 25+ | BAU ownership transfers to Head of Data and Analytics |
Full program delivery plan: docs/program_delivery_plan.md
Raw Data (3,333 records, 20 features)
|
+-- Data Quality Checks
| +-- Leakage detection: charge cols dropped (r=1.000 with minutes)
| +-- Class imbalance audit (85.5% / 14.5%)
| +-- Stratified 80/20 re-split (not the Kaggle split)
|
+-- Feature Engineering (+11 behavioral features)
| +-- Usage intensity (avg call duration per period)
| +-- Distress signals (high_service_caller, intl_plan_underuser)
| +-- Engagement distribution (day_to_total_minutes_ratio)
|
+-- Model Training (GridSearchCV, 5-fold CV)
| +-- Logistic Regression -> interpretable linear baseline
| +-- Random Forest -> non-linear ensemble (selected)
| +-- Gradient Boosting -> sequential error correction comparator
|
+-- Evaluation (cost-optimized threshold)
| +-- Precision / Recall / F1 / ROC-AUC / PR-AUC
| +-- Business cost: FN=$500, FP=$50 -> RF optimal threshold 0.24
|
+-- Explainability
+-- Feature importance (Random Forest impurity)
+-- SHAP values (TreeExplainer)
+-- Segment churn analysis (5 behavioral segments)
| Model | Purpose | CV ROC-AUC | PR-AUC | Business Cost | Trade-offs |
|---|---|---|---|---|---|
| Logistic Regression | Interpretable baseline | 0.836 | 0.532 | $15,550 | Assumes linear separability |
| Random Forest | Production model | 0.912 | 0.906 | $6,300 | Less interpretable than LR |
| Gradient Boosting | Sequential error correction comparator | 0.916 | 0.882 | $9,250 | Hyperparameter-sensitive |
Why PR-AUC over ROC-AUC: At 14.5% churn rate, ROC-AUC is optimistic because the dominant class is easily classified. PR-AUC measures performance on the minority class, the one that matters.
Why threshold 0.24 not 0.5: False negatives cost 10x more than false positives ($500 vs $50). Threshold optimization minimizes total dollar cost, not misclassification rate.
.github/
workflows/ci.yml # Pytest on push and pull request
.gitignore
src/
config.py # Paths, cost model, constants
utils.py # Logging, I/O helpers
data_pipeline.py # Load, quality checks, stratified split
feature_engineering.py # Behavioral features, ratios, risk score
modeling.py # Model registry, GridSearchCV, training
evaluation.py # Enterprise metrics, threshold optimization
explainability.py # Feature importance, SHAP, segment analysis
docs/
program_delivery_plan.md # Stakeholder map, RACI, phased roadmap, risk register
executive_brief.md # 1-page business summary
ml_system_design.md # Full system design + delivery considerations
decision_recommendations.md # 3 retention strategies with A/B test specs
interview_presentation.md # ML system design Q&A + TPM question bank
outputs/
charts/ # ROC curves, SHAP summary, feature importance
model_metrics/ # model_comparison.csv, all_model_metrics.json
data/
churn-bigml-80.csv
churn-bigml-20.csv
tests/
test_data_pipeline.py # 8 tests
test_feature_engineering.py # 6 tests
test_evaluation.py # 5 tests
notebooks/
exploratory_analysis.ipynb # Clean EDA
original_notebook.ipynb # Original student project (reference only)
run_pipeline.py # End-to-end pipeline runner
requirements.txt
git clone https://github.com/nammnjoshii/Customer-Retention-Intelligence-System-From-Churn-Prediction-to-Intervention-Strategy
cd Customer-Retention-Intelligence-System-From-Churn-Prediction-to-Intervention-Strategy
python3 -m venv .venv
source .venv/bin/activate # Windows: .venv\Scripts\activate
pip install -r requirements.txt
python run_pipeline.pyRun tests:
pytest tests/ -vOutputs generated in outputs/charts/ and outputs/model_metrics/
Charge columns dropped. All 4 charge columns have r=1.000 with their corresponding minute columns (charge = rate x minutes). They are billing artifacts, not features. The original notebook identified "Total day charge" as the top predictor, that was an artifact of including a derived variable, not a real behavioral driver.
PR-AUC over ROC-AUC. At 14.5% churn rate, a model predicting "never churns" achieves 85.5% accuracy and a misleading ROC-AUC. PR-AUC reflects true minority-class performance.
Stratified re-split. The Kaggle pre-split is arbitrary. A stratified 80/20 split guarantees class proportions are preserved, producing a representative holdout evaluation.
Cost-optimized threshold. Default 0.5 threshold is wrong when a missed churner costs 10x more than a false alarm. The system grid-searches thresholds to minimize total dollar cost.
Program delivery plan: docs/program_delivery_plan.md Executive brief: docs/executive_brief.md Full system design: docs/ml_system_design.md Retention strategy specifications: docs/decision_recommendations.md Interview preparation: docs/interview_presentation.md