Telco Customer Churn Prediction

An end-to-end machine learning pipeline for predicting customer churn, built with scikit-learn, XGBoost, LightGBM, CatBoost, FastAPI, Next.js, and deployed on Google Cloud Run and Vercel.

Live Deployments

Service	URL	Description
Frontend	churn-prediction-frontend.vercel.app	Next.js Dashboard
API	churn-prediction-api.run.app	FastAPI Prediction Service
API Docs	/docs	Interactive Swagger UI
Health Check	/health	Service Health Status

System Architecture

High-Level Architecture

graph TB
    subgraph "Data Layer"
        A[IBM Telco Dataset] --> B[Data Loader]
        B --> C[Data Cleaner]
        C --> D[Feature Engineer]
    end

    subgraph "ML Pipeline"
        D --> E[Data Splitter]
        E --> F[Model Training]
        F --> G[Model Evaluation]
        G --> H[Best Model Selection]
    end

    subgraph "Deployment"
        H --> I[Docker Container]
        I --> J[Google Artifact Registry]
        J --> K[Google Cloud Run]
    end

    subgraph "Frontend"
        L[Next.js App] --> M[Vercel]
        M --> K
    end

    subgraph "CI/CD"
        N[GitHub Actions] --> I
        N --> L
    end

Request Flow

sequenceDiagram
    participant User
    participant Frontend as Frontend (Vercel)
    participant API as API (Cloud Run)
    participant Model as ML Model

    User->>Frontend: Enter customer data
    Frontend->>API: POST /predict
    API->>Model: Load and predict
    Model-->>API: Prediction result
    API-->>Frontend: JSON response
    Frontend-->>User: Display churn probability

Model Performance

Best Model: Logistic Regression (ROC-AUC: 84.17%)

Model Comparison

Rank	Model	ROC-AUC	F1-Score	Recall	Precision	Accuracy
1	Logistic Regression	0.8417	0.6173	0.7914	0.5060	0.7395
2	MLP Neural Network	0.8411	0.5892	0.6310	0.5528	0.7722
3	Random Forest	0.8407	0.6309	0.7540	0.5423	0.7658
4	CatBoost	0.8403	0.6198	0.8021	0.5051	0.7388
5	LightGBM	0.8357	0.6256	0.7460	0.5386	0.7630
6	Voting Ensemble	0.8332	0.6168	0.7273	0.5351	0.7587
7	XGBoost	0.8209	0.6168	0.7059	0.5477	0.7672
8	SVM	0.7962	0.6192	0.7914	0.5086	0.7417
9	Decision Tree	0.7573	0.5661	0.7273	0.4634	0.7040

Model Performance Visualization

xychart-beta
    title "Model ROC-AUC Comparison"
    x-axis ["LR", "MLP", "RF", "CatBoost", "LGBM", "Voting", "XGB", "SVM", "DT"]
    y-axis "ROC-AUC Score" 0.70 --> 0.90
    bar [0.8417, 0.8411, 0.8407, 0.8403, 0.8357, 0.8332, 0.8209, 0.7962, 0.7573]

Top Churn Predictors (SHAP Feature Importance)

xychart-beta
    title "SHAP Feature Importance"
    x-axis ["Contract-Tenure", "Tenure-Charge", "Fiber Optic", "Monthly Charges", "E-Check"]
    y-axis "Importance Score" 0 --> 0.70
    bar [0.61, 0.24, 0.23, 0.21, 0.15]

Business Metrics

Dataset: 7,043 customers (IBM Telco dataset)
Churn Rate: 26.5%
Model Recall: 80%+ (identifies 4 out of 5 churners)

Quick Start

Prerequisites

Python 3.10+
Node.js 18+ (for frontend)
Docker (optional)

Installation

# Clone repository
git clone https://github.com/WWI2196/telecom-churn-prediction.git
cd telecom-churn-prediction

# Create virtual environment
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

Train Models

# Run the full training pipeline
python src/train_pipeline.py

# Or run comprehensive model comparison
python src/compare_all_models.py

Start API Server

# Launch FastAPI server
uvicorn src.api.main:app --reload --port 8000

# Access API docs at http://localhost:8000/docs

Run Frontend

cd frontend
npm install
npm run dev

# Access dashboard at http://localhost:3000

Project Structure

telecom-churn-prediction/
├── .github/workflows/          # CI/CD pipelines
│   ├── ci.yml                  # Main CI/CD pipeline
│   ├── mlops.yml               # MLOps pipeline
│   └── frontend.yml            # Frontend deployment
├── config/
│   └── config.yaml             # Central configuration
├── data/
│   ├── raw/                    # Original dataset
│   ├── processed/              # Cleaned data
│   └── interim/                # Intermediate files
├── docs/                       # Documentation
│   ├── CI_CD_SETUP.md          # CI/CD guide
│   ├── MLOPS_PIPELINE.md       # MLOps documentation
│   ├── CONVERGENCE_GUIDE.md    # ML convergence guide
│   └── FEATURE_SELECTION_GUIDE.md
├── frontend/                   # Next.js dashboard
│   ├── src/
│   │   ├── app/                # App router pages
│   │   ├── components/         # UI components
│   │   └── lib/                # Utilities and API client
│   └── package.json
├── models/
│   ├── baseline/               # Classic ML models
│   ├── advanced/               # Boosting and DL models
│   └── best_model/             # Production model
├── reports/
│   ├── figures/                # Visualizations
│   └── metrics/                # Performance metrics
├── src/
│   ├── api/                    # FastAPI application
│   ├── data/                   # Data processing
│   ├── features/               # Feature engineering
│   ├── models/                 # Model classes
│   ├── visualization/          # Plotting utilities
│   └── train_pipeline.py       # Training orchestrator
├── tests/                      # Unit tests
├── Dockerfile.api              # API container
├── Dockerfile.streamlit        # Dashboard container
├── docker-compose.yml          # Local development
└── requirements.txt

CI/CD Pipeline

Pipeline Overview

flowchart LR
    subgraph "CI/CD Pipeline"
        A[Push to Main] --> B[Lint]
        B --> C[Test]
        C --> D[Security Scan]
        D --> E[Build Docker]
        E --> F[Trivy Scan]
        F --> G[Push to GHCR]
        G --> H[Push to Artifact Registry]
        H --> I[Deploy to Cloud Run]
    end

Detailed Pipeline Stages

flowchart TB
    subgraph "Stage 1: Code Quality"
        A1[Flake8 Linting]
        A2[Black Formatting]
        A3[isort Import Sorting]
    end

    subgraph "Stage 2: Testing"
        B1[Unit Tests]
        B2[Coverage Report]
        B3[Upload to Codecov]
    end

    subgraph "Stage 3: Security"
        C1[pip-audit]
        C2[Dependency Check]
    end

    subgraph "Stage 4: Build"
        D1[Build API Image]
        D2[Build Dashboard Image]
        D3[Save Artifacts]
    end

    subgraph "Stage 5: Container Security"
        E1[Trivy Container Scan]
        E2[Trivy IaC Scan]
        E3[SARIF Upload]
    end

    subgraph "Stage 6: Push"
        F1[Push to GHCR]
        F2[Tag with SHA]
    end

    subgraph "Stage 7: Deploy"
        G1[Push to Artifact Registry]
        G2[Deploy to Cloud Run]
        G3[Health Check]
    end

    A1 --> B1
    A2 --> B1
    A3 --> B1
    B1 --> C1
    C1 --> D1
    D1 --> E1
    E1 --> F1
    F1 --> G1

Workflow Files

Workflow	File	Trigger	Purpose
CI/CD Pipeline	`ci.yml`	Push to main/develop	Build, test, deploy API
MLOps Pipeline	`mlops.yml`	Push, weekly schedule	Train and validate models
Frontend CI/CD	`frontend.yml`	Changes to frontend/	Deploy to Vercel

GitHub Secrets Required

Secret	Description
`GCP_PROJECT_ID`	Google Cloud project ID
`GCP_SA_KEY`	Service account JSON key
`VERCEL_TOKEN`	Vercel deployment token
`VERCEL_ORG_ID`	Vercel organization ID
`VERCEL_PROJECT_ID`	Vercel project ID

GitHub Variables

Variable	Description
`API_URL`	Production API URL for frontend

MLOps Pipeline

Pipeline Overview

flowchart TB
    subgraph "Data Ingestion"
        A[Download Dataset] --> B[Data Validation]
        B --> C[Data Cleaning]
    end

    subgraph "Feature Engineering"
        C --> D[Feature Creation]
        D --> E[Feature Selection]
        E --> F[Data Splitting]
    end

    subgraph "Model Training"
        F --> G[Baseline Models]
        F --> H[Advanced Models]
        G --> I[Model Evaluation]
        H --> I
    end

    subgraph "Model Selection"
        I --> J{ROC-AUC >= 0.75?}
        J -->|Yes| K[Save Best Model]
        J -->|No| L[Training Failed]
    end

    subgraph "Deployment"
        K --> M[Build Container]
        M --> N[Deploy to Cloud Run]
    end

    subgraph "Monitoring"
        N --> O[Performance Tracking]
        O --> P[Drift Detection]
        P -->|Drift Detected| A
    end

Training Pipeline Detail

flowchart LR
    subgraph "Data Processing"
        A[Raw Data<br/>7,043 rows] --> B[Clean Data<br/>Remove nulls]
        B --> C[Engineer Features<br/>+15 new features]
    end

    subgraph "Splitting"
        C --> D[Train Set<br/>60%]
        C --> E[Validation Set<br/>20%]
        C --> F[Test Set<br/>20%]
    end

    subgraph "Training"
        D --> G[Train 9 Models]
        E --> H[Hyperparameter Tuning]
        G --> H
    end

    subgraph "Evaluation"
        H --> I[Evaluate on Test]
        F --> I
        I --> J[Select Best Model]
    end

Model Training Workflow

stateDiagram-v2
    [*] --> DataIngestion
    DataIngestion --> DataValidation
    DataValidation --> DataCleaning
    DataCleaning --> FeatureEngineering
    FeatureEngineering --> DataSplitting
    DataSplitting --> ModelTraining
    ModelTraining --> ModelEvaluation
    ModelEvaluation --> ThresholdCheck
    ThresholdCheck --> ModelDeployment: Pass
    ThresholdCheck --> TrainingFailed: Fail
    ModelDeployment --> Monitoring
    Monitoring --> [*]
    TrainingFailed --> [*]

API Documentation

Endpoints

Method	Endpoint	Description
GET	`/health`	Health check and model status
POST	`/predict`	Single customer prediction
POST	`/predict/batch`	Batch predictions
GET	`/docs`	Interactive API documentation

Request/Response Flow

sequenceDiagram
    participant Client
    participant FastAPI
    participant Validator
    participant Model
    participant Response

    Client->>FastAPI: POST /predict
    FastAPI->>Validator: Validate input
    Validator-->>FastAPI: Valid
    FastAPI->>Model: Load model
    Model->>Model: Preprocess features
    Model->>Model: Generate prediction
    Model-->>FastAPI: Prediction result
    FastAPI->>Response: Format JSON
    Response-->>Client: Return prediction

Example Request

curl -X POST "https://<your-service>.run.app/predict" \
  -H "Content-Type: application/json" \
  -d '{
    "tenure": 12,
    "MonthlyCharges": 75.50,
    "TotalCharges": 906.0,
    "Contract": "Month-to-month",
    "InternetService": "Fiber optic",
    "PaymentMethod": "Electronic check",
    "gender": "Male",
    "SeniorCitizen": 0,
    "Partner": "No",
    "Dependents": "No",
    "PhoneService": "Yes",
    "MultipleLines": "No",
    "OnlineSecurity": "No",
    "OnlineBackup": "No",
    "DeviceProtection": "No",
    "TechSupport": "No",
    "StreamingTV": "No",
    "StreamingMovies": "No",
    "PaperlessBilling": "Yes"
  }'

Example Response

{
  "customer_id": "CUST-001",
  "churn_probability": 0.72,
  "churn_prediction": "Yes",
  "risk_level": "High",
  "confidence": "85%"
}

Docker Deployment

Local Development

# Build and run all services
docker-compose up -d

# Services available:
# - API:       http://localhost:8000
# - Dashboard: http://localhost:8501

Container Architecture

graph TB
    subgraph "Docker Compose Stack"
        A[docker-compose.yml]
        
        subgraph "API Container"
            B[FastAPI]
            C[Uvicorn]
            D[ML Model]
        end
        
        subgraph "Dashboard Container"
            E[Streamlit]
            F[Visualization]
        end
        
        subgraph "Monitoring Stack"
            G[Prometheus]
            H[Grafana]
        end
    end
    
    A --> B
    A --> E
    A --> G
    B --> D
    G --> B
    H --> G

Production Deployment

# Full MLOps stack with monitoring
docker-compose -f docker-compose.monitoring.yml up -d

# Services:
# - API:        http://localhost:8000
# - Dashboard:  http://localhost:8501
# - Prometheus: http://localhost:9090
# - Grafana:    http://localhost:3000

Monitoring

Monitoring Architecture

graph LR
    subgraph "Application"
        A[FastAPI] --> B[Metrics Endpoint]
    end
    
    subgraph "Collection"
        B --> C[Prometheus]
    end
    
    subgraph "Visualization"
        C --> D[Grafana]
    end
    
    subgraph "Alerting"
        C --> E[AlertManager]
        E --> F[Email/Slack]
    end

Metrics Tracked

mindmap
  root((Metrics))
    Model Performance
      ROC-AUC
      Precision
      Recall
      F1-Score
    API Health
      Request Rate
      Latency P50/P95
      Error Rate
      Active Connections
    Data Quality
      Feature Drift
      Prediction Distribution
      Missing Values
    Infrastructure
      CPU Usage
      Memory Usage
      Container Health

Alert Rules

Alert	Condition	Severity
Model Accuracy Drop	ROC-AUC < 0.75	Critical
High API Latency	P95 > 500ms	Warning
Data Drift Detected	PSI > 0.2	Warning
High Error Rate	> 5% requests	Critical

Documentation

Document	Description
CI/CD Setup Guide	Complete CI/CD pipeline documentation
MLOps Pipeline Guide	ML pipeline architecture and configuration
Convergence Guide	Understanding model convergence
Feature Selection Guide	Feature engineering documentation
Models Guide	Model comparison and usage
API Reference	Interactive API documentation (see deployed `/docs` endpoint)

Configuration

Edit config/config.yaml to customize:

# Model settings
models:
  random_state: 42
  
# Data splitting
splitting:
  test_size: 0.2
  validation_size: 0.2
  
# Optimization
optimization:
  n_trials: 100
  cv_folds: 5
  
# Thresholds
thresholds:
  roc_auc_minimum: 0.75

Testing

# Run all tests
pytest tests/ -v

# Run with coverage
pytest tests/ --cov=src --cov-report=html

# Run specific test file
pytest tests/unit/test_data_loader.py -v

References

License

This project is licensed under the MIT License - see the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 52 Commits
.github/workflows		.github/workflows
config		config
data		data
deployment		deployment
docs		docs
frontend		frontend
models		models
notebooks		notebooks
reports		reports
scripts		scripts
src		src
tests		tests
.gitignore		.gitignore
Dockerfile.api		Dockerfile.api
Dockerfile.streamlit		Dockerfile.streamlit
README.md		README.md
docker-compose.monitoring.yml		docker-compose.monitoring.yml
docker-compose.prod.yml		docker-compose.prod.yml
docker-compose.yml		docker-compose.yml
requirements.txt		requirements.txt

WWI2196/telecom-churn-prediction

Folders and files

Latest commit

History

Repository files navigation

Telco Customer Churn Prediction

Table of Contents

Live Deployments

System Architecture

High-Level Architecture

Request Flow

Model Performance

Best Model: Logistic Regression (ROC-AUC: 84.17%)

Model Comparison

Model Performance Visualization

Top Churn Predictors (SHAP Feature Importance)

Business Metrics

Quick Start

Prerequisites

Installation

Train Models

Start API Server

Run Frontend

Project Structure

CI/CD Pipeline

Pipeline Overview

Detailed Pipeline Stages

Workflow Files

GitHub Secrets Required

GitHub Variables

MLOps Pipeline

Pipeline Overview

Training Pipeline Detail

Model Training Workflow

API Documentation

Endpoints

Request/Response Flow

Example Request

Example Response

Docker Deployment

Local Development

Container Architecture

Production Deployment

Monitoring

Monitoring Architecture

Metrics Tracked

Alert Rules

Documentation

Configuration

Testing

References

License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Packages 0

Uh oh!

Languages

Packages