An end-to-end machine learning pipeline for predicting customer churn, built with scikit-learn, XGBoost, LightGBM, CatBoost, FastAPI, Next.js, and deployed on Google Cloud Run and Vercel.
- Live Deployments
- System Architecture
- Model Performance
- Quick Start
- Project Structure
- CI/CD Pipeline
- MLOps Pipeline
- API Documentation
- Docker Deployment
- Monitoring
- Documentation
- License
| Service | URL | Description |
|---|---|---|
| Frontend | churn-prediction-frontend.vercel.app | Next.js Dashboard |
| API | churn-prediction-api.run.app | FastAPI Prediction Service |
| API Docs | /docs | Interactive Swagger UI |
| Health Check | /health | Service Health Status |
graph TB
subgraph "Data Layer"
A[IBM Telco Dataset] --> B[Data Loader]
B --> C[Data Cleaner]
C --> D[Feature Engineer]
end
subgraph "ML Pipeline"
D --> E[Data Splitter]
E --> F[Model Training]
F --> G[Model Evaluation]
G --> H[Best Model Selection]
end
subgraph "Deployment"
H --> I[Docker Container]
I --> J[Google Artifact Registry]
J --> K[Google Cloud Run]
end
subgraph "Frontend"
L[Next.js App] --> M[Vercel]
M --> K
end
subgraph "CI/CD"
N[GitHub Actions] --> I
N --> L
end
sequenceDiagram
participant User
participant Frontend as Frontend (Vercel)
participant API as API (Cloud Run)
participant Model as ML Model
User->>Frontend: Enter customer data
Frontend->>API: POST /predict
API->>Model: Load and predict
Model-->>API: Prediction result
API-->>Frontend: JSON response
Frontend-->>User: Display churn probability
| Rank | Model | ROC-AUC | F1-Score | Recall | Precision | Accuracy |
|---|---|---|---|---|---|---|
| 1 | Logistic Regression | 0.8417 | 0.6173 | 0.7914 | 0.5060 | 0.7395 |
| 2 | MLP Neural Network | 0.8411 | 0.5892 | 0.6310 | 0.5528 | 0.7722 |
| 3 | Random Forest | 0.8407 | 0.6309 | 0.7540 | 0.5423 | 0.7658 |
| 4 | CatBoost | 0.8403 | 0.6198 | 0.8021 | 0.5051 | 0.7388 |
| 5 | LightGBM | 0.8357 | 0.6256 | 0.7460 | 0.5386 | 0.7630 |
| 6 | Voting Ensemble | 0.8332 | 0.6168 | 0.7273 | 0.5351 | 0.7587 |
| 7 | XGBoost | 0.8209 | 0.6168 | 0.7059 | 0.5477 | 0.7672 |
| 8 | SVM | 0.7962 | 0.6192 | 0.7914 | 0.5086 | 0.7417 |
| 9 | Decision Tree | 0.7573 | 0.5661 | 0.7273 | 0.4634 | 0.7040 |
xychart-beta
title "Model ROC-AUC Comparison"
x-axis ["LR", "MLP", "RF", "CatBoost", "LGBM", "Voting", "XGB", "SVM", "DT"]
y-axis "ROC-AUC Score" 0.70 --> 0.90
bar [0.8417, 0.8411, 0.8407, 0.8403, 0.8357, 0.8332, 0.8209, 0.7962, 0.7573]
xychart-beta
title "SHAP Feature Importance"
x-axis ["Contract-Tenure", "Tenure-Charge", "Fiber Optic", "Monthly Charges", "E-Check"]
y-axis "Importance Score" 0 --> 0.70
bar [0.61, 0.24, 0.23, 0.21, 0.15]
- Dataset: 7,043 customers (IBM Telco dataset)
- Churn Rate: 26.5%
- Model Recall: 80%+ (identifies 4 out of 5 churners)
- Python 3.10+
- Node.js 18+ (for frontend)
- Docker (optional)
# Clone repository
git clone https://github.com/WWI2196/telecom-churn-prediction.git
cd telecom-churn-prediction
# Create virtual environment
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt# Run the full training pipeline
python src/train_pipeline.py
# Or run comprehensive model comparison
python src/compare_all_models.py# Launch FastAPI server
uvicorn src.api.main:app --reload --port 8000
# Access API docs at http://localhost:8000/docscd frontend
npm install
npm run dev
# Access dashboard at http://localhost:3000telecom-churn-prediction/
├── .github/workflows/ # CI/CD pipelines
│ ├── ci.yml # Main CI/CD pipeline
│ ├── mlops.yml # MLOps pipeline
│ └── frontend.yml # Frontend deployment
├── config/
│ └── config.yaml # Central configuration
├── data/
│ ├── raw/ # Original dataset
│ ├── processed/ # Cleaned data
│ └── interim/ # Intermediate files
├── docs/ # Documentation
│ ├── CI_CD_SETUP.md # CI/CD guide
│ ├── MLOPS_PIPELINE.md # MLOps documentation
│ ├── CONVERGENCE_GUIDE.md # ML convergence guide
│ └── FEATURE_SELECTION_GUIDE.md
├── frontend/ # Next.js dashboard
│ ├── src/
│ │ ├── app/ # App router pages
│ │ ├── components/ # UI components
│ │ └── lib/ # Utilities and API client
│ └── package.json
├── models/
│ ├── baseline/ # Classic ML models
│ ├── advanced/ # Boosting and DL models
│ └── best_model/ # Production model
├── reports/
│ ├── figures/ # Visualizations
│ └── metrics/ # Performance metrics
├── src/
│ ├── api/ # FastAPI application
│ ├── data/ # Data processing
│ ├── features/ # Feature engineering
│ ├── models/ # Model classes
│ ├── visualization/ # Plotting utilities
│ └── train_pipeline.py # Training orchestrator
├── tests/ # Unit tests
├── Dockerfile.api # API container
├── Dockerfile.streamlit # Dashboard container
├── docker-compose.yml # Local development
└── requirements.txt
flowchart LR
subgraph "CI/CD Pipeline"
A[Push to Main] --> B[Lint]
B --> C[Test]
C --> D[Security Scan]
D --> E[Build Docker]
E --> F[Trivy Scan]
F --> G[Push to GHCR]
G --> H[Push to Artifact Registry]
H --> I[Deploy to Cloud Run]
end
flowchart TB
subgraph "Stage 1: Code Quality"
A1[Flake8 Linting]
A2[Black Formatting]
A3[isort Import Sorting]
end
subgraph "Stage 2: Testing"
B1[Unit Tests]
B2[Coverage Report]
B3[Upload to Codecov]
end
subgraph "Stage 3: Security"
C1[pip-audit]
C2[Dependency Check]
end
subgraph "Stage 4: Build"
D1[Build API Image]
D2[Build Dashboard Image]
D3[Save Artifacts]
end
subgraph "Stage 5: Container Security"
E1[Trivy Container Scan]
E2[Trivy IaC Scan]
E3[SARIF Upload]
end
subgraph "Stage 6: Push"
F1[Push to GHCR]
F2[Tag with SHA]
end
subgraph "Stage 7: Deploy"
G1[Push to Artifact Registry]
G2[Deploy to Cloud Run]
G3[Health Check]
end
A1 --> B1
A2 --> B1
A3 --> B1
B1 --> C1
C1 --> D1
D1 --> E1
E1 --> F1
F1 --> G1
| Workflow | File | Trigger | Purpose |
|---|---|---|---|
| CI/CD Pipeline | ci.yml |
Push to main/develop | Build, test, deploy API |
| MLOps Pipeline | mlops.yml |
Push, weekly schedule | Train and validate models |
| Frontend CI/CD | frontend.yml |
Changes to frontend/ | Deploy to Vercel |
| Secret | Description |
|---|---|
GCP_PROJECT_ID |
Google Cloud project ID |
GCP_SA_KEY |
Service account JSON key |
VERCEL_TOKEN |
Vercel deployment token |
VERCEL_ORG_ID |
Vercel organization ID |
VERCEL_PROJECT_ID |
Vercel project ID |
| Variable | Description |
|---|---|
API_URL |
Production API URL for frontend |
flowchart TB
subgraph "Data Ingestion"
A[Download Dataset] --> B[Data Validation]
B --> C[Data Cleaning]
end
subgraph "Feature Engineering"
C --> D[Feature Creation]
D --> E[Feature Selection]
E --> F[Data Splitting]
end
subgraph "Model Training"
F --> G[Baseline Models]
F --> H[Advanced Models]
G --> I[Model Evaluation]
H --> I
end
subgraph "Model Selection"
I --> J{ROC-AUC >= 0.75?}
J -->|Yes| K[Save Best Model]
J -->|No| L[Training Failed]
end
subgraph "Deployment"
K --> M[Build Container]
M --> N[Deploy to Cloud Run]
end
subgraph "Monitoring"
N --> O[Performance Tracking]
O --> P[Drift Detection]
P -->|Drift Detected| A
end
flowchart LR
subgraph "Data Processing"
A[Raw Data<br/>7,043 rows] --> B[Clean Data<br/>Remove nulls]
B --> C[Engineer Features<br/>+15 new features]
end
subgraph "Splitting"
C --> D[Train Set<br/>60%]
C --> E[Validation Set<br/>20%]
C --> F[Test Set<br/>20%]
end
subgraph "Training"
D --> G[Train 9 Models]
E --> H[Hyperparameter Tuning]
G --> H
end
subgraph "Evaluation"
H --> I[Evaluate on Test]
F --> I
I --> J[Select Best Model]
end
stateDiagram-v2
[*] --> DataIngestion
DataIngestion --> DataValidation
DataValidation --> DataCleaning
DataCleaning --> FeatureEngineering
FeatureEngineering --> DataSplitting
DataSplitting --> ModelTraining
ModelTraining --> ModelEvaluation
ModelEvaluation --> ThresholdCheck
ThresholdCheck --> ModelDeployment: Pass
ThresholdCheck --> TrainingFailed: Fail
ModelDeployment --> Monitoring
Monitoring --> [*]
TrainingFailed --> [*]
| Method | Endpoint | Description |
|---|---|---|
| GET | /health |
Health check and model status |
| POST | /predict |
Single customer prediction |
| POST | /predict/batch |
Batch predictions |
| GET | /docs |
Interactive API documentation |
sequenceDiagram
participant Client
participant FastAPI
participant Validator
participant Model
participant Response
Client->>FastAPI: POST /predict
FastAPI->>Validator: Validate input
Validator-->>FastAPI: Valid
FastAPI->>Model: Load model
Model->>Model: Preprocess features
Model->>Model: Generate prediction
Model-->>FastAPI: Prediction result
FastAPI->>Response: Format JSON
Response-->>Client: Return prediction
curl -X POST "https://<your-service>.run.app/predict" \
-H "Content-Type: application/json" \
-d '{
"tenure": 12,
"MonthlyCharges": 75.50,
"TotalCharges": 906.0,
"Contract": "Month-to-month",
"InternetService": "Fiber optic",
"PaymentMethod": "Electronic check",
"gender": "Male",
"SeniorCitizen": 0,
"Partner": "No",
"Dependents": "No",
"PhoneService": "Yes",
"MultipleLines": "No",
"OnlineSecurity": "No",
"OnlineBackup": "No",
"DeviceProtection": "No",
"TechSupport": "No",
"StreamingTV": "No",
"StreamingMovies": "No",
"PaperlessBilling": "Yes"
}'{
"customer_id": "CUST-001",
"churn_probability": 0.72,
"churn_prediction": "Yes",
"risk_level": "High",
"confidence": "85%"
}# Build and run all services
docker-compose up -d
# Services available:
# - API: http://localhost:8000
# - Dashboard: http://localhost:8501graph TB
subgraph "Docker Compose Stack"
A[docker-compose.yml]
subgraph "API Container"
B[FastAPI]
C[Uvicorn]
D[ML Model]
end
subgraph "Dashboard Container"
E[Streamlit]
F[Visualization]
end
subgraph "Monitoring Stack"
G[Prometheus]
H[Grafana]
end
end
A --> B
A --> E
A --> G
B --> D
G --> B
H --> G
# Full MLOps stack with monitoring
docker-compose -f docker-compose.monitoring.yml up -d
# Services:
# - API: http://localhost:8000
# - Dashboard: http://localhost:8501
# - Prometheus: http://localhost:9090
# - Grafana: http://localhost:3000graph LR
subgraph "Application"
A[FastAPI] --> B[Metrics Endpoint]
end
subgraph "Collection"
B --> C[Prometheus]
end
subgraph "Visualization"
C --> D[Grafana]
end
subgraph "Alerting"
C --> E[AlertManager]
E --> F[Email/Slack]
end
mindmap
root((Metrics))
Model Performance
ROC-AUC
Precision
Recall
F1-Score
API Health
Request Rate
Latency P50/P95
Error Rate
Active Connections
Data Quality
Feature Drift
Prediction Distribution
Missing Values
Infrastructure
CPU Usage
Memory Usage
Container Health
| Alert | Condition | Severity |
|---|---|---|
| Model Accuracy Drop | ROC-AUC < 0.75 | Critical |
| High API Latency | P95 > 500ms | Warning |
| Data Drift Detected | PSI > 0.2 | Warning |
| High Error Rate | > 5% requests | Critical |
| Document | Description |
|---|---|
| CI/CD Setup Guide | Complete CI/CD pipeline documentation |
| MLOps Pipeline Guide | ML pipeline architecture and configuration |
| Convergence Guide | Understanding model convergence |
| Feature Selection Guide | Feature engineering documentation |
| Models Guide | Model comparison and usage |
| API Reference | Interactive API documentation (see deployed /docs endpoint) |
Edit config/config.yaml to customize:
# Model settings
models:
random_state: 42
# Data splitting
splitting:
test_size: 0.2
validation_size: 0.2
# Optimization
optimization:
n_trials: 100
cv_folds: 5
# Thresholds
thresholds:
roc_auc_minimum: 0.75# Run all tests
pytest tests/ -v
# Run with coverage
pytest tests/ --cov=src --cov-report=html
# Run specific test file
pytest tests/unit/test_data_loader.py -v- IBM Telco Customer Churn Dataset
- SHAP: SHapley Additive exPlanations
- Optuna: Hyperparameter Optimization
- FastAPI Documentation
- Google Cloud Run
This project is licensed under the MIT License - see the LICENSE file for details.