A practical implementation of uplift modeling for customer targeting optimization. This project demonstrates complete uplift modeling workflow from synthetic data generation to financial impact analysis.
Uplift modeling (also called heterogeneous treatment effect estimation) predicts the incremental value of targeting each customer rather than the probability of conversion. The key insight is that not all customers respond equally to marketing:
- Persuadables: Will convert only if contacted (positive uplift)
- Sure Things: Would convert without contact (no need to contact)
- Lost Causes: Won't convert regardless of contact (wasted effort)
- Sleeping Dogs: Will NOT convert if contacted, but would convert without contact (negative uplift - avoid!)
| File | Description |
|---|---|
uplift_modeling.py |
Main implementation with data generation, model training, and evaluation |
uplift_modeling.ipynb |
Rendered script in jupyter notebook format |
uplift_data.csv |
Generated synthetic dataset |
README.md |
This file |
- Generates 10,000 synthetic customers with RFM features (Recency, Frequency, Monetary)
- Includes demographic features (age, gender)
- Creates 5 hidden customer segments with known true uplift values
- Simulates A/B test with 50/50 treatment assignment
Three uplift modeling approaches are compared:
- S-learner (SoloModel): Single model with treatment as a feature
- T-learner (TwoModels): Separate models for treatment and control groups
- Class Transformation (CVT): Transforms the problem to classification
- Qini AUC: Measures ranking quality of uplift predictions
- Uplift@K: Average uplift in top K% of customers
- Profit curves showing expected profit at different targeting fractions
- Optimal contact percentage calculation
- ROI comparison vs mass mailing
uv sync
uv run uplift_modeling.pyThe individual treatment effect: τ(x) = P(Y=1|X=x, T=1) - P(Y=1|X=x, T=0)
Similar to ROC curve but for uplift - plots cumulative uplift against fraction of population targeted. Area under curve (Qini AUC) measures ranking quality.
Shows expected profit as function of contact fraction, accounting for:
- Cost per contact
- Value per conversion (margin)
The optimal targeting fraction balances contact cost against incremental conversions.