This repository contains a complete, production-ready pipeline for predicting customer churn. It demonstrates a journey from initial data analysis and feature engineering to model training, deployment, and robust monitoring. The project emphasizes a data-centric approach and showcases an innovative vision for an LLM-augmented MLOps pipeline.
The final model is a CatBoost classifier on a challenging, synthetically engineered dataset designed to mirror real-world complexity.
To get started with the project, run the following commands:
pip install -r requirements-dev.txt
jupyter notebook 1_data_preprocessing.ipynb- End-to-End Workflow: Covers every step from data preprocessing and feature engineering to model training and deployment.
- Advanced Model Training: Utilizes CatBoost for high performance and Optuna for hyperparameter optimization.
- Model Explainability: Integrates SHAP for understanding model predictions.
- Production-Ready API: A FastAPI application (
app.py) serves the trained model for real-time predictions. - Containerized Deployment: Includes a
Dockerfilefor easy containerization and deployment. - Robust Monitoring: Implements a monitoring script using
evidentlyto detect data drift and ensure model health. - Innovative MLOps Concept: Proposes a next-generation, LLM-augmented monitoring system for automated root cause analysis and proactive testing.
The repository is organized into notebooks, scripts, and artifacts, providing a clear and reproducible workflow.
1_data_preprocessing.ipynb: Notebook for data loading, cleaning, extensive feature engineering, and creating the final, analysis-ready dataset.2_model_training.ipynb: Notebook for model training, hyperparameter optimization (using Optuna), evaluation, and SHAP-based explainability.3_deployment_and_monitoring.ipynb: Notebook that defines the production API with FastAPI, documents the scalable architecture, and implements the final, robust monitoring script withevidently.app.py: The Python script for the FastAPI prediction service.Dockerfile: Defines the container for deploying the FastAPI application.requirements-deploy.txt/requirements-dev.txt: Python dependencies for deployment and development, respectively..env.example: A template for providing API keys for LLM providers for the advanced monitoring features.artifacts/: Directory containing all output files, including the trained model, preprocessor, evaluation reports, and data files.README_SUMMARY.md: A detailed technical report summarizing the project's journey, decisions, and architecture.
This project proposes a forward-thinking MLOps design where Large Language Models (LLMs) are used to create a self-analyzing system. Instead of just flagging data drift, the pipeline can:
- Detect & Export: Automatically run an
evidentlymonitoring pipeline and export a machine-readabledrift_report.json. - Reason & Analyze: Use an LLM agent to perform a root cause analysis on the drift report, identifying the "why" behind the issue.
- Act & Triage: Programmatically create Jira tickets and Slack alerts based on the LLM's structured analysis.
- Test & Qualify: Generate new, targeted test cases with an "Adversarial Tester" LLM and use an "LLM Judge" to get qualitative insights into model performance on new data cohorts.
This creates a full, automated loop: Detect -> Reason -> Recommend -> Test -> Qualify, representing a next-generation approach to building and maintaining machine learning systems.
The project is designed with scalability in mind, proposing an architecture that can handle millions of daily predictions using technologies like Kafka for data streaming, Spark/Dask for distributed processing, and a Kubernetes-hosted Triton Inference Server for high-performance, auto-scaling model serving.