This repository provides a demonstration of integrating agentic AI into MLOps workflows using LangChain, Hugging Face models, and MLflow.
The pipeline simulates data drift detection, risk assessment, and automatic retraining, while logging all results to MLflow for transparency and reproducibility.
- LLM-driven agents: Uses a large language model (LLaMA-3.1-8B) from Hugging Face via LangChain.
- Data drift detection: Implements a Kolmogorov–Smirnov (KS) test to detect drift between old and new data distributions.
- Risk assessment: Simple bias score comparison between historical and current accuracy.
- Model retraining: Retrains a PyTorch model on simulated new data if drift is detected and risk is low.
- Experiment tracking: Logs metrics, parameters, and models with MLflow for reproducibility and visualization.
agentmlops/
├── agent-mlops-pipeline.py # main Python script
├── requirements.txt # project dependencies
├── .env # template for Hugging Face API token
└── README.md # documentation (this file)
Follow these steps to set up and run the project on your local system.
git clone https://github.com/your-username/agentmlops.git
cd agentmlops
# Create virtual environment
python3 -m venv venv
# Activate (Mac/Linux)
source venv/bin/activate
# Activate (Windows PowerShell)
venv\Scripts\activate
pip install -r requirements.txt
-
Create a
.env
file in the same directory. -
Add your Hugging Face access token to
.env
:HUGGINGFACEHUB_API_TOKEN="your_token_here"
🔑 You need access to the LLaMA-3.1-8B model to run this demo.
python --version
pip list
Run the script:
python agent-mlops-pipeline.py
Expected output (example):
Deterministic pipeline result: ✅ Drift detected. Retrained model safely.
To visualize results with MLflow, open another terminal and start the UI:
mlflow ui
Then go to http://127.0.0.1:5000 in your browser.
You will see:
- Parameters:
deterministic_agent_response
(drift/risk/retrain result)llm_agent_response
(LLM agent output, if available)
- Metrics:
accuracy
(simulated model accuracy)historical_accuracy
(baseline reference)
- Artifacts:
- retrained model (logged via MLflow if retraining is triggered)
detect_drift
→ applies a KS-test between old and new predictions.assess_risk
→ compares current accuracy to historical accuracy.retrain_model
→ retrains a simple PyTorch model on new data.- LangChain Agent → orchestrates tools via natural language queries.
- MLflow logging → ensures reproducibility and experiment tracking.
The workflow is as follows:
- Generate dummy old and new datasets.
- Detect drift (shift simulated with added noise).
- If drift is detected:
- Assess risk.
- Retrain the model if safe.
- Log results to MLflow.
- Optionally, query the pipeline via the LLM agent for reasoning steps.
- This demo uses synthetic data and a simple linear model.
- It is designed to illustrate the integration of agentic AI with monitoring, retraining, and experiment tracking.
- Extend it with real datasets and production-grade models for applied research.
This repository is shared for academic and research purposes.