Production-Ready MLOps Workflow 🚀

An industrialized ML pipeline that transforms a ML model into a scalable, tested, and containerized microservice.

📑 Table of Contents

⚡ Quick Start
🎯 Project Purpose
📁 Project Structure
🛠️ Technical Documentation
🔄 CI/CD & Quality Control
🚀 Future & Tech Stack

Quick Start (30 seconds)

If you have Docker installed, you can spin up the entire ecosystem with a single command:

docker compose -f config/docker-compose.yml up --build

API: http://localhost:5000
UI: http://localhost:8501

Project Purpose

This project demonstrates production-ready MLOps practices rather than focusing solely on achieving state-of-the-art model performance. The Wisconsin Breast Cancer dataset is used as a proof-of-concept to validate the MLOps infrastructure.

The goal is to showcase best practices in:

Reproducible ML Pipelines: Using scikit-learn pipelines for consistent preprocessing and inference
API Design: Building robust REST APIs with proper validation and error handling
Containerization: Multi-service architecture with Docker Compose
CI/CD: Automated quality gates, testing, and deployment pipelines
Code Quality: Type checking, linting, and comprehensive testing

The real question this project answers:

"How do I ensure my ML model works the same way in production as it does in development?"

This project solves this through container immutability and environment parity.

Project Structure

production-ready-mlops-workflow/
├── .github/workflows/         # 🔄 CI/CD: Quality gates & automated deployment
├── config/                    # 🐳 Docker: Multi-container orchestration
├── data/                      # 📊 Dataset storage (raw & processed)
├── models/                    # 🧠 Trained model artifacts (joblib)
├── notebooks/                 # 📓 EDA and experimentation
├── reports/                   # 📈 Generated metrics and figures
├── src/                       # 🛠️ Source code
│   ├── app.py                 # 🌐 Inference API (Flask)
│   ├── schemas.py             # ✅ Data validation (Pydantic)
│   └── model/                 # 🚂 Training and inference logic
├── tests/                     # 🧪 Unit & Integration test suite
└── pyproject.toml             # 📦 Dependency management (uv)

Technical Documentation

Setup

Prerequisites

Python 3.12+
Docker and Docker Compose (for containerized deployment)
uv (recommended) or pip for dependency management

Using `uv` (Recommended)

uv is a fast Python package installer written in Rust, offering faster dependency resolution and installation.

Clone the repository:

git clone https://github.com/anibalrojosan/production-ready-mlops-workflow
cd production-ready-mlops-workflow

Install uv globally (if needed):

# On macOS/Linux:
curl -LsSf https://astral.sh/uv/install.sh | sh
# Or via pip:
pip install uv

Install dependencies:
```
uv sync --all-groups
```
Activate the virtual env (optional):
```
source .venv/bin/activate
```
Note: you can run commands using uv run if you don't want to activate the virtual env.

Using pip (Alternative)

Create and activate virtual environment:

python -m venv .venv
# On Windows:
.\.venv\Scripts\Activate.ps1
# On Linux/macOS:
source .venv/bin/activate

Upgrade pip:
```
pip install --upgrade pip
```
Install dependencies:

Option A: using the requirements.txt (recommended for production).
```
pip install -r requirements.txt
```
Option B: using the pyproject.toml (recommended for development).
```
pip install .
```

Running Tests

The project includes comprehensive tests with a coverage requirement of 80%+.

Run all tests:

uv run pytest

Run with verbose output:

uv run pytest -v

Run with coverage report:

uv run pytest --cov=src --cov-report=term-missing

Training the Model

Train the ML pipeline and save the model artifact:

uv run python -m src.model.model_training

This will:

Load and preprocess the data
Train a Random Forest classifier within a scikit-learn pipeline
Evaluate the pipeline
Save the complete trained pipeline to models/model.joblib

Note: The model must be trained before running the API.

Running the API

Start the Flask API locally:

uv run python -m src.app

The API will be accessible at http://127.0.0.1:5000/

API Endpoints

The API exposes a POST /predict endpoint that accepts features as JSON and returns the prediction with probabilities. It also includes a GET / health check endpoint to verify service and model status.

For full validation details and data structures, refer to the Pydantic schemas in src/schemas.py.

Example using test scripts:

Linux/macOS: ./tests/integration/bash_test.sh
Windows PowerShell: .\tests\integration\powershell_test.ps1

Streamlit UI

The Streamlit application provides an interactive web interface for making predictions:

streamlit run src/streamlit_app.py

Ensure the Flask API is running first. The UI will open at http://localhost:8501.

Docker Deployment

The project uses Docker Compose to orchestrate both the Flask API and Streamlit UI services.

Build and Run

docker compose -f config/docker-compose.yml up --build -d

This will:

Build optimized images using multi-stage Docker builds
Start both API and Streamlit services
Make API available at http://localhost:5000/
Make Streamlit UI available at http://localhost:8501/

Stop Services

docker compose -f config/docker-compose.yml down

CI/CD & Quality Control

The project implements a continuous integration pipeline that acts as a quality filter (Quality Gates):

Static Analysis: ruff for linting and mypy for strict typing.
Automated Testing: pytest with a minimum coverage requirement of 80%.
Container Security: Multi-stage Docker builds for lightweight and secure images.
Integration Tests: Endpoint validation in isolated containers before deployment.

Pipeline Details (GitHub Actions)

The workflow defined in .github/workflows/main.yml includes:

Quality Gates Job

Linting: ruff for code style and quality
Type Checking: mypy for static type analysis
Testing: pytest with coverage reporting
Coverage Requirement: 80% minimum (pipeline fails if below)

Build & Deploy Job (runs after quality gates pass)

Train Model: Train and save model artifact
Build Docker Images: Create optimized container images
Push to Docker Hub: Store images in registry
Integration Testing: Test services in isolated containers
Health Checks: Verify API and UI endpoints
Cleanup: Remove test containers

This ensures that only tested and validated code reaches production.

Future & Tech Stack

🔮 Future Improvements

Potential enhancements to further strengthen the MLOps workflow:

Model Versioning: Implement MLFlow for experiment tracking and model registry.
Monitoring: Add model performance monitoring and drift detection.
A/B Testing: Framework for comparing model versions in production.
Feature Store: Centralized feature management for multiple models.
Automated Retraining: Scheduled retraining based on data drift or performance degradation.

📚 Technologies Used

ML Stack: scikit-learn (Pipeline & Models), pandas, joblib.
Backend & UI: Flask (Inference API), Streamlit (Interactive Dashboard).
Modern Tooling: uv (Package Manager), ruff (Linter), mypy (Type Checking), pydantic (Validation).
Testing: pytest, pytest-mock, pytest-cov.
Infrastructure: Docker, Docker Compose, GitHub Actions.

This project was developed with ❤️ by Anibal Rojo as a proof of concept for a real-world MLOps pipeline.

Remember: The value of this project is in the engineering practices, not the model metrics. These practices ensure your ML models work reliably in production, regardless of the problem domain or dataset complexity.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Production-Ready MLOps Workflow 🚀

📑 Table of Contents

Quick Start (30 seconds)

Project Purpose

Project Structure

Technical Documentation

Setup

Prerequisites

Using `uv` (Recommended)

Running Tests

Training the Model

Running the API

API Endpoints

Streamlit UI

Docker Deployment

Build and Run

Stop Services

CI/CD & Quality Control

Pipeline Details (GitHub Actions)

Quality Gates Job

Build & Deploy Job (runs after quality gates pass)

Future & Tech Stack

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 87 Commits
.github/workflows		.github/workflows
config		config
data		data
doc		doc
notebooks		notebooks
reports		reports
src		src
tests		tests
.coveragerc		.coveragerc
.dockerignore		.dockerignore
.gitignore		.gitignore
.python-version		.python-version
README.md		README.md
pyproject.toml		pyproject.toml
pytest.ini		pytest.ini
requirements.txt		requirements.txt
uv.lock		uv.lock

Folders and files

Latest commit

History

Repository files navigation

Production-Ready MLOps Workflow 🚀

📑 Table of Contents

Quick Start (30 seconds)

Project Purpose

Project Structure

Technical Documentation

Setup

Prerequisites

Using uv (Recommended)

Running Tests

Training the Model

Running the API

API Endpoints

Streamlit UI

Docker Deployment

Build and Run

Stop Services

CI/CD & Quality Control

Pipeline Details (GitHub Actions)

Quality Gates Job

Build & Deploy Job (runs after quality gates pass)

Future & Tech Stack

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Using `uv` (Recommended)

Packages