This repository provides an end-to-end MLOps pipeline for managing, tracking, and automating machine learning experiments.
The project integrates MLflow, Weights & Biases (W&B), SQL for experiment analysis, and CI/CD automation.
MLOps_Project/
├── data/ # Raw and processed datasets
├── mlruns/ # MLflow tracking logs
├── models/ # Saved models
├── notebook/ # Jupyter notebooks for analysis
├── sql_queries/ # SQL scripts for MLflow experiments analysis
├── src/ # Core source code
│ ├── clean_data/ # Data preprocessing scripts
│ ├── download_data/ # Data downloading scripts
│ ├── feature_engineering/ # Feature transformation scripts
│ ├── model_training/ # Model training scripts
│ ├── model_deployment/ # API for model deployment
│ ├── utils/ # Utility functions
├── sweeps/ # W&B sweep scripts for hyperparameter tuning
├── wandb/ # Weights & Biases logs
├── config/ # Project configuration files
├── .github/workflows/ # CI/CD pipeline
├── .gitignore # Ignore unnecessary files
├── environment.yaml # Conda environment dependencies
├── remove_russian_comments.py # Script to remove Russian comments from the code
├── requirements.txt # Python dependencies
├── README.md # Project documentation
- MLflow – Experiment tracking and model registry
- Weights & Biases (W&B) – Logging and hyperparameter sweeps
- PostgreSQL ️ – SQL for tracking and querying experiments
- XGBoost – Machine learning model
- Python – Main programming language
- GitHub Actions ⚙️ – CI/CD automation
git clone https://github.com/your-username/mlops_project.git
cd mlops_projectconda env create -f environment.yaml
conda activate mlops_envOR
python -m venv venv
source venv/bin/activate # On macOS/Linux
venv\Scripts\activate # On Windows
pip install -r requirements.txtpython src/clean_data/run.pypython src/model_training/run.pypython sweeps/sweep.pymlflow ui --host 0.0.0.0 --port 5000Then open http://localhost:5000 in your browser.
✅ MLflow & W&B integration
✅ SQL experiment analysis
✅ CI/CD with GitHub Actions
This project is distributed under the MIT License. Feel free to use the code! 🚀
💻 GitHub Repository: Evgenii Matveev
🌐 Portfolio: Data Science Portfolio
📌 LinkedIn: Evgenii Matveev
🔥 If you like this project, don't forget to star ⭐ the repository! 🔥