This repository demonstrates an end-to-end implementation of hyperparameter tuning using Bayesian optimization with Optuna and MLflow for experiment tracking.
The project provides a framework for optimizing machine learning models through systematic hyperparameter tuning using Bayesian search techniques. It leverages:
- MLflow for experiment tracking, model versioning, and artifact management
- Optuna for efficient hyperparameter optimization with Bayesian search
- Scikit-learn for model training and evaluation
- Hyperparameter tuning using Bayesian optimization
- Experiment tracking and model versioning with MLflow
- Model evaluation and metrics visualization
- Support for multiple optimization objectives
- Integration with popular ML frameworks
This project requires Python 3 and the following packages (included in requirements.txt):
- mlflow
- optuna
- scikit-learn
- pandas
- numpy
- matplotlib
- Clone this repository
- Install dependencies:
pip install -r requirements.txt
- Configure your experiment parameters in the configuration file
- Run hyperparameter optimization as specified in one of the Jupyter notebook:
This notebook implements Bayesian optimization with Optuna to tune RandomForestClassifier hyperparameters including:
notebooks/mlflow_experiments_bayesiansearch_hp_tuning.ipynb- n_estimators
- max_features
- max_depth
- max_samples
- bootstrap
- min_samples_split
- min_samples_leaf
- Track experiments using MLflow:
Access the MLflow UI at http://localhost:5000/
mlflow ui --host 0.0.0.0 --port 5000
- Data preprocessing and feature engineering
- Define model architecture and tunable parameters
- Run Optuna optimization with Bayesian search
- Track all experiments with MLflow
- Select the best model based on evaluation metrics
- Export and save the optimized model
This project includes integration with other tools from the requirements.txt:
- Pandas and NumPy for data manipulation
- Matplotlib for visualization
MIT License
- Pima Indians Diabetes Database for model training and evaluation
- MLflow and Optuna open-source communities