Skip to content

Conversation

Copy link

Copilot AI commented Dec 9, 2025

Summary

Extends MLflow tracking to sklearn-based classification tasks (SklearnClassification and PatchSklearnClassification). Previously only Lightning-based tasks had MLflow support, limiting experiment traceability for sklearn models.

Implementation:

  • Added MLflow logger infrastructure to base Task class with log_hyperparameters(), log_metrics(), and log_artifact() methods
  • Integrated logging into sklearn classification tasks during training and report generation
  • Logs model parameters, validation accuracy, git metadata, and uploads confusion matrices/results

Usage:

logger:
  mlflow:
    _target_: pytorch_lightning.loggers.mlflow.MLFlowLogger
    experiment_name: ${core.name}
    tracking_uri: ${oc.env:MLFLOW_TRACKING_URI}

core:
  upload_artifacts: true

Benefits:

  • Feature parity with Lightning tasks for sklearn models
  • Full experiment tracking with hyperparameters, metrics, and artifacts
  • Git commit tracking for reproducibility
  • 100% backward compatible (optional feature)

Type of Change

  • New feature (non-breaking change that adds functionality)

Checklist

  • I have tested my changes locally and they work as expected. (Added parametrized tests for both SklearnClassification and PatchSklearnClassification verifying logger initialization, hyperparameter/metric logging, and artifact upload)
  • I have added unit tests for my changes, or updated existing tests if necessary.
  • I have updated the documentation, if applicable.
  • I have installed pre-commit and run locally for my code changes.

Additional Information (Optional)

Files Modified:

  • quadra/tasks/base.py (+128 lines) - MLflow logger infrastructure
  • quadra/tasks/classification.py (+47 lines) - SklearnClassification integration
  • quadra/tasks/patch.py (+36 lines) - PatchSklearnClassification integration
  • tests/tasks/test_mlflow_sklearn.py (+115 lines) - Comprehensive test coverage

What Gets Logged:

  • Hyperparameters: model params, classifier type, device, precision, git info, config choices
  • Metrics: validation accuracy per split (sklearn) or overall (patch)
  • Artifacts: confusion matrices, results, configs, reconstruction outputs

All logging methods safely handle cases where MLflow is not configured (no-op behavior).

Original prompt

This section details on the original issue you should resolve

<issue_title>Integrate mlflow in sklearn classification training</issue_title>
<issue_description>### Description & Motivation

Right now we can track models on mlflow for all tasks that are lightning based, but we can't do it for sklearn based classification and patch classification.

It would be a nice addition to have this feature also for these two tasks as it will increase the amount of traceability of our models</issue_description>

Comments on the Issue (you are @copilot in this section)


✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.

Copilot AI changed the title [WIP] Integrate mlflow in sklearn classification training Integrate MLflow tracking for sklearn classification tasks Dec 9, 2025
Copilot AI requested a review from lorenzomammana December 9, 2025 15:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Integrate mlflow in sklearn classification training

2 participants