feat: Compare `CrossValidationReport`s #1512

auguste-probabl · 2025-04-02T09:24:25Z

This PR makes it possible to pass a list of CrossValidationReports to the ComparisonReport (which currently only accepts EstimatorReports).

For now, plots like Comparison.metrics.roc_curve are not supported for ComparisonReports holding CrossValidationReports.

Example:

>>> from sklearn.datasets import make_classification
>>> from sklearn.linear_model import LogisticRegression
>>> from skore import ComparisonReport, CrossValidationReport
>>> X, y = make_classification(random_state=42)
>>> estimator_1 = LogisticRegression()
>>> estimator_2 = LogisticRegression(C=2)  # Different regularization
>>> report_1 = CrossValidationReport(estimator_1, X, y)
>>> report_2 = CrossValidationReport(estimator_2, X, y)
>>> report = ComparisonReport({"model1": report_1, "model2": report_2})
>>> report.metrics.report_metrics(aggregate=None)
                                                    Value
Metric       Label / Average Estimator Split
Precision    0               model1    Split #0  1.000000
                                       Split #1  1.000000
                                       Split #2  1.000000
                                       Split #3  1.000000
                                       Split #4  0.909091
...                                                   ...
Predict time                 model2    Split #0  0.000185
                                       Split #1  0.000177
                                       Split #2  0.000187
                                       Split #3  0.000177
                                       Split #4  0.000188

[80 rows x 1 columns]
>>> report.metrics.report_metrics()
                                            mean       std
Metric       Label / Average Estimator
Precision    0               model1     0.981818  0.040656
                             model2     1.000000  0.000000
             1               model1     0.981818  0.040656
                             model2     0.981818  0.040656
Recall       0               model1     0.980000  0.044721
                             model2     0.980000  0.044721
             1               model1     0.980000  0.044721
                             model2     1.000000  0.000000
ROC AUC                      model1     0.996000  0.008944
                             model2     0.996000  0.008944
Brier score                  model1     0.022532  0.013130
                             model2     0.018731  0.014566
Fit time                     model1     0.001959  0.000559
                             model2     0.002603  0.000185
Predict time                 model1     0.000198  0.000014
                             model2     0.000183  0.000005

Closes #1414

Todo:

github-actions · 2025-04-02T09:33:50Z

Documentation preview @ 19b6926

skore/src/skore/sklearn/_cross_validation_comparison/metrics_accessor.py

skore/src/skore/sklearn/_cross_validation/metrics_accessor.py

auguste-probabl · 2025-04-07T09:00:18Z

~~Discussed IRL: The code common between ComparisonReport and CrossValidationComparisonReport should be factored in a BaseComparisonReport.~~
It makes more sense for us to have just ComparisonReport, that can take either EstimatorReports or CVReports.

skore/src/skore/sklearn/_cross_validation/metrics_accessor.py

auguste-probabl · 2025-04-11T09:21:44Z

skore/src/skore/sklearn/_comparison/report.py

+        if len(set(id(report) for report in reports_list)) < len(reports_list):
+            raise ValueError("Expected reports to be distinct objects")


This new constraint results from #1536; indeed, if the same report is passed twice, when we reset the progress bar at the end of the first CVReport computation (see below), when the second CVReport computation starts, the progress object is set to None, whereas it should be set to the ComparisonReport's progress object.

skore/skore/src/skore/utils/_progress_bar.py

Lines 85 to 86 in 836f171

self_obj._parent_progress = None

self_obj._progress_info = None

Note that right now this constraint is only really useful when comparing CVReports, not when comparing EstimatorReports. I didn't put it in the if-block because I can imagine that one day EstimatorReports might have more progress bars of their own.

skore/src/skore/sklearn/_comparison/report.py

skore/src/skore/utils/_progress_bar.py

skore/src/skore/sklearn/_estimator/metrics_accessor.py

auguste-probabl · 2025-04-11T09:24:35Z

skore/tests/unit/sklearn/comparison/estimator/test_comparison.py

@@ -65,7 +67,7 @@ def test_comparison_report_without_testing_data(binary_classification_model):
    estimator, X_train, _, y_train, _ = binary_classification_model
    estimator_report = EstimatorReport(estimator, X_train=X_train, y_train=y_train)

-    report = ComparisonReport([estimator_report, estimator_report])


This is necessary because of the new constraint that all reports must be distinct

skore/src/skore/sklearn/_comparison/report.py

This originates from a bug when implementing comparison of CrossValidationReport in #1512 Because CrossValidationReports hold a `_parent_progress`, when a ComparisonReport creates a progress bar to iterate over CrossValidationReports, the outer progress bar conflicts with the inner progress bars, and rich refuses to proceed. The solution is for the ComparisonReport to explicitly set its inner CrossValidationReports' progress instance, so that in total there is only one progress instance. But before this change, the progress instance was sometimes owned by a `CrossValidationReport.metrics` accessor. This is a problem because accessors are re-instantiated whenever they are accessed, so their state cannot be modified from the parent. The solution this change implements is to remove all `progress`-related attributes from all accessors, and to ensure that the progress instance is only owned by the Report object, not by any of its accessors.

rename function

glemaitre

Let's merge now. @auguste-probabl could you open an issue to track the improvement/refactoring for the progress bar to not forget.

@auguste-probabl

…les (#1623) Following #1512 With the help of @auguste-probabl: Co-authored-by: auguste-probabl <[email protected]>

Signed-off-by: Auguste Baum <[email protected]> Co-authored-by: Guillaume Lemaitre <[email protected]>

@auguste-probabl

…les (probabl-ai#1623) Following probabl-ai#1512 With the help of @auguste-probabl: Co-authored-by: auguste-probabl <[email protected]>

@auguste-probabl

…les (probabl-ai#1623) Following probabl-ai#1512 With the help of @auguste-probabl: Co-authored-by: auguste-probabl <[email protected]> fix: Fix `cannot cache unpickable configuration value` warning - Sphinx make html process (probabl-ai#1584) chore(dependencies): GITHUB-ACTIONS: Bump MishaKav/pytest-coverage-comment from 1.1.53 to 1.1.54 (probabl-ai#1627) Bumps [MishaKav/pytest-coverage-comment](https://github.com/mishakav/pytest-coverage-comment) from 1.1.53 to 1.1.54. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/mishakav/pytest-coverage-comment/releases">MishaKav/pytest-coverage-comment's releases</a>.</em></p> <blockquote> <h2>v1.1.54</h2> <h2>What's Changed</h2> <ul> <li>Improvements by <a href="https://github.com/MishaKav"><code>@MishaKav</code></a> in <a href="https://redirect.github.com/MishaKav/pytest-coverage-comment/pull/206">MishaKav/pytest-coverage-comment#206</a></li> </ul> <ul> <li>add support for new format for <code>pytest-coverage-path</code>, basically it add support for output of <code>pytest-cov >= 6.0.0</code></li> <li>bump dev dependencies</li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/MishaKav/pytest-coverage-comment/compare/v1.1.53...v1.1.54">https://github.com/MishaKav/pytest-coverage-comment/compare/v1.1.53...v1.1.54</a></p> </blockquote> </details> <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/MishaKav/pytest-coverage-comment/blob/main/CHANGELOG.md">MishaKav/pytest-coverage-comment's changelog</a>.</em></p> <blockquote> <h1>Changelog of the Pytest Coverage Comment</h1> <h2><a href="https://github.com/MishaKav/pytest-coverage-comment/tree/v1.1.54">Pytest Coverage Comment 1.1.54</a></h2> <p><strong>Release Date:</strong> 2025-04-30</p> <h4>Changes</h4> <ul> <li>add support for new format for <code>pytest-coverage-path</code>, basically it add support for output of <code>pytest-cov >= 6.0.0</code></li> <li>bump dev dependencies</li> </ul> <h2><a href="https://github.com/MishaKav/pytest-coverage-comment/tree/v1.1.53">Pytest Coverage Comment 1.1.53</a></h2> <p><strong>Release Date:</strong> 2024-10-10</p> <h4>Changes</h4> <ul> <li>add option <code>xml-skip-covered</code> to skip lines that covered for 100% based on xml report, thanks to <a href="https://github.com/NikTerentev"><code>@NikTerentev</code></a> for contribution</li> <li>bump dev dependencies and minor for <code>@actions/core</code></li> </ul> <h2><a href="https://github.com/MishaKav/pytest-coverage-comment/tree/v1.1.52">Pytest Coverage Comment 1.1.52</a></h2> <p><strong>Release Date:</strong> 2024-06-30</p> <h4>Changes</h4> <ul> <li>fix commit <code>sha</code> and <code>ref</code> for <code>workflow_run</code>, instead of from the default branch, thanks to <a href="https://github.com/cecheta"><code>@cecheta</code></a> for contribution</li> <li>use <code>label</code> instead of <code>ref</code> for <code>workflow_run</code> and <code>workflow_dispatch</code>, thanks to <a href="https://github.com/cecheta"><code>@cecheta</code></a> for contribution</li> <li>use data from all testsuites instead the first one, thanks to <a href="https://github.com/eltoder"><code>@eltoder</code></a> for contribution</li> </ul> <h2><a href="https://github.com/MishaKav/pytest-coverage-comment/tree/v1.1.51">Pytest Coverage Comment 1.1.51</a></h2> <p><strong>Release Date:</strong> 2024-01-13</p> <h4>Changes</h4> <ul> <li>add <code>workflow_run</code> to the events that can trigger this action, big thanks to <a href="https://github.com/Bouni"><code>@Bouni</code></a> for contribution</li> </ul> <h2><a href="https://github.com/MishaKav/pytest-coverage-comment/tree/v1.1.50">Pytest Coverage Comment 1.1.50</a></h2> <p><strong>Release Date:</strong> 2023-11-26</p> <h4>Changes</h4> <ul> <li>add support for updateing the comment in PR through <code>workflow_dispatch</code> event by passing manually issue number, thanks to <a href="https://github.com/alexjyong"><code>@alexjyong</code></a> for contribution</li> </ul> <h2><a href="https://github.com/MishaKav/pytest-coverage-comment/tree/v1.1.49">Pytest Coverage Comment 1.1.49</a></h2> <p><strong>Release Date:</strong> 2023-11-15</p>  </blockquote> <p>... (truncated)</p> </details> <details> <summary>Commits</summary> <ul> <li><a href="https://github.com/MishaKav/pytest-coverage-comment/commit/13d3c18e21895566c746187c9ea74736372e5e91"><code>13d3c18</code></a> Improvements (<a href="https://redirect.github.com/mishakav/pytest-coverage-comment/issues/206">#206</a>)</li> <li>See full diff in <a href="https://github.com/mishakav/pytest-coverage-comment/compare/81882822c5b22af01f91bd3eacb1cefb6ad73dc2...13d3c18e21895566c746187c9ea74736372e5e91">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=MishaKav/pytest-coverage-comment&package-manager=github_actions&previous-version=1.1.53&new-version=1.1.54)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> fix: Set size limit of `DiskCacheStorage` from 1go to unlimited (probabl-ai#1625) Fixes failing pipelines from probabl-ai#1617. --- https://grantjenks.com/docs/diskcache/tutorial.html#settings https://grantjenks.com/docs/diskcache/tutorial.html#eviction-policies ``` size_limit, default one gigabyte. The maximum on-disk size of the cache. cull_limit, default ten. The maximum number of keys to cull when adding a new item. Set to zero to disable automatic culling. eviction_policy, default “least-recently-stored”. The setting to determine [eviction policy](https://grantjenks.com/docs/diskcache/tutorial.html#tutorial-eviction-policies). "none" disables cache evictions. Caches will grow without bound. ``` docs: Avoid failure when pytorch load weights from TextEncoder with parallelism (probabl-ai#1617) It is a workaround on the failure that is sometimes happening when request `transform` from `TextEncoder` in parallel. In short, there is a failure in the weights loading from PyTorch where some tensors are loaded on a meta device (only metadata) and does not load the weight. I did not yet found the root cause of the failure why the weights are not loaded properly. In the meanwhile, we can avoid this issue by storing the weights. We should monitor that loading several time the language model does not make blow up the RAM on the GitHub actions. Othwerwise, we need to deactivate the parallelism most probably. Co-authored-by: Thomas S. <[email protected]>

github-actions bot assigned auguste-probabl Apr 2, 2025

glemaitre reviewed Apr 2, 2025

View reviewed changes

skore/src/skore/sklearn/_cross_validation_comparison/metrics_accessor.py Outdated Show resolved Hide resolved

auguste-probabl force-pushed the cross-validation-comparison branch 2 times, most recently from 7f0839b to ee1be03 Compare April 4, 2025 10:09

glemaitre reviewed Apr 4, 2025

View reviewed changes

skore/src/skore/sklearn/_cross_validation/metrics_accessor.py Show resolved Hide resolved

auguste-probabl force-pushed the cross-validation-comparison branch 2 times, most recently from 4a3ab4e to d71886e Compare April 4, 2025 16:01

auguste-probabl force-pushed the cross-validation-comparison branch 6 times, most recently from 0b7a77d to 84d6a1b Compare April 10, 2025 16:42

auguste-probabl mentioned this pull request Apr 11, 2025

refactor: Remove CrossValidationReport._MetricsAccessor #1318

Closed

1 task

auguste-probabl force-pushed the cross-validation-comparison branch from 84d6a1b to 3bb2a29 Compare April 11, 2025 08:48

auguste-probabl mentioned this pull request Apr 11, 2025

refactor(progress_decorator): Make sure self_obj is a Report #1536

Merged