Skip to content

Conversation

@auguste-probabl
Copy link
Contributor

@auguste-probabl auguste-probabl commented Apr 2, 2025

This PR makes it possible to pass a list of CrossValidationReports to the ComparisonReport (which currently only accepts EstimatorReports).

For now, plots like Comparison.metrics.roc_curve are not supported for ComparisonReports holding CrossValidationReports.


Example:

>>> from sklearn.datasets import make_classification
>>> from sklearn.linear_model import LogisticRegression
>>> from skore import ComparisonReport, CrossValidationReport
>>> X, y = make_classification(random_state=42)
>>> estimator_1 = LogisticRegression()
>>> estimator_2 = LogisticRegression(C=2)  # Different regularization
>>> report_1 = CrossValidationReport(estimator_1, X, y)
>>> report_2 = CrossValidationReport(estimator_2, X, y)
>>> report = ComparisonReport({"model1": report_1, "model2": report_2})
>>> report.metrics.report_metrics(aggregate=None)
                                                    Value
Metric       Label / Average Estimator Split
Precision    0               model1    Split #0  1.000000
                                       Split #1  1.000000
                                       Split #2  1.000000
                                       Split #3  1.000000
                                       Split #4  0.909091
...                                                   ...
Predict time                 model2    Split #0  0.000185
                                       Split #1  0.000177
                                       Split #2  0.000187
                                       Split #3  0.000177
                                       Split #4  0.000188

[80 rows x 1 columns]
>>> report.metrics.report_metrics()
                                            mean       std
Metric       Label / Average Estimator
Precision    0               model1     0.981818  0.040656
                             model2     1.000000  0.000000
             1               model1     0.981818  0.040656
                             model2     0.981818  0.040656
Recall       0               model1     0.980000  0.044721
                             model2     0.980000  0.044721
             1               model1     0.980000  0.044721
                             model2     1.000000  0.000000
ROC AUC                      model1     0.996000  0.008944
                             model2     0.996000  0.008944
Brier score                  model1     0.022532  0.013130
                             model2     0.018731  0.014566
Fit time                     model1     0.001959  0.000559
                             model2     0.002603  0.000185
Predict time                 model1     0.000198  0.000014
                             model2     0.000183  0.000005

Closes #1414

Todo:

  • Coverage
  • mypy
  • Docs probably
  • Fix bug with progress bars not getting cleaned up
  • Format tables properly according to feat(ComparisonReport): Be able to compare several CrossValidationReport #1414
  • Make plots work deferred
    • Add NotImplementedError to plots
  • Test what happens when aggregate is passed to ComparisonReport[EstimatorReport]
  • Test what happens when X, y is passed to ComparisonReport[CrossValidationReport]
  • Add doctests to ComparisonReport showcasing ComparisonReport[CrossValidationReport]
  • Add timings
  • Check if some estimators are named the same, if so make them different (do it when initializing report.report_names_)

@github-actions
Copy link
Contributor

github-actions bot commented Apr 2, 2025

Documentation preview @ 19b6926

@auguste-probabl auguste-probabl force-pushed the cross-validation-comparison branch 2 times, most recently from 7f0839b to ee1be03 Compare April 4, 2025 10:09
@auguste-probabl auguste-probabl force-pushed the cross-validation-comparison branch 2 times, most recently from 4a3ab4e to d71886e Compare April 4, 2025 16:01
@auguste-probabl
Copy link
Contributor Author

auguste-probabl commented Apr 7, 2025

Discussed IRL: The code common between ComparisonReport and CrossValidationComparisonReport should be factored in a BaseComparisonReport.
It makes more sense for us to have just ComparisonReport, that can take either EstimatorReports or CVReports.

Comment on lines +189 to +203
if len(set(id(report) for report in reports_list)) < len(reports_list):
raise ValueError("Expected reports to be distinct objects")
Copy link
Contributor Author

@auguste-probabl auguste-probabl Apr 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This new constraint results from #1536; indeed, if the same report is passed twice, when we reset the progress bar at the end of the first CVReport computation (see below), when the second CVReport computation starts, the progress object is set to None, whereas it should be set to the ComparisonReport's progress object.

self_obj._parent_progress = None
self_obj._progress_info = None

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that right now this constraint is only really useful when comparing CVReports, not when comparing EstimatorReports. I didn't put it in the if-block because I can imagine that one day EstimatorReports might have more progress bars of their own.

@@ -65,7 +67,7 @@ def test_comparison_report_without_testing_data(binary_classification_model):
estimator, X_train, _, y_train, _ = binary_classification_model
estimator_report = EstimatorReport(estimator, X_train=X_train, y_train=y_train)

report = ComparisonReport([estimator_report, estimator_report])
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is necessary because of the new constraint that all reports must be distinct

@auguste-probabl auguste-probabl force-pushed the cross-validation-comparison branch 2 times, most recently from 218335a to 33d1d82 Compare April 11, 2025 10:06
thomass-dev pushed a commit that referenced this pull request Apr 11, 2025
This originates from a bug when implementing comparison of
CrossValidationReport in #1512

Because CrossValidationReports hold a `_parent_progress`, when a
ComparisonReport creates a progress bar to iterate over
CrossValidationReports, the outer progress bar conflicts with the inner
progress bars, and rich refuses to proceed.

The solution is for the ComparisonReport to explicitly set its inner
CrossValidationReports' progress instance, so that in total there is
only one progress instance.

But before this change, the progress instance was sometimes owned by a
`CrossValidationReport.metrics` accessor. This is a problem because
accessors are re-instantiated whenever they are accessed, so their state
cannot be modified from the parent.

The solution this change implements is to remove all `progress`-related
attributes from all accessors, and to ensure that the progress instance
is only owned by the Report object, not by any of its accessors.
@auguste-probabl auguste-probabl force-pushed the cross-validation-comparison branch 2 times, most recently from cd3f989 to 1e157f4 Compare April 11, 2025 13:06
@auguste-probabl auguste-probabl force-pushed the cross-validation-comparison branch from 4b376f5 to 19b6926 Compare April 29, 2025 16:57
@glemaitre glemaitre self-requested a review April 30, 2025 08:27
Copy link
Member

@glemaitre glemaitre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's merge now. @auguste-probabl could you open an issue to track the improvement/refactoring for the progress bar to not forget.

@glemaitre glemaitre merged commit de10944 into probabl-ai:main Apr 30, 2025
27 checks passed
github-merge-queue bot pushed a commit that referenced this pull request Apr 30, 2025
Muhammad-Rebaal pushed a commit to Muhammad-Rebaal/skore that referenced this pull request May 5, 2025
Signed-off-by: Auguste Baum <[email protected]>
Co-authored-by: Guillaume Lemaitre <[email protected]>
Muhammad-Rebaal pushed a commit to Muhammad-Rebaal/skore that referenced this pull request May 5, 2025
Muhammad-Rebaal pushed a commit to Muhammad-Rebaal/skore that referenced this pull request May 5, 2025
…les (probabl-ai#1623)

Following probabl-ai#1512

With the help of @auguste-probabl:

Co-authored-by: auguste-probabl <[email protected]>

fix: Fix `cannot cache unpickable configuration value` warning - Sphinx make html process (probabl-ai#1584)

chore(dependencies): GITHUB-ACTIONS: Bump MishaKav/pytest-coverage-comment from 1.1.53 to 1.1.54 (probabl-ai#1627)

Bumps
[MishaKav/pytest-coverage-comment](https://github.com/mishakav/pytest-coverage-comment)
from 1.1.53 to 1.1.54.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/mishakav/pytest-coverage-comment/releases">MishaKav/pytest-coverage-comment's
releases</a>.</em></p>
<blockquote>
<h2>v1.1.54</h2>
<h2>What's Changed</h2>
<ul>
<li>Improvements by <a
href="https://github.com/MishaKav"><code>@​MishaKav</code></a> in <a
href="https://redirect.github.com/MishaKav/pytest-coverage-comment/pull/206">MishaKav/pytest-coverage-comment#206</a></li>
</ul>
<ul>
<li>add support for new format for <code>pytest-coverage-path</code>,
basically it add support for output of <code>pytest-cov &gt;=
6.0.0</code></li>
<li>bump dev dependencies</li>
</ul>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/MishaKav/pytest-coverage-comment/compare/v1.1.53...v1.1.54">https://github.com/MishaKav/pytest-coverage-comment/compare/v1.1.53...v1.1.54</a></p>
</blockquote>
</details>
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/MishaKav/pytest-coverage-comment/blob/main/CHANGELOG.md">MishaKav/pytest-coverage-comment's
changelog</a>.</em></p>
<blockquote>
<h1>Changelog of the Pytest Coverage Comment</h1>
<h2><a
href="https://github.com/MishaKav/pytest-coverage-comment/tree/v1.1.54">Pytest
Coverage Comment 1.1.54</a></h2>
<p><strong>Release Date:</strong> 2025-04-30</p>
<h4>Changes</h4>
<ul>
<li>add support for new format for <code>pytest-coverage-path</code>,
basically it add support for output of <code>pytest-cov &gt;=
6.0.0</code></li>
<li>bump dev dependencies</li>
</ul>
<h2><a
href="https://github.com/MishaKav/pytest-coverage-comment/tree/v1.1.53">Pytest
Coverage Comment 1.1.53</a></h2>
<p><strong>Release Date:</strong> 2024-10-10</p>
<h4>Changes</h4>
<ul>
<li>add option <code>xml-skip-covered</code> to skip lines that covered
for 100% based on xml report, thanks to <a
href="https://github.com/NikTerentev"><code>@​NikTerentev</code></a> for
contribution</li>
<li>bump dev dependencies and minor for <code>@actions/core</code></li>
</ul>
<h2><a
href="https://github.com/MishaKav/pytest-coverage-comment/tree/v1.1.52">Pytest
Coverage Comment 1.1.52</a></h2>
<p><strong>Release Date:</strong> 2024-06-30</p>
<h4>Changes</h4>
<ul>
<li>fix commit <code>sha</code> and <code>ref</code> for
<code>workflow_run</code>, instead of from the default branch, thanks to
<a href="https://github.com/cecheta"><code>@​cecheta</code></a> for
contribution</li>
<li>use <code>label</code> instead of <code>ref</code> for
<code>workflow_run</code> and <code>workflow_dispatch</code>, thanks to
<a href="https://github.com/cecheta"><code>@​cecheta</code></a> for
contribution</li>
<li>use data from all testsuites instead the first one, thanks to <a
href="https://github.com/eltoder"><code>@​eltoder</code></a> for
contribution</li>
</ul>
<h2><a
href="https://github.com/MishaKav/pytest-coverage-comment/tree/v1.1.51">Pytest
Coverage Comment 1.1.51</a></h2>
<p><strong>Release Date:</strong> 2024-01-13</p>
<h4>Changes</h4>
<ul>
<li>add <code>workflow_run</code> to the events that can trigger this
action, big thanks to <a
href="https://github.com/Bouni"><code>@​Bouni</code></a> for
contribution</li>
</ul>
<h2><a
href="https://github.com/MishaKav/pytest-coverage-comment/tree/v1.1.50">Pytest
Coverage Comment 1.1.50</a></h2>
<p><strong>Release Date:</strong> 2023-11-26</p>
<h4>Changes</h4>
<ul>
<li>add support for updateing the comment in PR through
<code>workflow_dispatch</code> event by passing manually issue number,
thanks to <a
href="https://github.com/alexjyong"><code>@​alexjyong</code></a> for
contribution</li>
</ul>
<h2><a
href="https://github.com/MishaKav/pytest-coverage-comment/tree/v1.1.49">Pytest
Coverage Comment 1.1.49</a></h2>
<p><strong>Release Date:</strong> 2023-11-15</p>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="https://github.com/MishaKav/pytest-coverage-comment/commit/13d3c18e21895566c746187c9ea74736372e5e91"><code>13d3c18</code></a>
Improvements (<a
href="https://redirect.github.com/mishakav/pytest-coverage-comment/issues/206">#206</a>)</li>
<li>See full diff in <a
href="https://github.com/mishakav/pytest-coverage-comment/compare/81882822c5b22af01f91bd3eacb1cefb6ad73dc2...13d3c18e21895566c746187c9ea74736372e5e91">compare
view</a></li>
</ul>
</details>
<br />

[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=MishaKav/pytest-coverage-comment&package-manager=github_actions&previous-version=1.1.53&new-version=1.1.54)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)

</details>

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

fix: Set size limit of `DiskCacheStorage` from 1go to unlimited (probabl-ai#1625)

Fixes failing pipelines from
probabl-ai#1617.

---

https://grantjenks.com/docs/diskcache/tutorial.html#settings
https://grantjenks.com/docs/diskcache/tutorial.html#eviction-policies

```
size_limit, default one gigabyte. The maximum on-disk size of the cache.

cull_limit, default ten. The maximum number of keys to cull when adding a new item.

    Set to zero to disable automatic culling.

eviction_policy, default “least-recently-stored”. The setting to determine [eviction policy](https://grantjenks.com/docs/diskcache/tutorial.html#tutorial-eviction-policies).

    "none" disables cache evictions. Caches will grow without bound.
```

docs: Avoid failure when pytorch load weights from TextEncoder with parallelism (probabl-ai#1617)

It is a workaround on the failure that is sometimes happening when
request `transform` from `TextEncoder` in parallel.

In short, there is a failure in the weights loading from PyTorch where
some tensors are loaded on a meta device (only metadata) and does not
load the weight. I did not yet found the root cause of the failure why
the weights are not loaded properly.

In the meanwhile, we can avoid this issue by storing the weights. We
should monitor that loading several time the language model does not
make blow up the RAM on the GitHub actions. Othwerwise, we need to
deactivate the parallelism most probably.

Co-authored-by: Thomas S. <[email protected]>
@auguste-probabl auguste-probabl deleted the cross-validation-comparison branch May 13, 2025 09:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat(ComparisonReport): Be able to compare several CrossValidationReport

2 participants