Skip to content

Conversation

@sobiya-22
Copy link

@sobiya-22 sobiya-22 commented Jun 9, 2025

Closes #1747

Hi @glemaitre , @MarieSacksick
Added unit tests for:

  • data_source parameter functionality in ROC curve
  • ComparisonReport display verification

@glemaitre
Copy link
Member

@sobiya-22 Thanks for the PR but it is not really what we would like as a test.

To give an example, in EstimatorReport, we test the behaviour of data_source with the following test:

https://github.com/probabl-ai/skore/blob/main/skore/tests/unit/sklearn/plot/roc_curve/test_estimator.py#L113-L158

Here, I would expect to modify the files skore/tests/unit/sklearn/plot/roc_curve/test_comparison_estimator.py to add similar test for the case that we have a ComparisonReport built with EstimatorReports to check that the data_source parameter has the expected behaviour.

We can limit the scope of this PR to this specific behaviour but I think that we need to extend the same test to the test_comparison_cross_validation.py file and most probably for the other type of visualization (PR curve and prediction-error).

@sobiya-22
Copy link
Author

Hi @glemaitre ,
Thank you for the feedback! I've now implemented the data_source parameter tests for ROC curve visualization in test_comparison_estimator.py and test_comparison_cross_validation.py

skore/tests/unit/sklearn/plot/roc_curve/test_comparison_estimator.py - Added test_data_source_binary_classification() and test_data_source_multiclass_classification()
skore/tests/unit/sklearn/plot/roc_curve/test_comparison_cross_validation.py - Added corresponding tests for CrossValidationReport scenarios

Both test files now include comprehensive coverage of the data_source parameter behavior ("train", "test", and "X_y" options) following the same pattern as the EstimatorReport tests you referenced.

Remaining task:
PR curve visualization tests (precision_recall curve)
Prediction-error visualization tests

Before I proceed with implementing similar data_source tests for PR curve and prediction-error visualizations, could you please review the current ROC curve implementation to confirm I'm on the right track? I want to ensure the approach and test structure are correct before extending to the remaining visualization types.

Please let me know if this matches your expectations, and I'll proceed with the PR curve and prediction-error tests accordingly.
Thanks!

@glemaitre
Copy link
Member

Before I proceed with implementing similar data_source tests for PR curve and prediction-error visualizations, could you please review the current ROC curve implementation to confirm I'm on the right track?

Indeed, it should be implemented in some independent PRs to keep the scope focused. I'll make a review.

@github-actions
Copy link
Contributor

github-actions bot commented Jun 10, 2025

Coverage

Coverage Report for skore/
FileStmtsMissCoverMissing
venv/lib/python3.13/site-packages/skore
   __init__.py230100% 
   _config.py280100% 
   exceptions.py440%4, 15, 19, 23
venv/lib/python3.13/site-packages/skore/project
   __init__.py20100% 
   project.py490100% 
   summary.py740100% 
   widget.py138596%375–377, 447–448
venv/lib/python3.13/site-packages/skore/sklearn
   __init__.py70100% 
   _base.py1691491%45, 58, 126, 129, 182, 185–186, 188–191, 224, 227–228
   find_estimators.py270100% 
   find_ml_task.py610100% 
   types.py26196%26
venv/lib/python3.13/site-packages/skore/sklearn/_comparison
   __init__.py50100% 
   metrics_accessor.py204398%175, 255, 1298
   report.py1050100% 
   utils.py550100% 
venv/lib/python3.13/site-packages/skore/sklearn/_cross_validation
   __init__.py50100% 
   metrics_accessor.py209199%248
   report.py122199%466
venv/lib/python3.13/site-packages/skore/sklearn/_estimator
   __init__.py70100% 
   feature_importance_accessor.py143298%216–217
   metrics_accessor.py382897%195, 197, 204, 295, 364, 368, 383, 418
   report.py164298%438–439
venv/lib/python3.13/site-packages/skore/sklearn/_plot
   __init__.py20100% 
   base.py50100% 
   style.py280100% 
   utils.py118595%50, 74–76, 80
venv/lib/python3.13/site-packages/skore/sklearn/_plot/metrics
   __init__.py60100% 
   confusion_matrix.py69494%90, 98, 120, 228
   precision_recall_curve.py260598%459, 559, 563, 623, 743
   prediction_error.py215597%179, 186, 422, 505, 687
   roc_curve.py272598%386, 509, 617, 626, 819
   summarize.py70100% 
venv/lib/python3.13/site-packages/skore/sklearn/train_test_split
   __init__.py00100% 
   train_test_split.py490100% 
venv/lib/python3.13/site-packages/skore/sklearn/train_test_split/warning
   __init__.py80100% 
   high_class_imbalance_too_few_examples_warning.py17194%80
   high_class_imbalance_warning.py180100% 
   random_state_unset_warning.py100100% 
   shuffle_true_warning.py10190%46
   stratify_is_set_warning.py100100% 
   time_based_column_warning.py21195%73
   train_test_split_warning.py40100% 
venv/lib/python3.13/site-packages/skore/utils
   __init__.py6266%8, 13
   _accessor.py52296%67, 108
   _environment.py270100% 
   _fixes.py80100% 
   _index.py50100% 
   _logger.py22481%15–17, 19
   _measure_time.py100100% 
   _parallel.py38392%23, 33, 124
   _patch.py13561%21, 23–24, 35, 37
   _progress_bar.py450100% 
   _show_versions.py33293%65–66
   _testing.py460100% 
TOTAL34438697% 

Tests Skipped Failures Errors Time
873 5 💤 0 ❌ 0 🔥 6m 58s ⏱️

@github-actions
Copy link
Contributor

github-actions bot commented Jun 10, 2025

Documentation preview @ 7c08f39

@sobiya-22 sobiya-22 force-pushed the issue-1747-data-source-tests branch from 6572c5e to 7c08f39 Compare June 11, 2025 01:46
@sobiya-22
Copy link
Author

sobiya-22 commented Jun 11, 2025

Hi @glemaitre,
I've now removed self-explanatory comments as suggested.
Please let me know if anything else needs to be addressed.
Once approved, I’ll proceed with the separate PRs for the PR curve and prediction-error visualizations as you recommended.

@sobiya-22
Copy link
Author

Hi @glemaitre , Shall I proceed with separate PR's for PR curve and prediction-error visualizations?

Copy link
Contributor

@MarieSacksick MarieSacksick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Almost good to me :)!
To be sure that the data_source is really used in the plot, and not just always the train or the test, can you check that there are some changes, other than in the parameter data_source please?

assert display.ax_[0].spines["right"].get_visible()


def test_data_source_binary_classification(pyplot, binary_classification_data_no_split):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you check that the results when changing the data_source do change please?
For instance, you can use a subsample of X and y when giving the data source X_y so you don't have to create a new one, and then check that the dataframes display.frame() outputed are different with assert not and the equals function.

assert all(0 <= auc <= 1 for auc in auc_values)


def test_data_source_multiclass_classification(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above, can you check that the outputs are different please?

assert display.ax_.get_title() == "ROC Curve"


def test_data_source_binary_classification(pyplot, binary_classification_data):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as in previous file: checking that the outputs are different.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

chore: Add tests for data_source parameter and the display of ComparisonReport

3 participants