Skip to content

Conversation

@auguste-probabl
Copy link
Contributor

Added support for calculating permutation importance at different stages of a pipeline.
One can choose to compute feature importance either at the start of the pipeline or at the end.

Closes #1398

Supersedes #1888

Co-authored-by: waridrox [email protected]

@github-actions
Copy link
Contributor

github-actions bot commented Aug 28, 2025

Coverage

Coverage Report for skore/
FileStmtsMissCoverMissing
skore/src/skore
   __init__.py230100% 
   _config.py310100% 
   exceptions.py440%4, 15, 19, 23
skore/src/skore/_sklearn
   __init__.py60100% 
   _base.py1981492%45, 58, 127, 130, 183, 186–187, 189–192, 225, 228–229
   find_ml_task.py610100% 
   types.py27196%28
skore/src/skore/_sklearn/_comparison
   __init__.py70100% 
   feature_importance_accessor.py35294%88, 107
   metrics_accessor.py178398%173, 253, 1215
   report.py1070100% 
   utils.py540100% 
skore/src/skore/_sklearn/_cross_validation
   __init__.py90100% 
   data_accessor.py45393%134, 137, 140
   feature_importance_accessor.py240100% 
   metrics_accessor.py182199%244
   report.py135199%487
skore/src/skore/_sklearn/_estimator
   __init__.py90100% 
   data_accessor.py66198%82
   feature_importance_accessor.py168298%251–252
   metrics_accessor.py356897%200, 202, 209, 300, 369, 373, 388, 423
   report.py165298%448–449
skore/src/skore/_sklearn/_plot
   __init__.py30100% 
   base.py98693%61–62, 224–226, 230
   utils.py770100% 
skore/src/skore/_sklearn/_plot/data
   __init__.py20100% 
   table_report.py185199%706
skore/src/skore/_sklearn/_plot/metrics
   __init__.py60100% 
   confusion_matrix.py70494%92, 100, 122, 230
   feature_importance_display.py672168%88, 121–122, 124, 142–146, 148–155, 158–160, 162
   metrics_summary_display.py80100% 
   precision_recall_curve.py281598%455, 555, 559, 619, 751
   prediction_error.py227597%179, 186, 422, 505, 705
   roc_curve.py294897%387, 510, 515, 616, 621, 625, 694, 834
skore/src/skore/_sklearn/train_test_split
   __init__.py00100% 
   train_test_split.py580100% 
skore/src/skore/_sklearn/train_test_split/warning
   __init__.py80100% 
   high_class_imbalance_too_few_examples_warning.py19194%83
   high_class_imbalance_warning.py200100% 
   random_state_unset_warning.py100100% 
   shuffle_true_warning.py90100% 
   stratify_is_set_warning.py100100% 
   time_based_column_warning.py210100% 
   train_test_split_warning.py30100% 
skore/src/skore/_utils
   __init__.py6266%8, 13
   _accessor.py90396%34, 146, 190
   _environment.py27196%40
   _fixes.py80100% 
   _index.py50100% 
   _logger.py22481%15–17, 19
   _measure_time.py100100% 
   _parallel.py38392%23, 33, 124
   _patch.py13561%21, 23–24, 35, 37
   _progress_bar.py460100% 
   _repr_html.py80100% 
   _show_versions.py380100% 
   _testing.py550100% 
skore/src/skore/project
   __init__.py20100% 
   project.py480100% 
   summary.py75198%120
   widget.py1870100% 
TOTAL404411297% 

Tests Skipped Failures Errors Time
1101 5 💤 0 ❌ 0 🔥 4m 21s ⏱️

@github-actions
Copy link
Contributor

github-actions bot commented Aug 28, 2025

Documentation preview @ 2cbc1f1

@glemaitre
Copy link
Member

Since we still don't have the permutation importance across the different reporters, there is not documentation to change in the user guide.

Copy link
Member

@glemaitre glemaitre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks good. I would only suggest to amend the example called plot_feature_importance.py where we can demonstate the feature. We have a Ridge model with some feature importance and we can show that we can compute with index 0 and -1.

@auguste-probabl
Copy link
Contributor Author

auguste-probabl commented Oct 15, 2025

amend the example called plot_feature_importance.py where we can demonstate the feature. We have a Ridge model with some feature importance and we can show that we can compute with index 0 and -1.

TODO: I found a bug while doing this, at a certain step the permutation computation fails because it doesn't like sparse matrices. Fixed.

Now I need to figure out how to showcase the new feature in the example, hopefully in a way that doesn't disrupt the flow.

Copy link
Member

@glemaitre glemaitre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here it comes ;). Sorry for the delay.

@glemaitre glemaitre self-requested a review October 30, 2025 09:24
Copy link
Member

@glemaitre glemaitre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With those changes, you are going to cover the two missing line and it is a possible pipeline that we should be supporting.

@auguste-probabl
Copy link
Contributor Author

Now it's

feature_names = estimator.feature_names_in_
that is not covered

glemaitre
glemaitre previously approved these changes Oct 30, 2025
Copy link
Member

@glemaitre glemaitre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just the last comment. Otherwise, LGTM.

@glemaitre glemaitre added this pull request to the merge queue Oct 31, 2025
Merged via the queue into probabl-ai:main with commit 2cba17f Oct 31, 2025
32 checks passed
github-merge-queue bot pushed a commit that referenced this pull request Oct 31, 2025
Following #1988.

Synchronize `pyproject.toml` and `.pre-commit-config.yaml` to let `mypy`
to work outside `pre-commit`.
@auguste-probabl auguste-probabl deleted the push-szluysnvkxtt branch November 20, 2025 09:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

enh: Show permutation at different stage of a pipeline

4 participants