Skip to content

Conversation

@waldie11
Copy link
Contributor

@waldie11 waldie11 commented Dec 18, 2024

PR Description

When _return_root_paths now uses ignore_nosub, it can benefit from from only exploring directories within root/sub-* instead of exploring the whole tree and subsequently applying the regex filter on visited files.

get_entity_vals is extended by an argument include_match to introduce a prefilter on regex path match. This is a whitelist approach, and extends both functionality and performance compared to a equivalent blacklist approach of existing option ignore_dirs.

Merge checklist

Maintainer, please confirm the following before merging.
If applicable:

  • All comments are resolved
  • This is not your own PR
  • All CIs are happy
  • PR title starts with [MRG]
  • whats_new.rst is updated
  • New contributors have been added to CITATION.cff
  • PR description includes phrase "closes <#issue-number>"

@welcome
Copy link

welcome bot commented Dec 18, 2024

Hello! 👋 Thanks for opening your first pull request here!
Please read the contributor guide, and please follow the steps outlined in the "Instructions for first-time contributors" section therein. ❤️ We will try to get back to you soon. 🚴🏽‍♂️

@codecov
Copy link

codecov bot commented Dec 23, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 97.41%. Comparing base (b75e4ca) to head (df82998).

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1355      +/-   ##
==========================================
+ Coverage   97.37%   97.41%   +0.03%     
==========================================
  Files          40       40              
  Lines        8966     9008      +42     
==========================================
+ Hits         8731     8775      +44     
+ Misses        235      233       -2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@sappelhoff
Copy link
Member

thanks @waldie11! we have a bit of a backlog right now so please don't worry if we take a bit longer to get back to you.

@waldie11
Copy link
Contributor Author

waldie11 commented Jan 2, 2025

@sappelhoff happy new year!
no worries, as i had seen some of the api changes in python-mne bringing down the test suite, i already expected priorities to be elsewhere.

@waldie11 waldie11 force-pushed the rework_mne_bids_path_match branch from 4863705 to c71a41f Compare January 2, 2025 17:00
Copy link
Member

@sappelhoff sappelhoff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @waldie11 could you please add an example for the feature that you introduce? Either by modifying an actual example that already exists, or by adding a numpy docstr example below the ones that already exist.

Could you please also resolve the conflicts AND follow the steps for first time contributors here? -->#1354

@waldie11 waldie11 force-pushed the rework_mne_bids_path_match branch from 102eb3d to 0480497 Compare February 3, 2025 13:09
@waldie11 waldie11 force-pushed the rework_mne_bids_path_match branch from 12c4f89 to def6ab2 Compare February 3, 2025 13:49
@waldie11
Copy link
Contributor Author

waldie11 commented Feb 3, 2025

Hi @sappelhoff ,

I think the true meat is rather a rework of _return_root_paths. This is not exactly a API function, so it feels a bit weird to write an example.

get_entity_vals seems to undergo some changes within upcoming 0.17 anyhow. I extended the docstring a bit on my feature contribution. Do you think this is sufficient? Neither the ignore_ parameters have an example. It benefits filtering a bids_path, which has deep directory trees for sourcedata, derivatives, subject or who-knows-what, where one wants to select prehand in which directories to look.

my_dataset/
  derivatives/
    downsampled/
      sub-01/
        micr/
          sub-01_sample-01_res-4x_TEM.png
          sub-01_sample-01_res-4x_TEM.json
  sub-01/
    micr/
      sub-01_sample-01_TEM.png
      sub-01_sample-01_TEM.json

(BIDS v1.8.0-1 p. 214)

@waldie11
Copy link
Contributor Author

waldie11 commented Feb 3, 2025

Idk how far you are interested in diving into this:

in test_path_benchmark I created a dummy bids compliant tree. If I browse this artificial tree with get_entity_vals, I extract a performance gain in about one order of magnitude by using include_match="sub-*/" in comparison to ignore_dirs=["derivatives", "sourcedata"]

This benchmark is eating up some CI runtime though. I tried keeping it lightweight.

@waldie11 waldie11 force-pushed the rework_mne_bids_path_match branch from 66b4e1c to 824649e Compare February 3, 2025 21:16
@sappelhoff
Copy link
Member

I will make time for this in the next days, latest in the weekend, thanks for your patience!

Copy link
Member

@sappelhoff sappelhoff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your work @waldie11! I left a few comments and suggestions to be addressed, but I think we can then go ahead and merge it!

Comment on lines 185 to 186
for i in equal_length_subj_ids(np.arange(1, 20, dtype=int)):
for j in range(1, 9):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where are the 20 and 9 coming from? as with the 17 above, better define these as variables within the test, potentially with a quick inline comment, or even better, a descriptive var name

n_test_subs = 20
...

for datatype in ["eeg", "meg"]:
bids_subdir = BIDSPath(
subject=i,
session="0" + str(j),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please use f-string formatting whenever possible.

Comment on lines +279 to +265
# Clean up
shutil.rmtree(tmp_bids_root)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this not done automatically by pytest?

@waldie11 waldie11 force-pushed the rework_mne_bids_path_match branch from 970751c to 7399903 Compare February 20, 2025 17:29
@waldie11
Copy link
Contributor Author

Thanks for your kind improvements @sappelhoff !

Copy link
Member

@sappelhoff sappelhoff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the contribution!

@sappelhoff sappelhoff merged commit 7b04694 into mne-tools:main Feb 20, 2025
6 of 21 checks passed
@welcome
Copy link

welcome bot commented Feb 20, 2025

🎉 Congrats on merging your first pull request! 🥳 Looking forward to seeing more from you in the future! 💪

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants