FIX: Match only within relative path when indexer is validating#859
FIX: Match only within relative path when indexer is validating#859effigies merged 8 commits intobids-standard:masterfrom
Conversation
Codecov Report
@@ Coverage Diff @@
## master #859 +/- ##
==========================================
- Coverage 86.29% 86.03% -0.27%
==========================================
Files 35 35
Lines 3954 3973 +19
Branches 1026 1031 +5
==========================================
+ Hits 3412 3418 +6
- Misses 353 363 +10
- Partials 189 192 +3
Continue to review full report at Codecov.
|
effigies
left a comment
There was a problem hiding this comment.
Looks sensible. Need to look at this CI failure...
|
It'd be nice to write some unit tests, but I'm not sure I have the time :S |
|
What about: def test__check_path_matches_patterns():
hidden_ds = "/home/user/.cache/datasets/myds"
failing_path = f"{hidden_ds}/myfile"
sensible_exclusions = [re.compile(r"/.*")]
assert _check_path_matches_patterns(failing_path, sensible_exclusions) is False
assert _check_path_matches_patterns(failing_path, sensible_exclusions, root=hidden_ds) is True |
|
Also please rebase on master to fix tests. |
606c091 to
13ab11c
Compare
|
Seems to be breaking expectations. |
bids/layout/index.py
Outdated
| if isinstance(patt, (str, Path)): | ||
| if str(patt) in path: |
There was a problem hiding this comment.
@effigies I hope I did not misinterpreted these two lines.
Here is the string matching pattern:
- Shouldn't we allow strings, not just Paths?
- Shouldn't we just check the presence of the substring, rather than the exact match?
There was a problem hiding this comment.
Finally, the changes were much more involved - probably this doesn't have much relevance now.
|
Okay, I seem to have opened a can of worms. The implementation of path matching and validation was pretty brittle, almost tailored to the few tests we have here. Although I haven't done an exhaustive revision of the truth table of possible combinations of |
adelavega
left a comment
There was a problem hiding this comment.
I haven't had time to test this out myself but looks reasonable.
Taking a step back force_index seems like a bit a mistake of the API, as it complicates things quite a bit. I also don't like how rules for what is indexed or excluded by default is in the code, rather than driven the BIDS schema-- but that will be a future effort.
Either way those comments don't have bearing on this PR and I think in the meantime this is good, and I'm glad we have more coverage on this.
I totally second this thought 👍 |
|
I've been thinking about @adelavega's feelings against That should go in a different PR, though. @effigies care to give this a final check? |
|
@oesteban about such an API breaking change, you should also be aware about #863 We will be eventually replacing At that point, we will allow a fair amount of API breaking changes so that would be a good time for that. The plan is to also having the current |
|
For now, it could be useful to:
|
|
That works for me |
|
Will merge if tests pass. Thanks for your patience. |
Make sure bids-standard/pybids#859 is available.
One major problem of the current behavior is, if one wants to exclude dot-folders and dot-files then a pattern like
re.compile(r"/\.")will also match patterns above the root folder. E.g., the path/home/username/.cache/dataset/sub-01would match, as well as/home/username/.cache/dataset/.datalad. However, we only the latter to be filtered out.Resolves: #858.