Commit 44dc3cd
authored
Improve regex detection for the drive_sep_replace default (#6417)
I imported an album where a track had the name `1:00 AM - Clear` and
another track named `12:00 AM - Clear` (just two examples).
See: [Animal Crossing: New Horizons
OST](https://musicbrainz.org/release/263f7ed3-60c2-4251-ac7d-6da3f8691256)
After import, the former was renamed `1_00 AM - Clear`, and the latter
`12;00 AM - Clear`. Notice the inconsistency of how the `:` was
replaced.
I did not make use of the (hidden) `drive_sep_replace` setting. These
were my `replace` settings:
```
replace: # prevent file name incompatibiliy
'[\s]' : ' ' # standardize whitespace
'["`‘’“”]' : "'" # standardize quotes
'[\u002D\u2010-\u2015\u2E3A]' : '-' # standardize dashes
'[\u2E3B\uFE58\uFE63\uFF0D]' : '-' # standardize dashes
'[\xAD]' : '-' # standardize dashes
'[\\\|\/]' : ' ' # slashes, pipe > space
'[:]' : ';' # colon > semicolon
'[<>]' : '-' # chevrons > dashes
'[\?\*]' : '' # remove restricted characters
'[\x00-\x1F\x7F]' : '' # remove basic control characters
'[\x80-\x9F]' : '' # remove extra control characters
'^\.' : '' # remove leading period
'\.$' : '' # remove trailing period
'^\s+' : '' # remove leading space
'\s+$' : '' # remove trailing space
```
I found the issue to be too generic regex for drive separator detection.
I'm on macOS, so this is irrelevant to me anyway (and I got around it by
adding `drive_sep_replace: ';'` to my settings), but regardless, I think
this could be improved.
This PR improves the regex to detect drive separators. Instead of merely
looking for any first character followed by a colon (`^\w:`), we look
for a letter, followed by a colon, followed by a backslash instead
(`^[a-zA-Z]:\\`).
The regex logic is solid, but I am not able to test this on a real
Windows environment.
~Still have to add an entry to the changelog, will do so soon.~
# Update
Initially this commit failed the
`MoveTest.test_move_file_with_colon_alt_separator` test because it
checks the logic using a `C:DOS` path. So I had to make the logic less
restrictive again, not checking for a backslash (`^[a-zA-Z]:`). I would
argue the test itself should be amended (test with `C:\DOS` instead),
but that's not up to me.
As a result, my case of "1:00 AM" being replaced incorrectly is still
resolved, but other hypothetical cases like "a:b" would still not be
covered due to an arguably incorrect test limiting a more precise regex.2 files changed
+5
-2
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
159 | 159 | | |
160 | 160 | | |
161 | 161 | | |
162 | | - | |
163 | | - | |
| 162 | + | |
| 163 | + | |
164 | 164 | | |
165 | 165 | | |
166 | 166 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
16 | 16 | | |
17 | 17 | | |
18 | 18 | | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
19 | 22 | | |
20 | 23 | | |
21 | 24 | | |
| |||
0 commit comments