Skip to content

Commit 44dc3cd

Browse files
authored
Improve regex detection for the drive_sep_replace default (#6417)
I imported an album where a track had the name `1:00 AM - Clear` and another track named `12:00 AM - Clear` (just two examples). See: [Animal Crossing: New Horizons OST](https://musicbrainz.org/release/263f7ed3-60c2-4251-ac7d-6da3f8691256) After import, the former was renamed `1_00 AM - Clear`, and the latter `12;00 AM - Clear`. Notice the inconsistency of how the `:` was replaced. I did not make use of the (hidden) `drive_sep_replace` setting. These were my `replace` settings: ``` replace: # prevent file name incompatibiliy '[\s]' : ' ' # standardize whitespace '["`‘’“”]' : "'" # standardize quotes '[\u002D\u2010-\u2015\u2E3A]' : '-' # standardize dashes '[\u2E3B\uFE58\uFE63\uFF0D]' : '-' # standardize dashes '[\xAD]' : '-' # standardize dashes '[\\\|\/]' : ' ' # slashes, pipe > space '[:]' : ';' # colon > semicolon '[<>]' : '-' # chevrons > dashes '[\?\*]' : '' # remove restricted characters '[\x00-\x1F\x7F]' : '' # remove basic control characters '[\x80-\x9F]' : '' # remove extra control characters '^\.' : '' # remove leading period '\.$' : '' # remove trailing period '^\s+' : '' # remove leading space '\s+$' : '' # remove trailing space ``` I found the issue to be too generic regex for drive separator detection. I'm on macOS, so this is irrelevant to me anyway (and I got around it by adding `drive_sep_replace: ';'` to my settings), but regardless, I think this could be improved. This PR improves the regex to detect drive separators. Instead of merely looking for any first character followed by a colon (`^\w:`), we look for a letter, followed by a colon, followed by a backslash instead (`^[a-zA-Z]:\\`). The regex logic is solid, but I am not able to test this on a real Windows environment. ~Still have to add an entry to the changelog, will do so soon.~ # Update Initially this commit failed the `MoveTest.test_move_file_with_colon_alt_separator` test because it checks the logic using a `C:DOS` path. So I had to make the logic less restrictive again, not checking for a backslash (`^[a-zA-Z]:`). I would argue the test itself should be amended (test with `C:\DOS` instead), but that's not up to me. As a result, my case of "1:00 AM" being replaced incorrectly is still resolved, but other hypothetical cases like "a:b" would still not be covered due to an arguably incorrect test limiting a more precise regex.
2 parents 80d08ed + fbe9495 commit 44dc3cd

File tree

2 files changed

+5
-2
lines changed

2 files changed

+5
-2
lines changed

beets/dbcore/db.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -159,8 +159,8 @@ def _get_formatted(self, model: Model, key: str) -> str:
159159
sep_repl: str = beets.config["path_sep_replace"].as_str()
160160
sep_drive: str = beets.config["drive_sep_replace"].as_str()
161161

162-
if re.match(r"^\w:", value):
163-
value = re.sub(r"(?<=^\w):", sep_drive, value)
162+
if re.match(r"^[a-zA-Z]:", value):
163+
value = re.sub(r"(?<=[a-zA-Z]):", sep_drive, value)
164164

165165
for sep in (os.path.sep, os.path.altsep):
166166
if sep:

docs/changelog.rst

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,9 @@ Unreleased
1616
Bug fixes
1717
~~~~~~~~~
1818

19+
- :ref:`replace`: Made ``drive_sep_replace`` regex logic more precise to prevent
20+
edge-case mismatches (e.g., a song titled "1:00 AM" would incorrectly be
21+
considered a Windows drive path).
1922
- :doc:`plugins/fish`: Fix AttributeError. :bug:`6340`
2023

2124
..

0 commit comments

Comments
 (0)