Description
Could we please have documented somewhere a reference implementation in Python for the glob part that complies with the mandatory requirements of the PEP? (maybe an attachment? Or something in the PyPA docs?)
The original intention of "let's document whatever stdlib's glob do, so that we can implement it in other languages" was generally agreed in the Discourse thread. However, there was a significant departure from this original intention to something that require a lot more validations which are not implemented by Python's stdlib itself.
Setuptools received something similar to the following in a contribution to setuptools: Validate license-files glob patterns by cdce8p · Pull Request #4841 · pypa/setuptools · GitHub (thanks @cdce8p)
import os
import re
from glob import glob
def find_pattern(pattern: str) -> list[str]:
"""
>>> find_pattern("/LICENSE.MIT")
Traceback (most recent call last):
...
ValueError: Pattern '/LICENSE.MIT' should be relative...
>>> find_pattern("../LICENSE.MIT")
Traceback (most recent call last):
...
ValueError: Pattern '../LICENSE.MIT' cannot contain '..'...
>>> find_pattern("LICEN{CSE*")
Traceback (most recent call last):
...
ValueError: Pattern 'LICEN{CSE*' contains invalid characters...
"""
if ".." in pattern:
raise ValueError(f"Pattern {pattern!r} cannot contain '..'")
if pattern.startswith((os.sep, "/")) or ":\\" in pattern:
raise ValueError(
f"Pattern {pattern!r} should be relative and must not start with '/'"
)
if re.match(r'^[\w\-\.\/\*\?\[\]]+$', pattern) is None:
raise ValueError(
f"Pattern '{pattern}' contains invalid characters. "
"https://packaging.python.org/en/latest/specifications/pyproject-toml/#license-files"
)
found = glob(pattern, recursive=True)
if not found:
raise ValueError(f"Pattern '{pattern}' did not match any files.")
return found
Is it enough/complete/correct? (at first glance I would say yes by looking at the text of the PEP, but I would like a second opinion).
/cc @befeleme
Activity