Skip to content

Conversation

@richardxia
Copy link
Contributor

This is a WIP, but I'm submitting it in case if the unit tests are still useful. As reported in #470 (comment), the glob() method no longer works in Python 3.13, returning an empty result regardless of the actual contents in Artifactory.

I haven't gotten this 100% working yet, but I've gotten a little bit closer to fixing it in Python 3.13. It looks like the pathlib.Path class changed a lot in Python 3.13 w.r.t. globbing behavior, where in Python 3.12 and lower, there was a full reimplementation of globbing inside the pathlib module, but in Python 3.13, it now delegates to the actual glob package.

One specific issue is that pathlib._abc.PurePathBase depends on the abstract glob._Globber class, but pathlib.PurePath overrides it to depend on glob._StringGlobber. _StringGlobber is problematic because it explicitly uses os.lstat and os.scandir instead of delegating to the Path subclass. What this means is that in Python 3.13, when you call ArtifactoryPath.glob(), you eventually end up calling os.scandir(), which fails because the path does not actually exist on disk locally.

The reason that ArtifactoryPath.glob() returns the empty list rather than throwing an exception is because in Python 3.13, any OSErrors that occur while calling glob() are ignored.

I have a partial fix, where I modified ArtifactoryPath to explicitly override the glob() method and delegate to pathlib._abc.PathBase's implementation rather than pathlib.Path's implementation (at least in Python 3.13). In the unit test I added, I tried to create mock Artifactory API response data that would correspond to a file tree that looks like:

.index/
com/
    foo
    bar

My unit test tries to run glob("**/*"), and in all Pythons < 3.13, it returns the four results of .index/, com/, com/foo, and com/bar. However, in Python 3.13, it currently is only returning the two files and not the directories on their own. I am not sure if my mock data is incorrect yet, or if there is some other Path API that I need to override.

I'm not sure if I'll have time to work on this for the rest of the day (US Pacific time), but I should have time to continue looking in the coming days. In the meantime, I encourage anyone else to take a look at the failures as well.

artifactory.py Outdated
if IS_PYTHON_3_13_OR_NEWER:
import glob

_globber = glob._Globber
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where is this used ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In Python 3.13, the pathlib._abc.PurePathBase class uses it in the _glob_selector() method, and the _glob_selector() method is called by the glob() method. The pathlib.Path class also calls _glob_selector() in its own definition of the glob() method.

PurePathBase defines the variable in https://github.com/python/cpython/blob/v3.13.2/Lib/pathlib/_abc.py#L121, and Path overrides it in https://github.com/python/cpython/blob/v3.13.2/Lib/pathlib/_local.py#L105.

If we don't want to depend on a private API like this, then I'm afraid the only option in Python 3.13 would be to completely reimplement the glob() method from scratch.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you please add the comment here then ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I've added a comment, but I also ended up figuring out the remaining issue, which required me to change this to a custom subclass of _Globber, so I've changed line a bit more significantly in 1bcbd59

@richardxia
Copy link
Contributor Author

I believe I've figured out the remaining issue on Python 3.13, and I've pushed up a fix in 1bcbd59.

Unfortunately, this required subclassing the private glob._Globber class and overriding one of the methods with a copy of the original implementation but with a slight tweak to the math. I've tried to document this in some very detailed comments, but summarize the issue, the upstream implementation of _Globber doesn't work properly for non-string paths, since it wants to be able to add a trailing slash for the purposes of computing where a regex match should start looking for child files/directories, and when you try to add that slash to a Path subclass, it ends up stripping it away. I had to copy the upstream implementation of the method but modify to add 1 to the regex start position in order for the rest of the code to work properly.

Please let me know if this approach is OK or if there is some other way we should resolve this problem. On the plus side, the unit test I added is now passing for Python 3.13, so this at least works for that test case.

Copy link
Contributor

@offa offa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have test this PR on Python 3.13 and 3.12. The sorting of the results is different (shouldn't be relevant), but the result list contains the same entries. Thanks! 👍

Copy link
Member

@allburov allburov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It becomes more complex and complex to support all the version :(
Need to think and check how others do it 🤔

@richardxia
Copy link
Contributor Author

Thanks again for all the review. I believe I've addressed all comments, so please let me know if there's anything I else I can do.

@relativityspace-jsmith
Copy link

Tested latest version, it works to upgrade an existing app to python 3.13!

@allburov allburov merged commit 643bc4f into devopshq:python-3-13 Mar 18, 2025
6 checks passed
allburov added a commit that referenced this pull request Apr 8, 2025
* Python 3.13 compatibility. (#473)

Between Python 3.12 and Python 3.13, the internal structure of pathlib
changed, causing the Artifactory URL parsing to fail. `pathlib.PurePath`
previously had a private class attribute, `_flavour`, which has now been
renamed to `parser` and made into a public and documented API.

This renames `_flavour` to `parser` in the ArtifactoryPath subclasses,
but it leaves around a `_flavour` class attribute that is aliased to
`parser` as a compatibility shim for older versions of Python.

One other breakage between Python 3.12 and 3.13 is that the artifactory
package attempted to import `posixpath` via the `pathlib` package.
`posixpath` was never meant to be a publicly accessible attribute of
`pathlib`, as `posixpath` is its own top-level package in the standard
library. The `pathlib` code was significantly restructured, causing the
`posixpath` module to no longer be accessible under `pathlib`. We fix
this in artifactory by directly importing the top-level `posixpath`
package.

Finally, this adds Python 3.13 to the package metadata in setup.py and
the tox and GitHub Actions configuration files so that it is officially
declared as a supported Python version and tested in CI.

* Replace _make_child_relpath() with public joinpath() (#475)

* Add unit test for `glob()` and partially fix Python 3.13 case. (#476)

* Add unit test for `glob()` and partially fix Python 3.13 case.

* Add custom subclass of _Globber to fix remaining issues with glob behavior.

* Guard _globber override with Python version check.

Co-authored-by: allburov <[email protected]>

* Rearrange `if` statement to prefer early returns.

Co-authored-by: allburov <[email protected]>

* Add comment linking to original code for recursive_selector().

---------

Co-authored-by: allburov <[email protected]>

---------

Co-authored-by: Richard Xia <[email protected]>
Co-authored-by: offa <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

5 participants