-
-
Notifications
You must be signed in to change notification settings - Fork 593
feat: data and pyi files in the venv #2936
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
This adds the necessary `.dist-info` files into the mix and should get us reasonably close to handling 99% of the cases. The expected differences from this and a `venv` built by `uv` would be: * Shared libraries are stored in `<package>.libs` in `uv` venvs. This can be achieved in `rules_python` by changing the `installer` settings in the `wheel_installer/wheel.py#unzip` function. * The `RECORD` files are excluded from the `venv`s for better cache hit rate in `bazel`, however I am not sure if we should do that for actual wheels that are downloaded from the internet. Tested: - [x] Building the `//docs` and manually checking the symlinks. - [ ] Unit tests Work towards bazel-contrib#2156
I am thinking I should also link other things passed as I think excluding anything outside |
I'll push some of the changes later but I have a hard time writing a test that would fullfill some of the requirements in the PR todo list. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
re: "src" field name: Let's pick a better name. "src" is an established name in bazel/starlark, so I don't think it should be overloaded.
- package
- dist
- link_group
- group_name
- group
- group_key1 (e.g. "simple"), group_key2 (e.g. "1.0.0")
- dist, version (simple, 1.0)
?
The basic change in logic here is associating a set of paths with a dist-info name, right? And only one of those sets of paths will be used? The idea being, given:
("foo", <foo v1 paths>)
("foo", <foo v2 paths>)
Only one of those tuples of info is used.
@@ -682,15 +682,30 @@ def _create_venv_symlinks(ctx, venv_dir_map): | |||
def _build_link_map(entries): | |||
# dict[str kind, dict[str rel_path, str link_to_path]] | |||
link_map = {} | |||
|
|||
# Here we store venv paths by package |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The type description makes it clear they're keyed by package, so no need to restate it in prose.
# Here we store venv paths by package |
# If we detect that we are adding a dist-info for an already existing package | ||
# we need to pop all of the previous symlinks from the link_map |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This changes the semantics to last-wins. We want first-wins so that the topological ordering has meaning.
pkg_venv_paths = pkg_map.setdefault(entry.src, {}).setdefault(entry.kind, []) | ||
pkg_venv_paths.append(entry.venv_path) | ||
|
||
# We overwrite duplicates by design. The dependency closer to the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar to above: if an entry is being overwritten, it means the first-wins semantics aren't occurring.
def _get_imports_and_venv_symlinks(ctx, semantics): | ||
imports = depset() | ||
venv_symlinks = depset() | ||
if VenvsSitePackages.is_enabled(ctx): | ||
venv_symlinks = _get_venv_symlinks(ctx) | ||
dist_info_metadata = _get_distinfo_metadata(ctx) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: move the get_distinfo_metadata() call inside the _get_venv_symlinks function. The value isn't used outside that function and that function already has the necessary inputs to call it.
package = normalize_name(package) | ||
|
||
repo_runfiles_dirname = runfiles_root_path(ctx, dist_info_metadata.short_path).partition("/")[0] | ||
venv_symlinks.append(VenvSymlinkEntry( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There needs to be a comment somewhere around here that the particular ordering matters -- the dist-info must come before the other symlinks to ensure things are properly respected.
venv_path = dist_info_dir, | ||
)) | ||
|
||
for src in ctx.files.srcs + ctx.files.data: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something seems wrong here, but I can't quite put my finger on it.
Files in srcs have to have the init.py logic applied to detect namespace package boundaries to auto-detect the proper paths to create.
Files in data shouldn't be part of that logic. If the file is within a directory covered by something in srcs (i.e *.py), then there's no need to look at the data file. If the data file isn't in a directory covered by something in srcs, then it shouldn't treat a .so/.pyi/.pyc file as some sort of boundary marker.
@@ -67,6 +67,11 @@ the venv to create the path under. | |||
|
|||
A runfiles-root relative path that `venv_path` will symlink to. If `None`, | |||
it means to not create a symlink. | |||
""", | |||
"src": """ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No comment on renaming this field?
This adds the remaining of the files into the venv and should get us
reasonably close to handling 99% of the cases.
The expected differences from this and a
venv
built byuv
would be:RECORD
files are excluded from thevenv
s for better cache hitrate in
bazel
, however I am not sure if we should do that for actualwheels that are downloaded from the internet.
Tested:
//docs
and manually checking the symlinks..pyi
files get included..dist_info
gets included.Work towards #2156