-
-
Notifications
You must be signed in to change notification settings - Fork 594
feat: data and pyi files in the venv #2936
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
9dd7589
1381bf8
42e5059
d59f85a
714b245
31e16ee
e0affec
c844aa0
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -682,15 +682,30 @@ def _create_venv_symlinks(ctx, venv_dir_map): | |
def _build_link_map(entries): | ||
# dict[str kind, dict[str rel_path, str link_to_path]] | ||
link_map = {} | ||
|
||
# Here we store venv paths by package | ||
# dict[str package, dict[str kind, list[str venv_path]]] | ||
pkg_map = {} | ||
for entry in entries: | ||
kind = entry.kind | ||
kind_map = link_map.setdefault(kind, {}) | ||
if entry.venv_path in kind_map: | ||
# We ignore duplicates by design. The dependency closer to the | ||
# binary gets precedence due to the topological ordering. | ||
continue | ||
else: | ||
kind_map[entry.venv_path] = entry.link_to_path | ||
|
||
# If we detect that we are adding a dist-info for an already existing package | ||
# we need to pop all of the previous symlinks from the link_map | ||
Comment on lines
+693
to
+694
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This changes the semantics to last-wins. We want first-wins so that the topological ordering has meaning. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Could you describe this in terms of what the end user should be able to do? I think we actually want the last wins? I am a little fuzzy about last vs first here. I added a test to what I think should be possible, so maybe we can discuss there. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. A binary should be able to add a dependency closer to it to override behavior of further away targets. "first" refers to topological ordering: the first item is closer. So given this:
Where foo_v1 and foo_v1 both produce a venv symlink for Then foo_v1's entry should be used and foo_v2's ignored. This is because the serialized order of bin's targets (i.e venv_symlinks.to_list()) follows topological order: foo_v1, middle, foo_v2 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I've added this sort of test and I've found that the current logic is what I need. That is why I have |
||
if entry.venv_path.endswith(".dist-info") and entry.src in pkg_map: | ||
# dist-info will come always first | ||
for kind, dir_paths in pkg_map.pop(entry.src).items(): | ||
for dir_path in dir_paths: | ||
link_map[kind].pop(dir_path) | ||
aignas marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
pkg_venv_paths = pkg_map.setdefault(entry.src, {}).setdefault(entry.kind, []) | ||
pkg_venv_paths.append(entry.venv_path) | ||
|
||
# We overwrite duplicates by design. The dependency closer to the | ||
aignas marked this conversation as resolved.
Show resolved
Hide resolved
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Similar to above: if an entry is being overwritten, it means the first-wins semantics aren't occurring. |
||
# binary gets precedence due to the topological ordering. | ||
# | ||
# This allows us to store only one version of the dist-info that is needed | ||
kind_map[entry.venv_path] = entry.link_to_path | ||
|
||
# An empty link_to value means to not create the site package symlink. | ||
# Because of the topological ordering, this allows binaries to remove | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -67,6 +67,11 @@ the venv to create the path under. | |
|
||
A runfiles-root relative path that `venv_path` will symlink to. If `None`, | ||
it means to not create a symlink. | ||
""", | ||
"src": """ | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. No comment on renaming this field? |
||
:type: str | None | ||
|
||
Represents the PyPI package that the code originates from. | ||
""", | ||
"venv_path": """ | ||
:type: str | ||
|
@@ -298,17 +303,6 @@ This field is currently unused in Bazel and may go away in the future. | |
|
||
A depset with `topological` ordering. | ||
|
||
|
||
Tuples of `(runfiles_path, site_packages_path)`. Where | ||
* `runfiles_path` is a runfiles-root relative path. It is the path that | ||
has the code to make importable. If `None` or empty string, then it means | ||
to not create a site packages directory with the `site_packages_path` | ||
name. | ||
* `site_packages_path` is a path relative to the site-packages directory of | ||
the venv for whatever creates the venv (typically py_binary). It makes | ||
the code in `runfiles_path` available for import. Note that this | ||
is created as a "raw" symlink (via `declare_symlink`). | ||
|
||
:::{include} /_includes/experimental_api.md | ||
::: | ||
|
||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -41,6 +41,7 @@ load( | |
"runfiles_root_path", | ||
) | ||
load(":flags.bzl", "AddSrcsToRunfilesFlag", "PrecompileFlag", "VenvsSitePackages") | ||
load(":normalize_name.bzl", "normalize_name") | ||
load(":precompile.bzl", "maybe_precompile") | ||
load(":py_cc_link_params_info.bzl", "PyCcLinkParamsInfo") | ||
load(":py_info.bzl", "PyInfo", "VenvSymlinkEntry", "VenvSymlinkKind") | ||
|
@@ -206,16 +207,32 @@ Source files are no longer added to the runfiles directly. | |
::: | ||
""" | ||
|
||
def _get_distinfo_metadata(ctx): | ||
data = ctx.files.data or [] | ||
for d in data: | ||
# work on case insensitive FSes | ||
if d.basename.lower() != "metadata": | ||
continue | ||
|
||
if d.dirname.endswith(".dist-info"): | ||
return d | ||
|
||
return None | ||
|
||
def _get_imports_and_venv_symlinks(ctx, semantics): | ||
imports = depset() | ||
venv_symlinks = depset() | ||
if VenvsSitePackages.is_enabled(ctx): | ||
venv_symlinks = _get_venv_symlinks(ctx) | ||
dist_info_metadata = _get_distinfo_metadata(ctx) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. nit: move the get_distinfo_metadata() call inside the _get_venv_symlinks function. The value isn't used outside that function and that function already has the necessary inputs to call it. |
||
venv_symlinks = _get_venv_symlinks( | ||
ctx, | ||
dist_info_metadata, | ||
) | ||
else: | ||
imports = collect_imports(ctx, semantics) | ||
return imports, venv_symlinks | ||
|
||
def _get_venv_symlinks(ctx): | ||
def _get_venv_symlinks(ctx, dist_info_metadata): | ||
imports = ctx.attr.imports | ||
if len(imports) == 0: | ||
fail("When venvs_site_packages is enabled, exactly one `imports` " + | ||
|
@@ -254,16 +271,41 @@ def _get_venv_symlinks(ctx): | |
repo_runfiles_dirname = None | ||
dirs_with_init = {} # dirname -> runfile path | ||
venv_symlinks = [] | ||
for src in ctx.files.srcs: | ||
if src.extension not in PYTHON_FILE_EXTENSIONS: | ||
continue | ||
package = None | ||
if dist_info_metadata: | ||
# in order to be able to have replacements in the venv, we have to add a | ||
# third value into the venv_symlinks, which would be the normalized | ||
# package name. This allows us to ensure that we can replace the `dist-info` | ||
# directories by checking if the package key is there. | ||
dist_info_dir = paths.basename(dist_info_metadata.dirname) | ||
package, _, _suffix = dist_info_dir.rpartition(".dist-info") | ||
package, _, _version = package.rpartition("-") | ||
package = normalize_name(package) | ||
|
||
repo_runfiles_dirname = runfiles_root_path(ctx, dist_info_metadata.short_path).partition("/")[0] | ||
venv_symlinks.append(VenvSymlinkEntry( | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There needs to be a comment somewhere around here that the particular ordering matters -- the dist-info must come before the other symlinks to ensure things are properly respected. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is no longer true. It can come anywhere, I think. |
||
kind = VenvSymlinkKind.LIB, | ||
link_to_path = paths.join(repo_runfiles_dirname, site_packages_root, dist_info_dir), | ||
src = package, | ||
venv_path = dist_info_dir, | ||
)) | ||
|
||
for src in ctx.files.srcs + ctx.files.data: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Something seems wrong here, but I can't quite put my finger on it. Files in srcs have to have the init.py logic applied to detect namespace package boundaries to auto-detect the proper paths to create. Files in data shouldn't be part of that logic. If the file is within a directory covered by something in srcs (i.e *.py), then there's no need to look at the data file. If the data file isn't in a directory covered by something in srcs, then it shouldn't treat a .so/.pyi/.pyc file as some sort of boundary marker. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Talking about special treatment, I don't think I understand why we need it. We could just treat everything as just files and interleave the directories? There is nothing special in my mind here. Beforehand we had the auto-generated There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
You mean, for every file, create a venv_symlink entry? That won't scale because it doubles the number of files a binary has to materialize, when all that's really needed is e.g. one symlink to a directory. However, because of namespace packages, we can't simply use the directories directly under There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What if we treat everything as
I am wondering how we can do this? |
||
path = _repo_relative_short_path(src.short_path) | ||
if not path.startswith(site_packages_root): | ||
continue | ||
path = path.removeprefix(site_packages_root) | ||
dir_name, _, filename = path.rpartition("/") | ||
|
||
if dir_name and filename.startswith("__init__."): | ||
if src.extension not in PYTHON_FILE_EXTENSIONS: | ||
if dir_name.endswith(".dist-info"): | ||
# we have already handled the stuff | ||
pass | ||
elif dir_name: | ||
# TODO @aignas 2025-05-30: is this the right way? | ||
dirs_with_init[dir_name] = None | ||
repo_runfiles_dirname = runfiles_root_path(ctx, src.short_path).partition("/")[0] | ||
elif dir_name and filename.startswith("__init__."): | ||
dirs_with_init[dir_name] = None | ||
repo_runfiles_dirname = runfiles_root_path(ctx, src.short_path).partition("/")[0] | ||
elif not dir_name: | ||
|
@@ -274,6 +316,7 @@ def _get_venv_symlinks(ctx): | |
venv_symlinks.append(VenvSymlinkEntry( | ||
kind = VenvSymlinkKind.LIB, | ||
link_to_path = paths.join(repo_runfiles_dirname, site_packages_root, filename), | ||
src = package, | ||
venv_path = filename, | ||
)) | ||
|
||
|
@@ -295,6 +338,7 @@ def _get_venv_symlinks(ctx): | |
venv_symlinks.append(VenvSymlinkEntry( | ||
kind = VenvSymlinkKind.LIB, | ||
link_to_path = paths.join(repo_runfiles_dirname, site_packages_root, dirname), | ||
src = package, | ||
venv_path = dirname, | ||
)) | ||
return venv_symlinks | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,14 @@ | ||
load("@rules_python//python:py_library.bzl", "py_library") | ||
|
||
package(default_visibility = ["//visibility:public"]) | ||
|
||
py_library( | ||
name = "simple_v1", | ||
srcs = glob(["site-packages/**/*.py"]), | ||
data = glob( | ||
["**/*"], | ||
exclude = ["site-packages/**/*.py"], | ||
), | ||
experimental_venvs_site_packages = "@rules_python//python/config_settings:venvs_site_packages", | ||
imports = [package_name() + "/site-packages"], | ||
) |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
inside is v1 |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
__version__ = "1.0.0" |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
# Intentionally empty |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,15 @@ | ||
load("@rules_python//python:py_library.bzl", "py_library") | ||
|
||
package(default_visibility = ["//visibility:public"]) | ||
|
||
py_library( | ||
name = "simple_v2", | ||
srcs = glob(["site-packages/**/*.py"]), | ||
data = glob( | ||
["**/*"], | ||
exclude = ["site-packages/**/*.py"], | ||
), | ||
experimental_venvs_site_packages = "@rules_python//python/config_settings:venvs_site_packages", | ||
imports = [package_name() + "/site-packages"], | ||
pyi_srcs = glob(["**/*.pyi"]), | ||
) |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
inside is v1 |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
Intentionally empty | ||
rickeylev marked this conversation as resolved.
Show resolved
Hide resolved
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
__version__ = "2.0.0" |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
# Intentionally empty |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The type description makes it clear they're keyed by package, so no need to restate it in prose.