Skip to content

Replace VenvPex with PEX's native --sh-boot #21177

Open
@huonw

Description

@huonw

Is your feature request related to a problem? Please describe.

Pants has a concept of a VenvPex, used for running internal tools (etc.), that pokes into PEX internals to minimise start-up time. I think this is now supported more directly via PEX with its --sh-boot argument.

Removing the custom code makes pants simpler and more reliable.

Describe the solution you'd like

  1. Verify a --sh-boot-built PEX behaves similarly to the VenvPex
  2. Replace uses of VenvPex in this repository with --sh-boot-built PEXes
  3. Deprecate VenvPex and its support to help plugin authors migrate

Describe alternatives you've considered

N/A

Additional context

Reference code:

  • Support --venv mode for internal PEXes. #11557
  • docs (relevant code surrounds it)
    # VenvPex is motivated by improving performance of Python tools by eliminating traditional PEX
    # file startup overhead.
    #
    # To achieve the minimal overhead (on the order of 1ms) we discard:
    # 1. Using Pex default mode:
    # Although this does reduce initial tool execution overhead, it still leaves a minimum
    # O(100ms) of overhead per subsequent tool invocation. Fundamentally, Pex still needs to
    # execute its `sys.path` isolation bootstrap code in this case.
    # 2. Using the Pex `venv` tool:
    # The idea here would be to create a tool venv as a Process output and then use the tool
    # venv as an input digest for all tool invocations. This was tried and netted ~500ms of
    # overhead over raw venv use.
    #
    # Instead we use Pex's `--venv` mode. In this mode you can run the Pex file and it will create a
    # venv on the fly in the PEX_ROOT as needed. Since the PEX_ROOT is a named_cache, we avoid the
    # digest materialization overhead present in 2 above. Since the venv is naturally isolated we
    # avoid the `sys.path` isolation overhead of Pex itself present in 1 above.
    #
    # This does leave O(50ms) of overhead though for the PEX bootstrap code to detect an already
    # created venv in the PEX_ROOT and re-exec into it. To eliminate this overhead we execute the
    # `pex` venv script in the PEX_ROOT directly. This is not robust on its own though, since the
    # named caches store might be pruned at any time. To guard against that case we introduce a shim
    # bash script that checks to see if the `pex` venv script exists in the PEX_ROOT and re-creates
    # the PEX_ROOT venv if not. Using the shim script to run Python tools gets us down to the ~1ms
    # of overhead we currently enjoy.

PEX code: pex-tool/pex#1721

There's some issues that might be resolved by eliminating this:

Activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions