Skip to content

Replace VenvPex with PEX's native --sh-boot #21177

Open
@huonw

Description

@huonw

Is your feature request related to a problem? Please describe.

Pants has a concept of a VenvPex, used for running internal tools (etc.), that pokes into PEX internals to minimise start-up time. I think this is now supported more directly via PEX with its --sh-boot argument.

Removing the custom code makes pants simpler and more reliable.

Describe the solution you'd like

  1. Verify a --sh-boot-built PEX behaves similarly to the VenvPex
  2. Replace uses of VenvPex in this repository with --sh-boot-built PEXes
  3. Deprecate VenvPex and its support to help plugin authors migrate

Describe alternatives you've considered

N/A

Additional context

Reference code:

  • Support --venv mode for internal PEXes. #11557
  • docs (relevant code surrounds it)
    # VenvPex is motivated by improving performance of Python tools by eliminating traditional PEX
    # file startup overhead.
    #
    # To achieve the minimal overhead (on the order of 1ms) we discard:
    # 1. Using Pex default mode:
    # Although this does reduce initial tool execution overhead, it still leaves a minimum
    # O(100ms) of overhead per subsequent tool invocation. Fundamentally, Pex still needs to
    # execute its `sys.path` isolation bootstrap code in this case.
    # 2. Using the Pex `venv` tool:
    # The idea here would be to create a tool venv as a Process output and then use the tool
    # venv as an input digest for all tool invocations. This was tried and netted ~500ms of
    # overhead over raw venv use.
    #
    # Instead we use Pex's `--venv` mode. In this mode you can run the Pex file and it will create a
    # venv on the fly in the PEX_ROOT as needed. Since the PEX_ROOT is a named_cache, we avoid the
    # digest materialization overhead present in 2 above. Since the venv is naturally isolated we
    # avoid the `sys.path` isolation overhead of Pex itself present in 1 above.
    #
    # This does leave O(50ms) of overhead though for the PEX bootstrap code to detect an already
    # created venv in the PEX_ROOT and re-exec into it. To eliminate this overhead we execute the
    # `pex` venv script in the PEX_ROOT directly. This is not robust on its own though, since the
    # named caches store might be pruned at any time. To guard against that case we introduce a shim
    # bash script that checks to see if the `pex` venv script exists in the PEX_ROOT and re-creates
    # the PEX_ROOT venv if not. Using the shim script to run Python tools gets us down to the ~1ms
    # of overhead we currently enjoy.

PEX code: pex-tool/pex#1721

There's some issues that might be resolved by eliminating this:

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions