Skip to content

Wheelable production install #1039

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 13 commits into
base: main
Choose a base branch
from

Conversation

davidlatwe
Copy link
Contributor

What's changed

  • Able to perform production install with pip + wheel
  • The entry_points is removed from setup.py

So, except creating virtual env and the welcome messages, setup.py is no different than the install.py. Now I think you could have rez production install with simple pip install rez on any environment, virtual or not. And pip uninstall rez can wipe out the installation entirely.

Tested on my Mac and Windows, looks promising.

Now the setup.py is able to make binary scripts by itself and
install them as data_files under rez production dir 'bin/rez'
(or 'Scripts/rez' on Windows). However, console_scripts 'rez'
is conflicting with the rez production dir 'rez'.

Since the previous install.py already removing those scripts
before doing the *patch*, console_scripts should be ignored
in new setup.py.

See AcademySoftwareFoundation#625 and AcademySoftwareFoundation#662 for reason
why adding console_scripts in the first place.
@nerdvegas
Copy link
Contributor

Hey David, yeah it looks promising but we have to tread very carefully with a change like this.

My concern is that this gets us closer to a pip-based production install (which would be amazing btw), however unless I'm mistaken, it still falls short:

  • What if you rez-env into an env with a different major python version? It might work by chance, but now you're exposing a python-X-based executable within a python-Y based env. Seems like a lot of scope for things to go wrong.
  • Right now this will probably work, but only because rez has those embedded vendor libs. I'd like to move away from them at some point, but then we'd be stuffed - we have no guarantee that standard python module lookup will work as expected in a resolved env
  • Plugins will be a problem. We do need to improve the plugin mechanism at some point, and I see part of that as having to install plugins directly into the rez venv. Obviously you can't do that in a pip-based install.

With these things in mind, is it really possible to ever have a safe rez installation via pip? And if not, why make setup.py more complicated and less standard?

@davidlatwe davidlatwe mentioned this pull request Mar 16, 2021
@davidlatwe
Copy link
Contributor Author

Thanks Allan :)

Yeah, I thought the bottom lines are:

  • Rez bin scripts MUST run with Python interpreter flag -E
  • Rez bin scripts MUST NOT live under Python "bin" dir

And we are safe, the rest are the responsibility of users for using pip instead of install.py. :P

And if not, why make setup.py more complicated and less standard?

About this, I thought at least it's better then the current entry_points in setup.py ?

I wasn't meant to get rid of install.py, but to make the pip install have a closer result as install.py does. But just to be sure, when comparing "Pip-based install" and the "production install", does that just mean the differences between entry_points scripts and the -E flag enabled script, or there are something else ?

@nerdvegas
Copy link
Contributor

nerdvegas commented Mar 16, 2021 via email

@davidlatwe
Copy link
Contributor Author

davidlatwe commented Mar 18, 2021

TL;DR

I have come up an approach that use shell scripts to replace binary executable files. And with PEP-427, those shell scripts can always locate correct Python interpreter to run. Which means, the production .whl can be platform independent. However, this approach will make rez must be distributed via wheel, and in rez's functionality, this breaks the previous forwarding script on Windows (no .exe).

Not sure about this kind of big change is acceptable or not.

Install scripts to bin/rez

Looks like there is no way you can intervene where scripts should be installed without giving --install-script flag to setuptools, so providing custom setuptool command class is a must. Luckily, I found there is a much simpler way to add sub-directory for rez bin scripts compares to previous commits.

Ship binary scripts (dead end)

I finally, aware that it was wrong to pre-generated binary scripts into wheel, since those scripts are already been hard-coded with the local executable path.

Then I found that PEP-427 says, wheel can rewrite shebang #!python to point to the correct interpreter on script installation. So I tried making script with distlib.scripts.ScriptMaker and set maker.executable = "python", in result, the python shebang is pointed correctly after installation, however, the interpreter flag -E is removed. So it seems, there's no chance that we can install scripts with interpreter flags. But even it did preserve the flag, the .whl is not platform independent.

To workaround that, I even thought about installing fake entry that will run distlib.scripts.ScriptMaker on first launch to create a real rez bin tool to replace it. But the production install will be ruined if not calling them with Rez venv's python on first launch. :(

No binary scripts

Turns out, the wheel isn't only rewriting shebang for python script files, it rewrites all files that lives in {distribution}-{version}.data/scripts/. (except .exe). So I came up with this :

  • Add shebang #!python into rez production install validation file .rez_production_install for wheel to rewrite
  • Replace each binary script with one shell script and one python script, when calling the shell script, it will read the shebang from .rez_production_install to get Python location and run the same named python script to launch rez tool.
#!/bin/bash
export _rez_python=$(head -1 $(dirname $0)/.rez_production_install)
${_rez_python:2} -E $(dirname $0)/rez-env_.py "$@"

Drawbacks

This change seems not affecting Linux much, but a huge different on Windows is that there are no .exe, which means you cannot do subprocess.Popen(["rez-env"]) but require to subprocess.Popen(["rez-env.cmd"]) or, use rez.utils.execution.Popen(["rez-env"]). And those pre-existing rez suite forwarding script on Windows will not work for same reason.

One other limitation is that rez MUST be distributed with wheel, so I added a pyproject.toml which is the only and correct way to make pip to install wheel before installing rez.

Oh and, the Windows docker image for testing, might need to be updated as well ? Since I have also changed the .github/docker/rez-win-py/entrypoint.ps1.


Although this breaks rez suite on Windows, but I do like to live without those binaries. Maybe make the install.py stay as it was, to get original rez tools ?

What do you think ?

@instinct-vfx
Copy link
Contributor

I am flooded with work currently so i am not sure when i can get around to actually do some testing. That said we have been running Rez through batch wrappers and are trying to get rid of them completely. There are many weird and unexpected side effects of doing so so i would be hesitant. An example would be escaping environment variable tokens. I have a launcher batch file that wrappers the Nuke executable in 8(!) percentage characters to escape and work. That's %%%%%%%%NUKE_EXE%%%%%%%%. There are multiple reasons contributing to that. But the culprit is layered wrappers.

To circumvent this issue another thing we did in the past was maintain a patched custom build of python+setuptools that does not have the full path in shims, but only python. But that's a lot of work to maintain and also has some possibly unexpected implications.

On the other hand it is a far reaching issue that i would love to see resolved. As mentioned on slack the same issue also affects every executable created by installing a pip package. e.g. if you install bumpversion through rez-pip, the resulting bumpversion.exe will be hardwired to the very python package you created it with. With an absolute path embedded. Any ideas would be really really welcome.

We are currently switching to powershell completely. That might bring new possibilities to handle these kinds of situation. But that would also possibly make packages shell dependent.

@davidlatwe
Copy link
Contributor Author

Thanks @instinct-vfx !

An example would be escaping environment variable tokens.

That indeed is something that I haven't fully tested yet.

That said we have been running Rez through batch wrappers and are trying to get rid of them completely.

Sounds like the binary executable solved the "env var token escaping" problem for older version of Rez ?

To be honest I am not very familiar with this kind of issues, but would love to dig down to the past to see what those Rez's "batch wrappers" looks like, were they part of this git repo ?

As mentioned on slack the same issue also affects every executable created by installing a pip package. e.g. if you install bumpversion through rez-pip, the resulting bumpversion.exe will be hardwired to the very python package you created it with. With an absolute path embedded.

So this will not be a problem for new rez installation approach, correct ? At least the absolute path is written in the editable production install validation file.

We are currently switching to powershell completely.

I actually can't do that, the powershell execution has been blocked in my studio completely. :(

@davidlatwe
Copy link
Contributor Author

davidlatwe commented Mar 26, 2021

So I am gonna closing this one now, so no one needs to wast too much time on this and the PR won't hang (the outcome of this PR works just like the old ways in my tests, but with that drawback on Windows and the time it needs for others to test, starts making me worried).

For what it's worth, I'll leaving some notes about what have I learnt on my way here.

While I was digging this hole, I found there were some attempts trying to make pip able to take Python interpreter flag -E into account when installing scripts from pip/wheel, but all failed or stopped :
pypa/pip#1847
pypa/pip#3758
pypa/pip#3759

And from my understanding, the deep reason for that is there's no proper section for storing this kind of information in wheel's metadata. Which may require to address a PEP for that.

Then I found this, pipx, which provides isolated environment for Python apps and that's almost the same as what Rez's post install script does. But it is build on top of pip, so no -E either. However, that could be a suitable place for implementing -E flagged console script without updating wheel's standard, since the goal of that project is to ship Python apps into a production environment hence having additional post install action from there make sense.

But

Before closing this, I'd like to point out one thing that might worth to keep and continue with another PR. Which is commit 132765a.

The production install validation in rez.cli._entry_opints is different from rez.system. In rez.system, it lookup for the validation file starting from the rez module with a presumed directory structure to locate. Which limits the Rez's console scripts' install location.

So I wonder, if there's a validation file which already been verified on entrance, do we still need to crawl the directory structure to verify it again ?

Could be better if Rez remember the production_bin_path on the way in if validation file is found, without having another hidden validation process at another code path.

# in rez.cli._entry_opints

def check_production_install():
    path = os.path.dirname(sys.argv[0])
    filepath = os.path.join(path, ".rez_production_install")

    if os.path.exists(filepath):
        try:
            import rez
            rez.production_bin_path = path  # remember this access is valid
        except ImportError:
            pass
    else:
        sys.stderr.write(
            "Pip-based rez installation detected. blah blah blah.."
        )
# in rez.system

def rez_bin_path(self):
    """Get path containing rez binaries, or None if no binaries are
    available, or Rez is not a production install.
    """
    import rez
    if rez.production_bin_path:
        return rez.production_bin_path
    # continue the old ways or just return None
    ...

@JeanChristopheMorinPerso JeanChristopheMorinPerso added rez-pip ingesting py pkgs into rez (pip, wheels, etc) installer labels Jul 2, 2022
@maxnbk maxnbk requested review from nerdvegas and a team as code owners November 15, 2022 23:58
@JeanChristopheMorinPerso JeanChristopheMorinPerso requested a review from a team as a code owner April 4, 2023 15:30
@JeanChristopheMorinPerso
Copy link
Member

JeanChristopheMorinPerso commented Sep 9, 2023

I was going through our old PRs and stumbled on this PR, and more specifically your latest changes and this comment you made.

No binary scripts

Turns out, the wheel isn't only rewriting shebang for python script files, it rewrites all files that lives in {distribution}-{version}.data/scripts/. (except .exe). So I came up with this :

  • Add shebang #!python into rez production install validation file .rez_production_install for wheel to rewrite
  • Replace each binary script with one shell script and one python script, when calling the shell script, it will read the shebang from .rez_production_install to get Python location and run the same named python script to launch rez tool.
#!/bin/bash
export _rez_python=$(head -1 $(dirname $0)/.rez_production_install)
${_rez_python:2} -E $(dirname $0)/rez-env_.py "$@"

I see exactly where you are going with this and I think I know I we could this even further. What you did there is pretty clever, but a little bit fragile.

We could improve this by replacing the wrapper with a real executable. This could be done by leveraging https://bitbucket.org/vinay.sajip/simple_launcher/src/master/. This is what pip, and all other installers (including https://github.com/pypa/installer) use to create the entry point console scripts at install time.

The way it works is kind of weird, but it's pretty clever. You can see how it can be used at https://github.com/pypa/installer/blob/main/src/installer/scripts.py#L159-L164. You basically have to you compile simple_launcher to generate what's in https://github.com/pypa/installer/tree/main/src/installer/_scripts, and then to create the console scripts executable, all you have to do is to store the content of the appropriate simple_launcher executable and append the shebang on one line + the content of the console script (the python snippet) at the end. You name that file <something>.exe and you get an executable console script on Windows.

In our case, what we could do is slightly modify simple_launcher to make it read the shebang from .rez_production_install (based on the path of the executable, like https://stackoverflow.com/a/933996).

Our wheels wouldn't be pure, but at least it would fix the last remaining problem with your approach.

We could also maybe look at https://github.com/python/cpython/blob/main/PC/launcher.c. conda uses a variant of that file to do something similar. See https://github.com/conda/conda-build/blob/main/conda_build/launcher_sources/cpython-launcher-c-mods-for-setuptools.3.7.patch. Though, that launcher works differently then simple_launcher. You basically don't embed the shebang and the python script in the executable. You basically name the launcher the way you want, and it expects to find a file named <name>-script.py on its side (see https://github.com/conda/conda-build/blob/bf8caf548b951bf0a06320f77e5019bb891ea106/conda_build/windows.py#L63-L70 or https://github.com/conda/conda-build/blob/bf8caf548b951bf0a06320f77e5019bb891ea106/conda_build/_link.py#L107).

And setuptools' own launcher: https://github.com/pypa/setuptools/blob/main/launcher.c. They use https://github.com/pypa/setuptools/blob/main/tools/build_launchers.py to build the launchers.

@JeanChristopheMorinPerso
Copy link
Member

JeanChristopheMorinPerso commented Sep 10, 2023

And it turns out that https://bitbucket.org/vinay.sajip/simple_launcher/src/master/launcher.c has a mode where we don't need to append the script (so similar to the launcher embedded in CPython). See https://bitbucket.org/vinay.sajip/simple_launcher/src/2e1c7592574c4f42062fd3a5b1051ec02da4b543/launcher.c#lines-44. From what I see, when APPENDED_ARCHIVE is not defined, the launcher will simply look for <name>-script.py and it will load that file and read the shebang from it. So I think we wouldn't even need to modify the launcher to do what we want to do! (correction, we would need to comment out https://bitbucket.org/vinay.sajip/simple_launcher/src/2e1c7592574c4f42062fd3a5b1051ec02da4b543/launcher.c#lines-38).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
installer rez-pip ingesting py pkgs into rez (pip, wheels, etc)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants