Description
Issue
When creating a virtual environment, the base prefix of the virtual environment should be the same as the base prefix of the Python that is creating the virtual environment - there should be no additional resolve steps with regards to symlinks.
This behaviour is seen in venv
, and was implemented as expected until #2686, where symlinks were resolved.
Why is this important?
In big (scientific) institutions, it is common to have a network mounted filesystem which can be access from all managed machines. This could be to mount a homespace, or to mount some data etc. (I've seen both). To scale this up, it is necessary to have multiple filesystem servers all using the same underlying storage. Machines are then clustered to point to different servers, but the user doesn't know which server a machine is talking to. When combining this with something such as autofs, you find that the server being contacted is in the path (e.g. /nfs/some-machine/
), and to smoothen this out accross machines, the managed machines get a canonical symlink (e.g. /project/{my-project}
which symlinks to the specific machine mountpoint). Essentially:
machine1:
ls -ltr /project/my-project -> /nfs/nfs-server1
machine2:
ls -ltr /project/my-project -> /nfs/nfs-server2
With this setup, you can create a virtual environment in /project/my-project
which works on both machines IFF the symlink is not resolved. This is the behaviour of CPython, and also the behaviour of venv
and virtualenv<20.25.1
.
Reproducer
Written in the form of a pytest (w validation against venv
also):
import os
import pathlib
import sys
import subprocess
import typing
import pytest
def read_venv_config(venv_prefix: pathlib.Path) -> typing.Dict[str, str]:
venv_cfg_path = venv_prefix / 'pyvenv.cfg'
cfg = {}
with venv_cfg_path.open('rt') as cfg_fh:
for line in cfg_fh:
name, _, value = line.partition('=')
cfg[name.strip()] = value.strip()
return cfg
@pytest.mark.parametrize("venv_impl", ["venv", "virtualenv"])
def test_symlink_python(tmp_path: pathlib.Path, venv_impl: str) -> None:
py_link = tmp_path / "some-other-prefix"
pathlib.Path(py_link).symlink_to(sys.base_prefix)
dest_venv = tmp_path / "some-venv"
py_bin = py_link / 'bin' / 'python'
subprocess.run([py_bin, '-m', venv_impl, str(dest_venv)], check=True)
cfg = read_venv_config(dest_venv)
py_base_prefix = subprocess.check_output([py_bin, '-c', 'import sys; print(sys.base_prefix)'], text=True).strip()
# Get the link of the py bin. Don't recursively resolve this (like pathlib.resolve would do)
py_bin_link = os.readlink(dest_venv / 'bin' / 'python')
assert py_bin_link == str(py_link / 'bin' / 'python')
assert str(py_link) == py_base_prefix
assert py_base_prefix + '/bin' == cfg['home']
The result is a pass in 20.25.0
and a fail since:
> assert py_base_prefix + '/bin' == cfg['home']
E AssertionError: assert '/tmp/pytest-of-pelson/pytest-401/test_symlink_python_virtualenv0/some-other-prefix/bin' == '/path/to/my/environment/bin'
E - /path/to/my/environment/bin
E + /tmp/pytest-of-pelson/pytest-401/test_symlink_python_virtualenv0/some-other-prefix/bin
It is worth noting that the implementation of #2686 has a bug in the fact that the symlink at dest_venv / 'bin' / 'python'
points to sys.base_prefix / 'bin' / 'python'
, and not the resolved symlink location of sys.base_prefix
. Therefore the home
value is inconsistent with the symlink that is created by virtualenv
. The test validates this.
Implications
There are two issues which #2686 closed:
To be honest, I'm not sure what #2682 is asking for. Perhaps is is a request for behaviour that is different for venv
and also the std library (wrt. sys.base_prefix
) @mayeut.
For #2684, I also don't fully understand the reason for this being a virtualenv
issue (this is my problem, not a problem with the issue itself) - and somebody who knows what the correct behaviour should be would need to chime in (perhaps @ofek?).