Description
For various reasons, it is sometimes useful to add a direct wheel URL to the requirements set for a layer. For example:
[[frameworks]]
name = "torch-cu124-win-x86@0"
runtime = "cpython3.11-win-x86@1"
requirements = [
"torch @ https://download.pytorch.org/whl/cu124/torch-2.6.0%2Bcu124-cp311-cp311-win_amd64.whl",
"accelerate",
]
platforms = ["win_amd64"]
This technique turns out to have a few issues:
- using a direct URL reference here means that
uv pip compile
has to download the entire wheel just to query its dependency metadata - it's not obvious that the direct URL reference should include the optional hash fragment in the URL string
- it's not obvious that the wheel reference should be accompanied by a regular version pin for the affected library
(torch==2.6.0
in the example) so layers that depend on this one inherit the relevant version constraint - the wheels involved are often going to be platform specific, which means messing about with environment markers,
or defining the same layer multiple times, once for each platform (the example shows the latter approach)
At least some of these issues could potentially be mitigated by adding a dedicated wheel_overrides
key, allowing layer declarations along the lines of the following:
[[frameworks]]
name = "torch-cu124"
runtime = "cpython3.11"
requirements = [
"accelerate",
"torch==2.6.0",
]
wheel_overrides = [
{
project="torch",
archive_url="https://download.pytorch.org/whl/cu124/torch-2.6.0%2Bcu124-cp311-cp311-win_amd64.whl",
archive_hashes={sha256="..."},
platforms = ["win_amd64"],
},
{
project="torch",
archive_url="https://download.pytorch.org/whl/cu124/torch-2.6.0%2Bcu124-cp311-cp311-linux_x86_64.whl",
archive_hashes={sha256="..."},
platforms = ["linux_x86_64"],
},
]
venvstacks
would add the appropriate direct URL references for each layer to the input requirements for that layer (this is necessary to ensure the hashes in the compiled requirements include the hash of the specified installation artifact)
We may want to add a hash-archive
helper subcommand to generate hashes from direct URL references. Given the PyTorch use case, the algorithm for that could include the following potential trick:
- if the URL domain is
download.pytorch.org
, partition the trailing segment of the URL at the+
(%2B
) symbol, then reverse partition the first segment of that at the last-
symbol - query the resulting url (
https://download.pytorch.org/whl/cu124/torch
) for the above example - look for a
href
to the original file name, and get the hash from that instead of downloading the file and hashing it locally