Description
The initial layer archiving mode is a concrete "deployment" archiving model: each layer definition in the stack specification corresponds to a concrete deployment artifact which needs to be deployed on the target system. If one package in a layer needs to be updated, the entire layer needs to be republished.
The advantage of this model is that the deployment artifacts are self-contained: if you have the required artifacts, you can create the environment on a target system without any further need for internet access. It also entirely avoids the need to have Python package installation tools installed in the deployed environments.
While this mechanism works, it comes with assorted practical complications around managing the versioning and potential parallel installation of the framework layers, as well as ensuring all the required framework layers for a given application layer are installed prior to attempt to use it. There's also runtime complexity around ensuring all the deployed environments are accessible for Python imports and for shared object loading.
However, there's a potential way to decouple the actual deployment artifacts from the stack definitions used to determine the lock files for each layer. Specifically:
- runtime layers change to ship with
uv
as the only additional package installed in the runtime environment. The other runtime dependency declarations are still used to constrain the locking of any upper layers, but they're no longer shipped in the runtime layer itself. - a
uv.toml
configuration file definition is also included in the runtime archive, defining how to retrieve individual packages based on the lock time config. This config file will also pointuv
's caching definition to a defined location (defaulting tovar/cache/uv
inside the runtime layer archive, as that should consistently be on the same file system as the installed application layer environments, which is needed for uv's package caching mechanism to be fully effective) - framework layers no longer produce deployment artifacts at all. They're solely used to constrain the other layers that depend on them to a common set of package versions at lock time.
- the full requirements lock file is added to the built application layer archives
- application layers change to deploy conventional isolated virtual environments with no
sys.path
and dynamic library loading path customisations. Their post-installation scripts still adjust the virtual environment for its deployment location, and then do something new: they use the base environment'suv
installation (and itsuv
config file) to install the locked dependencies for that application environment, relying onuv
's caching mechanism to ensure only one copy of each required package is downloaded and installed.
uv
can be used as the installer in this approach, as these are regular isolated virtual environments, so they won't be affected by astral-sh/uv#2500
The environments deployed this way will also be complete (with their RECORD files and executable wrappers fully populated), rather than having certain files that potentially encode absolute paths (or their hashes) deleted.
Deployments still need to ensure the right runtime is installed for each application layer, but there's no need to identify and install the individual framework layers. Framework versioning also gets a lot easier, since the published application layer archives don't have a runtime dependency on the framework layers in this archiving model - the framework layers are just expressing a set of dependencies that are intentionally being kept to the same versions across different application environments.
This approach also intrinsically allows partial dependency sharing even as layers are updated.
Compared to the benefits, the arguable list of downsides is relatively short:
- needs a Python package installer at installation time (true, but those are rarely far away given
ensurepip
in the standard library, so having one preinstalled in the base runtime environments shouldn't be that significant of a concern) - needs access to an index server at installation time (true, but it would be possible to make that a controlled server, rather than installing directly from public index servers)
- needs to perform additional work at installation time (arguable, as it doesn't take many saved package downloads from the improved sharing across layer versions to compensate for a lot of local package installations from the shared package cache)
- needs application environments to be installed on the same filesystem as their base runtime environments (true, but these are already required to be installed in the same target folder, so this shouldn't be that significant of a new concern)
- managing the package cache so it doesn't grow excessively large over time may be a hassle (true, but this problem also exists when it comes to managing previously installed framework layers that no application layers are currently referencing)
- ??? anything else I've missed?