Description
What's the problem this feature will solve?
Splitting out this comment (#11478 (comment)) into its own issue.
There remains one further avenue to investigate from #7819: amortizing/memoizing certain subproblems across separate invocations of the resolver. While I was able to identify one compartmentalized improvement in #7729, in general I still think pip (and all other package managers right now) currently spend most of their time recomputing a lot of the same constraints against each other over and over again for most hot-path use cases; for example, after identifying the subset of wheels that are compatible with a given interpreter, that information can be recorded and cached indefinitely by the local pip client to immediately jump to the subset of compatible wheels in future resolves.
Describe the solution you'd like
My vision is for pip to eventually perform most resolves entirely locally, only calling out to pypi when it's unable to find a satisfying set and it determines that a newer release of some package might be able to do so. I think this should be feasible without needing to store anything close to the magnitude of all the package contents on pypi, because cargo does that right now, and has some useful comments (https://github.com/rust-lang/cargo/blob/263d24baca913d7ace29e2f59a3254e47272fd6b/src/cargo/sources/registry/index.rs#L25-L30):
//! One important aspect of the index is that we want to optimize the "happy
//! path" as much as possible. Whenever you type `cargo build` Cargo will
//! *always* reparse the registry and learn about dependency information. This
//! is done because Cargo needs to learn about the upstream crates.io crates
//! that you're using and ensure that the preexisting `Cargo.lock` still matches
//! the current state of the world.
As detailed in that file, cargo essentially just serializes a list of the dependency relationships to json in order to publish their index, which is exactly how I implemented the persistent dependency resolution cache in #7819 that multiplied the effect of the fast-deps
technique alone (https://github.com/cosmicexplorer/pip/blob/40cec94281aa9165c689dd4bb69f953e2fd5fb06/src/pip/_internal/resolution/legacy/resolver.py#L171-L304).
According to this hypothesis, while fast-deps
(and now PEP 658) gives you the dependency relationships needed to iterate the resolver over a single candidate without actually downloading the many-megabyte zip file, but pip still currently has to pay the cost of network calls to move the resolver forward to evaluate each candidate, and if the candidate is discarded it has to do that again until the resolver finds a satisfying dependency set. In this file, cargo describes generating an index over another index to amortize the graph traversal performed during a cargo resolve while also being able to incorporate any updates to the index that have occurred since the last resolve. There's nothing special cargo is doing here with rust that we can't do in python; as they note:
//! Consequently, Cargo "null builds" (the index that Cargo adds to each build
//! itself) need to be fast when accessing the index. The primary performance
//! optimization here is to avoid parsing JSON blobs from the registry if we
//! don't need them. Most secondary optimizations are centered around removing
//! allocations and such, but avoiding parsing JSON is the #1 optimization.
Alternative Solutions
In pex-tool/pex#2175 my goal for pex was to reduce i/o and computation from compressing very large amounts of third-party code we do not control. I figured out a hack to reduce that effort, but as @jsirois points out, it's often more robust and maintainable in the long term to develop standard protocols that are designed for the task at hand than rely on clever hacks to optimize existing ones. In that case, while I found it was possible to cache traditional --layout zipapp
pex files with a clever hack, I learned that the --layout packed
format had been designed from the ground up to enable that cacheability. Analogously, @uranusjr spent the time to develop PEP 658 to avoid the need for us to use clever hacks like fast-deps
to get the information pip requires. In both cases, we were able to leverage features of the zip file format to put off standardization for a while, but we're in a better place now that more thoughtful infrastructure has been laid down.
spack currently performs an analogous process, deserializing and processing all of the dependency constraints annotated in spack package recipes upon each spack invocation (cc @tgamblin @alalazo)). In contrast, spack's main package repository is maintained as part of the spack codebase itself, so it can always expect the luxury of loading everything from the local filesystem. Still, it demonstrates that amortizing dependency graph traversal while incorporating updates from the source of truth is an intrinsic issue to package resolution, as opposed to the problem solved by fast-deps
which was a temporary workaround to slowly-evolving packaging standards.
Additional context
Pip shares Cargo's assumption that each individual release of any package is immutable, so the dependency relationships that fast-deps
/PEP 658 obtains from each wheel can therefore be cached and reused indefinitely after they are first computed. My hypothesis is that we can improve resolve performance by:
- (short term) mirroring every lookup of
package==version#checksum => [dependency requirements]
that pip extracts from pypi during a normal resolve into a local database of some form, to be reused whenever pip evaluates that installation candidate in future resolves.- As noted in consume packed wheel cache in zipapp creation pex-tool/pex#2175, every new cache implies a tradeoff of speedup/disk space. I suspect this will be a small enough dataset to avoid that concern, though, especially if we restrict ourselves to just indexing the packages pip has seen before instead of trying to contain all of pypi at once.
- As with cargo, there are some dependency relationships that are not immutable, like editable installs or the most recent release of a package, and we would need to be able to incorporate that kind of information too.
- (longer term) optimizing the local dependency index and the new resolver in tandem.
- Maybe this is the hubris of a beginner, but I am under the impression pip would be able to perform most resolutions in the blink of an eye if it didn't need to intersperse filesystem and/or network i/o between every step of the resolver's graph walk.
- The reason PEP 658 works is that it cuts out the ceremony required to access the dependency relationships for a given installation candidate--I think if we continue to pare down the stuff we cache and retain from entire wheels to just the information we need for a resolve (similarly to discussions of what to include in
pip install --report
), we can turn this into more and more of a compute-bound rather than IO-bound problem.