Skip to content

Make scheduler work with shallow realisations without fetching unneeded store objects #11928

Open
@Ericson2314

Description

Shallow realizations map a basic drv (no inputDrvs) and output name to a content address.

Suppose we have a dependency graph like CompilerA -> CompilerB -> Library. These are only build-time dependencies: the outputs of each build will not depend on this dependency. For sake of argument, CompilerA is "plain old data", (like a bootstrap binary), and just uploaded as-is.

Suppose we have built all 2 derivations and uploaded the results, shallow realisations, but not deep realisations to a remote store.
Now, in another store, configured to substitute from that remote store, one tries to build Library.

Currently, this will happen:

  1. Want to obtain Library
  2. There is no deep realization in the cache keyed unresolved derivation
  3. We don't know any content-addressed store object we try to download.
  4. Wants to build Library
  5. Want to obtain CompilerB
  6. Finds shallow trace for CompilerB (since CompilerA is plain old data, CompilerB's derivation is already resolved)
  7. Downloads CompilerB
  8. Resolves Library derivation
  9. Finds shallow trace for Library Derivation
  10. Downloads Library

This works, but note that we downloaded CompilerB even though it is not in the runtime closure of Library.

Instead I would want something like this:

  1. Want to obtain Library
  2. There is no deep realization in the cache keyed unresolved derivation
  3. Wants to resolve Library derivation
  4. Wants resolution for CompilerB
  5. Finds shallow trace for CompilerB
  6. Resolves Library derivation
  7. Finds shallow trace for Library Derivation
  8. Downloads Library

Now we don't bother downloading CompilerB.

The way to make the second sequence of steps reality is to have "obtaining a realisation" a goal in and of itself, separate from obtaining a store object and building one. In the case where the cache doesn't have the realisation, it falls back on to just building it, but in the case where it does it doesn't need to fall back on downloading store objects. Dependencies between these goals would allow us to resolve derivations through arbitrary many inputDrv edges without downloading any store objects.


Before doing this, we should attempt #11927 so this code is not nearly as annoying to work with.

Activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    ca-derivationsDerivations with content addressed outputs

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions