Description
There are three levels of caching in HLB: Cache mounts, FS caching, Build cache.
Cache mounts
Cache mounts is a persistent read-writable mount stored by the BuildKit backend. This is typically used for compiler / package manager caches.
Some caveats:
- Has sharing modes depending on the underlying compiler/package manager. Is concurrent usage safe? It's complicated to know what to pick.
- Ever growing size, how do we prune this? How many projects can we use this with? If its a single cache key for every project, would using
FS caching
be better because we have more control? - In a cloud cluster environment, this is not very reliable because you may not hit the same
buildkitd
.
fs default() {
image "golang:alpine"
run "go build xyz" with option {
mount fs { scratch; } "/root/.cache/go-build" with option {
cache "someCacheKey" "shared"
}
}
}
- I'm not sure if the
scratch
fs defined aftermount
is ever utilized if there is acache
option. Need to investigate this. - Instead of nesting as a
option::mount cache
, can we define it as aoption::run cacheMount
for UX? Initially it was designed as a mount option because of LLB, but we can change that. - In the Dockerfile frontend, I recall that they define cache keys for the user, it's possible we can move the cache key as an option to
cacheMount
. Need to investigate this. - Does the BuildKit build cache export cache mounts as well? Need to investigate this.
FS caching
Rather than a language or backend feature, this is more of a pattern emerging from HLB usage. You are able to export filesystems to various sources (image to DockerHub, pushing to remote git repo, publishing to HTTP server like artifactory), and you can also use these remote sources to mount as a "starting point" or a "primed mount" to speed up the operation.
When running npm install
, the current pattern is:
- Mount
option::mount cache
fornpm
cache directories. (Likely safe to be shared with multiple projects) - Mount
fs { scratch; } "src/node_modules"
because you can also prime the node modules. (Only safe to use for a single project)
And then you can push the node modules mount as an image, then remount it for subsequent runs:
fs default(fs src) {
image "node:alpine"
run "npm install" with option {
dir "/src"
mount src "/src"
mount fs { scratch; } "/root/.npm" with option {
cache "npm-config-cache" "shared"
}
mount fs { image "my-node-modules:latest"; } "/src/node_modules" as nodeModules
}
}
fs snapshotModules() {
nodeModules
dockerPush "my-node-modules:latest"
}
Perhaps users will then set up a CRON CI job to run the target snapshotModules
once in a while, or on a merge to master
.
Build cache
Build cache import/export is a native feature to BuildKit. See: https://github.com/moby/buildkit#export-cache
This is the basic behavior you get when you run a second build through HLB on the same BuildKit backend. Unchanged input will return the same output so they don't need to run it a second time. However when you point to a new BuildKit backend this is lost. Build cache import/export allows you to publish the build cache for a particular build to disk (local), inline (if creating a docker image, it will be embedded), or an OCI image (not an executable image, just a data container for the build cache).
There are two main angles we can tackle this from:
- Backend solution: We develop and maintain infrastructure around BuildKit to distribute build caches between
buildkitd
nodes. - Frontend solution: We expose build cache import/export in HLB.
Build cache import/export is per-solve, and we already do multiple solves when executing HLB.
If we want to implement this in the frontend side, we'll need to somehow inform the HLB compiler that this section is using this particular cache. Ideally we want to provide a fs { ... }
to be agonistic to the source (whether its local
or image
), but that will require an upstream change.
Here is an example that somewhat fulfills the requirements but I think is a terrible UX. But perhaps can serve as a starting point for discussion:
fs default() {
cacheContext fs {
image "openllb/my-remote-build-cache:latest";
} fs {
image "alpine"
run "running with cache context"
}
}
Activity