Skip to content

build: proper linuxkit dependency tracking via content-hash propagation#5678

Draft
rucoder wants to merge 8 commits intolf-edge:masterfrom
rucoder:rucoder/zfs-pkg
Draft

build: proper linuxkit dependency tracking via content-hash propagation#5678
rucoder wants to merge 8 commits intolf-edge:masterfrom
rucoder:rucoder/zfs-pkg

Conversation

@rucoder
Copy link
Contributor

@rucoder rucoder commented Mar 15, 2026

Problem: linuxkit dependency tracking gap

Linuxkit uses the git tree hash of a package's source directory as its cache key. This breaks when packages depend on other packages whose content changes:

pkg/zfs  ──────────────────────►  pkg/pillar
                                  pkg/dom0-ztools
                                  pkg/vtpm

When pkg/zfs is rebuilt (e.g. after a ZFS version bump), the consumers retain a stale cache hit because their own git trees did not change. Linuxkit has no way to know that the image it cached for pkg/pillar was built against the old pkg/zfs.

The problem is structural: consumer packages reference their dependencies via Dockerfile.in templates that are rendered by parse-pkgs.sh at build time. The rendered Dockerfile is gitignored, so it is invisible to linuxkit's tree-hash computation.

This is addressed in linuxkit/linuxkit#4210

Design

1. Declare dependencies in build.yml

Each package that depends on others declares it via a linuxkit build arg wildcard in build.yml:

# pkg/pillar/build.yml
buildArgs:
  - REL_HASH_%=@lkt:pkgs:../*

@lkt:pkgs:../* tells linuxkit to glob all sibling packages and resolve each to its content tag. Linuxkit filters the results against the ARG declarations in the package's Dockerfile, so only actually-used deps are included. The resolved tag values are folded into the package's own hash via SHA1 — effectively hashing the rendered dependency references.

Result: when pkg/zfs gets a new content tag, pkg/pillar's computed tag also changes, even though pkg/pillar's git tree is unchanged.

2. Two-stage Makefile build

Stage 1 — update-hashes (runs once per make invocation):

linuxkit pkg update-hashes --hash-dir .gen-deps \
    pkg/alpine:build.yml  pkg/zfs:build-2.4.yml  pkg/pillar:build.yml  ...

update-hashes processes all packages in topological dependency order (deps before consumers) and writes a YAML manifest per package to .gen-deps/:

# .gen-deps/pillar.hash
tag: lfedge/eve-pillar:a3f9b2c1...(hash-of-sources+zfs-tag+alpine-tag)
build-yml: build.yml
deps:
  - path: pkg/zfs
    tag: lfedge/eve-zfs:8d4e1f09...-2.4
  - path: pkg/alpine
    tag: lfedge/eve-alpine:c7a2b3d5...

Hash files are written with write-if-changed semantics: the mtime only advances when the tag actually changes, so unchanged packages do not trigger downstream rebuilds.

Stage 2 — eve-% (PHONY, per-package):

linuxkit pkg build --hash-dir .gen-deps pkg/pillar

Linuxkit reads the tag from .gen-deps/pillar.hash and checks whether that exact image exists in the Docker cache. If it does, no build occurs. If not, it builds. update-hashes runs exactly once per make invocation regardless of how many packages are targeted.

3. tools/gen-hash-deps — make-level ordering

tools/gen-hash-deps reads .gen-deps/*.hash and emits hash-deps.mk (included by the root Makefile) with:

# section 1: hash-file dep rules — topological ordering for future file-target builds
.gen-deps/pillar.hash: .gen-deps/zfs.hash .gen-deps/alpine.hash

# section 2: cache-export ordering
pillar-cache-export-docker-load: zfs-cache-export-docker-load

End-to-end: ZFS version bump

ZFS_VERSION=2.4  →  build-2.4.yml selected  →  pkg/zfs gets new tag
                 →  update-hashes recomputes pkg/pillar tag (includes new zfs tag)
                 →  pkg/pillar tag changed  →  docker cache miss  →  rebuild
                 →  pkg/dom0-ztools, pkg/vtpm same

Next make run: nothing changed → tags unchanged → update-hashes writes no new mtimes → all docker cache hits → instant.

What this PR contains

  1. pkg/zfs — the first package that exercises the new mechanism: a FROM scratch linuxkit package that builds OpenZFS userspace once, with versioned build-2.3.yml / build-2.4.yml variants. Replaces three redundant ZFS compilations that were scattered across the tree.

  2. pkg/pillar, pkg/dom0-ztools, pkg/vtpm — converted to consume pkg/zfs via @lkt:pkgs:../*, with build.yml deps: entries that make the dependency explicit.

  3. Makefile + tools/gen-hash-deps — two-stage build wiring: update-hashes target, eve-% simplified to a post-hash build, hash-deps.mk include.

How to test

# Full build
make pkgs

# ZFS version bump — pillar/dom0-ztools/vtpm must rebuild automatically
make pkg/zfs ZFS_VERSION=2.4.1
make pkg/pillar ZFS_VERSION=2.4.1   # new tag → cache miss → rebuild
make pkg/pillar ZFS_VERSION=2.4.1   # same tag → cache hit → instant

Changelog notes

None

PR Backports

  • 16.0-stable: No.
  • 14.5-stable: No.
  • 13.4-stable: No.

Checklist

  • I've provided a proper description
  • I've added the proper documentation
  • I've tested my PR on amd64 device
  • I've tested my PR on arm64 device
  • I've written the test verification instructions
  • I've set the proper labels to this PR
  • I've checked the boxes above, or I've provided a good reason why I didn't check them.

@rucoder rucoder changed the title pkg/zfs: build ZFS userspace once; two-stage linuxkit dep tracking pkg/zfs: build ZFS userspace once; universal linuxkit dependency tracking Mar 15, 2026
@rucoder rucoder changed the title pkg/zfs: build ZFS userspace once; universal linuxkit dependency tracking build: proper linuxkit dependency tracking via content-hash propagation Mar 15, 2026
@rucoder rucoder marked this pull request as draft March 15, 2026 22:24
@rene
Copy link
Contributor

rene commented Mar 16, 2026

@rucoder , TBH I think this is just introducing more (unwanted) complex to our build system to a problem that is not technically ours... this actually should be addressed by linuxkit, not by EVE build system. We had subtle bugs because of this feature that was introduced to help, but just caused problems... when we figured out the hashes were not being honored and package rebuilt accordingly, we just revert all commits that were introducing the automatic tags.... unless this is very well proven to work, I'm against moving forward....

@rucoder
Copy link
Contributor Author

rucoder commented Mar 16, 2026

@rucoder , TBH I think this is just introducing more (unwanted) complex to our build system to a problem that is not technically ours... this actually should be addressed by linuxkit, not by EVE build system. We had subtle bugs because of this feature that was introduced to help, but just caused problems... when we figured out the hashes were not being honored and package rebuilt accordingly, we just revert all commits that were introducing the automatic tags.... unless this is very well proven to work, I'm against moving forward....

@rene I forgot to mention a PR to LK that fixes the feature, I just reintroduced it back linuxkit/linuxkit#4210 . PR is still in draft state becasue I'm trying to figure out how to reduce the complexity. As of now following things work reliably

  1. Transitive dependencies: A->B->C if I change C - no matter code or parameter e.g. ZFS_VERSION A is rebuilt and gets UNIQUE hash. No more hash collisions

  2. No need for <hash>-<custom tag> required because hash is unique depends on code AND parameters

  3. we do not need get-deps anymore, I just temporary introduced new small and super fast tool to generate *.mk file from LK output

rucoder and others added 7 commits March 16, 2026 10:31
Introduce pkg/zfs/ — a FROM-scratch linuxkit package that builds
OpenZFS userspace once and exports only the runtime artifacts.
All consumers COPY --from this single image.

- Version selection via build-*.yml variants (build-2.3.yml,
  build-2.4.yml)
- ZFS_VERSION in kernel-version.mk drives build-yml selection
- Cross-compilation support for arm64 on amd64 hosts
- Removes redundant ZFS compilations from build-tools/src/scripts
- Use auto-hash feature to depend on Alpine
- Add pkg/zfs to PKGS_riscv

Signed-off-by: Mikhail Malyshev <mike.malyshev@gmail.com>
Update pkg/pillar, pkg/dom0-ztools and pkg/vtpm to consume the new
pkg/zfs package via the linuxkit @lkt:pkgs:../* autohash mechanism,
replacing hardcoded image references.  Add build.yml deps: entries
so linuxkit's hash propagation tracks the ZFS content tag through
to consumer packages — a ZFS source change automatically produces
new hashes for all consumers.

Signed-off-by: Mikhail Malyshev <mike.malyshev@gmail.com>
…ash-deps)

Two-stage package build system:

Stage 1 — update-hashes (runs once per make invocation):
  linuxkit pkg update-hashes --hash-dir .gen-deps computes content
  tags for all packages in topological dependency order, writing YAML
  manifests to .gen-deps/<pkgname>.hash with write-if-changed semantics.

Stage 2 — eve-% (PHONY, per-package):
  linuxkit pkg build --hash-dir .gen-deps pkg/$* reads the precomputed
  hash to determine the tag and hits the docker cache, skipping the
  actual build when the image is already present.

tools/gen-hash-deps: new Go tool that reads .gen-deps/*.hash manifests
and emits hash-deps.mk with:
  1. Hash file dependency rules (.gen-deps/pillar.hash: .gen-deps/zfs.hash)
     for topological ordering.
  2. Cache-export ordering rules.

The pkg/% static pattern rule is scoped to $(filter pkg/%,$(PKGS)) to
avoid accidental matches on deep source-file paths.

Signed-off-by: Mikhail Malyshev <mike.malyshev@gmail.com>
The --hash-dir and update-hashes features used by this branch are not
yet in a linuxkit release.  Build linuxkit from the rucoder/pkg-hash-dir
fork by default so CI works without manual intervention.

Three cases, in priority order:
  1. LINUXKIT_SRC=/path/to/tree  — build from local source (local dev)
  2. LINUXKIT_FORK_URL set (default: rucoder/linuxkit, rucoder/pkg-hash-dir
     branch) — shallow-clone the fork and go build the binary
  3. LINUXKIT_FORK_URL=""        — download the upstream release binary

Set LINUXKIT_FORK_URL="" to revert to the normal release-download path
once the features land in an upstream linuxkit release.

Signed-off-by: Mikhail Malyshev <mike.malyshev@gmail.com>
The previous .gen-deps/%.hash recipe approach had two bugs:

1. gen-hash-deps emitted Section 1 rules using .gen-deps/.bootstrap/
   paths (because -d pointed at .bootstrap/) instead of .gen-deps/ paths,
   so the prereq chain .gen-deps/vtpm.hash: .gen-deps/dom0-ztools.hash
   was never set up — vtpm could build before dom0-ztools.

2. The Makefile bootstrap wrote to .gen-deps/ directly so .gen-deps/*.hash
   files already existed after bootstrap; make considered them up-to-date
   and never ran the pkg build recipes.

Fix:
- Makefile: bootstrap now writes to .gen-deps/.bootstrap/ so that
  .gen-deps/*.hash files are absent on a clean build, forcing all
  recipes to run.
- gen-hash-deps: add -b flag (buildDir) so Section 1 rules use
  .gen-deps/ paths even when reading from .gen-deps/.bootstrap/.
  gen-hash-deps -d .gen-deps/.bootstrap -b .gen-deps emits correct
  .gen-deps/vtpm.hash: .gen-deps/dom0-ztools.hash prerequisites.

With these fixes, a clean make pkgs:
1. Bootstrap writes all .bootstrap/*.hash with dep-propagated tags.
2. gen-hash-deps emits Section 1 with .gen-deps/ paths.
3. Make sees no .gen-deps/*.hash files — runs all recipes in dep order.
4. Each recipe: (a) update-hashes --hash-dir .gen-deps writes the
   package's dep-propagated tag, (b) pkg build --hash-dir .gen-deps
   reads that tag and builds the Docker image correctly.

Signed-off-by: Mikhail Malyshev <mike.malyshev@gmail.com>

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Add SPDX-License-Identifier headers to docs/test-hash-dir.sh,
  tools/gen-hash-deps/main.go (via previous commit), and
  tools/gen-hash-deps/Makefile (Apache-2.0, 2026 Zededa Inc.)
- Remove leftover #test 2 debug comment from pkg/pillar/Dockerfile
  (hadolint flagged this as stray content before FROM)
- Update docs/test-hash-dir.sh for new .gen-deps/.bootstrap/ structure:
  all update-hashes output checks now use BDIR=.gen-deps/.bootstrap
  instead of .gen-deps/; hash-deps.mk checks remain in .gen-deps/
- Remove unused PILLAR_TAG_23_STORED variable (shellcheck SC2034)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
tools/mini-yetus.sh referenced lfedge/eve-yetus:0.15.10-eve-1 which
does not exist on Docker Hub. The Makefile correctly uses 0.15.1-eve-1.
This mismatch caused 'make mini-yetus' to silently fail: the Docker
container never started, no results file was produced, yet the script
exited 0 — masking the failure entirely.

Align the tag with the Makefile so the local linter actually runs.

Signed-off-by: Mikhail Malyshev <mike.malyshev@gmail.com>
@rucoder rucoder force-pushed the rucoder/zfs-pkg branch 2 times, most recently from 0a349f2 to f507720 Compare March 17, 2026 00:32
Signed-off-by: Mikhail Malyshev <mike.malyshev@gmail.com>
@codecov
Copy link

codecov bot commented Mar 17, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 29.49%. Comparing base (2281599) to head (20aefeb).
⚠️ Report is 341 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #5678      +/-   ##
==========================================
+ Coverage   19.52%   29.49%   +9.96%     
==========================================
  Files          19       18       -1     
  Lines        3021     2417     -604     
==========================================
+ Hits          590      713     +123     
+ Misses       2310     1552     -758     
- Partials      121      152      +31     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants