ci: gate images on the system closure, not the OCI tar#1332
Conversation
|
You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard. |
Blast radius
pie showData title Rebuilt checks by category
"image" : 15
"rust" : 2
"site" : 2
"agent" : 1
"blast" : 1
"eval" : 1
"lint" : 1
flowchart LR
c0["ix-mcp"]
c1["blast-radius-test"]
c2["agent-skills"]
c3["lint"]
c4["site-test"]
c5["site-case-tests"]
c0 --> k0["agent-skills"]
c0 --> k2["eval"]
c0 --> k3["image-development-base"]
c0 --> k4["image-kernel-dev"]
c0 --> k5["image-minecraft"]
changed checks (23)
|
There was a problem hiding this comment.
AI review found issues in this pull request.
Verdict: patch is incorrect
Confidence: 0.87
The patch removes PR CI coverage for building Nix OCI image archives and their efficiency checks, leaving only system-closure builds for discovered images.
- P1
lib/per-system.nix:1079Nix image checks no longer build the OCI archive
|
Intentional trade-off, accepted: the whole point of this PR is to keep the ~150s/image OCI tar build out of PR CI (it dominates (sent by an AI agent via Claude Code) |
What
The per-image CI checks (
image-*in.#ciChecks) built the full OCI tar(
streamLayeredImage→oci-image-builder→<name>-oci.tar). That tar pack(serialize + compress the whole NixOS closure, then the layer-efficiency pass)
is ~60-100s per image and dominated CI wall-clock.
This repoints the checks at the image's system closure
(
config.system.build.toplevel) instead of the tar. The tar is still thepackage output, built only at release where a registry push actually consumes
the bytes.
lib/image/oci-layer.nix: exposepassthru.toplevelon the image derivation.lib/per-system.nix:imageChecksbuild.toplevelfor Nix images. Non-Nixexample images (
mkNonNixImage, a pulled Debian/Ubuntu base) have no Nixtoplevel, so their check stays the assembled archive.
Why
The gate's only signal from an image check is "this image's closure builds." It
does not boot the VM or test any behavior, and the OCI assembly itself is
deterministic plumbing. So building the tar in the gate spends ~60-100s/image to
re-prove something the per-package checks already cover.
Worse, the closure includes frequently-changing packages — notably the
base-profile
mcp(in every image). Editing any mcp file changes everyimage's closure, so all ~15 tars re-packed on essentially every commit. Verified
on a Linux builder: a one-line comment in
packages/mcp/ix_notebook_mcp/runtime.pychanged the
image-minecraftderivation hash, and a darwin-cask / prompt-only PRre-packed all 15 images in CI.
Building the closure is a relink over already-built store paths: ~2s vs
~100s for the tar (measured on a warm Linux builder after an mcp edit).
Impact
This removes the tar-pack cascade from the gate. Image-touching PRs (and the
merge-queue run) drop from ~3.5-5 min to roughly the eval floor (~1.5-2 min),
about 2-3x. It does not touch eval cost (genuinely uncacheable across
commits) or the release/registry path.
Validation
On a Linux builder against this branch:
image-*checks present; no check-name collisions..#ciChecks.x86_64-linux.image-minecraft.drvPathnow resolves tonixos-system-...drv(the closure), notminecraft-oci.tar.drv..#packages.x86_64-linuxstill carries the image tar derivations (releasepath intact).
Authored with Claude (Opus). Review the eval-cost follow-up separately;
caching can't reduce the remaining eval floor.
Note
Gate image CI checks on the NixOS system closure instead of the OCI archive
passthru.toplevelto the OCI layer derivation in oci-layer.nix, exposing the NixOS system closure as a separate build target.imageChecksin per-system.nix so Nix image checks depend onv.toplevel(the closure) rather thanv(the OCI tar). Non-Nix example image checks are unchanged.Macroscope summarized ec04fc2.