Commit 1ac27b7
ci: parallel checks via nix-fast-build + warm /nix/store cache (#293)
## Summary
Rework the `Check` workflow for the fastest correct CI, plus one
matching simplification in `lib/per-system.nix`. No image, package, or
znver5 optimization changes — only how the checks are built, aggregated,
and cached.
- **Parallel eval + build** via
[`nix-fast-build`](https://github.com/Mic92/nix-fast-build) (pinned to
1.5.0 by commit) over `.#checks.x86_64-linux`, replacing the single `nix
build .#checks.x86_64-linux.all` step. `nix-fast-build` evaluates checks
with `nix-eval-jobs` and streams each derivation into a build pool as it
resolves, instead of blocking on one linkFarm to finish evaluating.
`--skip-cached` skips paths already in a substituter, so a warm run does
almost no work.
- **Warm `/nix/store` cache** via
[`cache-nix-action`](https://github.com/nix-community/cache-nix-action)
(v7) keyed on `flake.lock`, so the znver5 base is restored from GitHub's
cache instead of re-substituted from Cachix on every run.
- **One runner, on purpose.** The znver5 base is shared by every check;
splitting checks across ephemeral runners would rebuild that base on
each one when the cache is cold (thundering herd). A single job keeps
the base built at most once and gets its parallelism from
`nix-fast-build` + `max-jobs = auto` / `cores = 1` (one build per core,
each single-threaded, so consumed cores stays at `max-jobs *
NIX_BUILD_CORES = core count` rather than oversubscribing to
core-count²).
- **Runner: `ubuntu-latest`** (a standard 2-vCPU GitHub-hosted runner).
The cache-cold whole-closure rebuild is CPU-bound, so swapping in a
larger x86_64-linux runner label (if the org has one) would shorten it;
`nix-fast-build` + `max-jobs = auto` scale to whatever cores the label
provides.
- **Dropped the `all` linkFarm** in `lib/per-system.nix`: it existed
only so `nix build` had one aggregate target, and `nix-fast-build`
enumerates every check itself.
The required `flake-check` status, the push-only Cachix writer (`if:
github.event_name == 'push'`), and `nix flake check -L --no-build` are
all retained. The two-job gate (`check-group` + `flake-check`) collapses
into a single `flake-check` job because there is no fan-out to
aggregate; the required status name is unchanged.
## Why it is faster, and what CI timing should confirm
The dominant cost is the custom-compiled znver5 closure
(`nixpkgs.hostPlatform.gcc.arch = "znver5"`), which only the
`indexable-inc` Cachix serves. Three independent levers attack it:
1. **Pipelined parallel eval/build.** The old step evaluated the whole
`all` linkFarm before any build started; `nix-fast-build` overlaps
evaluation and building and parallelizes both. *Confirm:* the "Build all
flake checks" step starts producing build output well before evaluation
finishes, and total wall time drops on multi-check runs.
2. **A second, faster cache layer in front of Cachix.** Restoring
`/nix/store` from GitHub's same-datacenter cache is faster than
re-substituting the same paths from Cachix over HTTPS. *Confirm:* on a
warm PR (no `flake.lock` change) the cache-nix-action restore reports a
hit and the build step is near-noop; compare warm-run wall time against
a cold run.
3. **More cores for the cold path.** A `flake.lock` bump invalidates the
store-cache key and must recompile the closure; this is CPU-bound, so
the larger runner shortens the long pole. The `nix-${{ runner.os }}-`
restore prefix still restores the previous lock's store as a partial
warm base. *Confirm:* a lock-bump run is faster on the larger runner
than baseline, and still restores a (mostly stale) store.
First run on a new `flake.lock` (ideally a push to `main`) is the slow
one: it builds, pushes to Cachix, and saves the store cache; every later
run on that lock restores both.
## Coverage preserved (invariant 2)
`nix eval --accept-flake-config --json .#checks.x86_64-linux --apply
builtins.attrNames`, before vs after:
- before:
`["agents-md","all","eval","lint","loader-manifests","run-records-session","rust-package-tests","site-case-tests","site-test"]`
- after:
`["agents-md","eval","lint","loader-manifests","run-records-session","rust-package-tests","site-case-tests","site-test"]`
The only difference is the removed `all`, which was a pure aggregation
linkFarm over the other eight (its sole consumer was the CI `nix build …
.all`). The eight real checks are byte-for-byte unchanged.
## Action items and tradeoffs for humans
- **Runner label.** `flake-check` runs on `ubuntu-latest`, so CI
schedules and can go green on its own. If the org provisions a larger
x86_64-linux runner, swap its label into `runs-on:` to give the
CPU-bound cache-cold rebuild more cores; no other change is needed.
- **10 GB cache ceiling.** `gc-max-store-size-linux: 8G` trims the store
before save so the compressed cache stays under GitHub's 10 GB per-repo
limit. If the real check closure is much larger than 8 GB, the cache
covers only the hottest 8 GB and the rest still substitutes from Cachix
(still correct, partial speedup). Tune the cap upward while watching the
compressed cache size in the repo's Actions cache list; a save that
exceeds 10 GB is rejected wholesale.
- **znver5 scope deliberately unchanged (invariant 3).** I did not
narrow `gcc.arch = "znver5"` to fewer packages even though that would
cut build time, because it changes what is built. If the team wants it,
it should be evaluated as a separate behavior change.
- **cache-nix-action ↔ determinate-nix-action ordering.**
cache-nix-action documents DeterminateSystems as a compatible installer
and merges the Nix DB on restore, so install-then-restore is the
intended order. This interaction is exercised only on the Linux runner
(see below).
## Validation
Run locally on an aarch64-darwin dev host:
- attrNames before/after diff above — only `all` removed.
- `nixfmt --check lib/per-system.nix` using the repo's pinned
`formatter` — clean.
- `actionlint .github/workflows/check.yml` — the only finding is the
intended `LARGER_RUNNER_LABEL_TODO` unknown-label note; YAML and
embedded shell are otherwise clean.
- `git diff --check` — clean.
- `nix-fast-build 1.5.0 --help` — confirmed `--flake`, `--skip-cached`,
`--no-nom`, `--no-link`, and `--option` exist at the pinned commit.
- Action SHAs resolved via `gh api` (both are lightweight tags resolving
directly to commits).
Deferred to this PR's CI (x86_64-linux), and **not** verified locally:
- A full `nix flake check -L --no-build` and the actual check builds.
The dev host is aarch64-darwin; forcing the checks' `drvPath` triggers
the repo's IFD (cargo-unit's generated `cargo-units.nix`), which must
build x86_64-linux derivations that cannot be realized on darwin. The
`attrNames` eval, which does not force IFD, succeeds cleanly.
- The cache-nix-action store restore/save and its DB merge under
determinate-nix-action.
## Test plan
- [x] `runs-on` set to `ubuntu-latest` (swap in a larger runner label if
one is available).
- [ ] First run builds and (on push to `main`) populates Cachix + the
store cache.
- [ ] A follow-up warm PR shows a cache-nix-action restore hit and a
near-noop build step.
- [ ] Confirm the saved cache stays under 10 GB compressed.
---------
Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>1 parent 94262c9 commit 1ac27b7
2 files changed
Lines changed: 108 additions & 70 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
18 | 18 | | |
19 | 19 | | |
20 | 20 | | |
21 | | - | |
22 | | - | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
23 | 35 | | |
24 | 36 | | |
25 | 37 | | |
| |||
28 | 40 | | |
29 | 41 | | |
30 | 42 | | |
31 | | - | |
32 | | - | |
33 | | - | |
34 | | - | |
35 | | - | |
36 | | - | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
37 | 53 | | |
38 | | - | |
39 | | - | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
40 | 62 | | |
41 | 63 | | |
42 | 64 | | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
43 | 82 | | |
44 | | - | |
45 | | - | |
46 | | - | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
47 | 87 | | |
48 | 88 | | |
49 | 89 | | |
50 | 90 | | |
51 | 91 | | |
52 | 92 | | |
53 | 93 | | |
54 | | - | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
55 | 104 | | |
56 | | - | |
57 | | - | |
| 105 | + | |
| 106 | + | |
| 107 | + | |
| 108 | + | |
| 109 | + | |
| 110 | + | |
58 | 111 | | |
59 | | - | |
60 | | - | |
61 | | - | |
62 | | - | |
63 | | - | |
64 | | - | |
65 | | - | |
66 | | - | |
67 | | - | |
68 | | - | |
69 | | - | |
70 | | - | |
71 | | - | |
72 | | - | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
| 117 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
346 | 346 | | |
347 | 347 | | |
348 | 348 | | |
349 | | - | |
350 | | - | |
351 | | - | |
352 | | - | |
353 | | - | |
354 | | - | |
355 | | - | |
356 | | - | |
357 | | - | |
358 | | - | |
359 | | - | |
360 | | - | |
361 | | - | |
362 | | - | |
363 | | - | |
364 | | - | |
365 | | - | |
366 | | - | |
367 | | - | |
368 | | - | |
369 | | - | |
370 | | - | |
371 | | - | |
372 | | - | |
373 | | - | |
374 | | - | |
375 | | - | |
376 | | - | |
377 | | - | |
378 | | - | |
379 | | - | |
380 | | - | |
381 | | - | |
382 | | - | |
383 | | - | |
384 | | - | |
385 | 349 | | |
386 | | - | |
387 | | - | |
388 | | - | |
389 | | - | |
| 350 | + | |
| 351 | + | |
| 352 | + | |
| 353 | + | |
| 354 | + | |
| 355 | + | |
| 356 | + | |
| 357 | + | |
| 358 | + | |
| 359 | + | |
| 360 | + | |
| 361 | + | |
| 362 | + | |
| 363 | + | |
| 364 | + | |
| 365 | + | |
| 366 | + | |
| 367 | + | |
| 368 | + | |
| 369 | + | |
| 370 | + | |
| 371 | + | |
| 372 | + | |
| 373 | + | |
| 374 | + | |
| 375 | + | |
| 376 | + | |
| 377 | + | |
| 378 | + | |
| 379 | + | |
| 380 | + | |
| 381 | + | |
390 | 382 | | |
| 383 | + | |
391 | 384 | | |
392 | 385 | | |
393 | 386 | | |
| |||
0 commit comments