Skip to content

ci(nix): use runner nix on self-hosted lanes#243

Merged
Jesssullivan merged 1 commit intomainfrom
codex/socket-ci-narrow
Apr 22, 2026
Merged

ci(nix): use runner nix on self-hosted lanes#243
Jesssullivan merged 1 commit intomainfrom
codex/socket-ci-narrow

Conversation

@Jesssullivan
Copy link
Copy Markdown
Owner

@Jesssullivan Jesssullivan commented Apr 22, 2026

Summary

  • add a shared self-hosted Nix bootstrap script for runner-owned daemon mode
  • switch the self-hosted socket, gpu, cjk, distro, ssh-proxy, and release lanes to use the runner Nix daemon instead of assuming a writable single-user store
  • make the socket and ssh-proxy scripts resolve the active Nix dynamic linker instead of guessing an arbitrary glibc path

Validation

  • workflow YAML parse
  • bash -n scripts/configure-self-hosted-nix.sh scripts/run-socket-tests.sh scripts/run-ssh-proxy-tests.sh
  • git diff --check
  • previous main failure boundary on this fork: run 24747790637 failed immediately at Build libghostty (Nix) with /nix/var/nix/db/big-lock: Permission denied
  • fresh branch proof will be re-dispatched on this narrowed head

Context

This is the narrowed restack of the larger codex/socket-ci-truthfulness branch. The goal is to land the self-hosted runner contract fix on the real fork base without pulling in unrelated socket/runtime feature work.

@Jesssullivan Jesssullivan force-pushed the codex/socket-ci-narrow branch from b953660 to 1749496 Compare April 22, 2026 01:10
@Jesssullivan Jesssullivan changed the title ci(socket): restack self-hosted nix and headless fixes ci(nix): use runner nix on self-hosted lanes Apr 22, 2026
@Jesssullivan Jesssullivan marked this pull request as ready for review April 22, 2026 02:24
@Jesssullivan Jesssullivan merged commit f64a777 into main Apr 22, 2026
11 checks passed
@Jesssullivan Jesssullivan deleted the codex/socket-ci-narrow branch April 22, 2026 02:24
@greptile-apps
Copy link
Copy Markdown

greptile-apps Bot commented Apr 22, 2026

Greptile Summary

This PR introduces a shared scripts/configure-self-hosted-nix.sh bootstrap that probes for (or starts) a runner-owned Nix daemon, exports NIX_REMOTE=daemon and optional substituter config into $GITHUB_ENV, and updates six workflow files plus two test scripts to consume the daemon store via nix --store daemon. The interpreter-resolution logic in the socket and SSH-proxy runners is also improved to prefer $NIX_LD / $NIX_CC over a blind store glob.

All findings are P2: the hardcoded EOF delimiter in emit_github_env carries a small risk of silently truncating multi-line NIX_CONFIG values, resolve_nix_interpreter is duplicated between the two test scripts and is unreachable dead code in run-ssh-proxy-tests.sh specifically (Linux exits before it's called, macOS lacks patchelf).

Confidence Score: 5/5

Safe to merge; all findings are minor style/robustness suggestions with no impact on correctness.

No P0 or P1 issues found. The daemon-mode wiring, polling loop, and --store daemon propagation are all logically sound. Remaining comments are low-risk cleanup (EOF delimiter, dead code, duplication).

scripts/configure-self-hosted-nix.sh (EOF delimiter in emit_github_env); scripts/run-ssh-proxy-tests.sh (dead resolve_nix_interpreter function).

Important Files Changed

Filename Overview
scripts/configure-self-hosted-nix.sh New bootstrap script: probes for a pre-existing Nix daemon, optionally starts determinate-nixd, and exports NIX_REMOTE=daemon + NIX_CONFIG into $GITHUB_ENV; hardcoded EOF delimiter in emit_github_env is a minor robustness gap.
scripts/run-socket-tests.sh Adds resolve_nix_interpreter() with four progressively-broader fallbacks (NIX_LD → NIX_CC → cc path → store glob), replacing the fragile single-glob approach; logic is sound for Linux.
scripts/run-ssh-proxy-tests.sh Adds the same resolve_nix_interpreter() function, but the function is unreachable in practice: Linux exits early (macOS-only gate) and macOS lacks patchelf; dead code worth cleaning up.
.github/workflows/test-socket.yml Replaces cachix/install-nix-action with configure-self-hosted-nix.sh, switches all nix calls to --store daemon, and adds the continue-on-error + outcome-check pattern to preserve artifact upload while still failing the job on test regression.
.github/workflows/test-cjk.yml Drops cachix/install-nix-action in favour of configure-self-hosted-nix.sh; all nix invocations updated to --store daemon.
.github/workflows/test-gpu.yml Same pattern as test-cjk.yml: drops cachix action, adopts configure-self-hosted-nix.sh and --store daemon throughout.
.github/workflows/test-distro.yml Inserts configure-self-hosted-nix.sh + verify steps before KVM validation; nix build commands updated to --store daemon.
.github/workflows/test-ssh-proxy.yml Inserts configure-self-hosted-nix.sh + verify steps; nix develop/build commands updated to --store daemon.
.github/workflows/release-linux.yml Adds configure-self-hosted-nix.sh + daemon verify before KVM check; nix build steps updated to --store daemon.
.github/workflows/release-qcow2.yml Adds configure-self-hosted-nix.sh + daemon verify; QCOW2 build nix command updated to --store daemon.

Sequence Diagram

sequenceDiagram
    participant GHA as GitHub Actions Runner
    participant CFG as configure-self-hosted-nix.sh
    participant NIXD as Nix Daemon (daemon store)
    participant WF as Workflow Step (nix --store daemon)

    GHA->>CFG: bash scripts/configure-self-hosted-nix.sh
    CFG->>NIXD: nix store info --store daemon (probe)
    alt Daemon already running
        NIXD-->>CFG: OK
        CFG->>GHA: export NIX_REMOTE=daemon to GITHUB_ENV
    else determinate-nixd available
        CFG->>NIXD: nohup determinate-nixd daemon &
        loop poll (100 x 0.2 s)
            CFG->>NIXD: nix store info --store daemon
            NIXD-->>CFG: OK (eventually)
        end
        CFG->>GHA: export NIX_REMOTE=daemon to GITHUB_ENV
    else no daemon
        CFG->>GHA: exit 1 + emit_diagnostics
    end
    CFG->>GHA: export NIX_CONFIG (substituters) to GITHUB_ENV
    GHA->>WF: run subsequent steps
    WF->>NIXD: nix --store daemon build / develop
    NIXD-->>WF: build outputs via shared /nix/store
Loading

Reviews (1): Last reviewed commit: "ci(nix): use runner nix on self-hosted l..." | Re-trigger Greptile

Comment on lines +17 to +26
emit_github_env() {
local key="$1"
shift

{
printf '%s<<EOF\n' "$key"
printf '%s\n' "$@"
printf 'EOF\n'
} >>"$GITHUB_ENV"
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Hardcoded EOF delimiter may truncate NIX_CONFIG

The GitHub Actions multiline env-file format requires that the chosen delimiter does not appear as a standalone line anywhere in the value. Using the fixed string EOF means any pre-existing NIX_CONFIG that contains a bare EOF line would silently truncate the exported value, causing downstream nix invocations to receive incomplete config with no error. GitHub's own documentation recommends generating a random delimiter (e.g. via $RANDOM or a hash) to eliminate the collision risk entirely.

Comment on lines +14 to +41
resolve_nix_interpreter() {
if [ -n "${NIX_LD:-}" ] && [ -e "${NIX_LD}" ]; then
printf '%s\n' "${NIX_LD}"
return 0
fi

if [ -n "${NIX_CC:-}" ] && [ -r "${NIX_CC}/nix-support/dynamic-linker" ]; then
head -n 1 "${NIX_CC}/nix-support/dynamic-linker"
return 0
fi

if command -v cc >/dev/null 2>&1; then
local cc_path cc_root
cc_path="$(command -v cc)"
cc_root="${cc_path%/bin/cc}"
if [ -r "${cc_root}/nix-support/dynamic-linker" ]; then
head -n 1 "${cc_root}/nix-support/dynamic-linker"
return 0
fi
fi

if compgen -G "/nix/store/*glibc*/lib/ld-linux-x86-64.so.2" >/dev/null; then
ls /nix/store/*glibc*/lib/ld-linux-x86-64.so.2 2>/dev/null | sort -V | tail -1
return 0
fi

return 1
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 resolve_nix_interpreter is unreachable in this script

On Linux runners, the script exits at line 57 (uname -s != Darwin) before resolve_nix_interpreter is ever called at line 69. On macOS (where execution continues), patchelf is not available, so the call at line 70–73 is a no-op. This makes the function dead code in every actual execution path. The parallel copy in run-socket-tests.sh is live (Linux-only script), but this one can be safely removed — or if patchelf support on macOS is ever intended, that plan should be documented.

Comment on lines +33 to +60
resolve_nix_interpreter() {
if [ -n "${NIX_LD:-}" ] && [ -e "${NIX_LD}" ]; then
printf '%s\n' "${NIX_LD}"
return 0
fi

if [ -n "${NIX_CC:-}" ] && [ -r "${NIX_CC}/nix-support/dynamic-linker" ]; then
head -n 1 "${NIX_CC}/nix-support/dynamic-linker"
return 0
fi

if command -v cc >/dev/null 2>&1; then
local cc_path cc_root
cc_path="$(command -v cc)"
cc_root="${cc_path%/bin/cc}"
if [ -r "${cc_root}/nix-support/dynamic-linker" ]; then
head -n 1 "${cc_root}/nix-support/dynamic-linker"
return 0
fi
fi

if compgen -G "/nix/store/*glibc*/lib/ld-linux-x86-64.so.2" >/dev/null; then
ls /nix/store/*glibc*/lib/ld-linux-x86-64.so.2 2>/dev/null | sort -V | tail -1
return 0
fi

return 1
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 resolve_nix_interpreter duplicated across both test scripts

The function body is identical in run-socket-tests.sh and run-ssh-proxy-tests.sh. Maintaining two copies means any future fix (e.g. supporting aarch64 interpreter paths) must be applied in both places. Consider extracting it to a shared helper such as scripts/nix-lib.sh and sourcing it from both callers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant