ci: route compile-heavy jobs to self-hosted Linux runner by thehoff · Pull Request #179 · thehoff/contextcrawler

thehoff · 2026-05-25T11:52:34Z

Summary

Adds a CI workflow that targets the self-hosted Linux/X64 runner registered to this repo, taking compile-heavy Rust jobs off the Hoff's laptop and onto a DMZ-isolated LXC.


Trigger	`push` to in-repo branches (develop/main/feat/fix/harden/polish/perf/docs/ci/**) + `workflow_dispatch`
Explicitly NOT	`pull_request` — fork PRs cannot execute code on the self-hosted box
Jobs	`cargo test --bin contextcrawler` (30 min cap), `cargo clippy -- -D warnings` (15 min cap)
Caching	`Swatinem/rust-cache@v2`, shared key `self-hosted-stable`
Concurrency	Per-ref, in-flight runs cancelled on push

Security posture

Repo is public + fork. Self-hosted runners on public repos require defence-in-depth against rogue fork PRs.
This workflow's triggers cannot fire on a fork PR by design (push refs are owned by the repo).
Repo Settings -> Actions -> General -> "Require approval for all outside collaborators" enabled out-of-band as belt-and-braces.
LXC is in its own DMZ VLAN with no LAN reachback, internet egress only.

Bootstrap completed on the runner

Bare LXC, one-shot install run before this PR:

apt install -y build-essential pkg-config libssl-dev cmake \
               git curl ca-certificates jq

Rust toolchain installs in-job via dtolnay/rust-toolchain@stable. No permanent host install.

gitignore carve-out

.github/ was previously ignored wholesale with a "never publish" comment. Replaced with .github/* + !.github/workflows/ so shipped CI files can land while local-only .github/instructions/, .github/CICD.md, etc. remain ignored.

Side observation: there are five local-only workflow files under .github/workflows/ (ci.yml, cd.yml, next-release.yml, pr-target-check.yml, CICD.md) that are presumably inherited from upstream rtk-ai/rtk and never tracked in this fork. They remain untracked. Separate decision whether to adopt any of those upstream-derived workflows is out of scope for this PR.

Test plan

Merge to develop
Confirm workflow appears in Actions tab
First push to develop triggers a run on github-runner-1
First run completes (compile + test + clippy) within 30 min budget
Second run from cache lands in <2 min
Manual workflow_dispatch smoke test succeeds

🤖 Generated with Claude Code

Adds a CI workflow targeting the `[self-hosted, Linux, X64]` runner registered to this repo. Triggered on pushes to in-repo branches and `workflow_dispatch`, deliberately NOT on `pull_request` — fork PRs must not be able to execute arbitrary code on the self-hosted box. Outside-contributor PRs continue to hit whichever cloud-hosted workflows exist on `ubuntu-latest`. Two jobs: `cargo test --bin contextcrawler` (30 min cap) and `cargo clippy -- -D warnings` (15 min cap). Both use `Swatinem/rust-cache@v2` with a shared `self-hosted-stable` key so the second run onwards is near-instant. Concurrency group cancels in-flight runs on the same ref to avoid queueing up pushes from the same branch. The runner LXC is a bare Linux box in a DMZ VLAN with no LAN reachback, internet egress only. One-shot host bootstrap: apt install -y build-essential pkg-config libssl-dev cmake \ git curl ca-certificates jq Rust toolchain installs in-job via `dtolnay/rust-toolchain@stable`, no permanent host install. Belt-and-braces: repo Settings -> Actions -> General -> "Require approval for all outside collaborators" enabled out-of-band so cloud workflows don't fire on unreviewed fork PRs either. Also carves `.github/workflows/` out of the broader `.github/` gitignore rule so shipped CI files can actually land. Other `.github/*` paths (CICD.md, instructions/, etc.) remain ignored. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Defence-in-depth on the self-hosted runner workflow: 1. SHA-pin every third-party action so a compromised tag re-point cannot poison the runner (mirrors the tj-actions/changed-files incident shape from March 2025). Version comments record what the SHA resolved from at pinning time. Update via Dependabot. 2. Top-level `permissions: contents: read` locks GITHUB_TOKEN to read-only explicitly, not just by repo default. A malicious step in a transitively pulled dependency still cannot push, open issues, or mutate the repo. 3. `persist-credentials: false` on every checkout. Stops the token from being written into `.git/config` and surviving on the runner workspace between steps. Combined with the `push`-only triggers and the host-side `--ephemeral` registration (separate operational step), the runner is now defensible for a public-fork repo. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

First runner job revealed two unrelated issues: 1. `dtolnay/rust-toolchain@stable` fetched Rust 1.95.0, way ahead of the declared `rust-version = "1.80"` MSRV. Rust 1.95's clippy added new lints (doc_lazy_continuation, type_complexity tightening) plus an `incompatible_msrv` error for the existing `std::iter::repeat_n` usage (stable since 1.82). The lints firing on a clean codebase are toolchain drift, not bugs. 2. The clippy job ran with `-- -D warnings`, escalating every new advisory to a build failure. Combined with #1 above, the workflow was effectively unbuildable. Fix: pin the toolchain to `1.82` (newest version still aligned with the actual MSRV the code uses — `repeat_n` works) and drop `-D warnings` from clippy so warnings are visible but non-fatal. Re-tighten after a dedicated lint-cleanup pass lands. Also collapses the duplicate `with:` block in the clippy job that slipped in during the previous edit. The `cargo test` job exited 143 (SIGTERM) on the previous run — that was collateral from the workflow's job-failure cascade, not a real test failure. Re-run with the fixed clippy gate will tell us if the test job lands clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Previous pin to 1.82 broke on the live runner — a transitive dep `ignore-0.4.25` declares `edition = "2024"` in its Cargo.toml, which Cargo can only parse once `edition2024` is stabilized. That stabilized in Rust 1.85. Failure mode was `feature 'edition2024' is required` on `cargo fetch`, killing both test and clippy jobs in ~15s before any real work ran. Bumping the pinned toolchain to 1.85 is the smallest version that parses the current dependency graph. Still ahead of the project's declared MSRV (1.80, also stale — `std::iter::repeat_n` needs 1.82) but acceptable for CI; MSRV cleanup is a separate concern filed against the project. The JIT runner loop is now live on github-runner-1 (systemd unit `actions-jit-runner.service`), so this push fires immediately. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Pinning to 1.85 hit the next wall: source uses str::floor_char_boundary (stable in Rust 1.86), still unstable on 1.85. The codebase actually needs a moderately recent stable, and progressively pinning each time a newer feature shows up is whack-a-mole. Drop the explicit pin; `dtolnay/rust-toolchain@<sha>` defaults to the stable channel ref it was pinned at, which resolves to whatever stable is current at run time (1.95.x at present). The original 1.95 lints that surfaced earlier are now non-fatal because the `-D warnings` escalation was already removed in a previous commit. Lints stay visible in the log without bricking the build. If a future stable starts breaking the build on a real (non-lint) change, re-introduce the pin at that point — but track current stable rather than the declared MSRV. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The prior `replace_all` that stripped `toolchain: "1.85"` from both jobs accidentally left an orphan `components: clippy` line in the clippy job without its parent `with:` key. Result: invalid YAML, run 26401651631 failed at workflow parse time with no jobs ever started (`headBranch: null`, zero duration). Restoring the `with:` block fixes the YAML. Adding a python YAML validation step would catch this earlier but is out of scope for this fix — the CI itself will surface malformed workflow files going forward. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

- Add cd.yml, ci.yml, next-release.yml, pr-target-check.yml, CICD.md (previously held back by .github/ blanket-ignore — now within the workflows/ exception added earlier on this branch). - Drop personal reference from ci-self-hosted.yml header. - .gitignore: silence local-only peer-review patches + stray playwright-mcp package-lock.json. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…#180) Both gates serialised the raw shell command verbatim into JSONL on disk. Last-24h scan of a single user's downgrades.jsonl found 27 40-hex tokens and 15 `Authorization: token <hex>` headers captured in cleartext at a predictable path. Add `core::secret_redact::redact` and apply it at both write sites (`tirith_gate::log_downgrade`, `supply_chain_gate::log_event`). Covered patterns: - URL basic-auth (`https://user:pw@host`) - `Authorization: token|Bearer <value>` headers - GitHub PAT prefixes (gho_/ghp_/ghs_/ghu_/github_pat_) - Env-var assignments to credential-shaped names (matches `*_TOKEN`/`*_KEY`/`*_SECRET`/`*_PASSWORD`/`*_PAT`/`*_APIKEY`/`*_AUTH` and bare equivalents; leaves PATH/HOME/etc. alone) - CLI flags `--token`/`--auth-token`/`--password`/`--api-key`/`--secret`, space-separated or `=`-attached Conservative scrubber: prefer false negatives over corrupting the diagnostic value of the log. Zero-copy fast path (`Cow::Borrowed`) when the cmd has nothing to scrub. Idempotent. 15 unit tests cover each pattern + idempotency + the PATH-must-not-be-redacted invariant. Out of scope: backfill scrub utility for existing logs (follow-up), log rotation, encryption at rest. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…#180) Two follow-ups after a real-world scrub of one user's existing logs found 30 surviving secret-shaped strings: 1) The `tirith` field in downgrades.jsonl is spliced in verbatim from the tirith subprocess output. That blob frequently echoes the original command (and any inline credentials) back inside its findings. Apply the same redactor to it before splicing. 2) git-credential-helper feeds creds over a pipe as `protocol=...\nhost=...\nusername=...\npassword=<TOKEN>` where the `\n` is a literal two-char escape. From the regex engine's POV, `password` lives mid-word and `\b` doesn't anchor. Add a targeted pattern that matches `(\\n|\\r)(password|token|secret|auth)=...` and preserves the escape prefix in the replacement. Add a unit test for the git-credential-helper case + document the one remaining known limitation (`T=<40-hex>` one-letter aliases can't be safely caught by name-shape alone without false-positiving git SHAs). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

#180) Closes the final acceptance item on #180. The redactor lives in core::secret_redact; this exposes it as a one-shot CLI action that deep-walks every string in both audit JSONL files and rewrites them atomically through a temp file, with a timestamped backup left alongside. Behaviour: - `contextcrawler security --scrub-logs` — live rewrite, prints per-file stats (lines / changed / unparseable) and backup path. - `contextcrawler security --scrub-logs --dry-run` — same scan + report, no files touched. Useful before committing to a rewrite. - Unparseable lines (e.g. heredoc-with-embedded-newlines records that broke JSONL framing) get a raw-line redaction fallback so noise can't smuggle secrets through. Refactored the I/O core into `scrub_logs_in(&Path, dry_run)` so it's unit-testable against a tempdir. Public `ScrubReport` / `ScrubFileReport` structs expose per-file counts for callers that want to drive it programmatically. Three new tests: - credentials in both cmd AND nested tirith blob are stripped + backup written - dry-run reports counts without mutating files - missing files are skipped gracefully Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…ff bypass (#181) The gate's own error messages explicitly tell users: Overrides: rerun with CONTEXTCRAWLER_SUPPLY_CHAIN=off, or add the package … But that hint is misleading. `CONTEXTCRAWLER_SUPPLY_CHAIN=off pip install …` scopes the assignment to the `pip install` subprocess — the gate has already run by then and only reads its own process env. So the user follows the documented bypass, is still blocked, and concludes the gate is buggy. Add `cmd_has_leading_assignment(cmd, name, allowed)` and call it from `check()` after the existing `std::env::var` branch. It parses leading POSIX-style `NAME=VALUE` tokens in the cmd string, stops at the first non-assignment token (so mid-cmd `&& FOO=bar` does not bypass), and returns true if `name` appears with one of the allowed values. Conservative on value parsing — bareword values only. The bypass values we care about are short (`off`/`0`/`false`/`no`), and supporting shell quoting here would just create a different surprise. Tests: 8 unit tests cover the documented form, sibling assignments, value variants, the must-be-prefix invariant, defensive `=on` rejection, exact-value-match guard, invalid identifiers, and empty cmd. The existing `std::env::var` bypass path is unchanged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…'t blackout the gate (#182) Five `Verdict::Unavailable` events in one user's 24h logs traced to a single failed registry/OSV call early in the package loop. Once the first transient error fires, `transient_err.get_or_insert(e)` captures it, the loop moves on without further upstream calls succeeding into findings, and `check()` falls through to `Verdict::Unavailable` even though a retry would have cleared it. Add a retry-with-backoff to `http_get_json` and `http_post_json`: - 1 retry max (2 attempts total) to keep the worst-case per-call within the CHECK_WALL_BUDGET = 25s. - Per-attempt timeout dropped from 8s to 5s. Total per-call worst case: 5s + 250ms backoff + 5s = ~10.25s. Two slow packages still fit. - Retry only on retryable error shapes: * `ureq::Error::Transport(_)` — DNS hiccup, connection reset, read timeout. Exactly the class that produced the user's blackouts. * `ureq::Error::Status(500..600, _)` — registry unhealthy / transient overload. Worth a single retry. - 4xx is terminal — `404` (no such package), `401/403` (auth), `422` (malformed), `429` (rate-limit) all need *something other than immediate retry*. Bouncing harder against a rate-limiter just makes it worse. The retry-or-not policy is lifted into a `HttpErrTag`-keyed pure function (`is_retryable_http_err_tag`) so it can be unit-tested without constructing a real `ureq::Response`/`ureq::Transport`. Six new tests: 5xx-retryable, 4xx-not-retryable, 2xx/3xx-not-retryable defensive case, transport-retryable, and a budget-arithmetic guard that ensures the retry math always fits inside CHECK_WALL_BUDGET — so a future loosening of the constants can't silently push worst-case beyond the deadline. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Removes docs/audits/HANDOVER-2026-05-22.md. The doc captured useful session state but contained working-style detail that doesn't belong in the public repo. Session state lives in local context, not here. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

thehoff and others added 14 commits May 25, 2026 21:52

chore(gitignore): silence playwright-mcp captures + python bytecode

e54d10f

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

noogalabs pushed a commit to noogalabs/contextcrawler that referenced this pull request Jun 4, 2026

chore(master): release 0.21.1 (thehoff#179)

c29644b

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

thehoff closed this Jun 10, 2026

thehoff deleted the ci/self-hosted-runner branch June 10, 2026 13:11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ci: route compile-heavy jobs to self-hosted Linux runner#179

ci: route compile-heavy jobs to self-hosted Linux runner#179
thehoff wants to merge 14 commits into
developfrom
ci/self-hosted-runner

thehoff commented May 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

thehoff commented May 25, 2026

Summary

Security posture

Bootstrap completed on the runner

gitignore carve-out

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant