Run full test suite in CI and migrate platform skips to describe.if#197
Merged
dylan-conway merged 6 commits intomainfrom Mar 31, 2026
Merged
Run full test suite in CI and migrate platform skips to describe.if#197dylan-conway merged 6 commits intomainfrom
dylan-conway merged 6 commits intomainfrom
Conversation
CI was running test:unit + test:integration, a curated subset of 5 files.
Most test files were never run in CI. Switch to `npm test` which runs
everything. Drop the test:unit/test:integration scripts.
Migrate the inline `if (skipIfNotLinux()) return` pattern to bun's native
`describe.if()`/`it.if()`. The old pattern made wrong-platform tests show
as pass (zero assertions, green checkmark) instead of skip — CI's test
count looked the same regardless of what actually ran. New
test/helpers/platform.ts exports isLinux/isMacOS/isSupportedPlatform.
Delete ~310 lines of unreachable tests from seccomp-filter.test.ts:
- skipIfNotAnt() gate checked USER_TYPE env var that nothing sets
- Two tests called wrapCommandWithSandboxLinux() with no restrictions,
which returns the command unwrapped at the early-return check —
expect("echo test").not.toContain("apply-seccomp") was vacuously true
Pin allow-read root-deny tests to /bin/bash — EXEC_DEPS doesn't list
/opt/homebrew, so execvp failed on Macs with homebrew bash as SHELL.
Add docker-tests CI job: unprivileged container on both arches,
exercises enableWeakerNestedSandbox end-to-end.
Drop push trigger from '**' to 'main' — PRs were running the full
matrix twice (once for branch push, once for the PR event).
mock.module patches bun's module cache globally and never unmocks. With npm test running all files in one process (instead of the old test:unit + test:integration split), the mock leaked: every file that imported getApplySeccompBinaryPath after this one got () => null, so pid-namespace-isolation.test.ts and integration.test.ts failed in beforeAll. spyOn swaps one export binding; mockRestore in afterEach puts it back. The callee's own import binding routes through the same slot in bun, so checkLinuxDependencies sees the spy without any module-level surgery. Also spies on whichSync directly rather than overwriting Bun.which on globalThis — same fix, closer to what's actually being tested. Drop stale README reference to the deleted test:integration script.
The full suite assumes bwrap --proc /proc works; an unprivileged container doesn't have CAP_SYS_ADMIN for that. Only tests that set enableWeakerNestedSandbox can pass there. Instead of filtering which unit tests to run, test the thing the job is for: build srt, run it with enableWeakerNestedSandbox, check that allowed writes land, denied writes don't, and the seccomp filter blocks AF_UNIX. Gated on SRT_E2E_DOCKER so host jobs skip it.
SandboxRuntimeConfigSchema requires network (no .optional()). Without it loadConfig returns null, srt falls through to getDefaultConfig, and the sandbox enforces a different allowWrite than the test expects.
The three it.if(isLinux) tests each run two spawnSync calls with curl --max-time 3 then --max-time 5. When example.com responds slowly both curls run to their limits and the body takes ~8s, but bun's default test timeout is 5000ms. bun aborts mid-body; afterEach runs reset() against an in-flight spawn and the next test sees stale state. These were never in test:integration so they never ran on CI before this branch. On fast responses they complete in under 200ms.
km-anthropic
approved these changes
Mar 31, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
CI was running a hand-picked subset of 5 test files via
test:unit+test:integration.Most tests never ran in CI at all. This switches to
npm test(everything) and fixeswhat that surfaces.
Changes
describe.if()migration — 13 test files usedif (skipIfNotLinux()) returnatthe top of each test. On the wrong platform the test runs zero assertions and reports
pass. Now they report skip: before-skip-count 0, after ~149 on macOS / ~55 on Linux.
New
test/helpers/platform.tsexports the conditions.Dead tests in seccomp-filter.test.ts — deleted:
skipIfNotAnt()checkedUSER_TYPE=ant, which nothing sets. Tests behind it never ran.wrapCommandWithSandboxLinux()with norestrictions — hits the early return at
linux-sandbox-utils.ts:1072and givesback the unwrapped command.
expect("echo test").not.toContain("apply-seccomp")was always true for the wrong reason.
expect(true).toBe(true)placeholders.linux-dependency-error.test.ts — replaced
mock.modulewithspyOn.mock.modulepatches bun's module cache and never un-patches; running the full suite in one process
meant every file that imported
generate-seccomp-filterafter this one got the mock.spyOnswaps one export binding andmockRestoreputs it back.allow-read.test.ts — the root-deny tests fail on Macs with homebrew bash as
default shell;
EXEC_DEPSdoesn't cover/opt/homebrew. PinnedbinShell: '/bin/bash'.update-config.test.ts — three sandboxed-curl tests had
spawnSync({ timeout: 10000 })but no matching timeout on the
it(). Bun's default is 5000ms; when example.com is slowthe body exceeds it. Never ran on CI before so nobody hit it.
Docker CI job — builds srt and runs it with
enableWeakerNestedSandboxin anunprivileged container, checking that allowed writes land, denied writes don't, and
the seccomp filter blocks AF_UNIX. Gated on
SRT_E2E_DOCKERso host jobs skip it.Push trigger —
branches: ['**']→['main']. PRs were running the matrix twice.Diff size
mandatory-deny-paths.test.tsandwrap-with-sandbox.test.tslook enormous in thediffstat — prettier reindents the entire
describebody whendescribe.if(cond)(wraps to two lines. Compare with
git diff -wfor the actual changes.