Skip to content

Run full test suite in CI and migrate platform skips to describe.if#197

Merged
dylan-conway merged 6 commits intomainfrom
dylan/ci-full-suite
Mar 31, 2026
Merged

Run full test suite in CI and migrate platform skips to describe.if#197
dylan-conway merged 6 commits intomainfrom
dylan/ci-full-suite

Conversation

@dylan-conway
Copy link
Copy Markdown
Collaborator

@dylan-conway dylan-conway commented Mar 31, 2026

CI was running a hand-picked subset of 5 test files via test:unit + test:integration.
Most tests never ran in CI at all. This switches to npm test (everything) and fixes
what that surfaces.

Changes

describe.if() migration — 13 test files used if (skipIfNotLinux()) return at
the top of each test. On the wrong platform the test runs zero assertions and reports
pass. Now they report skip: before-skip-count 0, after ~149 on macOS / ~55 on Linux.
New test/helpers/platform.ts exports the conditions.

Dead tests in seccomp-filter.test.ts — deleted:

  • skipIfNotAnt() checked USER_TYPE=ant, which nothing sets. Tests behind it never ran.
  • Two "USER_TYPE gating" tests called wrapCommandWithSandboxLinux() with no
    restrictions — hits the early return at linux-sandbox-utils.ts:1072 and gives
    back the unwrapped command. expect("echo test").not.toContain("apply-seccomp")
    was always true for the wrong reason.
  • Two expect(true).toBe(true) placeholders.

linux-dependency-error.test.ts — replaced mock.module with spyOn. mock.module
patches bun's module cache and never un-patches; running the full suite in one process
meant every file that imported generate-seccomp-filter after this one got the mock.
spyOn swaps one export binding and mockRestore puts it back.

allow-read.test.ts — the root-deny tests fail on Macs with homebrew bash as
default shell; EXEC_DEPS doesn't cover /opt/homebrew. Pinned binShell: '/bin/bash'.

update-config.test.ts — three sandboxed-curl tests had spawnSync({ timeout: 10000 })
but no matching timeout on the it(). Bun's default is 5000ms; when example.com is slow
the body exceeds it. Never ran on CI before so nobody hit it.

Docker CI job — builds srt and runs it with enableWeakerNestedSandbox in an
unprivileged container, checking that allowed writes land, denied writes don't, and
the seccomp filter blocks AF_UNIX. Gated on SRT_E2E_DOCKER so host jobs skip it.

Push triggerbranches: ['**']['main']. PRs were running the matrix twice.

Diff size

mandatory-deny-paths.test.ts and wrap-with-sandbox.test.ts look enormous in the
diffstat — prettier reindents the entire describe body when describe.if(cond)(
wraps to two lines. Compare with git diff -w for the actual changes.

CI was running test:unit + test:integration, a curated subset of 5 files.
Most test files were never run in CI. Switch to `npm test` which runs
everything. Drop the test:unit/test:integration scripts.

Migrate the inline `if (skipIfNotLinux()) return` pattern to bun's native
`describe.if()`/`it.if()`. The old pattern made wrong-platform tests show
as pass (zero assertions, green checkmark) instead of skip — CI's test
count looked the same regardless of what actually ran. New
test/helpers/platform.ts exports isLinux/isMacOS/isSupportedPlatform.

Delete ~310 lines of unreachable tests from seccomp-filter.test.ts:
- skipIfNotAnt() gate checked USER_TYPE env var that nothing sets
- Two tests called wrapCommandWithSandboxLinux() with no restrictions,
  which returns the command unwrapped at the early-return check —
  expect("echo test").not.toContain("apply-seccomp") was vacuously true

Pin allow-read root-deny tests to /bin/bash — EXEC_DEPS doesn't list
/opt/homebrew, so execvp failed on Macs with homebrew bash as SHELL.

Add docker-tests CI job: unprivileged container on both arches,
exercises enableWeakerNestedSandbox end-to-end.

Drop push trigger from '**' to 'main' — PRs were running the full
matrix twice (once for branch push, once for the PR event).
mock.module patches bun's module cache globally and never unmocks.
With npm test running all files in one process (instead of the old
test:unit + test:integration split), the mock leaked: every file that
imported getApplySeccompBinaryPath after this one got () => null, so
pid-namespace-isolation.test.ts and integration.test.ts failed in
beforeAll.

spyOn swaps one export binding; mockRestore in afterEach puts it back.
The callee's own import binding routes through the same slot in bun, so
checkLinuxDependencies sees the spy without any module-level surgery.

Also spies on whichSync directly rather than overwriting Bun.which on
globalThis — same fix, closer to what's actually being tested.

Drop stale README reference to the deleted test:integration script.
The full suite assumes bwrap --proc /proc works; an unprivileged
container doesn't have CAP_SYS_ADMIN for that. Only tests that set
enableWeakerNestedSandbox can pass there.

Instead of filtering which unit tests to run, test the thing the job
is for: build srt, run it with enableWeakerNestedSandbox, check that
allowed writes land, denied writes don't, and the seccomp filter blocks
AF_UNIX. Gated on SRT_E2E_DOCKER so host jobs skip it.
SandboxRuntimeConfigSchema requires network (no .optional()). Without it
loadConfig returns null, srt falls through to getDefaultConfig, and the
sandbox enforces a different allowWrite than the test expects.
The three it.if(isLinux) tests each run two spawnSync calls with curl
--max-time 3 then --max-time 5. When example.com responds slowly both
curls run to their limits and the body takes ~8s, but bun's default
test timeout is 5000ms. bun aborts mid-body; afterEach runs reset()
against an in-flight spawn and the next test sees stale state.

These were never in test:integration so they never ran on CI before
this branch. On fast responses they complete in under 200ms.
@dylan-conway dylan-conway merged commit e94c5fd into main Mar 31, 2026
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants