Follow-up to #831 (test workflow speedup).
Context
PR #831 split the test workflow into parallel jobs and got e2e down to ~6m. The e2e suite has 608 tests in a single binary (tests/e2e_all/). The next big win is sharding that binary across N parallel CI jobs using nextest's --partition hash i/N.
Proposal
Add a matrix to the e2e-test job:
strategy:
fail-fast: false
matrix:
shard: [1, 2, 3]
steps:
- run: cargo nextest run --test e2e_all --partition hash ${{ matrix.shard }}/3
Safety
The shared-DB isolation (UUID-scoped orgs/workspaces + advisory-locked bootstrap in crates/api/tests/common/db_setup.rs) is already safe across parallel processes. --partition hash is deterministic (same test → same shard), so no cross-shard collision. Each shard gets its own PostgreSQL service container.
Consideration
This runs 3 e2e jobs in parallel on the same self-hosted runner (gpu11), which shares with prod. If CPU contention with model-serving is visible, this should land together with a dedicated CI VM (see nearai/infra issue). Alternatively, start with 2-way sharding.
Expected impact
~3× e2e wall-clock reduction (5m58s → ~2-2.5m), bounded by the slowest shard.
Follow-up to #831 (test workflow speedup).
Context
PR #831 split the test workflow into parallel jobs and got e2e down to ~6m. The e2e suite has 608 tests in a single binary (
tests/e2e_all/). The next big win is sharding that binary across N parallel CI jobs using nextest's--partition hash i/N.Proposal
Add a matrix to the
e2e-testjob:Safety
The shared-DB isolation (UUID-scoped orgs/workspaces + advisory-locked bootstrap in
crates/api/tests/common/db_setup.rs) is already safe across parallel processes.--partition hashis deterministic (same test → same shard), so no cross-shard collision. Each shard gets its own PostgreSQL service container.Consideration
This runs 3 e2e jobs in parallel on the same self-hosted runner (gpu11), which shares with prod. If CPU contention with model-serving is visible, this should land together with a dedicated CI VM (see nearai/infra issue). Alternatively, start with 2-way sharding.
Expected impact
~3× e2e wall-clock reduction (5m58s → ~2-2.5m), bounded by the slowest shard.