perf(metrics): Skip the eager metrics warm-up when this repo's cache is already populated#1580
Conversation
⚡ Performance Benchmark
Details
History9c96671 fix(tests): Address CodeRabbit feedback on prewarm-count assertions
4fac887 fix(metrics): Address codex review on cache-aware prewarm
92287a9 fix(metrics): Per-repo seen marker for prewarm heuristic
d770b10 perf(metrics): Cache-aware prewarm count for the metrics worker pool
|
|
Important Review skippedAuto incremental reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
📝 WalkthroughWalkthroughThis PR optimizes metrics worker initialization by conditionally pre-warming a smaller thread pool when the token-count cache file is detected on disk, instead of always warming all available worker threads. ChangesConditional Metrics Worker Warm-Up
Estimated code review effort🎯 2 (Simple) | ⏱️ ~12 minutes Possibly related PRs
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Code Review
This pull request optimizes the metrics worker pool initialization by introducing a "warm-likely" heuristic. It checks for the existence of an on-disk token-count cache file; if present, it limits the number of pre-warmed workers to one, avoiding unnecessary and expensive BPE parsing when cache hits are expected. If the cache is missing, it falls back to the original behavior of warming all worker threads. I have no feedback to provide as no review comments were present.
Deploying repomix with
|
| Latest commit: |
2fdf07e
|
| Status: | ✅ Deploy successful! |
| Preview URL: | https://297a6007.repomix.pages.dev |
| Branch Preview URL: | https://perf-metrics-cache-aware-pre.repomix.pages.dev |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #1580 +/- ##
==========================================
- Coverage 90.88% 90.86% -0.03%
==========================================
Files 121 121
Lines 4650 4683 +33
Branches 1080 1088 +8
==========================================
+ Hits 4226 4255 +29
- Misses 424 428 +4 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Three findings from iteration-1 codex review on PR #1580: 1. `seenMarkerKey` did not normalize relative paths before hashing. The CLI normalizes to absolute paths in `defaultAction.ts:117` so today there is no caller hitting this, but the public `pack()` API makes no such guarantee — a library consumer passing `['.']` from two different cwds would collide on the same marker. Run each root through `path.resolve()` before sorting and joining. 2. `saveTokenCountCache` only touched the per-repo marker after a successful cache WRITE. A fully-warm pack short-circuits via `!state.dirty` and never reaches the touch, so two corners stayed stuck on cold-likely forever: - Upgrade from a pre-marker release: the shared cache exists but no marker has ever been written. The first post-upgrade pack on a fully warm cache would never produce one. - Crash recovery: a previous pack landed the cache via `fs.rename` but exited before `markRepoSeen`. Cache present, marker missing. Add an early resync that touches the marker whenever the shared cache file already exists on disk and rootDirs is non-empty, regardless of whether the save itself will be a no-op. The touch is idempotent (0-byte writeFile) so the duplicate call when a real write follows is harmless. 3. Tests covered marker existence behavior but did not pin the *prewarm dispatch count*. A future refactor could keep the marker logic intact but accidentally warm `maxThreads` on warm-likely or only `1` on cold-likely without any test failing. Add three pinning tests in `tests/core/metrics/calculateMetrics.test.ts` that probe the actual `taskRunner.run` call count under each of the three warm/cold-likely combinations. `npm run lint` clean. `npm run test` 1309 → 1312 passing. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@tests/core/metrics/calculateMetrics.test.ts`:
- Around line 336-348: The test should compute the expected maxThreads instead
of assuming >1: import os (or require('os')) and compute expectedMaxThreads =
Math.min(os.cpus().length, Math.ceil(1000 / 100)) (or derive the
divisor/numOfTasks from the createMetricsTaskRunner call) and replace the
hardcoded expect((result.taskRunner.run as
Mock).mock.calls.length).toBeGreaterThan(1) with
expect(...).toBe(expectedMaxThreads); apply the same change to the similar
assertion in the other test (lines 350-356) so both use the same computed
maxThreads logic used by createMetricsTaskRunner.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 8c628619-df52-4e80-8614-a4243744bb26
📒 Files selected for processing (5)
src/core/metrics/calculateMetrics.tssrc/core/metrics/tokenCountCache.tssrc/core/packager.tstests/core/metrics/calculateMetrics.test.tstests/core/metrics/tokenCountCache.test.ts
Replace the unconditional eager warm-up of `min(cpu, ceil(N/100))`
workers with a cache-aware prewarm count derived from whether the
on-disk token-count cache file already exists.
Background — what the existing prewarm does
`createMetricsTaskRunner` opens the metrics Tinypool and immediately
fires `maxThreads` dummy `taskRunner.run({ content: '', encoding })`
tasks. Their only purpose is to force each worker to load
`gpt-tokenizer` and parse the o200k_base BPE table (~225ms CPU per
worker) BEFORE the real metrics phase needs them. The cost is paid
in parallel and overlaps `collect` / `security` / `process`, so on
cold runs it is a clean win — by the time `calculateFileMetrics`
dispatches real work, the pool is hot.
Why this regresses on warm runs
With the per-file token-count cache from PR #1565, the typical repeat
`repomix` invocation hits the cache for every per-file lookup and
dispatches ZERO tokenization tasks via `calculateFileMetrics`. The
only remaining metrics dispatches are 2-3 small git diff / git log
tokenizations. Yet the pool still eagerly warms all `maxThreads`
workers, so on an 8-vCPU host 7 of the 8 BPE parses end up being
~225ms of wasted CPU per pack.
What changes
A `existsSync` probe on the cache file at pool-creation time picks
the prewarm count from the actual workload prediction:
- cache file present → warm-likely → prewarm 1 (covers git
diff/log; any genuine miss spawns extra workers lazily)
- cache file missing → cold-likely → prewarm `maxThreads`
(identical to current `main` behavior, no regression)
`fs.existsSync` is sub-millisecond and the probe is a best-effort
warm/cold predictor only, not a correctness signal. A stale or
partial cache simply falls back to lazy worker spawn for the misses.
Measured (Apple Silicon, 8 vCPU, repomix self-pack, hyperfine n=15):
WARM cache (typical repeat run):
BEFORE 757.8 ± 7.2 ms / User 2185 ms
AFTER 710.0 ± 9.1 ms / User 978 ms (-6.3% wall, -55% User CPU)
COLD cache (REPOMIX_TOKEN_CACHE=0):
BEFORE 846.0 ± 7.2 ms / User 3092 ms
AFTER 842.5 ± 8.5 ms / User 3087 ms (-0.4% wall, no regression)
Companion to PR #1579, which capped the entire pool at 3 and traded a
~+4% cold-cache regression for the warm-cache win. This cache-aware
revision eliminates that trade — warm runs save even more BPE parses
(prewarm 1 instead of 3) AND cold runs are byte-identical to current
`main`.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The previous revision of this PR used `tokenCountCacheFileExistsSync()`
alone to predict "warm-likely". That probe is positive as soon as ANY
repo on this machine has packed once and written the shared cache, so
the very second pack of a brand-new repo would incorrectly land on the
prewarm-1 fast path. The metrics phase would then dispatch N real
tokenization tasks, Tinypool would spawn the rest of the workers
LAZILY on the critical path (each paying ~225ms BPE init), and we
would eat a wall regression vs `main` for the entire first-pack-of-a-
new-repo workload.
Fix by pairing the global cache-file probe with a per-repo "seen
marker" file. The marker is a zero-byte file under
`$TMPDIR/repomix/cache/seen/{md5(sorted(rootDirs))}` that is touched
ONLY after a successful `saveTokenCountCache(rootDirs)` write — so its
existence implies this machine has at some point persisted cache
entries for exactly this `rootDirs` set.
The prewarm decision then becomes:
```ts
cacheWarmLikely =
tokenCountCacheFileExistsSync() &&
tokenCountCacheSeenMarkerExistsSync(rootDirs);
```
Both true → warm-likely → prewarm 1 worker.
Either false → cold-likely → prewarm `maxThreads`, identical to `main`.
Design notes
- The shared cache itself is untouched. Cross-repo content
deduplication (e.g. shared `package-lock.json` boilerplate) still
works via the content-addressed key.
- `--remote` clones to `fs.mkdtemp(...)` which produces a unique
random path per invocation. The marker correctly classifies this as
cold-likely on every fresh clone (no false warm signal), while the
shared cache still serves content hits if the URL has been packed
before.
- The MCP server's single module-level `state` is unchanged. No
per-repo cache loading, no per-repo state map.
- The marker path is derived from `getCacheFilePath()`'s directory so
`REPOMIX_TOKEN_CACHE_PATH` overrides keep tests hermetic.
- Stale marker failure mode: a previously-cached repo whose entries
were later FIFO-evicted will still predict warm. The cost ceiling
is one BPE init per lazily-spawned worker on the metrics critical
path — bounded, and narrower than the original global false-positive.
Verification
`npm run lint` clean. `npm run test` 1304 → 1309 passing (5 new
focused tests pinning the heuristic corners: unrelated repo stays
cold, same repo becomes warm, fresh `--remote` clone path does not
see stale markers, no-op save does not touch the marker, marker key
is order-insensitive over `rootDirs`).
Apple Silicon local bench (hyperfine n=15, repomix self-pack):
WARM cache same repo:
BEFORE 767.2 ± 6.0 ms / User 2200 ms
AFTER 712.5 ± 8.3 ms / User 982 ms (-7.1% wall, -55% User CPU)
COLD cache (REPOMIX_TOKEN_CACHE=0):
BEFORE 867.5 ± 34.1 ms / User 3158 ms
AFTER 859.2 ± 21.3 ms / User 3140 ms (-1.0% wall, no regression)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three findings from iteration-1 codex review on PR #1580: 1. `seenMarkerKey` did not normalize relative paths before hashing. The CLI normalizes to absolute paths in `defaultAction.ts:117` so today there is no caller hitting this, but the public `pack()` API makes no such guarantee — a library consumer passing `['.']` from two different cwds would collide on the same marker. Run each root through `path.resolve()` before sorting and joining. 2. `saveTokenCountCache` only touched the per-repo marker after a successful cache WRITE. A fully-warm pack short-circuits via `!state.dirty` and never reaches the touch, so two corners stayed stuck on cold-likely forever: - Upgrade from a pre-marker release: the shared cache exists but no marker has ever been written. The first post-upgrade pack on a fully warm cache would never produce one. - Crash recovery: a previous pack landed the cache via `fs.rename` but exited before `markRepoSeen`. Cache present, marker missing. Add an early resync that touches the marker whenever the shared cache file already exists on disk and rootDirs is non-empty, regardless of whether the save itself will be a no-op. The touch is idempotent (0-byte writeFile) so the duplicate call when a real write follows is harmless. 3. Tests covered marker existence behavior but did not pin the *prewarm dispatch count*. A future refactor could keep the marker logic intact but accidentally warm `maxThreads` on warm-likely or only `1` on cold-likely without any test failing. Add three pinning tests in `tests/core/metrics/calculateMetrics.test.ts` that probe the actual `taskRunner.run` call count under each of the three warm/cold-likely combinations. `npm run lint` clean. `npm run test` 1309 → 1312 passing. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two cold-path tests asserted `(call count) > 1`, which depends on the host having at least 2 vCPUs. On a 1-vCPU CI runner `getWorkerThreadCount(1000).maxThreads` collapses to 1 and the cold path correctly fires a single warm-up task — but the assertion would fail despite the behavior being correct. Switch both tests to assert against the actual computed `maxThreads` from the same heuristic the production path uses, so the cold-path contract is pinned independent of runner vCPU count. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
9c96671 to
2fdf07e
Compare
Summary
The metrics worker pool warms eagerly at pack startup by firing
maxThreadsdummy tasks so each worker pre-parsesgpt-tokenizer's o200k_base BPE table (~225ms CPU per worker). On cold runs that overhead overlapscollect/security/processand is essentially free. On warm runs — when the per-file token-count cache from PR #1565 hits for every file —calculateFileMetricsdispatches zero worker tasks, and 7 of those 8 BPE parses on an 8-vCPU host become pure waste (~1.5s of unnecessary CPU per pack).This PR predicts the warm path with a two-part sub-millisecond probe and drops the eager warm-up to a single worker when both probes agree:
The cold path (either probe false) is byte-identical to current
main.What's added
getRepoSeenMarkerPath(rootDirs)/tokenCountCacheSeenMarkerExistsSync(rootDirs)— a 0-byte file under$TMPDIR/repomix/cache/seen/{md5(sorted(rootDirs))}whose existence means this machine has previously persisted cache entries for exactly this root-dir set.saveTokenCountCache(rootDirs)— touches the marker on the same successful write path that persists the cache. No marker if the save was a no-op.createMetricsTaskRunner(rootDirs, numOfTasks, encoding)— newrootDirsparameter; the warm/cold prediction reads the marker.packager.tspassesrootDirsthrough both call sites.Marker path is derived from
getCacheFilePath()'s directory soREPOMIX_TOKEN_CACHE_PATHoverrides keep the marker alongside the cache (test hermeticity).Why a marker instead of per-repo cache files
A per-repo cache file (e.g.
token-counts-{md5(path)}.json) was considered. It would give a precise "this repo cached" signal with a singleexistsSync, but trades away:${encoding}:${byteLength}:${md5_16(content)}— purely content-addressed — so shared boilerplate (package-lock.json, vendored libraries, license headers) currently dedupes across all packed repos. Per-repo files break this.--remote.fs.mkdtemp(...)produces a unique random path per invocation, so the cache filename would change every time and second-and-laterrepomix --remote https://X/Yinvocations would always be cold. With the marker,--remoteis correctly classified as cold-likely on the prewarm decision, AND the shared content-addressed cache still serves hits.state.entries: Mappopulated by oneloadTokenCountCache()per process. Per-repo files would force per-repo state management or per-pack reloads.The marker keeps all three properties.
Stale-marker failure mode
A previously-cached repo whose entries are later FIFO-evicted by 100k-entries-worth of intervening packs of other repos would still predict warm. The cost ceiling is one BPE init per missing worker (paid on the metrics critical path when Tinypool lazily spawns workers for the misses) — narrower and rarer than the original
existsSync(cacheFile)-only heuristic, which fired warm on the very second pack of the machine regardless of repo identity.Benchmarks
Apple Silicon (8 vCPU), repomix self-pack on this repo, hyperfine n=15:
Warm cache (typical repeat run on the same repo)
Cold cache (
REPOMIX_TOKEN_CACHE=0)CI bench across all three OSes (on the simpler earlier revision before the marker landed) showed −21–30%. The marker keeps that warm-cache win and adds the correctness fix for the first-pack-of-a-new-repo case.
Test coverage
tests/core/metrics/tokenCountCache.test.tsadds 5 focused tests pinning the heuristic corners:--remote-stylemkdtemppath does not see a stale markersaveTokenCountCache(nothing dirty) does NOT create the markerrootDirsbut set-sensitiveChecklist
npm run test— 1304 → 1309 passingnpm run lint— clean