Skip to content

perf(metrics): Skip the eager metrics warm-up when this repo's cache is already populated#1580

Merged
yamadashy merged 4 commits into
mainfrom
perf/metrics-cache-aware-prewarm
May 21, 2026
Merged

perf(metrics): Skip the eager metrics warm-up when this repo's cache is already populated#1580
yamadashy merged 4 commits into
mainfrom
perf/metrics-cache-aware-prewarm

Conversation

@yamadashy

@yamadashy yamadashy commented May 17, 2026

Copy link
Copy Markdown
Owner

Summary

The metrics worker pool warms eagerly at pack startup by firing maxThreads dummy tasks so each worker pre-parses gpt-tokenizer's o200k_base BPE table (~225ms CPU per worker). On cold runs that overhead overlaps collect/security/process and is essentially free. On warm runs — when the per-file token-count cache from PR #1565 hits for every file — calculateFileMetrics dispatches zero worker tasks, and 7 of those 8 BPE parses on an 8-vCPU host become pure waste (~1.5s of unnecessary CPU per pack).

This PR predicts the warm path with a two-part sub-millisecond probe and drops the eager warm-up to a single worker when both probes agree:

cacheWarmLikely =
  tokenCountCacheFileExistsSync() &&             // shared cache exists at all
  tokenCountCacheSeenMarkerExistsSync(rootDirs); // this repo wrote to it before

prewarmCount = cacheWarmLikely ? 1 : maxThreads;

The cold path (either probe false) is byte-identical to current main.

What's added

  • getRepoSeenMarkerPath(rootDirs) / tokenCountCacheSeenMarkerExistsSync(rootDirs) — a 0-byte file under $TMPDIR/repomix/cache/seen/{md5(sorted(rootDirs))} whose existence means this machine has previously persisted cache entries for exactly this root-dir set.
  • saveTokenCountCache(rootDirs) — touches the marker on the same successful write path that persists the cache. No marker if the save was a no-op.
  • createMetricsTaskRunner(rootDirs, numOfTasks, encoding) — new rootDirs parameter; the warm/cold prediction reads the marker.
  • packager.ts passes rootDirs through both call sites.

Marker path is derived from getCacheFilePath()'s directory so REPOMIX_TOKEN_CACHE_PATH overrides keep the marker alongside the cache (test hermeticity).

Why a marker instead of per-repo cache files

A per-repo cache file (e.g. token-counts-{md5(path)}.json) was considered. It would give a precise "this repo cached" signal with a single existsSync, but trades away:

  • Cross-repo content deduplication. The cache key is ${encoding}:${byteLength}:${md5_16(content)} — purely content-addressed — so shared boilerplate (package-lock.json, vendored libraries, license headers) currently dedupes across all packed repos. Per-repo files break this.
  • --remote. fs.mkdtemp(...) produces a unique random path per invocation, so the cache filename would change every time and second-and-later repomix --remote https://X/Y invocations would always be cold. With the marker, --remote is correctly classified as cold-likely on the prewarm decision, AND the shared content-addressed cache still serves hits.
  • MCP server's single module-level state.entries: Map populated by one loadTokenCountCache() per process. Per-repo files would force per-repo state management or per-pack reloads.

The marker keeps all three properties.

Stale-marker failure mode

A previously-cached repo whose entries are later FIFO-evicted by 100k-entries-worth of intervening packs of other repos would still predict warm. The cost ceiling is one BPE init per missing worker (paid on the metrics critical path when Tinypool lazily spawns workers for the misses) — narrower and rarer than the original existsSync(cacheFile)-only heuristic, which fired warm on the very second pack of the machine regardless of repo identity.

Benchmarks

Apple Silicon (8 vCPU), repomix self-pack on this repo, hyperfine n=15:

Warm cache (typical repeat run on the same repo)

wall User CPU
BEFORE 767.2 ± 6.0 ms 2200 ms
AFTER 712.5 ± 8.3 ms 982 ms
Δ −54.7 ms (−7.1%) −1218 ms (−55%)

Cold cache (REPOMIX_TOKEN_CACHE=0)

wall User CPU
BEFORE 867.5 ± 34.1 ms 3158 ms
AFTER 859.2 ± 21.3 ms 3140 ms
Δ −8.3 ms (−1.0%, within noise) ≈0

CI bench across all three OSes (on the simpler earlier revision before the marker landed) showed −21–30%. The marker keeps that warm-cache win and adds the correctness fix for the first-pack-of-a-new-repo case.

Test coverage

tests/core/metrics/tokenCountCache.test.ts adds 5 focused tests pinning the heuristic corners:

  • Unrelated repo stays cold-likely after another repo populates the shared cache
  • Same repo becomes warm-likely after a successful save
  • Fresh --remote-style mkdtemp path does not see a stale marker
  • A no-op saveTokenCountCache (nothing dirty) does NOT create the marker
  • Marker key is order-insensitive over rootDirs but set-sensitive

Checklist

  • Run npm run test — 1304 → 1309 passing
  • Run npm run lint — clean

@github-actions

github-actions Bot commented May 17, 2026

Copy link
Copy Markdown
Contributor

⚡ Performance Benchmark

Latest commit:2fdf07e fix(tests): Address CodeRabbit feedback on prewarm-count assertions
Status:✅ Benchmark complete!
Ubuntu:0.87s (±0.01s) → 0.66s (±0.01s) · -0.21s (-24.4%)
macOS:0.70s (±0.17s) → 0.62s (±0.16s) · -0.08s (-10.9%)
Windows:1.15s (±0.04s) → 0.96s (±0.01s) · -0.19s (-16.2%)
Details
  • Packing the repomix repository with node bin/repomix.cjs
  • Warmup: 2 runs (discarded), interleaved execution
  • Measurement: 20 runs / 30 on macOS (median ± IQR)
  • Workflow run
History

9c96671 fix(tests): Address CodeRabbit feedback on prewarm-count assertions

Ubuntu:0.90s (±0.02s) → 0.71s (±0.03s) · -0.19s (-20.8%)
macOS:0.90s (±0.19s) → 0.69s (±0.14s) · -0.20s (-22.7%)
Windows:1.14s (±0.13s) → 0.91s (±0.08s) · -0.23s (-19.8%)

4fac887 fix(metrics): Address codex review on cache-aware prewarm

Ubuntu:0.90s (±0.03s) → 0.70s (±0.03s) · -0.20s (-21.9%)
macOS:0.56s (±0.10s) → 0.44s (±0.05s) · -0.12s (-21.4%)
Windows:1.10s (±0.04s) → 0.86s (±0.02s) · -0.23s (-21.3%)

92287a9 fix(metrics): Per-repo seen marker for prewarm heuristic

Ubuntu:0.86s (±0.02s) → 0.66s (±0.01s) · -0.20s (-23.3%)
macOS:0.51s (±0.02s) → 0.39s (±0.03s) · -0.11s (-22.5%)
Windows:1.13s (±0.02s) → 0.94s (±0.01s) · -0.19s (-16.5%)

d770b10 perf(metrics): Cache-aware prewarm count for the metrics worker pool

Ubuntu:0.88s (±0.02s) → 0.68s (±0.02s) · -0.20s (-23.1%)
macOS:0.66s (±0.39s) → 0.46s (±0.23s) · -0.20s (-29.7%)
Windows:0.91s (±0.07s) → 0.70s (±0.06s) · -0.21s (-22.9%)

@coderabbitai

coderabbitai Bot commented May 17, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

Important

Review skipped

Auto incremental reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 4be1dd98-2aef-4b0d-9b55-a6cb0ca7bc9d

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

This PR optimizes metrics worker initialization by conditionally pre-warming a smaller thread pool when the token-count cache file is detected on disk, instead of always warming all available worker threads.

Changes

Conditional Metrics Worker Warm-Up

Layer / File(s) Summary
Token cache existence probe
src/core/metrics/tokenCountCache.ts
New tokenCountCacheFileExistsSync() function checks on-disk cache presence synchronously, respecting the REPOMIX_TOKEN_CACHE disable flag and returning false on filesystem errors.
Metrics worker warm-up integration
src/core/metrics/calculateMetrics.ts
Imports the cache probe, uses it to determine a warm-likely state and compute a capped pre-warm worker count, then pre-warms only that subset instead of all workers while preserving per-task error handling.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

Possibly related PRs

  • yamadashy/repomix#1374: Both PRs modify src/core/metrics/calculateMetrics.ts's metrics-worker warmup behavior with different strategies for pre-warming thread pools.
  • yamadashy/repomix#1409: Both PRs adjust worker thread initialization and concurrency control in src/core/metrics/calculateMetrics.ts.
  • yamadashy/repomix#1562: Both PRs modify token-count disk cache plumbing in src/core/metrics/tokenCountCache.ts, with this PR's warm-up gating relying on the cache-file concept.
🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Title check ✅ Passed The title accurately describes the primary optimization: skipping eager metrics warm-up when the token-count cache is already populated.
Description check ✅ Passed The description is comprehensive, covering objectives, implementation details, trade-offs, benchmarks, and test coverage. The required checklist is completed with both npm run test and npm run lint confirmed.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch perf/metrics-cache-aware-prewarm

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request optimizes the metrics worker pool initialization by introducing a "warm-likely" heuristic. It checks for the existence of an on-disk token-count cache file; if present, it limits the number of pre-warmed workers to one, avoiding unnecessary and expensive BPE parsing when cache hits are expected. If the cache is missing, it falls back to the original behavior of warming all worker threads. I have no feedback to provide as no review comments were present.

@cloudflare-workers-and-pages

cloudflare-workers-and-pages Bot commented May 17, 2026

Copy link
Copy Markdown

Deploying repomix with  Cloudflare Pages  Cloudflare Pages

Latest commit: 2fdf07e
Status: ✅  Deploy successful!
Preview URL: https://297a6007.repomix.pages.dev
Branch Preview URL: https://perf-metrics-cache-aware-pre.repomix.pages.dev

View logs

@codecov

codecov Bot commented May 17, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 89.18919% with 4 lines in your changes missing coverage. Please review.
✅ Project coverage is 90.86%. Comparing base (e15556a) to head (2fdf07e).
⚠️ Report is 11 commits behind head on main.
✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
src/core/metrics/tokenCountCache.ts 86.66% 4 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1580      +/-   ##
==========================================
- Coverage   90.88%   90.86%   -0.03%     
==========================================
  Files         121      121              
  Lines        4650     4683      +33     
  Branches     1080     1088       +8     
==========================================
+ Hits         4226     4255      +29     
- Misses        424      428       +4     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@yamadashy yamadashy changed the title perf(metrics): Cache-aware prewarm count for metrics worker pool perf(metrics): Skip the eager metrics warm-up when this repo's cache is already populated May 17, 2026
yamadashy added a commit that referenced this pull request May 17, 2026
Three findings from iteration-1 codex review on PR #1580:

1. `seenMarkerKey` did not normalize relative paths before hashing.
   The CLI normalizes to absolute paths in `defaultAction.ts:117` so
   today there is no caller hitting this, but the public `pack()` API
   makes no such guarantee — a library consumer passing `['.']` from
   two different cwds would collide on the same marker. Run each root
   through `path.resolve()` before sorting and joining.

2. `saveTokenCountCache` only touched the per-repo marker after a
   successful cache WRITE. A fully-warm pack short-circuits via
   `!state.dirty` and never reaches the touch, so two corners stayed
   stuck on cold-likely forever:

     - Upgrade from a pre-marker release: the shared cache exists but
       no marker has ever been written. The first post-upgrade pack
       on a fully warm cache would never produce one.

     - Crash recovery: a previous pack landed the cache via
       `fs.rename` but exited before `markRepoSeen`. Cache present,
       marker missing.

   Add an early resync that touches the marker whenever the shared
   cache file already exists on disk and rootDirs is non-empty,
   regardless of whether the save itself will be a no-op. The touch is
   idempotent (0-byte writeFile) so the duplicate call when a real
   write follows is harmless.

3. Tests covered marker existence behavior but did not pin the
   *prewarm dispatch count*. A future refactor could keep the marker
   logic intact but accidentally warm `maxThreads` on warm-likely or
   only `1` on cold-likely without any test failing. Add three pinning
   tests in `tests/core/metrics/calculateMetrics.test.ts` that probe
   the actual `taskRunner.run` call count under each of the three
   warm/cold-likely combinations.

`npm run lint` clean. `npm run test` 1309 → 1312 passing.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@tests/core/metrics/calculateMetrics.test.ts`:
- Around line 336-348: The test should compute the expected maxThreads instead
of assuming >1: import os (or require('os')) and compute expectedMaxThreads =
Math.min(os.cpus().length, Math.ceil(1000 / 100)) (or derive the
divisor/numOfTasks from the createMetricsTaskRunner call) and replace the
hardcoded expect((result.taskRunner.run as
Mock).mock.calls.length).toBeGreaterThan(1) with
expect(...).toBe(expectedMaxThreads); apply the same change to the similar
assertion in the other test (lines 350-356) so both use the same computed
maxThreads logic used by createMetricsTaskRunner.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 8c628619-df52-4e80-8614-a4243744bb26

📥 Commits

Reviewing files that changed from the base of the PR and between d770b10 and 4fac887.

📒 Files selected for processing (5)
  • src/core/metrics/calculateMetrics.ts
  • src/core/metrics/tokenCountCache.ts
  • src/core/packager.ts
  • tests/core/metrics/calculateMetrics.test.ts
  • tests/core/metrics/tokenCountCache.test.ts

Comment thread tests/core/metrics/calculateMetrics.test.ts Outdated
yamadashy and others added 4 commits May 20, 2026 00:32
Replace the unconditional eager warm-up of `min(cpu, ceil(N/100))`
workers with a cache-aware prewarm count derived from whether the
on-disk token-count cache file already exists.

Background — what the existing prewarm does

`createMetricsTaskRunner` opens the metrics Tinypool and immediately
fires `maxThreads` dummy `taskRunner.run({ content: '', encoding })`
tasks. Their only purpose is to force each worker to load
`gpt-tokenizer` and parse the o200k_base BPE table (~225ms CPU per
worker) BEFORE the real metrics phase needs them. The cost is paid
in parallel and overlaps `collect` / `security` / `process`, so on
cold runs it is a clean win — by the time `calculateFileMetrics`
dispatches real work, the pool is hot.

Why this regresses on warm runs

With the per-file token-count cache from PR #1565, the typical repeat
`repomix` invocation hits the cache for every per-file lookup and
dispatches ZERO tokenization tasks via `calculateFileMetrics`. The
only remaining metrics dispatches are 2-3 small git diff / git log
tokenizations. Yet the pool still eagerly warms all `maxThreads`
workers, so on an 8-vCPU host 7 of the 8 BPE parses end up being
~225ms of wasted CPU per pack.

What changes

A `existsSync` probe on the cache file at pool-creation time picks
the prewarm count from the actual workload prediction:

  - cache file present → warm-likely → prewarm 1 (covers git
    diff/log; any genuine miss spawns extra workers lazily)
  - cache file missing → cold-likely → prewarm `maxThreads`
    (identical to current `main` behavior, no regression)

`fs.existsSync` is sub-millisecond and the probe is a best-effort
warm/cold predictor only, not a correctness signal. A stale or
partial cache simply falls back to lazy worker spawn for the misses.

Measured (Apple Silicon, 8 vCPU, repomix self-pack, hyperfine n=15):

  WARM cache (typical repeat run):
    BEFORE 757.8 ± 7.2 ms  / User 2185 ms
    AFTER  710.0 ± 9.1 ms  / User  978 ms  (-6.3% wall, -55% User CPU)

  COLD cache (REPOMIX_TOKEN_CACHE=0):
    BEFORE 846.0 ± 7.2 ms  / User 3092 ms
    AFTER  842.5 ± 8.5 ms  / User 3087 ms  (-0.4% wall, no regression)

Companion to PR #1579, which capped the entire pool at 3 and traded a
~+4% cold-cache regression for the warm-cache win. This cache-aware
revision eliminates that trade — warm runs save even more BPE parses
(prewarm 1 instead of 3) AND cold runs are byte-identical to current
`main`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The previous revision of this PR used `tokenCountCacheFileExistsSync()`
alone to predict "warm-likely". That probe is positive as soon as ANY
repo on this machine has packed once and written the shared cache, so
the very second pack of a brand-new repo would incorrectly land on the
prewarm-1 fast path. The metrics phase would then dispatch N real
tokenization tasks, Tinypool would spawn the rest of the workers
LAZILY on the critical path (each paying ~225ms BPE init), and we
would eat a wall regression vs `main` for the entire first-pack-of-a-
new-repo workload.

Fix by pairing the global cache-file probe with a per-repo "seen
marker" file. The marker is a zero-byte file under
`$TMPDIR/repomix/cache/seen/{md5(sorted(rootDirs))}` that is touched
ONLY after a successful `saveTokenCountCache(rootDirs)` write — so its
existence implies this machine has at some point persisted cache
entries for exactly this `rootDirs` set.

The prewarm decision then becomes:

```ts
cacheWarmLikely =
  tokenCountCacheFileExistsSync() &&
  tokenCountCacheSeenMarkerExistsSync(rootDirs);
```

Both true → warm-likely → prewarm 1 worker.
Either false → cold-likely → prewarm `maxThreads`, identical to `main`.

Design notes

- The shared cache itself is untouched. Cross-repo content
  deduplication (e.g. shared `package-lock.json` boilerplate) still
  works via the content-addressed key.
- `--remote` clones to `fs.mkdtemp(...)` which produces a unique
  random path per invocation. The marker correctly classifies this as
  cold-likely on every fresh clone (no false warm signal), while the
  shared cache still serves content hits if the URL has been packed
  before.
- The MCP server's single module-level `state` is unchanged. No
  per-repo cache loading, no per-repo state map.
- The marker path is derived from `getCacheFilePath()`'s directory so
  `REPOMIX_TOKEN_CACHE_PATH` overrides keep tests hermetic.
- Stale marker failure mode: a previously-cached repo whose entries
  were later FIFO-evicted will still predict warm. The cost ceiling
  is one BPE init per lazily-spawned worker on the metrics critical
  path — bounded, and narrower than the original global false-positive.

Verification

`npm run lint` clean. `npm run test` 1304 → 1309 passing (5 new
focused tests pinning the heuristic corners: unrelated repo stays
cold, same repo becomes warm, fresh `--remote` clone path does not
see stale markers, no-op save does not touch the marker, marker key
is order-insensitive over `rootDirs`).

Apple Silicon local bench (hyperfine n=15, repomix self-pack):

  WARM cache same repo:
    BEFORE 767.2 ± 6.0 ms  / User 2200 ms
    AFTER  712.5 ± 8.3 ms  / User  982 ms  (-7.1% wall, -55% User CPU)

  COLD cache (REPOMIX_TOKEN_CACHE=0):
    BEFORE 867.5 ± 34.1 ms / User 3158 ms
    AFTER  859.2 ± 21.3 ms / User 3140 ms  (-1.0% wall, no regression)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three findings from iteration-1 codex review on PR #1580:

1. `seenMarkerKey` did not normalize relative paths before hashing.
   The CLI normalizes to absolute paths in `defaultAction.ts:117` so
   today there is no caller hitting this, but the public `pack()` API
   makes no such guarantee — a library consumer passing `['.']` from
   two different cwds would collide on the same marker. Run each root
   through `path.resolve()` before sorting and joining.

2. `saveTokenCountCache` only touched the per-repo marker after a
   successful cache WRITE. A fully-warm pack short-circuits via
   `!state.dirty` and never reaches the touch, so two corners stayed
   stuck on cold-likely forever:

     - Upgrade from a pre-marker release: the shared cache exists but
       no marker has ever been written. The first post-upgrade pack
       on a fully warm cache would never produce one.

     - Crash recovery: a previous pack landed the cache via
       `fs.rename` but exited before `markRepoSeen`. Cache present,
       marker missing.

   Add an early resync that touches the marker whenever the shared
   cache file already exists on disk and rootDirs is non-empty,
   regardless of whether the save itself will be a no-op. The touch is
   idempotent (0-byte writeFile) so the duplicate call when a real
   write follows is harmless.

3. Tests covered marker existence behavior but did not pin the
   *prewarm dispatch count*. A future refactor could keep the marker
   logic intact but accidentally warm `maxThreads` on warm-likely or
   only `1` on cold-likely without any test failing. Add three pinning
   tests in `tests/core/metrics/calculateMetrics.test.ts` that probe
   the actual `taskRunner.run` call count under each of the three
   warm/cold-likely combinations.

`npm run lint` clean. `npm run test` 1309 → 1312 passing.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two cold-path tests asserted `(call count) > 1`, which depends on the
host having at least 2 vCPUs. On a 1-vCPU CI runner
`getWorkerThreadCount(1000).maxThreads` collapses to 1 and the cold
path correctly fires a single warm-up task — but the assertion would
fail despite the behavior being correct.

Switch both tests to assert against the actual computed `maxThreads`
from the same heuristic the production path uses, so the cold-path
contract is pinned independent of runner vCPU count.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@yamadashy yamadashy force-pushed the perf/metrics-cache-aware-prewarm branch from 9c96671 to 2fdf07e Compare May 19, 2026 15:33
@yamadashy yamadashy merged commit dd0c7bf into main May 21, 2026
55 checks passed
@yamadashy yamadashy deleted the perf/metrics-cache-aware-prewarm branch May 21, 2026 11:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant