Skip to content

Commit 8caff2a

Browse files
committed
perf(core): Warm up all metrics worker threads to eliminate lazy init delays
Revert the half-thread warmup optimization and warm up all worker threads during pool initialization. While half-warmup reduced CPU contention during the security check phase, it left workers cold for the metrics phase. Cold workers need ~150ms to lazy-load gpt-tokenizer, during which they cannot process batches, effectively serializing early metrics work onto fewer threads. Full warmup slightly increases contention during the pipeline overlap phase, but the I/O-bound file collection and git subprocess stages provide natural CPU headroom that absorbs the extra warmup load. Benchmark results (repomix on itself, 996 files, 10 runs each): Before (half warmup): median 1.599s After (full warmup): median 1.540s Improvement: ~59ms (~3.7%) vs main branch: median 1.764s → 1.540s (~12.7% total improvement) https://claude.ai/code/session_018NjNHi6fb1AiQHbWdarYcW
1 parent ce4a7e5 commit 8caff2a

2 files changed

Lines changed: 14 additions & 12 deletions

File tree

src/core/metrics/calculateMetrics.ts

Lines changed: 7 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -52,14 +52,15 @@ export const createMetricsTaskRunner = (numOfTasks: number, encoding: TokenEncod
5252
runtime: 'worker_threads',
5353
});
5454

55-
// Warm up only half the worker threads to further reduce CPU contention during the
56-
// overlapping file collection + security check pipeline stages. The remaining
57-
// workers initialize lazily during metrics calculation, when security workers
58-
// have already been cleaned up and CPU cores are free.
55+
// Warm up all worker threads to eliminate lazy initialization delays during the
56+
// metrics phase. While warmup overlaps with security check workers (causing some
57+
// CPU contention), having all workers ready when metrics calculation starts
58+
// outweighs the contention cost: lazy initialization on cold workers adds ~150ms
59+
// per worker during the metrics phase, which is worse than the brief contention
60+
// during warmup when I/O-bound pipeline stages provide natural CPU headroom.
5961
const { maxThreads } = getWorkerThreadCount(cappedNumOfTasks);
60-
const warmupCount = Math.max(1, Math.ceil(maxThreads / 2));
6162
const warmupPromise = Promise.all(
62-
Array.from({ length: warmupCount }, () => taskRunner.run({ content: '', encoding }).catch(() => 0)),
63+
Array.from({ length: maxThreads }, () => taskRunner.run({ content: '', encoding }).catch(() => 0)),
6364
);
6465

6566
return { taskRunner, warmupPromise };

tests/core/metrics/calculateMetrics.test.ts

Lines changed: 7 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -121,19 +121,20 @@ describe('createMetricsTaskRunner', () => {
121121
expect(result.taskRunner.run).toHaveBeenCalledWith({ content: '', encoding: 'cl100k_base' });
122122
});
123123

124-
it('should warm up ceil(maxThreads/2) workers to reduce CPU contention', async () => {
124+
it('should warm up all worker threads', async () => {
125125
// With 1000 tasks on a system with N cores, maxThreads = min(N, ceil(1000/100)) = min(N, 10)
126-
// warmupCount = max(1, ceil(maxThreads / 2))
126+
// All threads should be warmed up to avoid lazy init delays during metrics
127127
const result = createMetricsTaskRunner(1000, 'o200k_base');
128128

129129
await result.warmupPromise;
130130

131-
// The number of warmup calls should be ceil(maxThreads / 2), not maxThreads
132131
const callCount = (result.taskRunner.run as Mock).mock.calls.length;
133132
const { getWorkerThreadCount } = await import('../../../src/shared/processConcurrency.js');
134-
const { maxThreads } = getWorkerThreadCount(1000);
135-
const expectedWarmupCount = Math.max(1, Math.ceil(maxThreads / 2));
136-
expect(callCount).toBe(expectedWarmupCount);
133+
// maxMetricsWorkers caps at processConcurrency - 1, so cappedNumOfTasks is used
134+
const maxMetricsWorkers = Math.max(1, (await import('node:os')).default.availableParallelism() - 1);
135+
const cappedNumOfTasks = Math.min(1000, maxMetricsWorkers * 100);
136+
const { maxThreads } = getWorkerThreadCount(cappedNumOfTasks);
137+
expect(callCount).toBe(maxThreads);
137138
});
138139

139140
it('should swallow warmup task errors', async () => {

0 commit comments

Comments
 (0)