Skip to content

Commit ba4b0b0

Browse files
yamadashyclaude
andcommitted
fix(metrics): Use 'characters' instead of 'KB' in chunk size comments
JS strings use UTF-16 encoding where character count != byte count. Use 'K characters' for technical accuracy. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1 parent dd94bfb commit ba4b0b0

2 files changed

Lines changed: 3 additions & 3 deletions

File tree

src/core/metrics/calculateOutputMetrics.ts

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -2,9 +2,9 @@ import { logger } from '../../shared/logger.js';
22
import { type MetricsTaskRunner, runTokenCount } from './metricsWorkerRunner.js';
33
import type { TokenEncoding } from './TokenCounter.js';
44

5-
// Target ~200KB per chunk to balance tokenization throughput and worker round-trip overhead.
5+
// Target ~200K characters per chunk to balance tokenization throughput and worker round-trip overhead.
66
// Benchmarks show 200K is the sweet spot: fewer round-trips than 100K with enough chunks
7-
// for good parallelism across available threads (e.g., 20 chunks for 4MB output on 4 cores).
7+
// for good parallelism across available threads (e.g., 20 chunks for a 4M character output).
88
const TARGET_CHARS_PER_CHUNK = 200_000;
99
const MIN_CONTENT_LENGTH_FOR_PARALLEL = 1_000_000; // 1MB
1010

tests/core/metrics/calculateOutputMetrics.test.ts

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -173,7 +173,7 @@ describe('calculateOutputMetrics', () => {
173173
}),
174174
});
175175

176-
// With TARGET_CHARS_PER_CHUNK=200_000, 1.1MB content should produce 6 chunks
176+
// With TARGET_CHARS_PER_CHUNK=200_000, 1.1M character content should produce 6 chunks
177177
const chunkSizes = processedChunks.map((chunk) => chunk.length);
178178

179179
expect(processedChunks.length).toBe(6);

0 commit comments

Comments
 (0)