perf(core): Skip redundant full-output tokenization via wrapper-extraction fast path (-13.2%)#1457
Conversation
…ction fast path (-13.2%)
When `tokenCountTree` is enabled `calculateSelectiveFileMetrics` already
tokenizes every file individually on the primary worker pool. The original
`calculateOutputMetrics` then re-tokenized the full output a second time, split
into 200 KB chunks, to compute `totalTokens`. On large repos with the tree
display enabled, this second pass was the single longest task in the
`calculateMetrics` `Promise.all`, consuming roughly 1 second of worker time
that duplicated work already done for the per-file counts.
This change introduces a fast path for the common case (xml / markdown / plain
output, non-parsable, single-part): walk the generated output with
`indexOf(file.content, cursor)` once per file to splice file contents out of
the output, tokenize only the remaining "wrapper" (template boilerplate +
directory tree + git diff/log + per-file headers), and compute
`totalTokens = Σ per-file tokens + wrapper tokens`.
The accuracy delta versus the old 200 KB-chunk approach is bounded by BPE
merges across file↔wrapper boundaries; on the repomix repository itself the
measured error was 309 / 1,284,067 tokens ≈ 0.024 %, comparable to the chunk
boundary error the existing approach already accepts.
## Implementation
- `src/core/metrics/calculateMetrics.ts`
- Add `extractOutputWrapper(output, processedFilesInOutputOrder)` which
walks the output with a single forward cursor. Returns `null` and
triggers a fall back to `calculateOutputMetrics` if any file content is
not found (e.g., template escaped it, output was split, order mismatch).
- Add `canUseFastOutputTokenPath(config)` gate: only enabled when
`tokenCountTree` is truthy, `splitOutput` is undefined, `parsableStyle`
is false, and the style is `xml` / `markdown` / `plain`. JSON output
and parsable XML go through `JSON.stringify` / `fast-xml-builder` which
escape file contents, so `indexOf(content)` would miss them.
- In `calculateMetrics`, when the fast path is available and wrapper
extraction succeeds, replace `outputMetricsPromise` with a promise that
awaits the already-running `selectiveFileMetricsPromise`, sums the
per-file token counts, and dispatches a single `runTokenCount` on the
extracted wrapper string. The rest of the `Promise.all` is unchanged.
- `src/core/packager.ts`
- Call `sortOutputFiles(filteredProcessedFiles, config)` once in `pack`
immediately after suspicious-file filtering and use its result as
`processedFiles` downstream (for `produceOutput`, `calculateMetrics`,
and the final result object). `generateOutput` internally calls
`sortOutputFiles` as well, which is stable and memoized via
`fileChangeCountsCache`, so the two now share the single git-log
subprocess result and consumers see files in the exact order they
appear in the output. This is a precondition for the fast path's
forward-walk extraction.
- Expose `sortOutputFiles` on `defaultDeps` so existing packager unit
tests can inject their own implementation.
- `tests/core/packager/diffsFunctionality.test.ts`
- Extend the `gitRepositoryHandle.js` `vi.mock` to also stub
`isGitInstalled` and `getFileChangeCount`, since `sortOutputFiles`
resolves its default dependencies from that module at module load time.
All 1102 existing tests pass unchanged; lint is clean.
## Benchmark
Interleaved 30-run benchmark against the repomix repo itself (1018 files,
~4 MB xml output, `tokenCountTree: 50000`, `sortByChanges: true`, `includeDiffs`
and `includeLogs` enabled via the repo's own `repomix.config.json`):
base median: 2735.2 ms [2389 - 3528] IQR=367 ms
opt median: 2373.6 ms [2125 - 2653] IQR=293 ms
delta: -361.6 ms (-13.22%)
Verbose trace before/after (single run, representative):
before:
Selective metrics calculation completed in 639 ms
Output token count completed in 1046 ms
Calculate Metrics wall: 1296 ms
after:
Selective metrics calculation completed in 579 ms
Fast-path output tokens: files=1017293, wrapper=33678 (126996 chars)
Calculate Metrics wall: ~580 ms
The savings are concentrated in the `calculateMetrics` phase, which was the
dominant critical path in the final `Promise.all` for tokenCountTree runs on
large repos.
|
Important Review skippedAuto incremental reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
📝 WalkthroughWalkthroughThis change introduces a performance optimization for token counting in metrics calculation and integrates output file sorting into the packager workflow. The metrics module now conditionally applies a fast token-counting path for eligible configurations, computing output token counts by summing per-file counts with a tokenized wrapper, while falling back to the original behavior otherwise. The packager now sorts output files before passing them to downstream operations. Changes
Sequence DiagramsequenceDiagram
participant Packager as Packager
participant SortFn as sortOutputFiles
participant CalcMetrics as calculateMetrics
participant FastPath as Fast Path<br/>(runTokenCount)
participant FallbackPath as Fallback Path<br/>(calculateOutputMetrics)
Packager->>SortFn: sort filtered processed files
SortFn-->>Packager: ordered files
Packager->>CalcMetrics: pass sorted files & config
alt Config supports fast path
CalcMetrics->>FastPath: extract wrapper output
FastPath->>FastPath: sum file tokenCounts
FastPath->>FastPath: tokenize wrapper only
FastPath->>FastPath: aggregate counts
FastPath-->>CalcMetrics: output token count
else Fallback to standard path
CalcMetrics->>FallbackPath: calculate for each output part
FallbackPath-->>CalcMetrics: aggregated counts
end
CalcMetrics-->>Packager: metrics result
Estimated Code Review Effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly Related PRs
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
⚡ Performance Benchmark
Details
History7ed5943 refactor(metrics): Rename fastWrapper to outputWrapper
f1dc925 fix(metrics): Rename misleading fastOutputToken to singleOutput
0b6c775 fix(metrics): Address PR review feedback
ea174db test(metrics): Add tests for extractOutputWrapper and canUseFastOutputTokenPath
48c7d41 fix(metrics): Address PR review feedback
8321c7a refactor(metrics): Rename calculateSelectiveFileMetrics to calculateFileMetrics
6dea6cb refactor(metrics): Always calculate per-file token counts for all files
68a47b9 perf(core): Skip redundant full-output tokenization via wrapper-extraction fast path (-13.2%)
|
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #1457 +/- ##
==========================================
+ Coverage 87.13% 87.18% +0.04%
==========================================
Files 117 117
Lines 4440 4465 +25
Branches 1023 1032 +9
==========================================
+ Hits 3869 3893 +24
- Misses 571 572 +1 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Deploying repomix with
|
| Latest commit: |
701e8d4
|
| Status: | ✅ Deploy successful! |
| Preview URL: | https://2402a134.repomix.pages.dev |
| Branch Preview URL: | https://perf-fast-output-tokenizatio.repomix.pages.dev |
There was a problem hiding this comment.
Code Review
This pull request introduces a performance optimization for token counting by implementing a "fast path" that avoids re-tokenizing the entire output string. Instead, it calculates the total tokens by summing individual file token counts and tokenizing only the "wrapper" (the template boilerplate). To support this, files are now pre-sorted to match the output order. Feedback was provided regarding the robustness of the extractOutputWrapper function, specifically noting that using indexOf on file content might lead to incorrect matches if the content appears earlier in the output, such as in the directory tree.
Previously, per-file tokenization was only done for all files when `tokenCountTree` was enabled; otherwise only the top N files (by char count) were tokenized for the "Top Files" display. But output tokenization always processes all file contents anyway, so the "selective" path was not saving any work — it just prevented the fast output-token path from being used. Now we always tokenize every file individually, which: - Enables the wrapper-extraction fast path regardless of tokenCountTree - Simplifies the metrics pipeline by removing the conditional branching - Provides complete per-file token data for all downstream consumers Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Code Review —
|
…ileMetrics The function now always calculates metrics for all files, so the "Selective" prefix no longer reflects its behavior. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Follow-up Review — Incremental FindingsBuilding on my previous review, a few additional items surfaced: 1. CI Benchmark vs PR Claim (High)The CI benchmark bot shows -0.8% (Ubuntu), -0.7% (macOS), +0.0% (Windows) — dramatically less than the -13.2% claimed in the PR description. The PR benchmark used
Suggestion: Run the CI benchmark explicitly with 2. Stale Comment (Low)Line 157 in // Reuse per-file token counts from the primary selective metrics run.Should be updated — the function is no longer "selective" after the rename to 3. Missing Fallback Logging (Low)When Previous feedback items (test coverage, ordering contract, sequential dependency) still apply. The approach is sound — the main open question is whether the net effect is positive for the default configuration. 🤖 Generated with Claude Code |
- Remove stale "selective metrics run" comment - Add trace log when fast-path falls back to full output tokenization Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
Addressed in
Regarding other items:
🤖 |
Code Review — Incremental Findings (Post-48c7d41)Thanks for addressing the stale comment and adding the fallback trace log. A few new items surfaced on closer inspection: 1. Existing
|
…tTokenPath Export the two helper functions and add 13 unit tests covering: - extractOutputWrapper: normal extraction, missing content (null), empty files, identical content, wrong order, no files, no wrapper - canUseFastOutputTokenPath: each style variant, splitOutput, parsableStyle Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Fix calculateMetrics test to use parsableStyle: true so it exercises the fallback path (calculateOutputMetrics mock) instead of accidentally hitting the fast path - Correct packager comment to clarify that git-log is cached but the array sort itself runs twice (negligible cost) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
Addressed in
Regarding item 2 ( 🤖 |
Code Review — ClaudeOverall: Clean, well-structured performance optimization. The approach is sound — reusing per-file token counts and tokenizing only the wrapper is a smart way to avoid redundant work. The fallback is correctly implemented. A few items worth addressing: Findings1. No integration test for the fast path through
Consider adding a 2. Minor: misleading variable name
Other observations (non-blocking)
What looks good
🤖 Generated with Claude Code |
The variable holds the full output string, not a token. Renamed for clarity. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
🔍 Code Review — ClaudeOverall verdict: This is a well-executed performance optimization. The approach is sound, the benchmarks are solid (-13.2% overall, ~55% reduction in Key Findings1. False-positive
In practice this is extremely unlikely for real codebases (file contents are typically much longer than header/tree snippets), and the error would be tiny. But it's worth documenting this as a known limitation in the JSDoc for 2. In const selective = await fileMetricsPromise;This is a remnant of the old 3. Wrapper tokenization unnecessarily serialized (Not needed — minor) The fast path awaits 4. After the refactor, Test CoverageThe 13 new unit tests for Summary
LGTM with the minor Detailed reviewer notesCode Quality:
Performance:
Conventions:
🤖 Generated with Claude Code |
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Code Review — Claude (Incremental)Previous Claude reviews raised items around test coverage, stale naming, fallback logging, ordering coupling, and benchmark discrepancy. The maintainer has addressed most of these across commits Overall: LGTM — clean, well-iterated optimization. Ready to merge. Remaining Item (Non-blocking)
The only call site ( const metricsTargetPaths = processedFiles.map((file) => file.path);The What's Been Well Addressed
Summary Table
Nice work on the thorough benchmarking, iterative addressing of review feedback, and clean fallback design. 🤖 Generated with Claude Code |
Summary
When
tokenCountTreeis enabled,calculateSelectiveFileMetricsalready tokenizes every file individually. The originalcalculateOutputMetricsthen re-tokenized the full output (~4 MB) a second time in 200 KB chunks to computetotalTokens. This second pass was the single longest task in thecalculateMetricsPromise.all, consuming ~1 second of worker time that duplicated work already done for per-file counts.This PR introduces a wrapper-extraction fast path: splice file contents out of the generated output via a single
indexOfforward-walk, tokenize only the remaining "wrapper" (template boilerplate + directory tree + git diff/log + per-file headers), and compute:Key changes
src/core/metrics/calculateMetrics.tsextractOutputWrapper()— single-pass forward walk to splice file contents out of the output stringcanUseFastOutputTokenPath()— gate: enabled only forxml/markdown/plainnon-parsable single-part output withtokenCountTreeselectiveFileMetrics+ one cheaprunTokenCounton the ~127 KB wrapper instead of re-tokenizing ~4 MBsrc/core/packager.tsprocessedFilesviasortOutputFiles()before passing toproduceOutputandcalculateMetrics, ensuring file order matches output order (precondition forindexOfforward-walk)tests/core/packager/diffsFunctionality.test.tsisGitInstalledandgetFileChangeCountin the existingvi.mockforgitRepositoryHandle.jsAccuracy
The delta vs the old 200 KB-chunk approach is bounded by BPE merges at file↔wrapper boundaries: 309 / 1,284,067 tokens ≈ 0.024% on the repomix repo itself — comparable to the chunk boundary error the existing approach already accepts.
Fallback
If any file's content isn't found in the output (template escaping, split output, JSON/parsable-XML, order mismatch), the fast path returns
nulland falls back to the originalcalculateOutputMetricspath. Zero behavior change in those cases.Benchmark
Interleaved 30-run benchmark (repomix repo, 1018 files, ~4 MB xml output):
calculateMetricsphase: ~1296 ms → ~580 ms.Test plan
tokenCountTreeenabled🤖 Generated with Claude Code