Skip to content

perf(core): Skip redundant full-output tokenization via wrapper-extraction fast path (-13.2%)#1457

Merged
yamadashy merged 9 commits into
mainfrom
perf/fast-output-tokenization
Apr 12, 2026
Merged

perf(core): Skip redundant full-output tokenization via wrapper-extraction fast path (-13.2%)#1457
yamadashy merged 9 commits into
mainfrom
perf/fast-output-tokenization

Conversation

@yamadashy

@yamadashy yamadashy commented Apr 12, 2026

Copy link
Copy Markdown
Owner

Summary

When tokenCountTree is enabled, calculateSelectiveFileMetrics already tokenizes every file individually. The original calculateOutputMetrics then re-tokenized the full output (~4 MB) a second time in 200 KB chunks to compute totalTokens. This second pass was the single longest task in the calculateMetrics Promise.all, consuming ~1 second of worker time that duplicated work already done for per-file counts.

This PR introduces a wrapper-extraction fast path: splice file contents out of the generated output via a single indexOf forward-walk, tokenize only the remaining "wrapper" (template boilerplate + directory tree + git diff/log + per-file headers), and compute:

totalTokens ≈ Σ per-file tokens + tokens(wrapper)

Key changes

  • src/core/metrics/calculateMetrics.ts

    • extractOutputWrapper() — single-pass forward walk to splice file contents out of the output string
    • canUseFastOutputTokenPath() — gate: enabled only for xml/markdown/plain non-parsable single-part output with tokenCountTree
    • When fast path applies, reuse per-file token counts from selectiveFileMetrics + one cheap runTokenCount on the ~127 KB wrapper instead of re-tokenizing ~4 MB
  • src/core/packager.ts

    • Pre-sort processedFiles via sortOutputFiles() before passing to produceOutput and calculateMetrics, ensuring file order matches output order (precondition for indexOf forward-walk)
  • tests/core/packager/diffsFunctionality.test.ts

    • Stub isGitInstalled and getFileChangeCount in the existing vi.mock for gitRepositoryHandle.js

Accuracy

The delta vs the old 200 KB-chunk approach is bounded by BPE merges at file↔wrapper boundaries: 309 / 1,284,067 tokens ≈ 0.024% on the repomix repo itself — comparable to the chunk boundary error the existing approach already accepts.

Fallback

If any file's content isn't found in the output (template escaping, split output, JSON/parsable-XML, order mismatch), the fast path returns null and falls back to the original calculateOutputMetrics path. Zero behavior change in those cases.

Benchmark

Interleaved 30-run benchmark (repomix repo, 1018 files, ~4 MB xml output):

base median: 2735 ms  [2389 - 3528]  IQR=367 ms
opt  median: 2374 ms  [2125 - 2653]  IQR=293 ms
delta:       -362 ms  (-13.2%)

calculateMetrics phase: ~1296 ms → ~580 ms.

Test plan

  • All 1102 existing tests pass
  • Lint clean
  • Verify fast path activates on default xml output with tokenCountTree enabled
  • Verify fallback works correctly for JSON/parsable-XML/split output

🤖 Generated with Claude Code


Open with Devin

…ction fast path (-13.2%)

When `tokenCountTree` is enabled `calculateSelectiveFileMetrics` already
tokenizes every file individually on the primary worker pool. The original
`calculateOutputMetrics` then re-tokenized the full output a second time, split
into 200 KB chunks, to compute `totalTokens`. On large repos with the tree
display enabled, this second pass was the single longest task in the
`calculateMetrics` `Promise.all`, consuming roughly 1 second of worker time
that duplicated work already done for the per-file counts.

This change introduces a fast path for the common case (xml / markdown / plain
output, non-parsable, single-part): walk the generated output with
`indexOf(file.content, cursor)` once per file to splice file contents out of
the output, tokenize only the remaining "wrapper" (template boilerplate +
directory tree + git diff/log + per-file headers), and compute
`totalTokens = Σ per-file tokens + wrapper tokens`.

The accuracy delta versus the old 200 KB-chunk approach is bounded by BPE
merges across file↔wrapper boundaries; on the repomix repository itself the
measured error was 309 / 1,284,067 tokens ≈ 0.024 %, comparable to the chunk
boundary error the existing approach already accepts.

## Implementation

- `src/core/metrics/calculateMetrics.ts`
  - Add `extractOutputWrapper(output, processedFilesInOutputOrder)` which
    walks the output with a single forward cursor. Returns `null` and
    triggers a fall back to `calculateOutputMetrics` if any file content is
    not found (e.g., template escaped it, output was split, order mismatch).
  - Add `canUseFastOutputTokenPath(config)` gate: only enabled when
    `tokenCountTree` is truthy, `splitOutput` is undefined, `parsableStyle`
    is false, and the style is `xml` / `markdown` / `plain`. JSON output
    and parsable XML go through `JSON.stringify` / `fast-xml-builder` which
    escape file contents, so `indexOf(content)` would miss them.
  - In `calculateMetrics`, when the fast path is available and wrapper
    extraction succeeds, replace `outputMetricsPromise` with a promise that
    awaits the already-running `selectiveFileMetricsPromise`, sums the
    per-file token counts, and dispatches a single `runTokenCount` on the
    extracted wrapper string. The rest of the `Promise.all` is unchanged.

- `src/core/packager.ts`
  - Call `sortOutputFiles(filteredProcessedFiles, config)` once in `pack`
    immediately after suspicious-file filtering and use its result as
    `processedFiles` downstream (for `produceOutput`, `calculateMetrics`,
    and the final result object). `generateOutput` internally calls
    `sortOutputFiles` as well, which is stable and memoized via
    `fileChangeCountsCache`, so the two now share the single git-log
    subprocess result and consumers see files in the exact order they
    appear in the output. This is a precondition for the fast path's
    forward-walk extraction.
  - Expose `sortOutputFiles` on `defaultDeps` so existing packager unit
    tests can inject their own implementation.

- `tests/core/packager/diffsFunctionality.test.ts`
  - Extend the `gitRepositoryHandle.js` `vi.mock` to also stub
    `isGitInstalled` and `getFileChangeCount`, since `sortOutputFiles`
    resolves its default dependencies from that module at module load time.

All 1102 existing tests pass unchanged; lint is clean.

## Benchmark

Interleaved 30-run benchmark against the repomix repo itself (1018 files,
~4 MB xml output, `tokenCountTree: 50000`, `sortByChanges: true`, `includeDiffs`
and `includeLogs` enabled via the repo's own `repomix.config.json`):

    base median: 2735.2 ms  [2389 - 3528]  IQR=367 ms
    opt  median: 2373.6 ms  [2125 - 2653]  IQR=293 ms
    delta:       -361.6 ms  (-13.22%)

Verbose trace before/after (single run, representative):

    before:
      Selective metrics calculation completed in 639 ms
      Output token count completed in      1046 ms
      Calculate Metrics wall:               1296 ms

    after:
      Selective metrics calculation completed in 579 ms
      Fast-path output tokens: files=1017293, wrapper=33678 (126996 chars)
      Calculate Metrics wall:                ~580 ms

The savings are concentrated in the `calculateMetrics` phase, which was the
dominant critical path in the final `Promise.all` for tokenCountTree runs on
large repos.
@coderabbitai

coderabbitai Bot commented Apr 12, 2026

Copy link
Copy Markdown
Contributor

Important

Review skipped

Auto incremental reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 09ed0669-60f8-48b4-987c-094a494b1e1c

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

This change introduces a performance optimization for token counting in metrics calculation and integrates output file sorting into the packager workflow. The metrics module now conditionally applies a fast token-counting path for eligible configurations, computing output token counts by summing per-file counts with a tokenized wrapper, while falling back to the original behavior otherwise. The packager now sorts output files before passing them to downstream operations.

Changes

Cohort / File(s) Summary
Metrics Optimization
src/core/metrics/calculateMetrics.ts
Added logger import and runTokenCount from worker runner. Introduced extractOutputWrapper() helper to isolate wrapper content and canUseFastOutputTokenPath() to determine fast-path eligibility. Replaced unconditional calculateOutputMetrics with conditional logic: fast path sums per-file token counts plus wrapper-only output tokens (via runTokenCount); fallback path preserves original behavior via calculateOutputMetrics.
Packager Output Sorting
src/core/packager.ts
Added sortOutputFiles dependency to defaultDeps. After filtering suspicious files, now awaits deps.sortOutputFiles() to establish output ordering before skill generation and metrics calculation, replacing the previous unsorted file list usage.
Test Mocks
tests/core/packager/diffsFunctionality.test.ts
Extended gitRepositoryHandle mock with isGitInstalled (resolves false) and getFileChangeCount (resolves empty object) asynchronous functions.

Sequence Diagram

sequenceDiagram
    participant Packager as Packager
    participant SortFn as sortOutputFiles
    participant CalcMetrics as calculateMetrics
    participant FastPath as Fast Path<br/>(runTokenCount)
    participant FallbackPath as Fallback Path<br/>(calculateOutputMetrics)

    Packager->>SortFn: sort filtered processed files
    SortFn-->>Packager: ordered files
    
    Packager->>CalcMetrics: pass sorted files & config
    
    alt Config supports fast path
        CalcMetrics->>FastPath: extract wrapper output
        FastPath->>FastPath: sum file tokenCounts
        FastPath->>FastPath: tokenize wrapper only
        FastPath->>FastPath: aggregate counts
        FastPath-->>CalcMetrics: output token count
    else Fallback to standard path
        CalcMetrics->>FallbackPath: calculate for each output part
        FallbackPath-->>CalcMetrics: aggregated counts
    end
    
    CalcMetrics-->>Packager: metrics result
Loading

Estimated Code Review Effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly Related PRs

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately summarizes the main performance optimization: skipping redundant full-output tokenization via a wrapper-extraction fast path with measured 13.2% improvement.
Description check ✅ Passed The description comprehensively covers the problem, solution, key changes, accuracy analysis, fallback behavior, benchmark results, and test plan, exceeding the minimal template requirements.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch perf/fast-output-tokenization

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions

github-actions Bot commented Apr 12, 2026

Copy link
Copy Markdown
Contributor

⚡ Performance Benchmark

Latest commit:701e8d4 refactor(metrics): Rename stale selective variable to allFileMetrics
Status:✅ Benchmark complete!
Ubuntu:1.34s (±0.02s) → 1.29s (±0.02s) · -0.05s (-3.9%)
macOS:1.17s (±0.38s) → 1.09s (±0.41s) · -0.08s (-6.7%)
Windows:1.79s (±0.04s) → 1.69s (±0.03s) · -0.10s (-5.8%)
Details
  • Packing the repomix repository with node bin/repomix.cjs
  • Warmup: 2 runs (discarded), interleaved execution
  • Measurement: 20 runs / 30 on macOS (median ± IQR)
  • Workflow run
History

7ed5943 refactor(metrics): Rename fastWrapper to outputWrapper

Ubuntu:1.35s (±0.03s) → 1.29s (±0.02s) · -0.06s (-4.5%)
macOS:0.84s (±0.03s) → 0.79s (±0.05s) · -0.05s (-6.3%)
Windows:2.07s (±0.42s) → 1.94s (±0.40s) · -0.13s (-6.4%)

f1dc925 fix(metrics): Rename misleading fastOutputToken to singleOutput

Ubuntu:1.41s (±0.02s) → 1.33s (±0.02s) · -0.07s (-5.3%)
macOS:0.94s (±0.14s) → 0.87s (±0.13s) · -0.07s (-7.2%)
Windows:1.73s (±0.04s) → 1.64s (±0.04s) · -0.09s (-5.3%)

0b6c775 fix(metrics): Address PR review feedback

Ubuntu:1.42s (±0.02s) → 1.35s (±0.02s) · -0.07s (-4.9%)
macOS:1.02s (±0.17s) → 0.93s (±0.17s) · -0.09s (-8.6%)
Windows:1.77s (±0.09s) → 1.69s (±0.15s) · -0.08s (-4.7%)

ea174db test(metrics): Add tests for extractOutputWrapper and canUseFastOutputTokenPath

Ubuntu:1.39s (±0.04s) → 1.32s (±0.02s) · -0.07s (-5.2%)
macOS:0.82s (±0.04s) → 0.76s (±0.04s) · -0.06s (-7.2%)
Windows:1.66s (±0.04s) → 1.57s (±0.05s) · -0.09s (-5.3%)

48c7d41 fix(metrics): Address PR review feedback

Ubuntu:1.38s (±0.02s) → 1.29s (±0.03s) · -0.08s (-5.8%)
macOS:0.95s (±0.12s) → 0.93s (±0.15s) · -0.02s (-2.4%)
Windows:1.79s (±0.03s) → 1.68s (±0.02s) · -0.10s (-5.9%)

8321c7a refactor(metrics): Rename calculateSelectiveFileMetrics to calculateFileMetrics

Ubuntu:1.40s (±0.02s) → 1.34s (±0.02s) · -0.06s (-4.6%)
macOS:1.12s (±0.13s) → 1.04s (±0.11s) · -0.07s (-6.5%)
Windows:1.89s (±0.05s) → 1.78s (±0.09s) · -0.11s (-5.7%)

6dea6cb refactor(metrics): Always calculate per-file token counts for all files

Ubuntu:1.44s (±0.03s) → 1.38s (±0.03s) · -0.06s (-4.4%)
macOS:1.26s (±0.20s) → 1.08s (±0.09s) · -0.18s (-14.2%)
Windows:1.91s (±0.43s) → 1.81s (±0.41s) · -0.10s (-5.1%)

68a47b9 perf(core): Skip redundant full-output tokenization via wrapper-extraction fast path (-13.2%)

Ubuntu:1.33s (±0.02s) → 1.32s (±0.03s) · -0.01s (-0.8%)
macOS:0.91s (±0.14s) → 0.90s (±0.16s) · -0.01s (-0.7%)
Windows:1.76s (±0.04s) → 1.76s (±0.05s) · +0.00s (+0.0%)

@codecov

codecov Bot commented Apr 12, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 84.09091% with 7 lines in your changes missing coverage. Please review.
✅ Project coverage is 87.18%. Comparing base (9b8a46f) to head (701e8d4).
⚠️ Report is 10 commits behind head on main.
✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
src/core/metrics/calculateMetrics.ts 84.61% 6 Missing ⚠️
src/core/metrics/calculateFileMetrics.ts 75.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1457      +/-   ##
==========================================
+ Coverage   87.13%   87.18%   +0.04%     
==========================================
  Files         117      117              
  Lines        4440     4465      +25     
  Branches     1023     1032       +9     
==========================================
+ Hits         3869     3893      +24     
- Misses        571      572       +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@cloudflare-workers-and-pages

cloudflare-workers-and-pages Bot commented Apr 12, 2026

Copy link
Copy Markdown

Deploying repomix with  Cloudflare Pages  Cloudflare Pages

Latest commit: 701e8d4
Status: ✅  Deploy successful!
Preview URL: https://2402a134.repomix.pages.dev
Branch Preview URL: https://perf-fast-output-tokenizatio.repomix.pages.dev

View logs

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a performance optimization for token counting by implementing a "fast path" that avoids re-tokenizing the entire output string. Instead, it calculates the total tokens by summing individual file token counts and tokenizing only the "wrapper" (the template boilerplate). To support this, files are now pre-sorted to match the output order. Feedback was provided regarding the robustness of the extractOutputWrapper function, specifically noting that using indexOf on file content might lead to incorrect matches if the content appears earlier in the output, such as in the directory tree.

Comment thread src/core/metrics/calculateMetrics.ts

@devin-ai-integration devin-ai-integration Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Devin Review: No Issues Found

Devin Review analyzed this PR and found no potential bugs to report.

View in Devin Review to see 5 additional findings.

Open in Devin Review

Previously, per-file tokenization was only done for all files when
`tokenCountTree` was enabled; otherwise only the top N files (by char
count) were tokenized for the "Top Files" display. But output
tokenization always processes all file contents anyway, so the
"selective" path was not saving any work — it just prevented the fast
output-token path from being used.

Now we always tokenize every file individually, which:
- Enables the wrapper-extraction fast path regardless of tokenCountTree
- Simplifies the metrics pipeline by removing the conditional branching
- Provides complete per-file token data for all downstream consumers

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@claude

claude Bot commented Apr 12, 2026

Copy link
Copy Markdown
Contributor

Code Review — perf(core): Skip redundant full-output tokenization via wrapper-extraction fast path

Nice optimization concept — reusing per-file token counts and only tokenizing the lightweight wrapper is a clean approach. A few items worth discussing before merge:

1. Always-tokenize-all-files is a behavioral change beyond the fast path (Medium)

The PR removes the selective tokenization optimization that previously only tokenized the top N files when tokenCountTree is disabled (the default):

// Before: only top files tokenized when tokenCountTree=false
const metricsTargetPaths = shouldCalculateAllFiles
  ? processedFiles.map((file) => file.path)
  : [...processedFiles].sort(...).slice(0, topFilesLength * 10).map(...)

// After: always all files
const metricsTargetPaths = processedFiles.map((file) => file.path);

For users running with default settings (tokenCountTree: false), this means all 1000+ files are now individually tokenized — work that wasn't done before. The fast path offsets this by avoiding full-output re-tokenization, but the net effect on the default config isn't benchmarked (the benchmark used tokenCountTree: true). A benchmark on the default config would confirm whether this is a net win or regression for the majority of users.

2. Test coverage gap for new logic (Medium)

Codecov reports 35.29% patch coverage with 22 lines missing. The PR's own test plan has unchecked items. Key untested code:

  • extractOutputWrapper — no unit tests for successful extraction, null return, empty file handling, or forward-walk correctness
  • canUseFastOutputTokenPath — no branch coverage tests
  • No integration test verifying the fast path produces token counts within acceptable bounds of the slow path
Suggested test cases
  • extractOutputWrapper with known output string and files → returns correct wrapper
  • extractOutputWrapper with missing file content → returns null
  • extractOutputWrapper with empty file content → skips correctly
  • canUseFastOutputTokenPath for each branch: splitOutput set, parsableStyle true, each style value
  • Fast vs slow path produce equivalent results (within ~0.1%)

3. Implicit ordering contract between packager and output generator (Low)

The fast path relies on processedFiles being sorted identically to how generateOutput orders them in the output. Both call sortOutputFiles, relying on a shared module-level cache to guarantee identical results. If generateOutput ever changes its sorting logic, the fast path silently produces wrong token counts (without triggering the null fallback).

Consider passing the pre-sorted array into generateOutput as an already-sorted input, making the contract explicit and eliminating the redundant sort.

4. Sequential dependency in fast path (Low)

The fast path awaits selectiveFileMetricsPromise before running runTokenCount on the wrapper. This serializes work that previously ran in parallel (output metrics vs file metrics). Since the wrapper tokenization is cheap (~127KB), the impact is small, but worth noting.


Positives:

  • Fallback to calculateOutputMetrics on null is solid defensive design
  • Gating on parsableStyle and JSON correctly excludes formats where content is escaped
  • Benchmark methodology (interleaved 30 runs, IQR reported) is credible
  • Pre-sorting files in packager and adding sortOutputFiles to deps follows the project's DI convention

🤖 Generated with Claude Code

…ileMetrics

The function now always calculates metrics for all files, so the
"Selective" prefix no longer reflects its behavior.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@claude

claude Bot commented Apr 12, 2026

Copy link
Copy Markdown
Contributor

Follow-up Review — Incremental Findings

Building on my previous review, a few additional items surfaced:

1. CI Benchmark vs PR Claim (High)

The CI benchmark bot shows -0.8% (Ubuntu), -0.7% (macOS), +0.0% (Windows) — dramatically less than the -13.2% claimed in the PR description. The PR benchmark used tokenCountTree: true (non-default), while CI likely tests default config. Since the PR also removes the selective top-N tokenization optimization, the default config now tokenizes all files instead of just ~50. This means:

  • With tokenCountTree: true: net win (avoids redundant full-output tokenization)
  • With tokenCountTree: false (default): potential regression (now tokenizes all files, gains nothing from fast path since fileTokenCounts weren't previously computed for all files)

Suggestion: Run the CI benchmark explicitly with tokenCountTree: true to validate the claimed improvement, and confirm default config isn't regressing.

2. Stale Comment (Low)

Line 157 in calculateMetrics.ts:

// Reuse per-file token counts from the primary selective metrics run.

Should be updated — the function is no longer "selective" after the rename to calculateFileMetrics and the removal of top-N filtering.

3. Missing Fallback Logging (Low)

When extractOutputWrapper returns null, the code silently falls back to calculateOutputMetrics with no log. Adding a logger.trace('Fast-path unavailable, falling back to full output tokenization') would help with debugging and monitoring adoption of the fast path in practice.


Previous feedback items (test coverage, ordering contract, sequential dependency) still apply. The approach is sound — the main open question is whether the net effect is positive for the default configuration.

🤖 Generated with Claude Code

- Remove stale "selective metrics run" comment
- Add trace log when fast-path falls back to full output tokenization

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@yamadashy

Copy link
Copy Markdown
Owner Author

Addressed in 48c7d413:

  • Removed stale "selective metrics run" comment (item 2 from follow-up review)
  • Added logger.trace when fast-path falls back to full output tokenization (item 3 from follow-up review)

Regarding other items:

  • Always-tokenize-all-files: Intentional design decision — output tokenization processes all file contents anyway, so per-file tokenization is equivalent work with better pipeline overlap. CI benchmarks confirm net improvement across all platforms.
  • CI benchmark vs PR claim: The -13.2% was measured with tokenCountTree: true on the author's machine. CI runs default config and shows -4.6% to -6.5% improvement, which is consistent with the analysis.
  • Test coverage: Will discuss with maintainer.
  • Implicit ordering contract: Will discuss with maintainer.
  • Sequential dependency: Acknowledged; wrapper tokenization is ~127KB so the serialization cost is negligible.

🤖

@claude

claude Bot commented Apr 12, 2026

Copy link
Copy Markdown
Contributor

Code Review — Incremental Findings (Post-48c7d41)

Thanks for addressing the stale comment and adding the fallback trace log. A few new items surfaced on closer inspection:

1. Existing calculateMetrics test accidentally bypasses calculateOutputMetrics mock (Medium)

The test in calculateMetrics.test.ts uses default config (style: 'xml', parsableStyle: false, splitOutput: undefined) and output 'a'.repeat(300) with file content 'a'.repeat(100). Since canUseFastOutputTokenPath returns true for this config, and extractOutputWrapper will find the substring, the fast path activates — meaning the calculateOutputMetrics: async () => 30 mock is never called. The test passes by coincidence, not by exercising the intended code path. This should be fixed by either:

  • Setting parsableStyle: true in the test config to force the fallback path
  • Adding a separate test that explicitly verifies the fast path behavior

2. targetFilePaths parameter in calculateFileMetrics is now vestigial (Low)

The only call site always passes processedFiles.map(f => f.path) — i.e., all file paths. The filtering logic inside calculateFileMetrics (targetFileSet.has(file.path)) is dead code. Consider removing the parameter (or adding a comment explaining if it's retained for future use).

3. Comment overstates caching of sort (Low)

The comment at packager.ts:160-166 says "sorting once here…costs only a single git-log subprocess (cached)" — this is correct for the git subprocess, but sortFilesByChangeCounts (the array sort) still executes twice: once in packager.ts and once inside generateOutput. The practical cost is negligible (~1ms for 1000 files), but the comment implies the sort itself is memoized when it isn't.

Previous feedback still applicable
  • CI benchmark discrepancy (-0.8% vs -13.2%) — the PR benchmark used tokenCountTree: true while CI uses default config. Worth caveating in the description.
  • Test coverage for extractOutputWrapper and canUseFastOutputTokenPath — still no direct unit tests. Key untested scenarios: content not found (null return), duplicate file contents, all-empty files, each canUseFastOutputTokenPath branch.
  • Implicit ordering contractcalculateMetrics accepts any ProcessedFile[] but the fast path assumes output-order sorting. A misorder where content exists in the wrong segment could yield incorrect counts without triggering the null fallback.

Overall: the optimization is sound for its target use case (tokenCountTree: true, large repos, xml/markdown/plain). The main concern is ensuring the test suite exercises the fast path intentionally rather than accidentally.

🤖 Generated with Claude Code

yamadashy and others added 2 commits April 12, 2026 23:41
…tTokenPath

Export the two helper functions and add 13 unit tests covering:
- extractOutputWrapper: normal extraction, missing content (null),
  empty files, identical content, wrong order, no files, no wrapper
- canUseFastOutputTokenPath: each style variant, splitOutput,
  parsableStyle

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Fix calculateMetrics test to use parsableStyle: true so it exercises
  the fallback path (calculateOutputMetrics mock) instead of accidentally
  hitting the fast path
- Correct packager comment to clarify that git-log is cached but the
  array sort itself runs twice (negligible cost)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@yamadashy

Copy link
Copy Markdown
Owner Author

Addressed in 0b6c775d:

  • Test bypassing mock (item 1): Fixed by setting parsableStyle: true in the test config so it forces the fallback path and exercises the calculateOutputMetrics mock as intended.
  • Comment overstates caching (item 3): Corrected the packager comment to clarify that git-log is cached but the array sort runs twice (negligible).

Regarding item 2 (targetFilePaths vestigial parameter): Will discuss with maintainer — removing it is a valid cleanup but a separate concern.

🤖

@claude

claude Bot commented Apr 12, 2026

Copy link
Copy Markdown
Contributor

Code Review — Claude

Overall: Clean, well-structured performance optimization. The approach is sound — reusing per-file token counts and tokenizing only the wrapper is a smart way to avoid redundant work. The fallback is correctly implemented. A few items worth addressing:

Findings

1. No integration test for the fast path through calculateMetrics ⚠️

calculateMetrics.test.ts now uses parsableStyle: true, which forces the fallback path. The fast-path IIFE (lines 159–171 of calculateMetrics.ts) is never exercised end-to-end by any test — fastOutputTokenPath.test.ts only covers the two pure helpers in isolation.

Consider adding a calculateMetrics test with a config where canUseFastOutputTokenPath returns true and a single-part output where extractOutputWrapper succeeds, to verify the fast path produces correct total token counts.

2. Minor: misleading variable name fastOutputToken 💡

calculateMetrics.ts:151fastOutputToken holds the full output string, not a token. Something like singleOutput or fastPathOutput would be clearer.

Other observations (non-blocking)
  • Sequential await in fast path: The fast path awaits fileMetricsPromise before dispatching runTokenCount(wrapper), losing some parallelism versus the slow path. In practice this is fine since fileMetricsPromise dominates the time, but if per-file tokenization gets faster in the future, kicking off runTokenCount(wrapper) in parallel would help.

  • Double-sort acknowledged: sortOutputFiles() runs in both packager.ts and inside generateOutput. The comment correctly notes this is negligible (~1ms), and the git-log subprocess is cached. The coupling between the two sort calls is a maintenance consideration but not a current issue.

  • Fallback log level: The trace-level log when the fast path falls back (logger.trace('Fast-path unavailable...')) could be promoted to debug for better observability in production without full trace logging.

  • canUseFastOutputTokenPathsplitOutput !== undefined guard: This is technically redundant with the outputParts.length === 1 check downstream, but serves as a clear early-exit for readability. Fine as-is.

What looks good

  • extractOutputWrapper is well-designed: single forward pass, handles empty files, correctly returns null on any mismatch
  • Comprehensive unit tests for both new helpers (7 + 6 cases)
  • Clean rename of calculateSelectiveFileMetricscalculateFileMetrics
  • Proper dependency injection for sortOutputFiles in packager
  • All existing tests updated correctly
  • Benchmark data is thorough and reproducible

🤖 Generated with Claude Code

yamadashy and others added 2 commits April 13, 2026 00:02
The variable holds the full output string, not a token. Renamed for clarity.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@claude

claude Bot commented Apr 12, 2026

Copy link
Copy Markdown
Contributor

🔍 Code Review — Claude

Overall verdict: This is a well-executed performance optimization. The approach is sound, the benchmarks are solid (-13.2% overall, ~55% reduction in calculateMetrics), and the fallback behavior is safe. A few items worth considering before merge:

Key Findings

1. False-positive indexOf match risk (Recommended)

extractOutputWrapper finds file content via indexOf(file.content, cursor). If a file's content appears verbatim earlier in the output than its actual file section (e.g., inside the directory tree, a git diff block, or a summary), indexOf will match the wrong occurrence. Unlike the "not found" case (which returns null and falls back), a wrong-position match produces a silently incorrect wrapper and therefore a slightly wrong totalTokens — without triggering the fallback.

In practice this is extremely unlikely for real codebases (file contents are typically much longer than header/tree snippets), and the error would be tiny. But it's worth documenting this as a known limitation in the JSDoc for extractOutputWrapper.

2. selective variable name is stale (Recommended)

In calculateMetrics.ts:160:

const selective = await fileMetricsPromise;

This is a remnant of the old calculateSelectiveFileMetrics naming. Since it now always contains all files, renaming to fileMetrics (or allFileMetrics) would be clearer.

3. Wrapper tokenization unnecessarily serialized (Not needed — minor)

The fast path awaits fileMetricsPromise before dispatching runTokenCount for the wrapper. Since the wrapper string is already extracted synchronously (line 152), the wrapper tokenization could theoretically be dispatched in parallel with per-file metrics. However, the wrapper is only ~127KB and the tokenization is fast, so the real-world gain would be negligible.

4. calculateFileMetrics retains targetFilePaths parameter (Not needed for this PR)

After the refactor, calculateFileMetrics is always called with all file paths. The targetFilePaths parameter + internal Set filtering is now dead logic. This is a cleanup that could happen in a follow-up PR.

Test Coverage

The 13 new unit tests for extractOutputWrapper and canUseFastOutputTokenPath cover the key scenarios well. Codecov reports 84% patch coverage with 6 lines uncovered in calculateMetrics.ts — these are the fast-path integration lines (159–170) which are only exercised when the fast path fires end-to-end. The existing calculateMetrics.test.ts forces the fallback via parsableStyle: true, which is a good defensive choice. A future follow-up test exercising the fast path through calculateMetrics would close this gap.

Summary

Area Status
Approach ✅ Sound — wrapper extraction + sum is a clean optimization
Accuracy ✅ 0.024% delta, well within BPE boundary error already accepted
Fallback ✅ Returns null → original path, zero behavior change
Security ✅ No concerns — indexOf is linear, no untrusted patterns
Conventions ✅ DI pattern, file sizes, commit messages all compliant
Tests ⚠️ Good unit coverage, fast-path integration lines uncovered

LGTM with the minor selectivefileMetrics rename suggestion. Nice work on the benchmarking and documentation! 🎉


Detailed reviewer notes

Code Quality:

  • extractOutputWrapper and canUseFastOutputTokenPath are clean, well-documented functions
  • The inline IIFE for the fast-path promise (lines 159–171) is slightly harder to follow than a named helper, but acceptable given the tight scope
  • calculateMetrics.ts is at 221 lines — close to the 250-line project limit but within bounds

Performance:

  • String.prototype.indexOf with forward cursor is O(n) amortized for the common case
  • wrapperSegments.join('') for ~127KB is negligible
  • Double sortOutputFiles call relies on fileChangeCountsCache — works correctly but is a hidden coupling worth noting

Conventions:

  • runTokenCount is called directly (not via deps) in the fast path — minor DI gap, acceptable for now since it's a thin wrapper
  • sortOutputFiles properly added to defaultDeps in packager

🤖 Generated with Claude Code

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@claude

claude Bot commented Apr 12, 2026

Copy link
Copy Markdown
Contributor

Code Review — Claude (Incremental)

Previous Claude reviews raised items around test coverage, stale naming, fallback logging, ordering coupling, and benchmark discrepancy. The maintainer has addressed most of these across commits 48c7d41701e8d4. This review covers only new or remaining findings on the final state of the PR.

Overall: LGTM — clean, well-iterated optimization. Ready to merge.

Remaining Item (Non-blocking)

targetFilePaths parameter in calculateFileMetrics is vestigial

The only call site (calculateMetrics.ts:124) always passes all file paths:

const metricsTargetPaths = processedFiles.map((file) => file.path);

The targetFileSet filtering inside calculateFileMetrics (lines 23–24) is dead code. This was noted in a prior review and deferred. Consider removing it in a follow-up to keep the interface honest — or adding a comment explaining it's retained for API flexibility.

What's Been Well Addressed

Previous Feedback Resolution
Stale "selective" naming Renamed throughout (allFileMetrics, calculateFileMetrics)
Missing fallback log Added logger.trace on line 154
Test accidentally bypassing mock Fixed with parsableStyle: true
Comment overstating cache Corrected to note array sort runs twice
Variable naming (fastOutputToken, fastWrapper, selective) All renamed to clearer alternatives
Unit tests for helpers 13 tests added in fastOutputTokenPath.test.ts

Summary Table

Area Status
Approach ✅ Sound — wrapper extraction + sum avoids redundant tokenization
Accuracy ✅ 0.024% delta, within existing BPE chunk-boundary error
Fallback null → original path, logged, zero behavior change
Security ✅ No concerns — string search only, no eval/parse
Performance ✅ CI confirms -4.5% to -6.4% improvement across platforms
Conventions ✅ 221 lines (under 250 limit), DI pattern followed, commits follow conventions
Tests ✅ Good unit coverage for helpers; fast-path integration lines (159–170) uncovered but acceptable given the fallback safety

Nice work on the thorough benchmarking, iterative addressing of review feedback, and clean fallback design.

🤖 Generated with Claude Code

@yamadashy yamadashy merged commit 4c356f7 into main Apr 12, 2026
59 checks passed
@yamadashy yamadashy deleted the perf/fast-output-tokenization branch April 12, 2026 15:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants