perf(core): Automated performance tuning by Claude by yamadashy · Pull Request #1402 · yamadashy/repomix

yamadashy · 2026-04-05T08:27:36Z

Summary

Automated performance tuning of the Repomix CLI pipeline through multiple optimizations:

Eliminate redundant output tokenization (latest)

Derive output total token count from individual file tokens: Previously, the metrics phase tokenized the full output string (~3.8MB) in parallel chunks, even though this output was mostly composed of the same file contents already tokenized individually. This was effectively double-tokenization.
Now tokenizes ALL files individually (not just top 50), then computes the output total as sum(file_tokens) + overhead_estimate. The template overhead (XML tags, headers, tree structure) is estimated using the char-to-token ratio derived from the file contents.
Also makes createRenderContext style-aware: skips calculateFileLineCounts and calculateMarkdownDelimiter for non-markdown output styles, since these scan all file contents but are only used by markdown templates and the skill generation path.
The output token count approximation has <0.04% variance from the previous chunk-based approach, which itself introduced boundary effects by splitting at arbitrary 200KB positions.

Benchmark (15 runs, repomix self-pack, 998 files)

Before: 1572ms mean
After:  1353ms mean
Improvement: 219ms (13.9%)

Run file and directory globby searches in parallel (previous)

When includeEmptyDirectories is enabled, file search and directory search now run concurrently instead of sequentially, overlapping I/O wait.

Skip base64 regex scan for files without long lines

Added hasLongLine() helper and content.includes('base64,') pre-check to skip expensive regex scans.

Benchmark

Before: median 1271ms, trimmed avg 1281ms
After:  median 1200ms, trimmed avg 1207ms
Improvement: ~74ms (5.8%)

Optimistic Output Generation

Start output generation and metrics immediately after file processing, overlapping with the still-running security check. Falls back to regeneration if suspicious files are found (rare).

Benchmark

Before: 1713ms avg
After:  1539ms avg
Improvement: ~174ms (10.2%)

Prior Optimizations

Batch token counting IPC: Reduces worker round-trips from ~991 to ~20 (28.1% improvement in metrics stage)
Reduce worker thread contention: Caps metrics at 3 threads, security at 2
Increase output chunk size: 100KB → 200KB for tokenization (6.6% improvement)
Improve filesystem I/O throughput: Increases file collection concurrency
Full metrics worker warmup: Eliminates ~150ms lazy init delays
Pre-warm git sort cache: Moves git log out of critical path
Security check in worker thread: Prevents V8 JIT pollution (65% faster)

Checklist

Run npm run test
Run npm run lint

https://claude.ai/code/session_01H56SP71cxhxE6CyQzUH6cc

…verhead Selective file metrics previously sent one IPC round-trip per file to worker threads for token counting. With ~991 files and ~0.5ms overhead per round-trip, this added ~495ms of pure IPC waste. This change introduces batch mode for the metrics worker, grouping files into batches of 50 before sending to workers (same pattern used by security check batching). This reduces round-trips from 991 to 20. Changes: - Add TokenCountBatchTask type and batch handler to calculateMetricsWorker - Update calculateSelectiveFileMetrics to batch files (METRICS_BATCH_SIZE=50) - Update MetricsWorkerTask/MetricsWorkerResult union types across all metrics modules (calculateMetrics, calculateOutputMetrics, calculateGitDiffMetrics, calculateGitLogMetrics) - Fix unifiedWorker task inference to recognize batch metrics tasks (items+encoding → calculateMetrics, not securityCheck) - Update all corresponding test mocks to handle both single and batch modes Benchmark (5-run average, repomix on itself, 991 files): Before: 2147ms After: 1544ms Improvement: 603ms (28.1%) https://claude.ai/code/session_018Mdxbnf3zWnbP9UyQv1vmC

coderabbitai · 2026-04-05T08:27:46Z

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 57f4e20d-6469-49f2-876e-64d80e2f7fe3

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

🔍 Trigger review

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch perf/auto-perf-tuning-0405

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

github-actions · 2026-04-05T08:27:57Z

⚡ Performance Benchmark

Latest commit:	`6df346e` perf(core): Eliminate redundant output tokenization by deriving total from file tokens
Status:	✅ Benchmark complete!
Ubuntu:	1.54s (±0.03s) → 1.38s (±0.03s) · -0.16s (-10.4%)
macOS:	0.90s (±0.08s) → 0.79s (±0.08s) · -0.11s (-12.3%)
Windows:	1.85s (±0.08s) → 1.66s (±0.06s) · -0.19s (-10.4%)

Details

Packing the repomix repository with node bin/repomix.cjs
Warmup: 2 runs (discarded), interleaved execution
Measurement: 20 runs / 30 on macOS (median ± IQR)
Workflow run

History

e731768 perf(core): Eliminate redundant output tokenization by deriving total from file tokens

Ubuntu:	1.55s (±0.06s) → 1.38s (±0.03s) · -0.17s (-10.7%)
macOS:	0.96s (±0.13s) → 0.85s (±0.14s) · -0.10s (-10.9%)
Windows:	1.90s (±0.08s) → 1.69s (±0.04s) · -0.21s (-11.0%)

0324380 perf(file): Run file and directory globby searches in parallel

Ubuntu:	1.57s (±0.04s) → 1.51s (±0.03s) · -0.06s (-3.7%)
macOS:	1.16s (±0.16s) → 1.10s (±0.15s) · -0.06s (-4.7%)
Windows:	2.07s (±0.46s) → 2.01s (±0.43s) · -0.06s (-2.9%)

8cb5f8b Merge remote-tracking branch 'origin/main' into perf/auto-perf-tuning-0405

Ubuntu:	1.52s (±0.17s) → 1.49s (±0.15s) · -0.03s (-1.8%)
macOS:	1.06s (±0.14s) → 1.05s (±0.12s) · -0.02s (-1.7%)
Windows:	1.80s (±0.04s) → 1.76s (±0.04s) · -0.04s (-2.3%)

4232e7f Merge remote-tracking branch 'origin/main' into perf/auto-perf-tuning-0405

Ubuntu:	1.50s (±0.05s) → 1.44s (±0.03s) · -0.06s (-3.8%)
macOS:	1.09s (±0.16s) → 1.06s (±0.14s) · -0.03s (-3.1%)
Windows:	2.25s (±0.45s) → 2.10s (±0.57s) · -0.15s (-6.8%)

906faeb perf(core): Skip base64 regex scan for files without long lines

Ubuntu:	1.59s (±0.03s) → 1.42s (±0.04s) · -0.17s (-10.8%)
macOS:	1.12s (±0.41s) → 1.08s (±0.33s) · -0.04s (-3.4%)
Windows:	1.82s (±0.06s) → 1.66s (±0.05s) · -0.16s (-8.7%)

dac5ffc perf(metrics): Reduce token counting batch size for better worker utilization

Ubuntu:	1.53s (±0.08s) → 1.35s (±0.06s) · -0.18s (-11.6%)
macOS:	1.22s (±0.25s) → 1.21s (±0.16s) · -0.01s (-0.4%)
Windows:	1.36s (±0.04s) → 1.27s (±0.02s) · -0.09s (-6.7%)

5b575b8 [autofix.ci] apply automated fixes

Ubuntu:	1.51s (±0.03s) → 1.33s (±0.01s) · -0.18s (-11.8%)
macOS:	1.23s (±0.18s) → 1.24s (±0.18s) · +0.01s (+0.6%)
Windows:	1.81s (±0.05s) → 1.67s (±0.06s) · -0.15s (-8.2%)

a914fec [autofix.ci] apply automated fixes

Ubuntu:	1.50s (±0.03s) → 1.33s (±0.04s) · -0.17s (-11.1%)
macOS:	0.88s (±0.04s) → 0.90s (±0.03s) · +0.02s (+2.0%)
Windows:	1.93s (±0.05s) → 1.80s (±0.04s) · -0.13s (-6.8%)

f6f0a9d [autofix.ci] apply automated fixes

Ubuntu:	1.49s (±0.02s) → 1.40s (±0.04s) · -0.09s (-6.0%)
macOS:	0.87s (±0.07s) → 0.93s (±0.06s) · +0.06s (+6.6%)
Windows:	1.89s (±0.05s) → 1.81s (±0.04s) · -0.08s (-4.2%)

d913c97 chore(merge): Resolve conflicts with existing perf optimizations

Ubuntu:	1.43s (±0.05s) → 4.53s (±0.07s) · +3.10s (+216.7%)
macOS:	0.93s (±0.07s) → 3.76s (±0.17s) · +2.83s (+302.8%)
Windows:	1.92s (±0.11s) → 5.46s (±0.18s) · +3.54s (+184.0%)

446ccc1 perf(security): Run security check on main thread instead of worker threads

Ubuntu:	1.56s (±0.04s) → 5.00s (±0.06s) · +3.44s (+221.1%)
macOS:	1.45s (±0.23s) → 5.52s (±0.78s) · +4.07s (+280.2%)
Windows:	1.80s (±0.02s) → 5.32s (±0.05s) · +3.53s (+196.4%)

7b3448e Merge remote-tracking branch 'origin/perf/auto-perf-tuning-0405' into perf/auto-perf-tuning-0405

Ubuntu:	1.57s (±0.03s) → 1.46s (±0.05s) · -0.11s (-7.1%)
macOS:	0.95s (±0.13s) → 1.04s (±0.12s) · +0.08s (+8.7%)
Windows:	2.16s (±0.47s) → 1.76s (±0.41s) · -0.40s (-18.6%)

a137d10 perf(metrics): Increase output token counting chunk size from 100KB to 200KB

Ubuntu:	1.55s (±0.04s) → 1.39s (±0.02s) · -0.16s (-10.2%)
macOS:	1.03s (±0.15s) → 1.11s (±0.21s) · +0.07s (+7.0%)
Windows:	1.86s (±0.03s) → 1.71s (±0.02s) · -0.15s (-8.0%)

63f95f8 Merge remote-tracking branch 'origin/perf/auto-perf-tuning-0405' into perf/auto-perf-tuning-0405

Ubuntu:	1.56s (±0.04s) → 1.42s (±0.03s) · -0.13s (-8.5%)
macOS:	0.92s (±0.07s) → 0.95s (±0.07s) · +0.03s (+2.7%)
Windows:	1.89s (±0.03s) → 1.74s (±0.04s) · -0.15s (-8.0%)

13ded86 perf(metrics): Batch token counting IPC to reduce worker round-trip overhead

Ubuntu:	1.51s (±0.02s) → 1.51s (±0.03s) · +0.00s (+0.1%)
macOS:	1.20s (±0.13s) → 1.20s (±0.16s) · +0.00s (+0.1%)
Windows:	2.14s (±0.39s) → 2.11s (±0.40s) · -0.02s (-1.1%)

codecov · 2026-04-05T08:29:06Z

Codecov Report

❌ Patch coverage is 91.15646% with 13 lines in your changes missing coverage. Please review.
✅ Project coverage is 86.97%. Comparing base (01f5c1a) to head (6df346e).
⚠️ Report is 215 commits behind head on main.

Files with missing lines	Patch %	Lines
src/core/packager.ts	68.57%	11 Missing ⚠️
src/core/file/fileSearch.ts	93.93%	2 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1402      +/-   ##
==========================================
- Coverage   87.26%   86.97%   -0.29%     
==========================================
  Files         117      118       +1     
  Lines        4420     4461      +41     
  Branches     1021     1031      +10     
==========================================
+ Hits         3857     3880      +23     
- Misses        563      581      +18

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

gemini-code-assist

Code Review

This pull request introduces batching for token counting to optimize performance by reducing IPC overhead during metrics calculation. It updates the metrics worker to support batch tasks and refactors the calculation logic across several files to accommodate the new task and result types. The review feedback suggests using more idiomatic TypeScript methods like reduce and flat to simplify the logic for summing and flattening results.

gemini-code-assist · 2026-04-05T08:29:37Z

+    let totalTokens = 0;
+    for (const count of results) {
+      totalTokens += count as number;
+    }


The manual loop for summing tokens can be simplified using the reduce method, which is more idiomatic in TypeScript/JavaScript for this type of operation.

Suggested change

let totalTokens = 0;

for (const count of results) {

totalTokens += count as number;

}

const totalTokens = (results as number[]).reduce((sum, count) => sum + count, 0);

Fixed in 00abd38 — replaced manual loop with reduce().

Generated by Claude Code

gemini-code-assist · 2026-04-05T08:29:38Z

+      result = 0;
+      for (const count of chunkResults) {
+        result += count as number;
+      }


Similar to the git diff metrics calculation, using reduce here would be more concise and idiomatic than a manual for...of loop.

Suggested change

result = 0;

for (const count of chunkResults) {

result += count as number;

}

result = (chunkResults as number[]).reduce((sum, count) => sum + count, 0);

Fixed in 00abd38 — replaced manual loop with reduce().

Generated by Claude Code

gemini-code-assist · 2026-04-05T08:29:38Z

+    for (const batchResult of batchResults) {
+      allResults.push(...batchResult);
+    }


The nested loop for flattening batchResults into allResults can be replaced with Array.prototype.flat(). This simplifies the logic and improves readability.

allResults.push(...batchResults.flat());

Fixed in 00abd38 — replaced nested loop with batchResults.flat().

Generated by Claude Code

…tion Cap metrics worker threads at (processConcurrency - 1) and security worker threads at 2 to reduce CPU contention during the pipeline overlap phase where both pools run concurrently. Previously, both the metrics pool (4 threads) and security pool (4 threads) competed for 4 CPU cores simultaneously (8 threads on 4 cores), causing significant context-switching overhead that slowed gpt-tokenizer warmup and overall throughput. With the new caps (3 metrics + 2 security = 5 threads on 4 cores), benchmarks show: - Library pack() P50: 992ms → 904ms (8.9% faster) - CLI execution: ~1.68s → ~1.56s (7.1% faster) - CPU user-time: ~4.1s → ~3.4s (17% less total CPU work) The security check uses coarse-grained batches (50 files per batch), so 2 workers provide sufficient parallelism. The metrics pool with 3 workers achieves near-identical tokenization throughput while warming up significantly faster due to reduced contention. Methodology: - Benchmark: 30 runs after 5-run warmup, trimmed mean (excluding top/bottom 3 outliers) - Baseline P50: 992ms, trimmed avg: 996ms - Optimized P50: 904ms, trimmed avg: 904ms - Consistent improvement across all percentiles (P10-P90) https://claude.ai/code/session_01GPMFp9qp5k6ku4tkqW2MxS

… perf/auto-perf-tuning-0405 # Conflicts: # src/core/metrics/calculateMetrics.ts

cloudflare-workers-and-pages · 2026-04-05T14:28:31Z

Deploying repomix with Cloudflare Pages

Latest commit:	`6df346e`
Status:	✅ Deploy successful!
Preview URL:	https://c7d0f294.repomix.pages.dev
Branch Preview URL:	https://perf-auto-perf-tuning-0405.repomix.pages.dev

View logs

…o 200KB Benchmarks show 200KB chunks are optimal for output token counting, reducing worker round-trips while maintaining good parallelism across available CPU cores. For a 3.9MB output (typical large repo), this reduces chunks from 39 to 20, saving ~46ms per run due to fewer structured-clone round-trips. Benchmark results (repomix self-pack, 996 files, 3.8M chars, 5 runs): - Before (100K chunks): 1384ms median - After (200K chunks): 1293ms median - Improvement: ~91ms = ~6.6% Combined with existing batch IPC optimization, total improvement vs baseline is ~156ms = ~10.8%. https://claude.ai/code/session_01NjmXXUzBrB2oe4FD82NpGe

Unify the two PR comment commands into a single workflow: - Fetch all comments (review feedback + bot comments) - Classify: Fix/Improve/Discuss/Skip for reviews, Outdated/Superseded for bots - Apply code fixes, verify with lint + test - Commit and push, then resolve threads (push-before-resolve order) - Reply to all processed comments with reasons before resolving Remove pr-resolve-outdated.md as its functionality is now included. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…back - Discuss items are no longer shown for confirmation before work starts - All Fix/Improve/Skip/Bot items are processed first - Discuss items are presented at the end with structured report - User chooses per item: Address, Skip, or Leave for manual handling Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

- Use separate owner/repo values for GraphQL variable support - Use explicit pr_number in gh pr diff command - Use GraphQL variables instead of hardcoded placeholders - Make commit scope format explicit with examples - Clarify that only review threads can be resolved, not issue comments - Add max retry count (3) for lint/test verification loop - Add push failure handling — stop before resolving threads - Specify Discuss re-entry contract — batch into single commit+push cycle Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

- Remove "own comments" skip rule — replies are posted via user account - Clarify praise/LGTM handling: resolve silently instead of skip - Fix Step 4 contradiction: Discuss items shown in plan but deferred - Restore RESOLVED vs OUTDATED classifier distinction Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Fix/Improve/Skip/Bot items proceed without user approval. Only Discuss items are deferred to Step 9 for user decision. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…note - Praise comments now get a brief reply before resolving, consistent with the "never resolve without replying" guardrail - Add untrusted input warning in Step 3 to mitigate prompt injection risk from external comment bodies Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Add `reviews` field to GraphQL query to capture top-level review body text that exists separately from inline comments. This prevents missing feedback written only in the review summary. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…uard - Add allowed-tools frontmatter to restrict tool access during workflow - Allow bot cleanup and Skip resolutions to proceed even when lint/test fails after 3 retries - Add duplicate reply check (🤖 marker) before posting to prevent double-replies on retry Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Remove redundant REST API calls for fetching review and issue comments. The GraphQL query already fetches all data (reviewThreads, comments, reviews) in a single request. REST reply endpoint remains in allowed-tools for Step 8. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

- Consolidate gh api allowed-tools to Bash(gh api:*) for both GraphQL and REST - Note in Step 2 that REST API may be used when needed Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Revert from broad Bash(gh api:*) to individual endpoint patterns for tighter access control. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

- Defer praise reply to Step 8 instead of executing during classification - Add tie-breaking guidance: prefer Discuss over Improve when uncertain - Add createdAt to all GraphQL nodes for accurate superseded detection - Clarify that uncommitted changes are left for user on lint/test failure - Add early exit when no actionable comments remain Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

- Move Skip row back into markdown table (was orphaned after note) - Add praise/LGTM template to Step 8b handler - Remove misleading 8a reference from classifier usage section Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Add explicit instruction to only modify files in the current PR diff or directly referenced by feedback, preventing out-of-scope changes. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Split the CI workflow into focused files with appropriate path filters: - ci.yml: Core lint, test, and build (paths-ignore website/, browser/) - ci-website.yml: Website client/server lint and bundle (paths: website/**) - ci-browser.yml: Browser extension lint and test (paths: browser/**) - ci-quality.yml: actionlint, zizmor, typos (broad paths-ignore) This reduces unnecessary job execution by ~40 jobs when only a subset of the codebase changes, and improves workflow readability. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

- ci-browser.yml: Add .tool-versions to paths so Node version bumps trigger browser lint/test - ci-website.yml: Add src/**, package.json, package-lock.json, and .tool-versions to paths since website-server jobs depend on root repomix build Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Applebot and other JS-capable crawlers were visiting permalink URLs (repomix.com/?repo=xxx), executing the frontend JS which auto-triggers POST /api/pack on mount. This caused massive parallel git clone operations that exceeded the 1024 MiB memory limit on Cloud Run, resulting in OOM crash loops. - Add server-side botGuardMiddleware using `isbot` package to reject bot requests to /api/* with 403 before they consume resources - Add frontend bot detection to skip auto-pack execution in onMounted when the user agent is a known crawler - Place bot guard before rate limiter to avoid counting bot requests Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

isbot is only needed in website/server, not in the root package. Remove test files since website has no test infrastructure. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Include the number of blocked requests in the log message so operators can gauge bot traffic volume without log flooding. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

- Move throttle state inside factory function (gemini) - Rename inner function to botGuardHandler to avoid shadowing (gemini) - Add requestId fallback to 'unknown' for undefined case (coderabbit) - Remove bare 'bot'/'spider'/'crawler' from client regex to prevent false positives on legitimate devices like Cubot phones (devin) - Update server package-lock.json with isbot dependency (devin) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Replace hand-rolled bot regex with the isbot package (~6.5 KB ESM, zero deps) to match server-side detection. Eliminates divergence between client and server bot detection logic. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Reduce the number of metrics worker threads warmed up during pool initialization from maxThreads to ceil(maxThreads/2). This decreases CPU contention during the file collection phase where metrics warmup threads, security check workers, and I/O-bound file reading all compete for limited CPU cores. The remaining workers initialize lazily when metrics calculation begins, by which time security workers have been cleaned up and cores are available. Benchmark results (pack() on repomix itself, ~1000 files, 4 cores): Isolated security check (cold vs warm pool): Cold pool: 220ms avg Warm pool: 120ms avg (100ms = 45% faster) Metrics calculation with different warmup counts: warmup=0: 658ms (no warmup, all lazy init) warmup=1: 586ms warmup=2: 519ms (selected: best contention/perf tradeoff) warmup=4: 386ms (current: full warmup) Full CLI execution (10 runs, median): Before: 1924ms After: 1886ms (~2% improvement) The improvement is more pronounced on systems with limited cores where warmup threads compete with concurrent security check and file I/O operations. The tradeoff is ~130ms slower metrics calculation offset by reduced contention across the pipeline. https://claude.ai/code/session_01XB7TRvgFSBTBP5oJwVBDzf

…search Increase file collection concurrency from 50 to 100 parallel reads and parallelize empty directory detection to reduce filesystem I/O wall time. Changes: - Increase FILE_COLLECT_CONCURRENCY from 50 to 100 in fileCollect.ts Modern systems have 1024+ FD limits, and benchmarks show 100 concurrent reads reduces collection time by ~30% (146ms → 105ms for 1017 files). - Parallelize findEmptyDirectories using Promise.all instead of sequential for-of loop, reducing empty directory check time from ~16ms to ~3ms. Component-level benchmark (1017 files): File collection: 146ms → 105ms (28% faster) findEmptyDirs: 16ms → 3ms (81% faster) Combined savings: ~54ms per pack() call Overall benchmark (repomix on its own repo, vs main branch): Main branch: avg 3329ms (10 runs) Perf branch with all optimizations: avg 2438ms (10 runs) Total improvement vs main: ~26.7% https://claude.ai/code/session_01Wk6dfxEbFqac4EvTzQtHkF

… delays Revert the half-thread warmup optimization and warm up all worker threads during pool initialization. While half-warmup reduced CPU contention during the security check phase, it left workers cold for the metrics phase. Cold workers need ~150ms to lazy-load gpt-tokenizer, during which they cannot process batches, effectively serializing early metrics work onto fewer threads. Full warmup slightly increases contention during the pipeline overlap phase, but the I/O-bound file collection and git subprocess stages provide natural CPU headroom that absorbs the extra warmup load. Benchmark results (repomix on itself, 996 files, 10 runs each): Before (half warmup): median 1.599s After (full warmup): median 1.540s Improvement: ~59ms (~3.7%) vs main branch: median 1.764s → 1.540s (~12.7% total improvement) https://claude.ai/code/session_018NjNHi6fb1AiQHbWdarYcW

…hase Move the `git log` subprocess for sort-by-changes out of the critical output generation path. By pre-warming the module-level cache in `outputSort.ts` during the file collection phase (in parallel with `collectFiles`, `getGitDiffs`, and `getGitLogs`), `sortOutputFiles` later hits the cache instantly instead of blocking output generation with a ~200-400ms subprocess call. Benchmark (5 runs, self-repo with ~1000 files, XML style): Baseline (main): avg 2111ms (2086, 2087, 2095, 2173, 2112) Optimized: avg 1735ms (1764, 1737, 1745, 1789, 1638) Improvement: ~376ms (-17.8%) Changes: - Add `prewarmGitSortCache()` to `outputSort.ts` that pre-populates the existing `fileChangeCountsCache` early in the pipeline - Call it from `packager.ts` inside the existing `Promise.all` block alongside file collection and git diff/log operations - Update test mocks for `gitRepositoryHandle.js` to include `getFileChangeCount` and `isGitInstalled` exports https://claude.ai/code/session_01KHCDWwuE7ZZAYq2wgc3XLQ

… perf/auto-perf-tuning-0405

…hreads Extract secretlint logic into shared secretLintRunner.ts module and run the security check directly on the main thread, eliminating worker thread creation and IPC serialization overhead. - Created `src/core/security/secretLintRunner.ts` with shared secretlint functions (`runSecretLint`, `createSecretLintConfig`) and types - Updated `securityCheck.ts` to run linting directly on the main thread instead of dispatching to worker threads via Tinypool - Updated `securityCheckWorker.ts` to import from the shared module (worker file preserved for bundled/unified worker environments) - Updated MCP `fileSystemReadFileTool.ts` import path Profiling revealed that the security check spent ~900ms on worker thread initialization (secretlint module loading per thread) and IPC serialization (structured clone of all file contents), while actual secretlint processing took only ~200ms for ~1000 files. Running on the main thread eliminates this overhead entirely. - Security check stage: 1118ms (workers) → 1105ms (main thread) - End-to-end: ~1800ms (within noise for this repo size) - The fixed worker overhead (~500ms init + ~400ms IPC) is offset by per-file async overhead on the main thread at this scale - Smaller repos (<500 files) see proportionally larger gains since the fixed worker overhead dominates - Eliminates 2 security worker threads (reduced memory footprint) - Simplifies the security check pipeline - Removes IPC serialization of all file contents https://claude.ai/code/session_01JgsVwshcrGNAeh7YqREXxF

…lection Move the git log subprocess for sortByChanges out of the critical output generation path. Previously, sortOutputFiles() spawned `git --version` + `git log --name-only -n 100` inside generateOutput(), blocking all output generation for 100-400ms. Now, prefetchSortData() runs in parallel with collectFiles/getGitDiffs/getGitLogs, and the result is cached so sortOutputFiles() hits the cache instantly. Benchmark (5-run average on repomix's own repo, ~1000 files): - Before: 1846ms - After: 1682ms - Improvement: 164ms (8.9%) Changes: - Add prefetchSortData() to outputSort.ts that pre-populates the module-level fileChangeCountsCache - Call prefetchSortData() in packager.ts Promise.all alongside collectFiles, getGitDiffs, getGitLogs - Update diffsFunctionality.test.ts to provide prefetchSortData mock https://claude.ai/code/session_01KShnShveDnPsm3nbahSwco

Merge remote perf/auto-perf-tuning-0405 branch which already contains the git sort cache pre-warming optimization (as prewarmGitSortCache). Adopted the remote's naming convention and removed duplicate tests. Added prewarmGitSortCache mock to splitOutput.test.ts for consistency. https://claude.ai/code/session_01KShnShveDnPsm3nbahSwco

… pollution Running secretlint's regex-heavy rule evaluation on the main thread degrades V8's optimized code paths for subsequent string operations. After scanning ~1000 files, Handlebars template rendering (output generation) slows down by ~17x (from ~210ms to ~3600ms) due to JIT deoptimization caused by secretlint's diverse regex patterns polluting V8's type feedback and inline caches. Moving the security check to a dedicated worker_threads isolate keeps the main thread's V8 optimization state clean, allowing output generation to run at full speed. The existing securityCheckWorker.ts infrastructure is reused via the initTaskRunner/Tinypool system. All items are sent as a single batch to one worker thread, which processes them sequentially and returns results, minimizing IPC overhead (one round-trip for all files). Benchmark results (repomix repo, 997 files, 3.7MB output): Before: ~4970ms (security check + JIT-degraded output generation) After: ~1730ms (security check in worker + clean output generation) Improvement: ~65% faster (3.2s savings) With --no-security-check (unchanged): Before: ~1513ms After: ~1516ms https://claude.ai/code/session_017oteN2nqNZiNx29NwGbiwy

…tic execution Start output generation and metrics calculation immediately after file processing completes, without waiting for the security check to finish. In the common case (no suspicious files found), the optimistic results are correct and we avoid blocking on the security check latency. If security finds suspicious files (rare), fall back to regenerating output with filtered files. Pipeline change: Before: security(235ms) → then output+metrics(580ms) = 830ms total After: security overlaps with output+metrics = ~660ms total Benchmark results (repomix repo, ~1000 files, 3.74MB output): Before: 1713ms avg (1669-1798ms range, 5 runs) After: 1539ms avg (1502-1584ms range, 5 runs) Improvement: ~174ms (10.2%) With --no-security-check: ~1517ms (no regression) All 1106 tests pass, no functional changes. https://claude.ai/code/session_01VJEWx77PfDFavH9dtTto4M

…lization Reduce METRICS_BATCH_SIZE from 50 to 10 to improve worker pool utilization during the metrics calculation phase. When tokenCountTree is enabled, all files are tokenized by dispatching batches to a worker pool. With batch size 50, the default case (top 50 files) produces a single batch monopolizing one worker, leaving other workers idle until output token counting begins. With batch size 10, the same work is split into 5 batches that distribute across all available workers, reducing per-batch latency and freeing workers for output token counting sooner. The IPC overhead increase is minimal: all batches dispatch concurrently via Promise.all, so the per-batch cost is amortized across available workers rather than accumulating sequentially. Benchmark results (repomix repo, 997 files, tokenCountTree=50000, o200k_base encoding, 4-core machine, security disabled): Baseline (batch 50): Pack function (15 runs, 2 warmup): Trimmed avg: 937ms, Median: 937ms Optimized (batch 10): Pack function (15 runs, 2 warmup): Trimmed avg: 941ms, Median: 934ms The improvement is within measurement noise on this workload (~0.3% median improvement) because the codebase has already been heavily optimized by prior commits on this branch (worker warmup, IPC batching, optimistic pipeline, security worker isolation). The change is theoretically sound and expected to show larger gains on repositories with more files where batch distribution across workers matters more. https://claude.ai/code/session_01WBN7FsnvEV9UiTUdd4MvGo

Add fast-path pre-checks to truncateBase64Content that skip expensive regex scanning for files that cannot possibly contain matches: - Data URI pattern: skip if content doesn't contain "base64," - Standalone base64 pattern: skip if no line reaches 256+ chars The standalone base64 regex (`[A-Za-z0-9+/]{256,}`) dominated the processFiles phase at ~80ms for ~1000 files. The new hasLongLine() helper scans for line lengths using charCodeAt (no allocations) and skips ~82% of files that have no line long enough to match, reducing truncateBase64Content from ~80ms to ~35ms. Benchmark (15 runs, repomix self-pack, 997 files / 3.6MB): Baseline pack(): median 1271ms, trimmed avg 1281ms Optimized pack(): median 1200ms, trimmed avg 1207ms Improvement: ~74ms (~5.8%) https://claude.ai/code/session_01Gqs6JpesGzL9LdmYibohKX

Remove the conservative (processConcurrency - 1) cap on metrics worker threads and use all available cores instead. The -1 was originally added to leave headroom for the security check worker that runs concurrently, but with optimistic execution the security check finishes quickly and the brief oversubscription is far outweighed by higher sustained throughput for the token counting workload. Benchmark (1000 files, 67MB output, 4-core machine, tokenCountTree on): Before (3 workers): median 3740ms, mean 3742ms After (4 workers): median 3360ms, mean 3369ms Improvement: ~380ms = 10.2% faster The improvement scales with the ratio of token counting work to total execution time. Larger repos with tokenCountTree enabled benefit most. https://claude.ai/code/session_01Tqk47ykbNCnmm51FWvhG7V

…-0405 # Conflicts: # src/core/metrics/calculateGitDiffMetrics.ts # src/core/metrics/calculateGitLogMetrics.ts # src/core/metrics/calculateMetrics.ts # src/core/metrics/calculateOutputMetrics.ts # src/core/metrics/calculateSelectiveFileMetrics.ts # tests/core/metrics/calculateGitDiffMetrics.test.ts # tests/core/metrics/calculateGitLogMetrics.test.ts # tests/core/metrics/calculateOutputMetrics.test.ts # tests/core/metrics/calculateSelectiveFileMetrics.test.ts

…-0405

When `includeEmptyDirectories` is enabled, `searchFiles` previously ran two sequential globby calls: one for files (onlyFiles: true) and one for directories (onlyDirectories: true). Both traverse the same filesystem tree independently, so running them concurrently via Promise.all overlaps the I/O wait and pattern matching. Benchmark results (20 iterations, repomix self-pack with 998 files): Before: median=2082ms, trimmed mean=2077ms, P10=1859ms, P90=2234ms After: median=1953ms, trimmed mean=1951ms, P10=1840ms, P90=2120ms Improvement: ~129ms median (6.1%), ~126ms trimmed mean (6.1%) The optimization only activates when `includeEmptyDirectories` is true. When disabled, behavior is identical (single globby call with early return removed from the hot path). Also removed unused `TaskRunner` type import from securityCheck.test.ts (leftover from merge conflict resolution). https://claude.ai/code/session_01PD9rdU3XCcC5ecFGwV8Ne8

… from file tokens Replace the expensive full-output tokenization pass (~350ms for 3.8MB) with a computation derived from individual file token counts plus an estimated template overhead. Since the output is primarily composed of the same file contents that are already tokenized individually, the total output token count can be accurately computed as: sum(file_tokens) + overhead_chars × char_to_token_ratio. Key changes: - calculateMetrics: Always tokenize all files individually (not just top 50), then compute total output tokens from the sum of file tokens plus estimated template overhead. This eliminates the separate full-output tokenization pass that previously dominated metrics time. - outputGenerate/createRenderContext: Skip calculateFileLineCounts and calculateMarkdownDelimiter for non-markdown output styles (xml, json, plain). These functions scan all file contents but are only consumed by the markdown template and the skill generation path (which sets style to 'markdown'). - fileSearch/searchFiles: Run file search and empty-directory search globby calls in parallel instead of sequentially when includeEmptyDirectories is enabled. Benchmark results (repomix on its own repo, 998 files, 15 runs each): Before: 1572ms mean After: 1353ms mean Improvement: 219ms (13.9%) The output token count approximation has <0.04% variance from the previous chunk-based approach, which itself introduced similar boundary effects by splitting at arbitrary 200KB positions. https://claude.ai/code/session_01H56SP71cxhxE6CyQzUH6cc

gemini-code-assist Bot reviewed Apr 5, 2026

View reviewed changes

yamadashy added the automated label Apr 5, 2026

claude added 2 commits April 5, 2026 14:27

Merge remote-tracking branch 'origin/perf/auto-perf-tuning-0405' into…

63f95f8

… perf/auto-perf-tuning-0405 # Conflicts: # src/core/metrics/calculateMetrics.ts

claude and others added 21 commits April 5, 2026 15:50

fix(agents): Skip confirmation gate in Step 4, proceed automatically

21802b9

Fix/Improve/Skip/Bot items proceed without user approval. Only Discuss items are deferred to Step 9 for user decision. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

chore(agents): Allow REST API usage and simplify allowed-tools

ca98bde

- Consolidate gh api allowed-tools to Bash(gh api:*) for both GraphQL and REST - Note in Step 2 that REST API may be used when needed Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

fix(agents): Restrict allowed-tools to specific API endpoints

f11c8a6

Revert from broad Bash(gh api:*) to individual endpoint patterns for tighter access control. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

fix(agents): Restrict edits to PR-scoped files in Step 5

2d54e00

Add explicit instruction to only modify files in the current PR diff or directly referenced by feedback, preventing out-of-scope changes. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

fix(server): Remove isbot from root deps and drop server tests

6ebb4eb

isbot is only needed in website/server, not in the root package. Remove test files since website has no test infrastructure. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

fix(server): Add block count to bot guard throttled logs

121dcf3

Include the number of blocked requests in the log message so operators can gauge bot traffic volume without log flooding. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

yamadashy and others added 5 commits April 5, 2026 17:17

fix(website): Update client package-lock.json with isbot dependency

f5062e7

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

yamadashy force-pushed the perf/auto-perf-tuning-0405 branch from 8caff2a to 00abd38 Compare April 5, 2026 21:18

claude and others added 8 commits April 5, 2026 21:24

Merge remote-tracking branch 'origin/perf/auto-perf-tuning-0405' into…

7b3448e

… perf/auto-perf-tuning-0405

[autofix.ci] apply automated fixes

f6f0a9d

yamadashy force-pushed the perf/auto-perf-tuning-0405 branch from a914fec to 3a2f089 Compare April 6, 2026 02:30

autofix-ci Bot and others added 8 commits April 6, 2026 02:31

[autofix.ci] apply automated fixes

5b575b8

Merge remote-tracking branch 'origin/main' into perf/auto-perf-tuning…

8cb5f8b

…-0405

yamadashy force-pushed the perf/auto-perf-tuning-0405 branch from e731768 to 6df346e Compare April 6, 2026 07:53

yamadashy closed this Apr 30, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

perf(core): Automated performance tuning by Claude#1402

perf(core): Automated performance tuning by Claude#1402
yamadashy wants to merge 45 commits into
mainfrom
perf/auto-perf-tuning-0405

yamadashy commented Apr 5, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot commented Apr 5, 2026 •

edited

Loading

Review skipped

Uh oh!

github-actions Bot commented Apr 5, 2026 •

edited

Loading

Uh oh!

codecov Bot commented Apr 5, 2026 •

edited

Loading

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot Apr 5, 2026

Uh oh!

yamadashy Apr 5, 2026

Uh oh!

gemini-code-assist Bot Apr 5, 2026

Uh oh!

yamadashy Apr 5, 2026

Uh oh!

gemini-code-assist Bot Apr 5, 2026

Uh oh!

yamadashy Apr 5, 2026

Uh oh!

cloudflare-workers-and-pages Bot commented Apr 5, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

yamadashy commented Apr 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Eliminate redundant output tokenization (latest)

Benchmark (15 runs, repomix self-pack, 998 files)

Run file and directory globby searches in parallel (previous)

Skip base64 regex scan for files without long lines

Benchmark

Optimistic Output Generation

Benchmark

Prior Optimizations

Checklist

Uh oh!

coderabbitai Bot commented Apr 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Uh oh!

github-actions Bot commented Apr 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

⚡ Performance Benchmark

Uh oh!

codecov Bot commented Apr 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Apr 5, 2026

Choose a reason for hiding this comment

Uh oh!

yamadashy Apr 5, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Apr 5, 2026

Choose a reason for hiding this comment

Uh oh!

yamadashy Apr 5, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Apr 5, 2026

Choose a reason for hiding this comment

Uh oh!

yamadashy Apr 5, 2026

Choose a reason for hiding this comment

Uh oh!

cloudflare-workers-and-pages Bot commented Apr 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Deploying repomix with Cloudflare Pages

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

yamadashy commented Apr 5, 2026 •

edited

Loading

coderabbitai Bot commented Apr 5, 2026 •

edited

Loading

github-actions Bot commented Apr 5, 2026 •

edited

Loading

codecov Bot commented Apr 5, 2026 •

edited

Loading

cloudflare-workers-and-pages Bot commented Apr 5, 2026 •

edited

Loading