Commit 5ff9b1f
committed
perf(core): Add newline pre-filter to base64 run detection
Skip the per-character `hasLongBase64Run` scan for files whose lines are
all shorter than the 256-char standalone-base64 threshold.
Why:
- `truncateBase64Content` runs on the main thread for every collected
file (no worker pool on the default pack path), so its CPU cost is
fully on the serial critical path. With `truncateBase64: true` (set in
this repo's own repomix.config.json, the benchmark target) it is the
dominant cost of the file-processing phase.
- `hasLongBase64Run` previously charCodeAt-scanned every byte of every
file (~5.5 MB across ~1.1k files) just to gate the standalone-base64
regex.
What:
- A 256-char base64 run cannot contain a newline (`\n` is not a base64
character and resets the run), so it must fit inside a single line.
Before the byte scan, walk newline offsets with the native
`String.prototype.indexOf`; if no line reaches the threshold, no run is
possible and we return early. Files with a long line fall through to
the unchanged full scan.
Behavior-preserving:
- The pre-filter can only short-circuit when a long run is provably
absent; any file with a >=256-char line still runs the authoritative
byte scan, so results are identical. CLI output verified byte-identical
across xml/markdown/json/plain. Isolated run over all 1127 repo files:
0 mismatches vs the previous implementation.
Benchmark (this container, `node bin/repomix.cjs`, warm cache):
- Isolated `truncateBase64Content` over the full repo file set
(interleaved, JIT-warmed median): 42.5ms -> 17.3ms, -25.2ms.
- Whole-process wall clock (interleaved, noise floor):
min -32.5ms (-3.73%)
p25 -24.3ms (-2.53%)
Comfortably above the 2%-of-total improvement bar.
Tests:
- Added newline-split / many-short-lines / CRLF / no-newline cases to
tests/core/file/truncateBase64.test.ts. Full suite: 1345 passing.1 parent 84b2603 commit 5ff9b1f
2 files changed
Lines changed: 56 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
24 | 24 | | |
25 | 25 | | |
26 | 26 | | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
27 | 46 | | |
28 | 47 | | |
29 | 48 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
127 | 127 | | |
128 | 128 | | |
129 | 129 | | |
| 130 | + | |
| 131 | + | |
| 132 | + | |
| 133 | + | |
| 134 | + | |
| 135 | + | |
| 136 | + | |
| 137 | + | |
| 138 | + | |
| 139 | + | |
| 140 | + | |
| 141 | + | |
| 142 | + | |
| 143 | + | |
| 144 | + | |
| 145 | + | |
| 146 | + | |
| 147 | + | |
| 148 | + | |
| 149 | + | |
| 150 | + | |
| 151 | + | |
| 152 | + | |
| 153 | + | |
| 154 | + | |
| 155 | + | |
| 156 | + | |
| 157 | + | |
| 158 | + | |
| 159 | + | |
| 160 | + | |
| 161 | + | |
| 162 | + | |
| 163 | + | |
| 164 | + | |
| 165 | + | |
| 166 | + | |
130 | 167 | | |
131 | 168 | | |
132 | 169 | | |
| |||
0 commit comments