Skip to content

Commit 7c4f082

Browse files
BYKclaude
andauthored
perf(scan): rewrite walker hot path with sync I/O + manual string ops (#805)
## Summary Walker was 411ms p50 for 10k files on the synthetic/large fixture — roughly 10× slower than a naive sync walk (50ms). Follow-up to PR #791 and #804. Five targeted optimizations in the hot loop, all micro-benchmarked before landing: | Change | Measured win | End-to-end win | |---|---|---| | `readdirSync` instead of `readdir` | 105→45ms (−60ms) | dominant | | `statSync` instead of `Bun.file().size` | −10ms per 10k files | | | String concat for paths | path.join 7ms → concat 0.7ms (10×) | | | `abs.slice(cwdPrefixLen)` for relPath | path.relative 9ms → slice 0.8ms (11×) | | | Manual `lastIndexOf`-based extname | 9ms → 7ms (25%) | | ### Perf results (synthetic/large, 10k files, p50) | Op | Before | After | Δ | |---|---:|---:|---:| | `scan.walk` | 411ms | **231ms** | **−44%** | | `scan.walk.noExt` | 572ms | 448ms | **−22%** | | `scan.walk.dsnParity` | 228ms | **138ms** | **−39%** | | `scanCodeForDsns` | 323ms | 304ms | −6% | | `detectAllDsns.cold` | 327ms | 308ms | −6% | | `detectAllDsns.warm` | 27.9ms | 27.0ms | — | | `scan.grepFiles` | 322ms | 316ms | noise | | `scanCodeForFirstDsn` | 2.23ms | 2.05ms | — | Walker ops are 22-44% faster. Downstream ops (grep, DSN scanner) benefit less because their time is dominated by content scanning, not walking — but still show consistent ~6% improvements. ## Why `readdirSync` is safe here The sync vs async tradeoff usually favors async because blocking the event loop is bad in general. But measured per-call cost matters: | Metric | `readdirSync` on 3635 dirs | |---|---| | p50 | 11µs | | p95 | 24µs | | p99 | 35µs | | max | 65µs | 65µs max block is trivial — `setTimeout(0)` latency in Node is ~4ms. Blocking for 65µs never causes noticeable event-loop pauses. And we pay ~60µs of microtask overhead for each async readdir, which wipes out any theoretical fairness benefit. Net: 2-3× faster per-dir on walks with many small directories, which is every realistic CLI workload. If this ever matters for a weird embedded use case, the optimization is trivially reversible — the walker's public API is unchanged. ## mtime parity fix Initial implementation regressed `detectAllDsns.warm` from 28ms → 304ms because `statSync().mtimeMs` is a float (e.g. `1776790602458.1033`) while `Bun.file().lastModified` is already an integer. The DSN cache validator compares floored `sourceMtimes`, so un-floored floats caused cache misses on every warm call. Fixed by flooring explicitly in `tryYieldFile` — matches the same treatment already applied to `onDirectoryVisit`'s dirMtimes. ## Walker v2 design notes - **Sync readdir + async yield.** Readdir blocks briefly (~65µs max), but the surrounding async-generator interface is preserved. Library consumers that insert async work between iterations still get cooperative scheduling. - **Path ops inlined.** `path.join` / `path.relative` / `path.extname` + `toLowerCase` are all replaced with manual string ops in the hot loop. On Windows, `normalizePath` is still applied via `replaceAll(NATIVE_SEP, "/")` — the POSIX fast path uses a cached `POSIX_NATIVE` module constant. - **`WalkContext.cwdPrefixLen`** precomputed once per walk, used to slice relative paths from absolute paths. - **Cached module constants** `NATIVE_SEP` and `POSIX_NATIVE` avoid per-call `path.sep` property lookups in the hot loop. ## What this PR does NOT change - `WalkEntry` shape: preserved identically (absolutePath, relativePath, size, mtime, isBinary, depth). - `WalkOptions` contract: no new options, no semantics changes. - Symlink handling, `followSymlinks`, cycle detection via `visitedInodes`: unchanged. - IgnoreStack + gitignore semantics: unchanged. - `onDirectoryVisit`, `recordMtimes`, `abortSignal`: unchanged. ## Test plan - [x] `bunx tsc --noEmit` — clean - [x] `bun run lint` — clean (1 pre-existing warning in `src/lib/formatters/markdown.ts` unrelated) - [x] `bun test --timeout 15000 test/lib test/commands test/types` — **5640 pass, 0 fail** - [x] `bun test test/isolated` — 138 pass - [x] `bun test test/lib/scan/walker.test.ts` — 34 pass (incl. property tests covering hidden files, symlinks, maxDepth, gitignore interaction) - [x] `bun test test/lib/dsn/code-scanner.test.ts` — 52 pass (incl. dirMtimes / sourceMtimes cache validation) - [x] DSN count correctness verified end-to-end: 4 DSNs found on fixture (matches pre-change count) - [x] `bun run bench --size large --runs 5 --warmup 2` — results in table above 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent 78f196a commit 7c4f082

2 files changed

Lines changed: 96 additions & 49 deletions

File tree

0 commit comments

Comments
 (0)