Commit 7c4f082
perf(scan): rewrite walker hot path with sync I/O + manual string ops (#805)
## Summary
Walker was 411ms p50 for 10k files on the synthetic/large fixture —
roughly 10× slower than a naive sync walk (50ms). Follow-up to PR #791
and #804.
Five targeted optimizations in the hot loop, all micro-benchmarked
before landing:
| Change | Measured win | End-to-end win |
|---|---|---|
| `readdirSync` instead of `readdir` | 105→45ms (−60ms) | dominant |
| `statSync` instead of `Bun.file().size` | −10ms per 10k files | |
| String concat for paths | path.join 7ms → concat 0.7ms (10×) | |
| `abs.slice(cwdPrefixLen)` for relPath | path.relative 9ms → slice
0.8ms (11×) | |
| Manual `lastIndexOf`-based extname | 9ms → 7ms (25%) | |
### Perf results (synthetic/large, 10k files, p50)
| Op | Before | After | Δ |
|---|---:|---:|---:|
| `scan.walk` | 411ms | **231ms** | **−44%** |
| `scan.walk.noExt` | 572ms | 448ms | **−22%** |
| `scan.walk.dsnParity` | 228ms | **138ms** | **−39%** |
| `scanCodeForDsns` | 323ms | 304ms | −6% |
| `detectAllDsns.cold` | 327ms | 308ms | −6% |
| `detectAllDsns.warm` | 27.9ms | 27.0ms | — |
| `scan.grepFiles` | 322ms | 316ms | noise |
| `scanCodeForFirstDsn` | 2.23ms | 2.05ms | — |
Walker ops are 22-44% faster. Downstream ops (grep, DSN scanner) benefit
less because their time is dominated by content scanning, not walking —
but still show consistent ~6% improvements.
## Why `readdirSync` is safe here
The sync vs async tradeoff usually favors async because blocking the
event loop is bad in general. But measured per-call cost matters:
| Metric | `readdirSync` on 3635 dirs |
|---|---|
| p50 | 11µs |
| p95 | 24µs |
| p99 | 35µs |
| max | 65µs |
65µs max block is trivial — `setTimeout(0)` latency in Node is ~4ms.
Blocking for 65µs never causes noticeable event-loop pauses. And we pay
~60µs of microtask overhead for each async readdir, which wipes out any
theoretical fairness benefit. Net: 2-3× faster per-dir on walks with
many small directories, which is every realistic CLI workload.
If this ever matters for a weird embedded use case, the optimization is
trivially reversible — the walker's public API is unchanged.
## mtime parity fix
Initial implementation regressed `detectAllDsns.warm` from 28ms → 304ms
because `statSync().mtimeMs` is a float (e.g. `1776790602458.1033`)
while `Bun.file().lastModified` is already an integer. The DSN cache
validator compares floored `sourceMtimes`, so un-floored floats caused
cache misses on every warm call. Fixed by flooring explicitly in
`tryYieldFile` — matches the same treatment already applied to
`onDirectoryVisit`'s dirMtimes.
## Walker v2 design notes
- **Sync readdir + async yield.** Readdir blocks briefly (~65µs max),
but the surrounding async-generator interface is preserved. Library
consumers that insert async work between iterations still get
cooperative scheduling.
- **Path ops inlined.** `path.join` / `path.relative` / `path.extname` +
`toLowerCase` are all replaced with manual string ops in the hot loop.
On Windows, `normalizePath` is still applied via `replaceAll(NATIVE_SEP,
"/")` — the POSIX fast path uses a cached `POSIX_NATIVE` module
constant.
- **`WalkContext.cwdPrefixLen`** precomputed once per walk, used to
slice relative paths from absolute paths.
- **Cached module constants** `NATIVE_SEP` and `POSIX_NATIVE` avoid
per-call `path.sep` property lookups in the hot loop.
## What this PR does NOT change
- `WalkEntry` shape: preserved identically (absolutePath, relativePath,
size, mtime, isBinary, depth).
- `WalkOptions` contract: no new options, no semantics changes.
- Symlink handling, `followSymlinks`, cycle detection via
`visitedInodes`: unchanged.
- IgnoreStack + gitignore semantics: unchanged.
- `onDirectoryVisit`, `recordMtimes`, `abortSignal`: unchanged.
## Test plan
- [x] `bunx tsc --noEmit` — clean
- [x] `bun run lint` — clean (1 pre-existing warning in
`src/lib/formatters/markdown.ts` unrelated)
- [x] `bun test --timeout 15000 test/lib test/commands test/types` —
**5640 pass, 0 fail**
- [x] `bun test test/isolated` — 138 pass
- [x] `bun test test/lib/scan/walker.test.ts` — 34 pass (incl. property
tests covering hidden files, symlinks, maxDepth, gitignore interaction)
- [x] `bun test test/lib/dsn/code-scanner.test.ts` — 52 pass (incl.
dirMtimes / sourceMtimes cache validation)
- [x] DSN count correctness verified end-to-end: 4 DSNs found on fixture
(matches pre-change count)
- [x] `bun run bench --size large --runs 5 --warmup 2` — results in
table above
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>1 parent 78f196a commit 7c4f082
2 files changed
Lines changed: 96 additions & 49 deletions
0 commit comments