Skip to content

Commit b053bbe

Browse files
committed
perf(file): Fast-path the ignore-file predicate without per-path resolution
The isIgnored() predicate returned by buildIgnoreFileFilter routed every tested path through createIgnoreMatcher, which performs ~5 path operations per call (resolve, normalize, relative, isInsidePath check, slash conversion) before reaching the `ignore` matcher — applied to every file and directory the scans emit (~1,400 paths per search on this repo, plus the empty-directory scans), this dominated the post-scan filter cost. fast-glob only ever emits clean, slash-separated paths relative to the scan root. For those, the baseDir-relative form the `ignore` package expects is a constant prefix (the scan root's path below the git root; empty when they coincide) plus the input string, so the fast path is now one string concatenation + ig.ignores(). A single regex routes every other input shape ('', '.'/'..' segments, backslashes, doubled or trailing slashes, absolute paths) to the unchanged legacy matcher, so edge-case semantics stay exactly as before. decision(gitignore-filter): keep createIgnoreMatcher as the fallback for non-fast-glob input shapes instead of replicating its normalization inline — equivalence by construction, and the fallback never runs on the hot path rejected(double-sort): skipping the second sortOutputFiles inside generateOutput — re-measured at ~0.12ms isolated; the change-7 round's 22ms figure was the git-log subprocess cost already eliminated by prefetchSortData rejected(xml-direct-builder): direct string builder replacing the Handlebars xml render — noise-level on the current tip (t=-0.14 over 20 interleaved pairs); earlier ~19ms estimates came from builds without the landed lazy render-context getters rejected(md5-precompute): computing contentCacheKey during processFiles to clear it from the tail — +7.6ms (noise, t=0.86) over 20 interleaved pairs on the current tip learned(bench): this container runs ~1.6-1.7x slower than the previous rounds' quiet host (e2e baseline ~1615ms vs ~800-950ms); relative deltas from interleaved pairs are the comparable metric Benchmark (32 interleaved ABBA pairs, warm, default pack of this repo, 4-core Linux): e2e median 1615ms -> 1540ms, paired mean delta -82.0ms (-5.1%), median delta -74ms (-4.6%), t = -7.68, 29/32 pairs improved. Search phase ([globby] trace, --verbose): 749-778ms -> 643-684ms with identical results (1099 files, 255 directories). Output byte-identical (cmp) vs the previous build for: default pack, subdirectory pack (website/client — exercises the git-root prefix branch), multi-root (src website), and --no-gitignore. 1416/1416 tests pass; lint clean (3 pre-existing warnings in unrelated files). https://claude.ai/code/session_01N3uqykUShsrDKkyvjuKi13
1 parent 632bf8f commit b053bbe

2 files changed

Lines changed: 63 additions & 3 deletions

File tree

src/core/file/gitignoreFilter.ts

Lines changed: 31 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -21,6 +21,7 @@ import fs from 'node:fs/promises';
2121
import nodePath from 'node:path';
2222
import fastGlob from 'fast-glob';
2323
import type { Options as GlobbyOptions } from 'globby';
24+
import ignore from 'ignore';
2425
import {
2526
convertPatternsForFastGlob,
2627
createIgnoreMatcher,
@@ -202,13 +203,40 @@ export const buildIgnoreFileFilter = async (
202203
const resolvedCwd = nodePath.resolve(cwd);
203204
const patternsForFastGlob = convertPatternsForFastGlob(patterns, usingGitRoot);
204205

206+
// Fast path for the strings fast-glob actually emits: clean, slash-separated
207+
// paths relative to the scan root. For those, the baseDir-relative form the
208+
// `ignore` package expects is a constant prefix (scan root's path below the
209+
// git root, empty when they coincide) plus the input — no per-path
210+
// resolve/normalize/relative calls. Routing every tested path through
211+
// `createIgnoreMatcher` instead cost ~5 path operations per call, which
212+
// dominated the post-scan filter (~15ms per search on a ~1,100-file repo).
213+
const ig = ignore().add(patterns);
214+
const cwdFromBase = nodePath.relative(nodePath.resolve(baseDir), resolvedCwd).replace(/\\/g, '/');
215+
const baseRelativePrefix = cwdFromBase ? `${cwdFromBase}/` : '';
216+
217+
// Inputs the fast path cannot take: empty, `.`/`..` segments, backslashes,
218+
// doubled or trailing slashes (all of which path normalization would have
219+
// rewritten), plus absolute paths. fast-glob never produces these, but the
220+
// legacy matcher resolves them exactly as globby did, so fall back for them.
221+
const needsLegacyResolution = /(?:^|\/)\.\.?(?:\/|$)|\\|\/\/|\/$/;
222+
205223
return {
206224
isIgnored: (relativePath: string, isDirectory: boolean): boolean => {
207-
const absolutePath = nodePath.resolve(resolvedCwd, nodePath.normalize(relativePath));
208-
if (matcher(absolutePath).ignored) {
225+
if (relativePath === '' || needsLegacyResolution.test(relativePath) || nodePath.isAbsolute(relativePath)) {
226+
const absolutePath = nodePath.resolve(resolvedCwd, nodePath.normalize(relativePath));
227+
if (matcher(absolutePath).ignored) {
228+
return true;
229+
}
230+
return isDirectory && matcher(absolutePath + nodePath.sep).ignored;
231+
}
232+
233+
const baseRelativePath = baseRelativePrefix + relativePath;
234+
if (ig.ignores(baseRelativePath)) {
209235
return true;
210236
}
211-
return isDirectory && matcher(absolutePath + nodePath.sep).ignored;
237+
// Directory-only rules (`dir/`) only match when the tested path carries a
238+
// trailing separator, mirroring the legacy matcher's second test.
239+
return isDirectory && ig.ignores(`${baseRelativePath}/`);
212240
},
213241
patternsForFastGlob,
214242
};

tests/core/file/gitignoreFilter.test.ts

Lines changed: 32 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -96,6 +96,38 @@ describe('buildIgnoreFileFilter', () => {
9696
expect(filter.isIgnored('excluded/hidden.txt', false)).toBe(false);
9797
});
9898

99+
describe('inputs needing legacy path resolution (never produced by fast-glob)', () => {
100+
test('normalizes dot segments, doubled and trailing slashes before matching', async () => {
101+
await write('.gitignore', '*.log\nbuild/\n');
102+
103+
const filter = await buildIgnoreFileFilter(tempDir, true, IGNORE_FILE_PATTERNS, [], undefined);
104+
// Equivalent to the clean forms after path normalization.
105+
expect(filter.isIgnored('./a.log', false)).toBe(true);
106+
expect(filter.isIgnored('sub/../a.log', false)).toBe(true);
107+
expect(filter.isIgnored('sub//deep/a.log', false)).toBe(true);
108+
expect(filter.isIgnored('build/', true)).toBe(true);
109+
// Negative control through the same fallback branch.
110+
expect(filter.isIgnored('./a.txt', false)).toBe(false);
111+
});
112+
113+
test('never ignores the scan root itself or paths outside the base directory', async () => {
114+
await write('.gitignore', '*\n');
115+
116+
const filter = await buildIgnoreFileFilter(tempDir, true, IGNORE_FILE_PATTERNS, [], undefined);
117+
expect(filter.isIgnored('', false)).toBe(false);
118+
expect(filter.isIgnored('.', true)).toBe(false);
119+
expect(filter.isIgnored('../outside.txt', false)).toBe(false);
120+
});
121+
122+
test('resolves absolute inputs against the base directory', async () => {
123+
await write('.gitignore', '*.log\n');
124+
125+
const filter = await buildIgnoreFileFilter(tempDir, true, IGNORE_FILE_PATTERNS, [], undefined);
126+
expect(filter.isIgnored(path.join(tempDir, 'a.log'), false)).toBe(true);
127+
expect(filter.isIgnored(path.join(tempDir, 'a.txt'), false)).toBe(false);
128+
});
129+
});
130+
99131
describe('with a git root above the scan root', () => {
100132
test('collects parent .gitignore files and anchors them at the git root', async () => {
101133
await fs.mkdir(path.join(tempDir, '.git'), { recursive: true });

0 commit comments

Comments
 (0)