perf(watcher): precompile ignored globs into one subtree-matching regex#14329
perf(watcher): precompile ignored globs into one subtree-matching regex#14329stormslowly wants to merge 5 commits into
Conversation
Rsdoctor Bundle Diff AnalysisFound 5 projects in monorepo, 5 projects with changes. 📊 Quick Summary
📋 Detailed Reports (Click to expand)📁 popular-libsPath:
📁 react-10kPath:
📁 react-5kPath:
📁 ui-componentsPath:
📁 react-1kPath:
Generated by Rsdoctor GitHub Action |
📦 Binary Size-limit
❌ Size increased by 4.00KB from 62.60MB to 62.60MB (⬆️0.01%) |
Merging this PR will improve performance by 11.18%
|
| Mode | Benchmark | BASE |
HEAD |
Efficiency | |
|---|---|---|---|---|---|
| ❌ | Simulation | js@collect imported identifiers |
222.8 µs | 229.1 µs | -2.74% |
| ⚡ | WallTime | bundle@threejs-10x-development |
308.7 ms | 242.9 ms | +27.09% |
Tip
Investigate this regression by commenting @codspeedbot fix this regression on this PR, or directly use the CodSpeed MCP with your agent.
Comparing perf/watcher-ignored-precompiled-regex (6edaf63) with main (1fd4fca)
Footnotes
-
40 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports. ↩
On the macOS/Windows recursive-root watch the OS delivers an event for every path under the project — mostly deep node_modules / build-output that nobody registered — so the per-event ignored filter must be cheap. Translate each ignored glob to a regex ending in `(?:$|/)` (watchpack's trick) so a directory match also covers its subtree, fold all patterns into one precompiled `regex::Regex`, and classify each event with a single `is_match` — O(1) per event instead of O(depth x patterns) ancestor walking. Applied at the event ingress (`Trigger::on_event`) and at registration, which absolutizes first so it filters the same absolute paths events report. Benched in `benches/ignored_filter.rs`: ~20-46x faster per event, ~30x on a 500-event burst.
7646162 to
1706471
Compare
There was a problem hiding this comment.
Pull request overview
This PR improves rspack_watcher’s ignore filtering for recursive-root watch mode by precompiling ignored glob patterns into a single subtree-matching regex and applying it both at path registration and at event ingress, preventing spurious rebuild triggers from ignored directories (e.g. deep node_modules / build output).
Changes:
- Add
IgnoredMatcherthat rewrites ignored globs to subtree-matching regex alternatives and precompiles them into oneregex::Regex. - Filter ignored paths at event ingress (
Trigger::on_event) and during registration (PathUpdater::update) using the same absolute path basis. - Add a Criterion benchmark (
benches/ignored_filter.rs) and wire it up inCargo.toml.
Reviewed changes
Copilot reviewed 6 out of 7 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| crates/rspack_watcher/src/trigger.rs | Drops ignored-subtree events before association bubbling to prevent spurious rebuilds. |
| crates/rspack_watcher/src/paths.rs | Uses IgnoredMatcher during registration; adds PathManager::is_ignored_path. |
| crates/rspack_watcher/src/lib.rs | Re-exports IgnoredMatcher as doc-hidden for benchmarking. |
| crates/rspack_watcher/src/ignored.rs | Implements glob→subtree-regex translation and the new precompiled matcher + tests. |
| crates/rspack_watcher/Cargo.toml | Adds regex dep, Criterion dev-dep, and registers the benchmark. |
| crates/rspack_watcher/benches/ignored_filter.rs | New benchmark comparing ancestor-walk glob matching vs precompiled regex. |
| Cargo.lock | Updates lockfile for new dependencies. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
…hen no patterns Address review on #14329: - `IgnoredMatcher::new` no longer swallows a regex compile error via `.ok()`. Glob escaping guarantees valid syntax so the only realistic failure is the regex size limit on a pathological config; degrade to "no glob filtering" (events still flow, no missed changes) but log it instead of disabling ignores silently. - `is_ignored` short-circuits before normalizing the path when there are no patterns, removing per-event overhead in the common no-ignores config.
Removes the bench, its criterion dev-dependency and `[[bench]]` entry, and the `doc(hidden)` `IgnoredMatcher` re-export that existed only so the bench could reach it. Not needed for now; the production `regex` dependency stays.
Why
On the macOS/Windows recursive-root watch the OS delivers an event for every path under the project — mostly deep
node_modules/ build-output that nobody registered. An event inside an ignored directory isn't a registered file, sofind_associated_eventbubbles it up to a registered parent and emits a[dir](Change)that looks like a real source change — a spurious rebuild that flakes watch-mode tests. The per-event filter must be cheap.ignoredwas also only checked at registration, so these events reachedTrigger::on_eventunfiltered.What
Do what watchpack does: translate each ignored glob to a regex ending in
(?:$|/)so a directory match also covers its subtree, fold all patterns into one precompiledregex::Regex, and classify each event with a singleis_match—O(1)per event, no ancestor walk. Applied at the event ingress and at registration (which absolutizes first so it filters the same absolute paths events report). Compiled once per watcher, reused across the rebuild loop.Perf
benches/ignored_filter.rs: ~20–46× faster per event, ~30× on a 500-event burst. Matching normalizes\→/, so it is separator-agnostic on Windows (portable test included).