Skip to content

perf: replace scope parent-chain lookup with scoped symbol table#14335

Merged
hardfist merged 1 commit into
mainfrom
perf/ds-algo-20260610
Jun 10, 2026
Merged

perf: replace scope parent-chain lookup with scoped symbol table#14335
hardfist merged 1 commit into
mainfrom
perf/ds-algo-20260610

Conversation

@hardfist

@hardfist hardfist commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

Summary

What changed

ScopeInfoDB in the JavaScript parser previously kept one FxHashMap<Atom, VariableInfoId> per scope and resolved a variable by probing the map of every scope along the parent chain, inserting tombstones into the starting scope as a negative cache on misses.

Since the parser enters and exits scopes in strict stack order and always reads/writes through the innermost active scope, this PR replaces the per-scope maps with a classic scoped symbol table:

  • a single global FxHashMap<Atom, SmallVec<[Binding; 2]>> mapping each name to a stack of bindings, innermost last
  • lookup is one hash probe (stack.last()) instead of one probe per enclosing scope
  • each scope records the names it binds; exit_scope (called at the three scope-restore sites in walk.rs) pops them
  • negative-cache tombstone insertion on lookup misses is removed entirely, as misses are already O(1)
  • debug_asserts enforce the stack-discipline invariant

Why it should improve performance

Variable resolution is on the parse hot path and scales with identifier occurrences (dependency-level cardinality). On the cases/all benchmark, callgrind showed ~2.9M lookups performing 8.2M per-scope hash probes plus tombstone-insert churn, ~2.1% of all executed instructions. With the binding-stack table each lookup costs a single probe regardless of scope depth.

Measured effect (callgrind, cases/all)

  • total executed instructions: 39.15B -> 38.46B (-1.75%)
  • scope lookup cost: ~824M Ir (2.1%) -> ~500M Ir (1.3%); the former top hotspot HashMap<Atom, VariableInfoId>::get no longer appears in the profile

Related links

Checklist

  • Tests updated (or not required).
  • Documentation updated (or not required).

Made with Cursor

…l table

The parser resolves variables by probing one hash map per scope while
walking the scope parent chain, averaging ~2.8 probes per lookup plus
tombstone-insert churn for negative caching. Since scopes are entered
and exited in strict stack order and all reads/writes go through the
innermost scope, a single name -> binding-stack table resolves each
lookup with one hash probe.

Co-authored-by: Cursor <cursoragent@cursor.com>
@github-actions

github-actions Bot commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

📝 Benchmark detail: Open

Name Base (b04d9f4) Current Change
10000_big_production-mode_disable-minimize + exec 11.9 s ± 95 ms 11.9 s ± 221 ms +0.11 %
10000_development-mode + exec 855 ms ± 25 ms 840 ms ± 10 ms -1.80 %
10000_development-mode_hmr + stats 150 ms ± 1.8 ms 151 ms ± 11 ms +0.64 %
10000_development-mode_noop-loader + exec 1.91 s ± 13 ms 1.89 s ± 46 ms -0.84 %
10000_production-mode + exec 969 ms ± 47 ms 934 ms ± 19 ms -3.69 %
10000_production-mode_persistent-cold + exec 1.11 s ± 20 ms 1.1 s ± 27 ms -0.74 %
10000_production-mode_persistent-hot + exec 605 ms ± 21 ms 594 ms ± 12 ms -1.82 %
10000_production-mode_source-map + exec 1.12 s ± 38 ms 1.09 s ± 17 ms -1.97 %
arco-pro_development-mode + exec 1.27 s ± 100 ms 1.26 s ± 110 ms -0.49 %
arco-pro_development-mode_hmr + stats 31 ms ± 1.6 ms 31 ms ± 0.32 ms -1.04 %
arco-pro_production-mode + exec 2.29 s ± 114 ms 2.27 s ± 74 ms -1.06 %
arco-pro_production-mode_generate-package-json-webpack-plugin + exec 2.36 s ± 74 ms 2.33 s ± 126 ms -1.38 %
arco-pro_production-mode_persistent-cold + exec 2.41 s ± 39 ms 2.33 s ± 66 ms -3.46 %
arco-pro_production-mode_persistent-hot + exec 342 ms ± 3 ms 342 ms ± 2.5 ms +0.23 %
arco-pro_production-mode_source-map + exec 2.83 s ± 228 ms 2.73 s ± 108 ms -3.57 %
arco-pro_production-mode_traverse-chunk-modules + exec 2.35 s ± 59 ms 2.3 s ± 157 ms -2.50 %
bundled-threejs_development-mode + exec 174 ms ± 9.3 ms 168 ms ± 2.4 ms -3.45 %
bundled-threejs_production-mode + exec 200 ms ± 4.7 ms 195 ms ± 1.9 ms -2.60 %
large-dyn-imports_development-mode + exec 1.12 s ± 29 ms 1.12 s ± 28 ms -0.48 %
large-dyn-imports_production-mode + exec 1.19 s ± 18 ms 1.18 s ± 35 ms -1.08 %
threejs_development-mode_10x + exec 815 ms ± 51 ms 804 ms ± 29 ms -1.30 %
threejs_development-mode_10x_hmr + stats 106 ms ± 5.3 ms 105 ms ± 2.2 ms -1.07 %
threejs_production-mode_10x + exec 2.82 s ± 120 ms 2.78 s ± 105 ms -1.54 %
threejs_production-mode_10x_persistent-cold + exec 2.96 s ± 122 ms 2.91 s ± 183 ms -1.49 %
threejs_production-mode_10x_persistent-hot + exec 370 ms ± 12 ms 366 ms ± 12 ms -1.19 %
threejs_production-mode_10x_source-map + exec 3.56 s ± 30 ms 3.52 s ± 37 ms -1.29 %
10000_big_production-mode_disable-minimize + rss memory 2156 MiB ± 34 MiB 2156 MiB ± 25.4 MiB +0.04 %
10000_development-mode + rss memory 548 MiB ± 7.14 MiB 540 MiB ± 3.29 MiB -1.46 %
10000_development-mode_hmr + rss memory 790 MiB ± 9.54 MiB 776 MiB ± 26.8 MiB -1.72 %
10000_development-mode_noop-loader + rss memory 841 MiB ± 10.8 MiB 841 MiB ± 22.7 MiB +0.01 %
10000_production-mode + rss memory 488 MiB ± 11.5 MiB 482 MiB ± 5.41 MiB -1.13 %
10000_production-mode_persistent-cold + rss memory 682 MiB ± 8.04 MiB 680 MiB ± 13.9 MiB -0.33 %
10000_production-mode_persistent-hot + rss memory 682 MiB ± 33.9 MiB 676 MiB ± 30.3 MiB -0.86 %
10000_production-mode_source-map + rss memory 517 MiB ± 59 MiB 521 MiB ± 9.98 MiB +0.85 %
arco-pro_development-mode + rss memory 468 MiB ± 4.16 MiB 460 MiB ± 16.5 MiB -1.71 %
arco-pro_development-mode_hmr + rss memory 484 MiB ± 14.7 MiB 477 MiB ± 17.1 MiB -1.49 %
arco-pro_production-mode + rss memory 644 MiB ± 13.5 MiB 615 MiB ± 62.4 MiB -4.42 %
arco-pro_production-mode_generate-package-json-webpack-plugin + rss memory 654 MiB ± 14.4 MiB 646 MiB ± 14.3 MiB -1.32 %
arco-pro_production-mode_persistent-cold + rss memory 728 MiB ± 44.5 MiB 720 MiB ± 20.1 MiB -1.05 %
arco-pro_production-mode_persistent-hot + rss memory 360 MiB ± 7.27 MiB 348 MiB ± 14.2 MiB -3.25 %
arco-pro_production-mode_source-map + rss memory 730 MiB ± 25.8 MiB 732 MiB ± 13.5 MiB +0.24 %
arco-pro_production-mode_traverse-chunk-modules + rss memory 636 MiB ± 58.3 MiB 630 MiB ± 59.5 MiB -0.88 %
bundled-threejs_development-mode + rss memory 147 MiB ± 20 MiB 154 MiB ± 3.72 MiB +4.97 %
bundled-threejs_production-mode + rss memory 172 MiB ± 3.75 MiB 173 MiB ± 2.05 MiB +0.27 %
large-dyn-imports_development-mode + rss memory 565 MiB ± 7.96 MiB 563 MiB ± 5.01 MiB -0.46 %
large-dyn-imports_production-mode + rss memory 433 MiB ± 7.56 MiB 430 MiB ± 11.5 MiB -0.83 %
threejs_development-mode_10x + rss memory 514 MiB ± 14.9 MiB 493 MiB ± 9.62 MiB -4.14 %
threejs_development-mode_10x_hmr + rss memory 655 MiB ± 21.6 MiB 635 MiB ± 19.5 MiB -2.94 %
threejs_production-mode_10x + rss memory 673 MiB ± 18.4 MiB 655 MiB ± 15.8 MiB -2.71 %
threejs_production-mode_10x_persistent-cold + rss memory 623 MiB ± 9.27 MiB 619 MiB ± 7.85 MiB -0.68 %
threejs_production-mode_10x_persistent-hot + rss memory 427 MiB ± 15.6 MiB 412 MiB ± 7.95 MiB -3.48 %
threejs_production-mode_10x_source-map + rss memory 783 MiB ± 25.3 MiB 760 MiB ± 47.1 MiB -2.88 %

Base persistent cache hit rate: 👍

Current persistent cache hit rate: 👍

@github-actions

github-actions Bot commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

📦 Binary Size-limit

Comparing e650c3c to feat: support rspack magic comment prefix (#14323) by AsyncIter

❌ Size increased by 4.00KB from 62.60MB to 62.60MB (⬆️0.01%)

@github-actions

github-actions Bot commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

Rsdoctor Bundle Diff Analysis

Found 5 projects in monorepo, 0 projects with changes.

📊 Quick Summary
Project Total Size Gzip Size Change Gzip Change
popular-libs 1.7 MB 551.3 KB 0 0
react-10k 5.6 MB 1.3 MB 0 0
react-1k 822.8 KB 218.3 KB 0 0
react-5k 2.7 MB 669.1 KB 0 0
ui-components 4.8 MB 1.4 MB 0 0

Generated by Rsdoctor GitHub Action

@codspeed-hq

codspeed-hq Bot commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

Merging this PR will improve performance by 6.77%

⚠️ Unknown Walltime execution environment detected

Using the Walltime instrument on standard Hosted Runners will lead to inconsistent data.

For the most accurate results, we recommend using CodSpeed Macro Runners: bare-metal machines fine-tuned for performance measurement consistency.

⚠️ Different runtime environments detected

Some benchmarks with significant performance changes were compared across different runtime environments,
which may affect the accuracy of the results.

Open the report in CodSpeed to investigate

⚡ 1 improved benchmark
✅ 51 untouched benchmarks
⏩ 40 skipped benchmarks1

Performance Changes

Mode Benchmark BASE HEAD Efficiency
Simulation rust@scan_dependencies@three_module 23.8 ms 22.3 ms +6.77%

Tip

Curious why this is faster? Comment @codspeedbot explain why this is faster on this PR, or directly use the CodSpeed MCP with your agent.


Comparing perf/ds-algo-20260610 (e650c3c) with main (b04d9f4)

Open in CodSpeed

Footnotes

  1. 40 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

@hardfist

hardfist commented Jun 9, 2026

Copy link
Copy Markdown
Contributor Author

@codspeedbot explain why this is faster

@codspeed-hq

codspeed-hq Bot commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

@codspeedbot explain why this is faster

Why scan_dependencies@three_module got faster (+6.77%, 23.8 ms → 22.3 ms)

The speedup comes from changing the data structure used for variable resolution in the JavaScript parser, not from doing less work overall — the parser still visits every identifier, it just resolves each one far more cheaply.

What was slow before

The old ScopeInfoDB kept one HashMap<Atom, VariableInfoId> per scope. Resolving a name meant walking the scope parent chain and doing a hash probe in each enclosing scope until a hit — averaging ~2.8 probes per lookup — and on a miss it wrote a tombstone back into the starting scope as a negative cache. On cases/all, callgrind attributed ~824M instructions (~2.1% of the whole run) to this, with HashMap<Atom, VariableInfoId>::get as the single top hotspot. Variable resolution is on the parse hot path and scales with identifier count, so this cost grew with module size and scope nesting.

What the PR changed

It replaces the per-scope maps with a classic scoped symbol table: one global FxHashMap<Atom, SmallVec<[Binding; 2]>> mapping each name to a stack of bindings (innermost last). Because the parser enters/exits scopes in strict stack order and always resolves through the innermost scope:

  • Lookup is one hash probe (stack.last()) regardless of how deeply nested the scope is, instead of one probe per enclosing scope.
  • Misses are O(1), so the tombstone negative-cache writes are removed entirely.
  • Each scope records the names it binds, and exit_scope simply pops them off the stacks.

What the profile confirms

In the HEAD flamegraph for this benchmark, the former top hotspot HashMap<Atom, VariableInfoId>::get no longer appears at all. Instead you see the new, much cheaper ScopeInfoDB::exit_scope doing the binding-stack pops, and the remaining hot self-time is now in the unavoidable parser machinery (walk_statement, pre_walk_statement) and memchr work inside dependency-location advancing — not scope lookup. On the broader cases/all measurement this shows up as total executed instructions dropping 39.15B → 38.46B (-1.75%) and scope-lookup cost falling from ~2.1% to ~1.3% of instructions.

One caveat on the number itself

The reported 6.77% is a walltime measurement on a standard (non-macro) hosted runner, and CodSpeed flagged both an unknown walltime environment and a runtime-environment difference for this comparison — so treat the exact percentage as noisier than usual. The deterministic instruction-count reduction from the callgrind/simulation data is the more reliable signal that this is a genuine improvement, and the direction matches: fewer hash probes per lookup plus no tombstone churn.

Summary
  • Warming up the performance wizard...
  • Analyzing performance improvement — Posted analysis: the scoped symbol table reduces variable resolution from ~2.8 hash probes per lookup (plus tombstone writes) to a single probe; HEAD flamegraph confirms the old HashMap::get hotspot is gone.
  • Handling comment — Comment handling finished successfully

Was this response helpful? React with 👍 or 👎 to help us improve CodSpeedBot.

@hardfist hardfist marked this pull request as ready for review June 10, 2026 02:40
Copilot AI review requested due to automatic review settings June 10, 2026 02:40
@hardfist hardfist merged commit a954cf2 into main Jun 10, 2026
39 checks passed
@hardfist hardfist deleted the perf/ds-algo-20260610 branch June 10, 2026 02:40

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR optimizes variable resolution in the JavaScript parser by replacing per-scope hash maps + parent-chain probing with a single scoped symbol table (name → stack of bindings). This reduces variable lookup to one hash probe regardless of scope depth and removes the previous negative-cache tombstone insertion on misses.

Changes:

  • Reworked ScopeInfoDB to maintain a global bindings map from identifier → binding stack, plus per-scope defined tracking and a new exit_scope unwinding API.
  • Updated parser scope entry/exit sites to call exit_scope when restoring previous scopes.
  • Switched “current scope variables” enumeration to a new scope_variables iterator and added unit tests covering shadowing, deletion masking, and enumeration behavior.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

File Description
crates/rspack_plugin_javascript/src/visitors/scope_info.rs Implements the binding-stack scoped symbol table, adds scope unwind (exit_scope), and adds unit tests.
crates/rspack_plugin_javascript/src/visitors/dependency/parser/walk.rs Hooks scope unwinding into the three scope-restore paths to keep the binding stacks consistent.
crates/rspack_plugin_javascript/src/visitors/dependency/parser/mod.rs Updates current-scope variable enumeration to use ScopeInfoDB::scope_variables.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +134 to +141
for key in &defined {
if let Some(stack) = self.bindings.get_mut(key)
&& let Some(top) = stack.last()
&& top.scope == id
{
stack.pop();
}
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants