Skip to content

Commit 125cd12

Browse files
Spec 034 — Element allocation reduction (LayoutModifiers / VisualModifiers shim, UseMemoCells, REACTOR_HOOKS_007) (#126)
* spec 034 §A — bucketed ElementModifiers (Component A) Move 27 layout/visual fields off the parent ElementModifiers record into two lazy sub-records (LayoutModifiers, VisualModifiers). Public surface stays identical via get/init shim properties; Merge merges buckets at the sub-record level so a Padding-only update no longer clones every unrelated field. - src/Reactor/Core/Element.cs: 17-field LayoutModifiers + 10-field VisualModifiers, parent shims, bucket-aware Merge - tests/Reactor.Tests: LayoutModifiersTests, VisualModifiersTests, ElementModifiersBucketTests covering Merge semantics, shim ⇄ bucket round-trip, and bucket-reference preservation under partial Merge All 6,748 unit tests pass (6,727 pre-existing + 21 new). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * spec 034 §B — StressPerf.ReactorOptimized + advanced.md "Hot loops" Add a sibling stress-perf project that demonstrates the direct-record- initializer cell-construction idiom from spec 034 §B. The naive StressPerf.Reactor stays untouched and remains the framework-level baseline; the new optimized sibling will accumulate Components B + C through Phases 2–4. - tests/stress_perf/StressPerf.ReactorOptimized: cloned from StressPerf.Reactor, inner-loop replaced with direct new TextBlockElement { Modifiers = new ElementModifiers { Layout = …, Visual = … } } construction. UseMemoCellsByIndex arrives in Phase 4. - Reactor.sln: project entry, Debug/Release × x64/ARM64 config rows, parent mapping under the StressPerf solution folder. - run_stocks_grid_baseline.ps1, run_bench_aot_publish.sh, run_benchmark.sh, run_sweep_arm64.ps1: ReactorOptimized variant added alongside the naive Reactor row. - docs/_pipeline/templates/advanced.md.dt: new "Hot loops" section with side-by-side fluent/direct example, workload-shape guidance, trade-offs, and a forward reference to spec 008's builder pattern. Compiled output in docs/guide/advanced.md regenerated via mur. - tests/stress_perf/README.md and SPEC.md: document the naive vs optimized split. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * spec 034 §C — UseMemoCells hooks + REACTOR_HOOKS_007 analyzer (Component C) Add cell-level memoization for high-frequency list / grid bodies plus the companion analyzer that catches missing closure-capture deps at compile time. The hook intentionally takes params-deps to match UseMemo/UseEffect/UseCallback; the analyzer is what makes the params shape safe. - src/Reactor/Hooks/UseMemoCells.cs: three RenderContext extension methods (UseMemoCells / UseMemoCellsByKey / UseMemoCellsByIndex) plus matching Component-class shims. ArgumentNullException at validation boundaries; ArgumentOutOfRangeException for bad index in ByIndex. - src/Reactor.Analyzers/UseMemoCellsAnalyzer.cs: REACTOR_HOOKS_007. Walks the builder lambda's data flow, enumerates captures (filtering the lambda's own params, the implicit-this parameter, method symbols, const, and static-readonly fields), reports any capture not present in the trailing deps slot. - src/Reactor.Analyzers/UseMemoCellsCodeFix.cs: appends the missing capture to the deps slot. Handles both params-tail and explicit array-literal forms. - AnalyzerReleases.Unshipped.md: REACTOR_HOOKS_007 row. - tests/Reactor.Tests/UseMemoCellsTests.cs: 22 hook tests covering first render, full reuse, partial reuse, deps invalidation, count growth/shrink, ArgumentNullException at boundaries, ByKey identity / mutation / reorder / duplicate-key last-write-wins, ByIndex empty / single / out-of-range / count-change fallback. - tests/Reactor.Tests/AnalyzerTests/UseMemoCellsAnalyzerTests.cs: 11 analyzer + codefix tests covering happy path, missing-deps warning, zero-deps + capture, this-field capture, indirect-capture blind spot, ByKey / ByIndex variants, codefix transforms input → input with capture appended. - docs/_pipeline/templates/advanced.md.dt: "Memoizing list cells" section with the three-overload table, gen2 trade-off, and analyzer pointer. Compiled output regenerated via mur. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * spec 034 — wire UseMemoCellsByIndex into ReactorOptimized + skill + close Final phases of spec 034: - tests/stress_perf/StressPerf.ReactorOptimized/Program.cs: replace the inner cell `for` loop with `UseMemoCellsByIndex(data, changedIndices, builder, GreenBrush, RedBrush)`. The data source's existing `Update()` return value already produced the changed-index list; thread it via `UseRef<IReadOnlyList<int>>` between the timer Tick and Render. ReactorOptimized is now the spec's canonical "all three components combined" reference. - skills/perf-tips.md: agent-facing playbook covering UseMemo / UseCallback, UseMemoCells (3 overloads), direct-record-initializer, COM-resource caching, the gen2 trade-off, profiling entry points, and "when NOT to" — written so an unfamiliar agent can pick the right tool from this skill alone. - docs/specs/034-element-allocation-reduction.md: status flipped from Drafted → Implemented (2026-05-02). Production re-bench is logged as a follow-up; the prototype's table from reactor-vs-direct-10pct.md remains the operating reference until that re-bench lands. - CHANGELOG.md: top-level "Spec 034 — Element allocation reduction" rollup bullet under Changed. Sample audit (Phase 5): no samples have a workload that crosses the threshold (~50+ items × frequent mutation × pure-of-T builder) where UseMemoCells would beat the fluent chain. The reference implementation is StressPerf.ReactorOptimized; samples stay fluent. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * spec 034 close-out — verified perf table + Phase 5/7/8 wrap-up - Same-day three-variant bench (Reactor / ReactorOptimized / Direct) at 20/50/100 % mutation, 10s ARM64 Release. Captures reconcile-time delta as the cleanest in-app proxy for the alloc story (no ETW — PresentTracer needs admin and was not run for this close-out). - Verified close-out section appended to spec 034 with the matrix and reads. Headline: ReactorOptimized cuts reconcile time -60 % at 20 % mutation, -33 % at 50 %, -8 % at 100 %, clearing Direct outright at every rate sampled. - CHANGELOG rollup updated with verified numbers (replaced the prototype-only placeholder). - Phase 5 sample audit complete: zero migrations. Audit findings documented in commit body — TodoApp lists are too small, PerfStressDemo bars rebuild every render, ChatTimeline already uses keys, gallery samples are demos. - Phase 7 spec status flipped to "Implemented — 2026-05-02" with the re-bench locked in. - Phase 8 AOT smoke logged as follow-up (vswhere/MSVC linker not on this shell's PATH); spec 034 surface is reflection-free by inspection. - Implementation task file updated with phase-level completion markers and outstanding-followups annotations. Adds: tests/stress_perf/run_spec034_bench.ps1 tests/stress_perf/baselines/spec-034-final.{csv,log} Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * spec 034 — capture pre/post Reactor baseline (Component A in isolation) Re-bench'd the unmodified naive StressPerf.Reactor source against the pre-spec-034 commit (247a525, parent of Component A merge) via git worktree. Same source, same fluent-chain usage on both sides — only the framework changed (transparent LayoutModifiers/VisualModifiers shim). Result at 20/50/100 % mutation (10 s, ARM64 Release): Mutation Pre-A renders Post-A renders ΔRenders ΔReconcile 20 % 81 77 -4.9 % +2.3 % 50 % 48 50 +4.2 % -16.7 % 100 % 34 34 0 % -3.4 % Honest read: Component A's transparent shim does NOT deliver renders/sec uplift outside run-to-run noise at these mutation rates. The cleanest signal is -16.7 % reconcile at 50 %; 20 % and 100 % are within noise. This is consistent with Component A being an allocation- side improvement (~-11 % bytes/tick per the prototype) on a workload that is GC-bound at high mutation. The prototype's predicted +6 % renders was at 10 % mutation — a point not sampled here. PR framing for Component A: the win is real on the allocation axis, invisible on renders/sec at the high mutation rates sampled, and worth re-measuring at 10 % before quoting a renders-side number. Components B + C carry the user-visible perf story. Spec close-out section + CHANGELOG rollup updated to reflect the isolated-A measurement instead of folding it into the ReactorOptimized headline. Adds: tests/stress_perf/run_spec034_reactor_before.ps1 tests/stress_perf/baselines/spec-034-reactor-before.{csv,log} Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * stress_perf: full-matrix bench script with ETW + auto-build run_full_matrix.ps1 — single-entry script that builds and benches every C# StressPerf variant (Direct, Bound, Wpf, DirectX, Reactor, ReactorOptimized, ReactorGrid, VirtualListReactor) at 10/20/50/100 % mutation, 10 s each, ARM64 Release, with PresentTracer ETW capture for ground-truth Present/sec. - Hard admin gate up front: PresentTracer needs an elevated ETW kernel session (DxgKrnl), so the script aborts with a clear message if not run from an admin shell. -SkipETW lets you run in-app-metrics-only mode without admin (no Present/sec, no DxgKrnl rates). - Auto-builds each variant csproj + PresentTracer at the configured Configuration / Platform (default: Release / ARM64). -SkipBuild reuses existing binaries. - -VariantFilter @('Reactor','ReactorOptimized') restricts the run. - -Repeats N for noise control; output CSV has per-run rows + a per-(variant, percent) median/min/max summary. - Output goes to baselines/full-matrix-<timestamp>/ with run.log, run.csv, run.summary.csv. Each run gets its own dir so historical matrices accumulate. Replaces the half-broken run_stocks_grid_baseline.ps1 (hardcoded reactor3 paths, no admin gate, no build phase, fixed-three-percent). That script is left in place for now in case anything else references its CSV path; new runs should target run_full_matrix.ps1. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * stress_perf: add RN-Fabric to full-matrix bench, drop VirtualListReactor - New RN-Fabric variant entry: IsRN=$true, Exe path under tests/stress_perf_rn/StocksGrid/windows/ARM64/Release/StocksGrid.exe. In-app report scraped via UIA from the testID="HeadlessReport" Text node before process teardown (RN doesn't write report files to disk). - Build phase auto-builds RN: npm install (only if node_modules missing) followed by npx @react-native-community/cli run-windows --release --arch arm64 --no-launch --no-deploy in tests/stress_perf_rn/StocksGrid. ~10-15 min on first build, incremental after. - Variant table loses VirtualListReactor — not needed for the spec 034 comparisons. - Build phase skips the npm/npx step entirely if no RN variant is in the requested set; missing-exe error message points the user at the right build command for either kind of variant. - Run loop branches on IsRN: • C# variants — Read-VariantReport from .report.txt as before. • RN — Scrape-RnReport via UIA before killing the process. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * stress_perf: clearer error message when RN-Fabric build can't find VS react-native-windows 0.82 specifically wants VS 17.11+ and does not accept VS 18 — same VS-detection issue that blocks the AOT smoke elsewhere. When the build fails, point the user at the three most likely causes (VS 17 missing, admin shell, node_modules drift) plus the -VariantFilter escape hatch. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * stress_perf: auto-set MinimumVisualStudioVersion=18 for RN-Fabric on VS 18-only machines react-native-windows 0.82's MSBuild detection (findLatestVsInstall in @react-native-windows/cli/lib-commonjs/utils/vsInstalls.js) clamps the acceptable VS version range to [minVersion, floor(minVersion)+1). The default minVersion is 17.11.0, so the range becomes [17.11.0, 18.0) — which excludes VS 18 entirely. The CLI does honor the MinimumVisualStudioVersion env var to override the floor. Detect Visual Studio installations under both Program Files locations before invoking npx; if VS 17 is missing but VS 18 is present, set MinimumVisualStudioVersion=18.0 in the build's environment so the range becomes [18.0, 19.0) and the installed VS 18.x matches. Restored to whatever it was before (typically unset) when the build finishes. No-op when VS 17 is on the box. Verified: with this fix, the RN-Fabric solution builds clean against VS 18.5 Enterprise on this machine in ~2 min. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * stress_perf: auto-build single missing target under -SkipBuild -SkipBuild used to fail the whole run if any single target's exe was missing — which bites when you've already built the variants but forgot PresentTracer (or vice versa). Now each missing C# target auto-builds on demand right before the run starts, so -SkipBuild skips the bulk-rebuild but still recovers from "I just added one new variant" or "PresentTracer was never built on this branch." RN-Fabric still hard-errors with the npm + npx command to run manually — auto-running react-native run-windows from inside this script would block the matrix on a 5-min build with confusing output interleaving. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * spec 034 — verified close-out with ETW Present-tracking Replaces the no-ETW close-out section with the 2026-05-03 full-matrix bench: 8 variants × 4 mutation rates (10/20/50/100 %) × 10 s, ARM64 Release, ETW Present-tracking via PresentTracer (admin shell). The prototype's 10 % headline is now validated on production code. Headline: ReactorOptimized at 10 % mutation reaches 17.1 Effective Refresh/s - within run-to-run noise of DirectX (17.2) and Wpf (17.9) - +66 % over naive Reactor (10.3) on the same hardware Reconcile-time win on the same A/B (Reactor → ReactorOptimized): 10 %: 32.5 ms → 7.9 ms (-76 %) 20 %: 36.8 ms → 14.4 ms (-61 %) 50 %: 43.9 ms → 30.4 ms (-31 %) 100 %: 53.7 ms → 47.3 ms (-12 %) Memo's win tracks the partial-reuse opportunity exactly as predicted in §C: large at low mutation, narrows toward saturation. DirectX runs away above 50 % mutation - no allocating framework keeps up once GC pressure dominates, which is a known shape, not a spec 034 regression. The "ReactorOptimized > Direct at every rate" result deserves an asterisk - StressPerf.Direct/MainWindow.cs has dev instrumentation (File.AppendAllText to C:\temp on the UI thread, plus per-tick header TextBlock SetValues regardless of value change) that biases the comparison. The headline (matches DirectX/Wpf within noise) is robust; the "beats Direct" framing should be read as "ties or beats after fixing those scaffolding warts." Filed as a bench follow-up. Two scrape anomalies noted honestly in the doc - Bound @ 10 % and RN-Fabric @ 50 % exited before the script's 500 ms post-run UIA scrape window. ETW Present rates were captured cleanly so the variants did run; only in-app Total Renders is zero in those rows. Adds: tests/stress_perf/baselines/full-matrix-2026-05-03-070935/{run.csv,run.summary.csv,run.log} CHANGELOG rollup updated to reference the verified numbers. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * spec 034 — address Copilot CR feedback on PR #126 Three review comments from copilot-pull-request-reviewer. 1. UseMemoCellsCodeFix re-parsed the capture name from the diagnostic message text. Brittle to message edits / localization. Now the analyzer attaches the capture name to Diagnostic.Properties under UseMemoCellsAnalyzer.CaptureNameProperty; the codefix reads it from there and falls back to message parsing only for diagnostics from stale analyzer builds. Existing 11 analyzer tests still pass — the round-trip is internal plumbing, behavior is unchanged. 2. UseMemoCellsByIndex XML doc claimed item-count change was "not supported" and callers "must fall back" to UseMemoCells, but the implementation already detects prev.Children.Length != count and does a full rebuild (Hooks/UseMemoCells.cs:211). Doc updated to describe the actual behavior — count change → full rebuild, changedIndices treated as "rebuild everything" — and points callers whose lists grow/shrink frequently at the value- or key-equality overloads for better incremental reuse. 3. tests/stress_perf/SPEC.md still claimed net8.0-windows10.0.22621.0 but every csproj targets net9.0 (WinUI: net9.0-windows10.0.22621.0, WPF: net9.0-windows, PresentTracer: net9.0). Updated to list the actual TFMs per variant family. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent 25e1bb6 commit 125cd12

35 files changed

Lines changed: 7955 additions & 119 deletions

CHANGELOG.md

Lines changed: 54 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -29,6 +29,33 @@ to land under these conventions; subsequent specs follow this shape.
2929

3030
### Added
3131

32+
- `Microsoft.UI.Reactor.Hooks.UseMemoCells` /
33+
`UseMemoCellsByKey` / `UseMemoCellsByIndex` — cell-level memoization
34+
hooks (extension methods on `RenderContext`, plus matching `Component`
35+
shims) for high-frequency list/grid bodies. Cells whose item value
36+
(and declared deps) haven't changed since the previous render are
37+
reused by reference; the reconciler short-circuits on
38+
`ReferenceEquals` and skips diffing entirely. (spec 034 §C)
39+
- `REACTOR_HOOKS_007` analyzer + codefix — warns when a `UseMemoCells`
40+
builder lambda closes over a value that isn't declared in the
41+
`params deps` list, which would silently render stale. The codefix
42+
appends the missing capture to the deps slot. Indirect captures
43+
through helper methods are a documented blind spot. (spec 034 §C)
44+
- "Memoizing list cells" section in `docs/guide/advanced.md` covering
45+
the three overloads, when each is the right hammer, the gen2
46+
trade-off, and the analyzer-as-safety-net story. (spec 034 §C)
47+
- `tests/stress_perf/StressPerf.ReactorOptimized` — sibling bench
48+
variant that demonstrates the spec-034 §B direct-record-initializer
49+
idiom for inner-loop cell construction. The naive `StressPerf.Reactor`
50+
variant stays unchanged and remains the framework-level baseline; the
51+
new optimized sibling is the reference implementation of the perf-tips
52+
skill. Wired into `run_stocks_grid_baseline.ps1`,
53+
`run_bench_aot_publish.sh`, `run_benchmark.sh`, and
54+
`run_sweep_arm64.ps1`. (spec 034 §B)
55+
- "Hot loops" section in `docs/guide/advanced.md` documenting when to
56+
reach for direct record initializers, the trade-offs vs the fluent
57+
chain, and a side-by-side worked example. Source template at
58+
`docs/_pipeline/templates/advanced.md.dt`. (spec 034 §B)
3259
- `Expr(Func<Element?>)` factory in `Microsoft.UI.Reactor.Factories` for inline
3360
block-expression bodies inside a DSL tree, removing the
3461
`((Func<Element?>)(() => …))()` cast ceremony. Pure composition — no hooks,
@@ -70,6 +97,33 @@ to land under these conventions; subsequent specs follow this shape.
7097

7198
### Changed
7299

100+
- **Spec 034 — Element allocation reduction.** Three independent
101+
allocation cuts in one PR: bucketed `ElementModifiers` (transparent
102+
storage shim, ~−11% bytes/tick on the 4,900-cell stress grid),
103+
direct-record-initializer idiom for inner cell loops (~−60% bytes
104+
per cell), and `UseMemoCells` cell-level memoization. Verified at
105+
PR-close on ARM64 Release with full ETW Present-tracking across
106+
10/20/50/100% mutation, all eight stress_perf variants:
107+
**ReactorOptimized at 10% mutation reaches 17.1 Effective Refresh/s
108+
— within noise of DirectX (17.2) and Wpf (17.9), and +66% over
109+
naive Reactor (10.3).** Reconcile-time win on the same A/B: −76% at
110+
10% (32.5 ms → 7.9 ms), −61% at 20%, −31% at 50%, −12% at 100% —
111+
memo's win tracks the partial-reuse opportunity exactly as
112+
predicted. DirectX runs away at saturation (50%+) — no allocating
113+
framework can keep up there. Component A in isolation (naive
114+
Reactor pre-shim vs post-shim, same source, no app-code changes)
115+
shows renders/sec within run-to-run noise at 20/50/100% — its win
116+
is allocation-side, not renders-side, on this hardware. See
117+
`docs/specs/034-element-allocation-reduction.md` § "Verified
118+
close-out — 2026-05-03" for the full eight-variant matrix and
119+
reads. (spec 034)
120+
- `ElementModifiers` now stores layout and visual fields in
121+
`LayoutModifiers` / `VisualModifiers` sub-records. Existing call sites are
122+
unaffected — public properties (`Padding`, `Margin`, `Foreground`,
123+
`Background`, …) shim through to the appropriate bucket on read and write.
124+
Perf-critical inner loops may construct buckets directly via the new
125+
`Layout = …` / `Visual = …` initializer slots to avoid a fat
126+
`ElementModifiers` clone per fluent step. (spec 034 §A)
73127
- `PersistedStateCache` rewritten over an LRU cache with eviction-on-full
74128
semantics. The previous "refuse new keys when 4096 entries are present"
75129
policy is replaced — later, hotter keys are no longer starved by the

Reactor.sln

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -39,6 +39,8 @@ Project("{FAE04EC0-301F-11D3-BF4B-00C04F79EFBC}") = "StressPerf.Bound", "tests\s
3939
EndProject
4040
Project("{FAE04EC0-301F-11D3-BF4B-00C04F79EFBC}") = "StressPerf.Reactor", "tests\stress_perf\StressPerf.Reactor\StressPerf.Reactor.csproj", "{1CBE61F7-04BC-44FE-B2C9-85A6EF22A653}"
4141
EndProject
42+
Project("{FAE04EC0-301F-11D3-BF4B-00C04F79EFBC}") = "StressPerf.ReactorOptimized", "tests\stress_perf\StressPerf.ReactorOptimized\StressPerf.ReactorOptimized.csproj", "{5A1B2C3D-4E5F-6789-ABCD-1234567890AB}"
43+
EndProject
4244
Project("{FAE04EC0-301F-11D3-BF4B-00C04F79EFBC}") = "StressPerf.Wpf", "tests\stress_perf\StressPerf.Wpf\StressPerf.Wpf.csproj", "{92A235CE-7700-4A4C-83A2-D1A1FD9BB593}"
4345
EndProject
4446
Project("{FAE04EC0-301F-11D3-BF4B-00C04F79EFBC}") = "StressPerf.DirectX", "tests\stress_perf\StressPerf.DirectX\StressPerf.DirectX.csproj", "{CC8B6B85-4256-403C-B6B6-68DE09C80B54}"
@@ -383,6 +385,22 @@ Global
383385
{1CBE61F7-04BC-44FE-B2C9-85A6EF22A653}.Release|Any CPU.Build.0 = Release|x64
384386
{1CBE61F7-04BC-44FE-B2C9-85A6EF22A653}.Release|x86.ActiveCfg = Release|x64
385387
{1CBE61F7-04BC-44FE-B2C9-85A6EF22A653}.Release|x86.Build.0 = Release|x64
388+
{5A1B2C3D-4E5F-6789-ABCD-1234567890AB}.Debug|ARM64.ActiveCfg = Debug|ARM64
389+
{5A1B2C3D-4E5F-6789-ABCD-1234567890AB}.Debug|ARM64.Build.0 = Debug|ARM64
390+
{5A1B2C3D-4E5F-6789-ABCD-1234567890AB}.Debug|x64.ActiveCfg = Debug|x64
391+
{5A1B2C3D-4E5F-6789-ABCD-1234567890AB}.Debug|x64.Build.0 = Debug|x64
392+
{5A1B2C3D-4E5F-6789-ABCD-1234567890AB}.Debug|Any CPU.ActiveCfg = Debug|x64
393+
{5A1B2C3D-4E5F-6789-ABCD-1234567890AB}.Debug|Any CPU.Build.0 = Debug|x64
394+
{5A1B2C3D-4E5F-6789-ABCD-1234567890AB}.Debug|x86.ActiveCfg = Debug|x64
395+
{5A1B2C3D-4E5F-6789-ABCD-1234567890AB}.Debug|x86.Build.0 = Debug|x64
396+
{5A1B2C3D-4E5F-6789-ABCD-1234567890AB}.Release|ARM64.ActiveCfg = Release|ARM64
397+
{5A1B2C3D-4E5F-6789-ABCD-1234567890AB}.Release|ARM64.Build.0 = Release|ARM64
398+
{5A1B2C3D-4E5F-6789-ABCD-1234567890AB}.Release|x64.ActiveCfg = Release|x64
399+
{5A1B2C3D-4E5F-6789-ABCD-1234567890AB}.Release|x64.Build.0 = Release|x64
400+
{5A1B2C3D-4E5F-6789-ABCD-1234567890AB}.Release|Any CPU.ActiveCfg = Release|x64
401+
{5A1B2C3D-4E5F-6789-ABCD-1234567890AB}.Release|Any CPU.Build.0 = Release|x64
402+
{5A1B2C3D-4E5F-6789-ABCD-1234567890AB}.Release|x86.ActiveCfg = Release|x64
403+
{5A1B2C3D-4E5F-6789-ABCD-1234567890AB}.Release|x86.Build.0 = Release|x64
386404
{92A235CE-7700-4A4C-83A2-D1A1FD9BB593}.Debug|ARM64.ActiveCfg = Debug|ARM64
387405
{92A235CE-7700-4A4C-83A2-D1A1FD9BB593}.Debug|ARM64.Build.0 = Debug|ARM64
388406
{92A235CE-7700-4A4C-83A2-D1A1FD9BB593}.Debug|x64.ActiveCfg = Debug|x64
@@ -1347,6 +1365,7 @@ Global
13471365
{E21C1E62-3135-4577-8130-9023A2D8E0AE} = {B899CF64-C19C-0C08-9E9A-B3F6D048BA53}
13481366
{FE122C30-D3DF-4AA9-9EAD-B63C804D42C7} = {B899CF64-C19C-0C08-9E9A-B3F6D048BA53}
13491367
{1CBE61F7-04BC-44FE-B2C9-85A6EF22A653} = {B899CF64-C19C-0C08-9E9A-B3F6D048BA53}
1368+
{5A1B2C3D-4E5F-6789-ABCD-1234567890AB} = {B899CF64-C19C-0C08-9E9A-B3F6D048BA53}
13501369
{92A235CE-7700-4A4C-83A2-D1A1FD9BB593} = {B899CF64-C19C-0C08-9E9A-B3F6D048BA53}
13511370
{CC8B6B85-4256-403C-B6B6-68DE09C80B54} = {B899CF64-C19C-0C08-9E9A-B3F6D048BA53}
13521371
{E2F3A4B5-C6D7-8901-2345-6789ABCDEF12} = {5D20AA90-6969-D8BD-9DCD-8634F4692FDA}

docs/_pipeline/templates/advanced.md.dt

Lines changed: 107 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -110,6 +110,113 @@ mutate the original `ObservableCollection` in event handlers. Reactor
110110
subscribes to `CollectionChanged` and triggers a re-render on any
111111
modification.
112112

113+
## Hot loops
114+
115+
The fluent modifier chain is ergonomic but allocates an `ElementModifiers`
116+
clone per `with`-step. For ordinary UI that is invisible — a button has
117+
five modifiers, the cost is two extra small records on a click handler.
118+
For inner-loop cell construction in a 4,900-cell grid that re-renders
119+
30× per second, those clones dominate the allocation profile.
120+
121+
The escape hatch is to construct the `Element` and its `ElementModifiers`
122+
record directly, skipping the fluent chain entirely. The
123+
`LayoutModifiers` and `VisualModifiers` sub-records are public types
124+
specifically so perf-critical code can build them once instead of having
125+
the fluent chain rebuild them step-by-step.
126+
127+
```csharp
128+
// Fluent — five clones per cell. Right tool for ordinary UI.
129+
var cell = TextBlock(label)
130+
.FontSize(8)
131+
.Foreground(item.IsUp ? GreenBrush : RedBrush)
132+
.Padding(2, 1, 2, 1)
133+
.Grid(row: r, column: c);
134+
135+
// Direct record initializer — one TextBlockElement, one ElementModifiers,
136+
// two bucket sub-records, one Attached dictionary. Use only when the
137+
// allocation cost shows up in profiles.
138+
var cell = new TextBlockElement(label)
139+
{
140+
FontSize = 8,
141+
Modifiers = new ElementModifiers
142+
{
143+
Layout = new LayoutModifiers { Padding = new Thickness(2, 1, 2, 1) },
144+
Visual = new VisualModifiers { Foreground = item.IsUp ? GreenBrush : RedBrush },
145+
},
146+
Attached = new Dictionary<Type, object>(1)
147+
{
148+
[typeof(GridAttached)] = new GridAttached(r, c, 1, 1),
149+
},
150+
};
151+
```
152+
153+
**Workload shape.** Use this idiom in lists or grids with hundreds-plus
154+
elements per render — tickers, log tables, observability dashboards. Don't
155+
adopt it for ordinary screens. The fluent chain remains the right tool for
156+
everything except the inner cell loop.
157+
158+
**Trade-offs.** Roughly halves the allocation cost of cell construction
159+
on the 4,900-cell stress grid, but loses fluent ergonomics. The direct
160+
form is more brittle to refactor — changing one field touches an explicit
161+
initializer block instead of a chain step. Restrict it to the
162+
identifiable hot loop and keep the rest of the file fluent.
163+
164+
**Reference implementation.** The canonical before/after pair lives in
165+
`tests/stress_perf/StressPerf.Reactor` (naive — fluent chain, the shape
166+
unaware users write) and `tests/stress_perf/StressPerf.ReactorOptimized`
167+
(idiomatic perf-tuned variant). Same workload, side-by-side diffable.
168+
169+
**Forward reference.** Spec 008's builder-pattern element factories
170+
would let the fluent chain match this allocation profile, eliminating
171+
the dichotomy. Until then, treat direct-initializer as a targeted
172+
optimization.
173+
174+
## Memoizing list cells
175+
176+
`UseMemoCells` skips the cell-build for indices whose item value (and
177+
declared dependencies) haven't changed since the previous render. The
178+
reconciler's `ReferenceEquals` shortcut means a reused cell allocates
179+
nothing and skips diffing entirely.
180+
181+
```csharp
182+
var theme = ctx.UseTheme();
183+
var children = ctx.UseMemoCells(
184+
stocks,
185+
(item, i) => Cell(item, theme),
186+
theme); // ← deps; framework invalidates on change
187+
```
188+
189+
**When it's the right hammer.** Tickers, log tables, file lists, large
190+
read-only grids — anywhere the cell content is a pure function of `T`
191+
plus a small set of declared deps.
192+
193+
**When it's the wrong hammer.** Rows whose chrome depends on focus,
194+
drag, selection, or hover state that you aren't capturing in deps.
195+
Memo silently renders stale when an external state change isn't
196+
declared as a dep — the analyzer below catches the obvious cases, but
197+
indirect captures through helper methods aren't visible to it.
198+
199+
**Three overloads:**
200+
201+
| Overload | Use when |
202+
|----------|----------|
203+
| `UseMemoCells<T>` | Per-item value equality. Default choice. |
204+
| `UseMemoCellsByKey<T, TKey>` | Items have stable identity but mutable interior (`record Person(int Id, string Name)`). Hashes by key, value-compares for content. Reordered keys reuse cells via the reconciler's keyed-children path. |
205+
| `UseMemoCellsByIndex<T>` | Data source already knows which indices changed. Skips the per-cell equality scan; only the named indices run the builder. |
206+
207+
**gen2 caveat.** Memo trades short-lived gen0 churn for longer-lived
208+
gen1/gen2 retention. Many memoized lists across an app can compound
209+
gen2 pressure even when bytes-per-tick drops. Profile before adopting
210+
across the board.
211+
212+
**Compile-time safety net.** The companion Roslyn analyzer
213+
`REACTOR_HOOKS_007` warns when a builder closure captures a value that
214+
isn't declared in the deps list. Codefix is "add the missing capture to
215+
deps". Indirect captures through intermediate methods are a documented
216+
blind spot — the analyzer can't see through a method call without
217+
whole-program analysis (same blind spot as React's
218+
`react-hooks/exhaustive-deps`).
219+
113220
## Tips
114221

115222
**Wrap third-party components in `ErrorBoundary`.** If a plugin or external

docs/guide/advanced.md

Lines changed: 107 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -253,6 +253,113 @@ mutate the original `ObservableCollection` in event handlers. Reactor
253253
subscribes to `CollectionChanged` and triggers a re-render on any
254254
modification.
255255

256+
## Hot loops
257+
258+
The fluent modifier chain is ergonomic but allocates an `ElementModifiers`
259+
clone per `with`-step. For ordinary UI that is invisible — a button has
260+
five modifiers, the cost is two extra small records on a click handler.
261+
For inner-loop cell construction in a 4,900-cell grid that re-renders
262+
30× per second, those clones dominate the allocation profile.
263+
264+
The escape hatch is to construct the `Element` and its `ElementModifiers`
265+
record directly, skipping the fluent chain entirely. The
266+
`LayoutModifiers` and `VisualModifiers` sub-records are public types
267+
specifically so perf-critical code can build them once instead of having
268+
the fluent chain rebuild them step-by-step.
269+
270+
```csharp
271+
// Fluent — five clones per cell. Right tool for ordinary UI.
272+
var cell = TextBlock(label)
273+
.FontSize(8)
274+
.Foreground(item.IsUp ? GreenBrush : RedBrush)
275+
.Padding(2, 1, 2, 1)
276+
.Grid(row: r, column: c);
277+
278+
// Direct record initializer — one TextBlockElement, one ElementModifiers,
279+
// two bucket sub-records, one Attached dictionary. Use only when the
280+
// allocation cost shows up in profiles.
281+
var cell = new TextBlockElement(label)
282+
{
283+
FontSize = 8,
284+
Modifiers = new ElementModifiers
285+
{
286+
Layout = new LayoutModifiers { Padding = new Thickness(2, 1, 2, 1) },
287+
Visual = new VisualModifiers { Foreground = item.IsUp ? GreenBrush : RedBrush },
288+
},
289+
Attached = new Dictionary<Type, object>(1)
290+
{
291+
[typeof(GridAttached)] = new GridAttached(r, c, 1, 1),
292+
},
293+
};
294+
```
295+
296+
**Workload shape.** Use this idiom in lists or grids with hundreds-plus
297+
elements per render — tickers, log tables, observability dashboards. Don't
298+
adopt it for ordinary screens. The fluent chain remains the right tool for
299+
everything except the inner cell loop.
300+
301+
**Trade-offs.** Roughly halves the allocation cost of cell construction
302+
on the 4,900-cell stress grid, but loses fluent ergonomics. The direct
303+
form is more brittle to refactor — changing one field touches an explicit
304+
initializer block instead of a chain step. Restrict it to the
305+
identifiable hot loop and keep the rest of the file fluent.
306+
307+
**Reference implementation.** The canonical before/after pair lives in
308+
`tests/stress_perf/StressPerf.Reactor` (naive — fluent chain, the shape
309+
unaware users write) and `tests/stress_perf/StressPerf.ReactorOptimized`
310+
(idiomatic perf-tuned variant). Same workload, side-by-side diffable.
311+
312+
**Forward reference.** Spec 008's builder-pattern element factories
313+
would let the fluent chain match this allocation profile, eliminating
314+
the dichotomy. Until then, treat direct-initializer as a targeted
315+
optimization.
316+
317+
## Memoizing list cells
318+
319+
`UseMemoCells` skips the cell-build for indices whose item value (and
320+
declared dependencies) haven't changed since the previous render. The
321+
reconciler's `ReferenceEquals` shortcut means a reused cell allocates
322+
nothing and skips diffing entirely.
323+
324+
```csharp
325+
var theme = ctx.UseTheme();
326+
var children = ctx.UseMemoCells(
327+
stocks,
328+
(item, i) => Cell(item, theme),
329+
theme); // ← deps; framework invalidates on change
330+
```
331+
332+
**When it's the right hammer.** Tickers, log tables, file lists, large
333+
read-only grids — anywhere the cell content is a pure function of `T`
334+
plus a small set of declared deps.
335+
336+
**When it's the wrong hammer.** Rows whose chrome depends on focus,
337+
drag, selection, or hover state that you aren't capturing in deps.
338+
Memo silently renders stale when an external state change isn't
339+
declared as a dep — the analyzer below catches the obvious cases, but
340+
indirect captures through helper methods aren't visible to it.
341+
342+
**Three overloads:**
343+
344+
| Overload | Use when |
345+
|----------|----------|
346+
| `UseMemoCells<T>` | Per-item value equality. Default choice. |
347+
| `UseMemoCellsByKey<T, TKey>` | Items have stable identity but mutable interior (`record Person(int Id, string Name)`). Hashes by key, value-compares for content. Reordered keys reuse cells via the reconciler's keyed-children path. |
348+
| `UseMemoCellsByIndex<T>` | Data source already knows which indices changed. Skips the per-cell equality scan; only the named indices run the builder. |
349+
350+
**gen2 caveat.** Memo trades short-lived gen0 churn for longer-lived
351+
gen1/gen2 retention. Many memoized lists across an app can compound
352+
gen2 pressure even when bytes-per-tick drops. Profile before adopting
353+
across the board.
354+
355+
**Compile-time safety net.** The companion Roslyn analyzer
356+
`REACTOR_HOOKS_007` warns when a builder closure captures a value that
357+
isn't declared in the deps list. Codefix is "add the missing capture to
358+
deps". Indirect captures through intermediate methods are a documented
359+
blind spot — the analyzer can't see through a method call without
360+
whole-program analysis (same blind spot as React's
361+
`react-hooks/exhaustive-deps`).
362+
256363
## Tips
257364

258365
**Wrap third-party components in `ErrorBoundary`.** If a plugin or external

0 commit comments

Comments
 (0)