Skip to content

fix(virtualization): proper ItemsRepeater recycling — stop leaking every realized container#324

Merged
codemonkeychris merged 3 commits into
mainfrom
fix/itemsrepeater-recycling-leak
May 18, 2026
Merged

fix(virtualization): proper ItemsRepeater recycling — stop leaking every realized container#324
codemonkeychris merged 3 commits into
mainfrom
fix/itemsrepeater-recycling-leak

Conversation

@codemonkeychris
Copy link
Copy Markdown
Collaborator

@codemonkeychris codemonkeychris commented May 18, 2026

Summary

  • ElementFactory<T> now reuses args.Element from RecycleElement via a recycle stack + Reconciler.Reconcile instead of always Mounting fresh, matching the WinUI IElementFactory contract.
  • FlexPanel no longer subscribes to XamlRoot.Changed per instance — DPI scale is read lazily inside MeasureOverride, which removes the leak path that Loaded/Unloaded couldn't close because those events don't fire for ItemsRepeater-recycled containers.
  • Three new SelfTest fixtures pin the IElementFactory recycling contract so this regression can't sneak back in.

Root cause

Reading microsoft-ui-xaml-lift:

  • ViewManager.cpp:865-869 — at realize, the framework does children.Append(element) only when GetParent(element) != *repeater. Recycled elements stay parented to the ItemsRepeater.
  • ViewManager.cpp:153-194ClearElementToElementFactory calls our RecycleElement but does NOT remove from repeater.Children. The remove branch runs only when there is no ItemTemplate.
  • ItemTemplateWrapper.cpp:120-136 — the built-in RecyclingElementFactory cycles the same UIElements via RecyclePool; the framework's same-children-set design depends on the factory re-yielding its elements.
  • ItemsRepeater.cpp:653-655 comment: "ownership of the items is with Repeater today because the wrapper is doing the recycling as opposed to the DataTemplate."

Our factory's "build fresh on every realize, drop reference on recycle" pattern produced one orphan in Repeater.Children per realize. At 2,750 scroll cycles the fixed-height demo had ~2,750 UIElements parented forever; managed heap climbed 20→70 MB and working set climbed 247→448 MB.

The variable-height demo additionally crashed with stowed exception 0xC000027B because the orphan pile-up interacted with RefreshRealizedItems_listState.ByKey[key].Index resolved a stale Element to a now-foreign realized container, and a mismatched Reconcile walked a structurally-divergent tree (rows where i % 7 == 0 had an extra TextBlock vs. neighbors) on the XAML thread.

Fix details

ElementFactory<T> (commit f719976)

  • New Stack<UIElement> _recyclePool and Dictionary<UIElement, Element> _lastElementByControl.
  • RecycleElement(args): drop _mountedElements / _keyByControl tracking for the key, don't call UnmountChild, push to _recyclePool.
  • GetElement(args): if pool non-empty, pop a control, look up its prior Element, call Reconciler.Reconcile(oldElement, newElement, control, ...), and reuse the control. Only Mount fresh when the pool is empty.
  • Removed PoolInteractiveLeaves / IsPoolableInteractive — the per-leaf pooling path conflicted with whole-row reuse.

Reconciler reverse-lookup map (_keyByControl)

Supports O(1) cleanup in RecycleElement and prevents RefreshRealizedItems from running Reconcile against a stale _mountedElements entry whose key→index resolved to a different realized row. This was the proximate cause of the variable-height crash.

FlexPanel

Replaced OnLoaded + XamlRoot.Changed += OnXamlRootChanged with SyncPointScaleLazy() called at the top of MeasureOverride. Same DPI-tracking behavior, zero subscriptions. WinUI does not reliably fire Loaded/Unloaded for ItemsRepeater-recycled containers (observed: 12 unloads for 2,774 constructions), so the prior subscribe path leaked every FlexPanel through XamlRoot's multicast delegate list.

Tests (commit 5c33ddc)

Three SelfTest fixtures in ElementFactoryRecyclingFixtures.cs exercise the IElementFactory contract on a standalone ElementFactory<T> (no live LazyVStack, so framework-driven realizes don't collide with the test's cycles):

Fixture What it pins Pre-fix value
EFR_Factory_BoundedDistinctControls_AcrossManyRealizeCycles 100 realize+recycle cycles return ≤ 5 distinct UIElement instances 100
EFR_Factory_RecycledControlIsReusedOnNextRealize RecycleElement(X) followed by GetElement(...) returns the same X by reference not equal
EFR_Factory_BookkeepingBoundedAcrossCycles _mountedElements / _keyByControl / _recyclePool / _lastElementByControl counts stay bounded at ≤1 after 50 cycles 50 / 50 / 0 / 0

The fixtures use internal Debug*Count accessors on ElementFactory<T> (gated by the existing InternalsVisibleTo on Reactor.AppTests.Host).

Holistic testing-gap analysis

This bug class — "WinRT/UIElement retained by some framework or static path that our code can't easily inspect" — is hard to catch with traditional assertions because:

  1. GC ordering is non-deterministic for CCWs. Naïve WeakReference checks misfire.
  2. Loaded/Unloaded are not a reliable lifecycle signal inside ItemsRepeater (we now know).
  3. Process-level memory metrics are noisy without forced Gen2 + finalizers, and even then native CCW retention skews working-set numbers more than managed heap.

Other Reactor code paths with similar exposure (followups, not in this PR):

  • Reconciler._interactionTrackersstatic Dictionary<UIElement, ...>, leaks if Unmount doesn't fire (same trap we just hit).
  • Reconciler._keyframeTriggerValues — same shape.
  • Any _pending* collections that capture a UIElement in a deferred lambda (e.g., _pendingConnectedAnimationStarts, ApplyMoveAnimations dispatcher-enqueued closure).
  • Any Window-rooted event subscription (CompositionTarget.Rendering, XamlRoot.Changed).

A reusable test scaffold for future leak coverage would look like the EFR fixtures here: build a small isolated factory or component tree, drive the suspect path N times directly, then assert the internal collection counts are bounded. Two reasons that pattern works where naïve managed-heap profiling fails: (a) you assert against deterministic internal state, not GC outcomes; (b) you isolate from the framework's own activity so deltas mean what you think they mean. I'd recommend adding similar coverage when we touch the other suspect paths above.

Empirical verification (from the original repro)

Scrolling DataSystemDemo → VirtualList (Fixed Height, 100k items, 14,506 realize/recycle cycles), with instrumented Gen2+finalizer GC at each readout:

Metric Before After
FlexPanel constructions 2,774 (1:1 w/ realizes) 50 (flat)
Managed heap 20 → 70 MB 17 MB (flat)
Working set 247 → 470 MB 238 MB (flat)

Variable-height crash (stowed exception 0xC000027B) also no longer reproduces.

Test plan

  • dotnet test tests/Reactor.Tests — 7,648 passed / 0 failed
  • dotnet test tests/Reactor.SelfTests — 717 passed / 0 failed (+3 new EFR fixtures)
  • Manual: DataSystemDemo → VirtualList (Fixed Height) → scroll 14k+ cycles, memory flat
  • Manual: DataSystemDemo → VirtualList (Variable Height) → scroll heavily, no crash

🤖 Generated with Claude Code

…ery realized container

ElementFactory<T> never reused args.Element from RecycleElement — every
GetElement built a fresh Reactor Element tree and Mounted a fresh WinUI
control. Per microsoft-ui-xaml-lift `ViewManager.cpp:865-869` and
`ItemTemplateWrapper.cpp:120-136`, ItemsRepeater keeps every realized
UIElement parented to itself forever and expects the IElementFactory to
cycle them through GetElement/RecycleElement. The "fresh build every
time" pattern produced one orphan in Repeater.Children per realize.

Empirical: scrolling DataSystemDemo > VirtualList (Fixed Height, 100k
items) leaked ~75 KB working set per realize and unbounded managed
heap. Variable-height demo crashed with stowed exception 0xC000027B
once the orphan count interacted with structural divergence between
expanded and collapsed rows in RefreshRealizedItems.

Fix:
- ElementFactory.RecycleElement pushes args.Element onto a recycle
  stack and drops _mountedElements/_keyByControl tracking. It does NOT
  call UnmountChild — the WinUI tree stays alive for reuse.
- ElementFactory.GetElement pops from the recycle stack first and calls
  Reconciler.Reconcile(oldElement, newElement, control, ...) to diff
  the existing tree against the new content. Allocates fresh only when
  the pool is empty.
- _lastElementByControl tracks the last-bound Element per control so
  reuse has an oldElement to diff against.
- _keyByControl reverse-lookup map (also new) lets RecycleElement drop
  the _mountedElements key entry in O(1) — without it, stale entries
  drove Reconcile against mismatched realized children when
  RefreshRealizedItems ran during a subsequent update, which is what
  was triggering the variable-height demo's 0xC000027B crash.

FlexPanel: stop subscribing to XamlRoot.Changed per instance.
Loaded/Unloaded do not fire reliably for ItemsRepeater-recycled
containers (in this repro Unloaded fired for 12 of 2,774 containers),
so the subscribe in OnLoaded leaked every FlexPanel through XamlRoot's
multicast delegate list. Replaced with SyncPointScaleLazy() called
from MeasureOverride — reads XamlRoot.RasterizationScale directly and
updates YogaConfig.PointScaleFactor when it changes. No subscription,
same DPI-tracking behavior.

Result: at 14,506 realize/recycle cycles, FlexPanel constructor count
stays at 50 (the working set), managedMB flat at 17, workingSetMB flat
at 238. Previously these climbed 1:1 with realize count.

Tests: Reactor.Tests 7648 passed / Reactor.SelfTests 714 passed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes virtualized list recycling and removes a FlexPanel DPI event-subscription leak path.

Changes:

  • Adds root-container recycling in ElementFactory<T> using a recycle stack and Reconciler.Reconcile.
  • Adds reverse UIElement-to-key tracking for cleanup of realized item state.
  • Changes FlexPanel DPI scale sync to lazy reads during measure instead of XamlRoot.Changed subscriptions.
Show a summary per file
File Description
src/Reactor/Core/ElementFactory.cs Implements ItemsRepeater recycle-pool reuse and reverse tracking maps.
src/Reactor/Yoga/FlexPanel.cs Removes XamlRoot event subscription and syncs rasterization scale during measure.

Copilot's findings

Tip

Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comments suppressed due to low confidence (1)

src/Reactor/Core/ElementFactory.cs:200

  • If the new item renders EmptyElement, Reconcile returns null after unmounting the recycled control. The replacement ?? reused fallback then returns the old recycled UIElement anyway, so an empty row can display stale content and gets tracked as if it represented the empty element.
                var replacement = _reconciler.Reconcile(oldElement, element, reused, _requestRerender);
                control = replacement ?? reused;
            }
  • Files reviewed: 2/2 changed files
  • Comments generated: 4

Comment thread src/Reactor/Core/ElementFactory.cs Outdated
Comment thread src/Reactor/Core/ElementFactory.cs
Comment thread src/Reactor/Core/ElementFactory.cs Outdated
Comment thread src/Reactor/Core/ElementFactory.cs
codemonkeychris and others added 2 commits May 17, 2026 18:22
Adds three SelfTest fixtures that exercise ElementFactory<T> directly
through its IElementFactory interface, capturing the invariants that
the prior leak violated:

- EFR_Factory_BoundedDistinctControls_AcrossManyRealizeCycles:
  100 realize-then-recycle cycles return ≤5 distinct UIElement instances
  (pre-fix: 100, one per cycle). The headline regression gate.

- EFR_Factory_RecycledControlIsReusedOnNextRealize:
  The control handed back via RecycleElement IS the control returned by
  the next GetElement (reference equality). When the recycle pool is
  empty, the next GetElement Mints fresh — different control.

- EFR_Factory_BookkeepingBoundedAcrossCycles:
  After 50 realize+recycle pairs, internal dicts (_mountedElements,
  _keyByControl, _recyclePool, _lastElementByControl) stay bounded at
  ≤1 each. Pre-fix _mountedElements / _keyByControl would have grown
  to 50. Catches a second class of bug where the control gets reused
  but bookkeeping entries accumulate — the exact corruption that drove
  the variable-height demo's 0xC000027B crash via stale entries in
  RefreshRealizedItems.

Fixtures drive a standalone ElementFactory<T> + Reconciler (no real
LazyVStack / ItemsRepeater) so the IElementFactory contract is the only
thing exercised. Standalone isolation matters: when an earlier draft
shared the same data items with a live LazyVStack, the framework's
pre-realized window + our cycles collided on string keys and corrupted
state — informative but not what the test intends to measure.

ElementFactory<T> exposes four `Debug*Count` accessors (gated by the
existing InternalsVisibleTo on Reactor.AppTests.Host) so the
bookkeeping fixture can read dict counts without reflection.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three edge cases Copilot's review caught in the recycle path, plus
SelfTest fixtures that pin each one as a regression gate. The original
fix is solid for homogeneous-row demos like DataSystem; these are real
defects only the broader fleet (heterogeneous rows, re-renders during
scroll, navigation away from a LazyStack) would exercise.

1. ElementFactory.GetElement, heterogeneous-row replacement
   When Reconcile returns a non-null replacement (root element type
   changed mid-cycle), the popped `reused` control is still parented
   to the ItemsRepeater — re-introducing the very orphan-in-Children
   leak the recycle pool was meant to fix. Now we DetachFromParent
   the orphan and drop its _lastElementByControl tracking before
   returning the replacement. New fixture:
   EFR_Factory_ReplacementOnRootTypeChange_DropsOldControlTracking.

2. ElementFactory.RefreshRealizedItems, stale _lastElementByControl
   The refresh path updated _mountedElements[key] with the new
   Element but left _lastElementByControl[control] pointing at the
   pre-refresh Element. A later recycle-then-reuse of the same
   control then fed Reconcile a stale oldElement and diffed against
   the wrong tree shape (potentially skipping updates or walking the
   wrong children). Refresh now writes both dicts together. New
   fixture: EFR_Factory_RefreshRealizedItems_SyncsLastElementByControl.

3. Reconciler.UnmountRecursive, ItemsRepeater children
   Verified via microsoft-ui-xaml-lift/.../ItemsRepeater.idl:154 —
   ItemsRepeater projects to C# as a FrameworkElement, not Panel,
   even though its C++ implementation uses DeriveFromPanelHelper.
   So the existing Panel branch in UnmountRecursive did NOT descend
   into repeater children, leaving every row Component's UseEffect
   cleanups unrun when the LazyStack itself unmounted (navigation,
   conditional render). Long-lived timers / subscriptions / async
   loops inside row Components would leak forever post-navigation.
   Added an ItemsRepeater branch that walks via VisualTreeHelper
   (the public surface doesn't expose Children directly). New
   fixture: EFR_LazyStack_Unmount_CleansUpAllRecycledRowComponents.

Not addressed in this PR (design-scope, documented for follow-up):
   Per-row Component identity. LazyStack's user keySelector is not
   propagated to row Element.Key, so reusing a control across two
   logical items preserves any inner Component's UseState/UseEffect
   state. This is the same behavior as WinUI RecyclingElementFactory
   and is the documented expectation, but Reactor should give users
   a clean way to opt into per-item state reset. Tracking separately.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@codemonkeychris
Copy link
Copy Markdown
Collaborator Author

Addressed Copilot's review (commit `cd8d244`). Three of the four findings were real defects; the fourth is a design-scope question I've tracked for a follow-up.

# Finding Status
1 Reconcile returns a non-null replacement → popped `reused` becomes an orphan in `Repeater.Children` (heterogeneous rows) Fixed: `DetachFromParent` + drop `_lastElementByControl` for the orphan
2 `RefreshRealizedItems` updates `_mountedElements` but leaves `_lastElementByControl` stale → next reuse diffs the wrong tree Fixed: refresh now updates both dicts
3 Recycle preserves Component `UseState`/`UseEffect` across logical items because keySelector isn't propagated to row Element.Key Not in this PR. Matches WinUI `RecyclingElementFactory`'s documented behavior. Reactor should give users a clean opt-in to per-item state reset — tracking as separate work since it's a design question, not a regression fix
4 `RecycleElement` skips `UnmountChild` and `Reconciler.Unmount` doesn't descend into ItemsRepeater children → row Components' cleanups never run on LazyStack unmount, leaking timers/subscriptions Fixed: added ItemsRepeater branch to `UnmountRecursive` (walks via VisualTreeHelper since the C# projection of ItemsRepeater is FrameworkElement, not Panel — confirmed in `microsoft-ui-xaml-lift/.../ItemsRepeater.idl:154`)

Three new SelfTest fixtures (one per fix) pin each as a regression gate:

  • `EFR_Factory_ReplacementOnRootTypeChange_DropsOldControlTracking`
  • `EFR_Factory_RefreshRealizedItems_SyncsLastElementByControl`
  • `EFR_LazyStack_Unmount_CleansUpAllRecycledRowComponents` — drives a counter via `UseEffect(() => () => count++)` on Components inside rows, then unmounts the LazyStack and asserts cleanups fired

Tests: 7,648 unit / 720 self (was 717, +3 new) all pass.

@codemonkeychris
Copy link
Copy Markdown
Collaborator Author

Filed #326 to track the per-row Component state design question (finding #3 from the Copilot review).

@codemonkeychris codemonkeychris merged commit 599360b into main May 18, 2026
8 checks passed
@codemonkeychris codemonkeychris deleted the fix/itemsrepeater-recycling-leak branch May 18, 2026 02:59
codemonkeychris added a commit that referenced this pull request May 29, 2026
…450)

- Point the two docs/guide links at the published GitHub Pages site
  (getting-started deep link + the Guide table row) instead of in-repo
  Markdown paths, so readers land on the rendered docset.
- Expand the Fundamentals "Performance" bullet with three more concrete,
  shipped optimizations: keyed-list longest-increasing-subsequence
  reconciliation (#322), virtualized-list container recycling (#324),
  and the per-render allocation-reduction pass (#126 / spec 034).

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants