Commit 3807774
spec(047): Phase 0 deliverables — audits, perf scaffolding, ARM64 baseline (#411)
* spec(047): Phase 0 deliverables — audits, perf scaffolding, ARM64 baseline
Lands the seven Phase 0 deliverables from spec §14 so 047 can clear its
exit gate. Audit results, perf-suite infrastructure, ARM64-native
baseline measurements, decision criteria, and the factoring
recommendation are all under docs/specs/047/.
Audits (0.1, 0.2, 0.5):
- BeginSuppress audit — 24 call sites classified (14 eliminable, 8
tolerance-shaped, 1 ColorPicker §8.1 edge case, 1 redundant). CSV +
one-page summary; spec §8 footnotes the audit.
- EventHandlerState field audit — 51 fields classified (42 routed-input,
9 control-intrinsic across 7 controls). Per-control struct sketches
for §9.2.
- Existing-API surface inventory — promote-vs-stay-internal table for
Phase 1's first task. 8 in-tree RegisterType call sites enumerated.
Perf suite (0.3):
- StressPerf.ReactorV2 + BlankReactorV2 — verbatim copies of the
Reactor variants at Phase-0 freeze; built and shipping ARM64-native
retail. V2 numbers ≈ Today numbers so Phase 1+ work shows up as the
delta.
- PerfBench.ControlModel — new WinUI host project implementing M1–M13
across Direct/ReactorToday/ReactorV2 variants. CLI for selective
runs + a --demo mode that captures one screenshot per (bench,
variant) via win32 PrintWindow.
- tools/spec047-aggregator — JSON-Lines → §15.6 (a)/(b)/(c) markdown
tables + trend.csv. Rejects rows missing required environment
metadata.
- perf-suite-runbook.md — operator-side environment-isolation rules
(foreground, AC, DRR off, no session switches, High Performance
power plan, warm-up policy). Cross-references the prior stress_perf
memory entries.
Baseline (0.4):
- ARM64-native retail M1–M13 captured on LAPTOP-4MEP83VI (Snapdragon X).
195 result rows, 0 excluded. Aggregator output committed under
baseline-results/LAPTOP-4MEP83VI/2026-05-25-arm64/aggregator-out/.
- 39 PrintWindow screenshots — one per (bench, variant) — confirm each
scenario actually exercises WinUI rendering (M13 visually shows the
§8.2 bug: ToggleSwitch ends up On after Set, callback fires).
- Reference x64-emulated capture preserved under .../2026-05-25/ as a
negative control; ARM-on-ARM is ~8–17× faster on the same silicon
for mount/dispatch-dominated tests.
- Spec §11.6 target table rewritten against measured M1–M3 with
Target = min(Direct + 100, ReactorToday × 0.4). Spec §12 opening
footnotes the Phase-0 anchor.
Decision criteria + factoring (0.6, 0.7):
- decision-criteria.md ratifies Q1/Q3/Q6/Q7/Q11/Q17/Q18/Q19 with
thresholds keyed to the audit data and the M-bench gates.
- factoring-recommendation.md: keep 047 unified; the only carve-out
is a standalone §8.2 setter-suppression fix (small, ahead of Phase 1).
Macros (0.3.4):
- L1 ships three-way (BlankWinUI3 + BlankReactor + BlankReactorV2);
run_startup_bench.ps1 enumerates the V2 variant. L2/L3 scenario
contracts frozen in macro-suite-status.md; binary implementations and
L4/L5/L7–L9/L11 deferred to Phase 1 with explicit rationale.
This commit clears the Phase 0 exit gate (spec §14): all seven
deliverables complete, baseline numbers committed, §11/§12 updated with
measured numbers, factoring reviewed and ratified.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* test(perf_bench): modernize AttachConsole interop, drop BenchCliHolder
LibraryImport over DllImport for source-generated marshalling, and read
args from Environment.GetCommandLineArgs() in OnLaunched instead of
stashing them in a mutable static. Addresses two of the github-code-quality
comments on PR #411; the rest are rejected with rationale on the PR.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* spec(047): address PR #411 review — bench correctness, aggregator env-mismatch, doc accuracy
- M1/M2/M3/M4/M6/M10 RunOne now calls Reconciler.UnmountChild(ui) after
removing the control from Parent.Children. Without unmount, 10k iterations
per repetition accumulated tags / subscriptions / event trampolines —
particularly distorting M10 (EventHandlerState_Alloc). Baseline re-capture
to follow this commit.
- spec047-aggregator groups rows by (BenchId, Variant, Architecture). Spec
§15.5 requires non-comparable runs to be flagged; the previous (Bench,
Variant) key silently merged ARM64-native + x64-emulated rows that share
a MachineSku. Each output table now emits one row per (bench, arch).
- summary.md regen command no longer uses ** (which ExpandGlobs treats as
a literal directory); single * with AllDirectories recurses correctly.
- perf-suite-runbook downgrades unimplemented "Harness assertion" claims to
🟡 planned-for-Phase-1; clarifies that PerfBench.ControlModel uses a
custom BenchRunner rather than BenchmarkDotNet at Phase 0. Run-metadata
schema table reflects which fields are stamped today.
- WindowCapture.cs drops unused GetClientRect P/Invoke.
- Implementation-tasks checkboxes for 0.3.1/0.3.2/0.3.3/0.3.5 reconciled
with what's shipped; Phase 0 exit checklist rolled up.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* spec(047): re-capture M1-M13 ARM64 baseline after bench correctness fix
Numbers shift moderately on the benches that the unmount fix touches:
- M3 (Button + 3 callbacks) per-op timing drops from 351k→210k ns on
ReactorToday — without unmount we were measuring "mount into a tree
already polluted by 5000 prior leaked controls", which inflated the
per-iteration cost.
- M10 per-op alloc unchanged within rep-to-rep variance; the timing
picks up the unmount cost on the credit side and loses the leaked-
tree cost on the debit side.
- M7 / M9 timings drift on the Direct path (which doesn't touch
Reactor); attributing to run-to-run variance — alloc bytes are
bit-identical across the two captures, confirming the Direct workload
itself didn't change.
Three batches re-run on LAPTOP-4MEP83VI ARM64-native retail Release:
M1-M8 @ 5000 iters × 5 reps, M9 @ 2000 × 5, M10-M13 @ 1000 × 5.
195 rows total, 0 excluded.
Summary takeaways table refreshed; spec §11.1 / §11.6 refresh deferred to
Phase 1 alongside the workstation x64 capture.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* spec(047): workstation x64 baseline — M1-M13 capture on CPC-ander-YTZ3O
Phase 0 §14 deliverable 4 second machine. ARM64 baseline already
landed on LAPTOP-4MEP83VI; this closes the gate's two-machine
requirement.
195 rows ingested, 0 excluded, all rows stamped Architecture=X64.
M13 OnIsOnChangedFireCount = 1 on both ReactorToday and ReactorV2
— §8.2 bug reproduces independent of architecture. M7 Reactor diff
short-circuit holds at ~50x faster than naive direct on x64
(75x on ARM64). Per-op alloc bytes match ARM64 within rounding on
every Mn, as expected (alloc is the architecture-independent axis).
Absolute ns figures on this Windows 365 Cloud PC (shared EPYC vCPU
slice) run 1.6-3.4x slower than the Snapdragon X laptop —
counterintuitive for "workstation x64" but expected for a Cloud PC
host. Both rows are valid datapoints; the §15.6 emitter keeps them
in separate (BenchId, Variant, Architecture) rows per spec §15.5,
so the comparison stays apples-to-apples.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* spec(047): ratify Phase 0 decision criteria — Q9/Q12/Q14 added, Q17 revised
User review of decision-criteria.md drove three additions and one revision:
- Q17 (registry precedence) revised — no override mechanism in v1. Throw
on any duplicate registration, including downstream-registering-for-
built-in. The original spec §13 Q17 "downstream wins with diagnostic"
recommendation is rejected; the RegisterOverride verb is deferred to a
future release where it can be added purely additively.
- Q9 (override semantics) added as a direct consequence of the revised
Q17 — no override verb in v1; testing satisfied by composing the
Reconciler from scratch in setup or test-only subclass wrappers.
- Q12 (Update return type) ratified as void Update(...) forbidding
substitution. Type changes flow through the existing unmount-and-
remount path. Substitution-mid-update is a bug farm whose primary
consumers (parent collection, modifier reapply, ItemsControl
realized-container caches) all benefit from the strict shape.
- Q14 (concurrency model) ratified as UI-thread-only. The MountContext
surface documents this explicitly; off-thread mount with
ThreadAffinity = Any is deferred until a real consumer surfaces.
All three additions are non-breaking to relax later — a future
RegisterOverride verb is purely additive, void Update can widen to
UIElement? Update without breaking existing callers, and ThreadAffinity
can default to UIThread.
Q11, Q18, Q19 ratified as already-recommended; no text changes.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>1 parent 0f739c3 commit 3807774
92 files changed
Lines changed: 5591 additions & 7 deletions
File tree
- docs/specs
- 047
- audits
- baseline-results
- CPC-ander-YTZ3O/2026-05-25-x64
- aggregator-out
- LAPTOP-4MEP83VI
- 2026-05-25-arm64
- aggregator-out
- screenshots
- 2026-05-25
- aggregator-out
- tasks
- tests
- perf_bench/PerfBench.ControlModel
- Benches
- Variants
- startup_perf
- BlankReactorV2
- stress_perf/StressPerf.ReactorV2
- tools/spec047-aggregator
Some content is hidden
Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
6 | 6 | | |
7 | 7 | | |
8 | 8 | | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
9 | 15 | | |
10 | 16 | | |
11 | 17 | | |
| |||
319 | 325 | | |
320 | 326 | | |
321 | 327 | | |
| 328 | + | |
| 329 | + | |
| 330 | + | |
| 331 | + | |
322 | 332 | | |
323 | 333 | | |
324 | 334 | | |
| |||
338 | 348 | | |
339 | 349 | | |
340 | 350 | | |
341 | | - | |
| 351 | + | |
| 352 | + | |
| 353 | + | |
342 | 354 | | |
343 | 355 | | |
344 | 356 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
404 | 404 | | |
405 | 405 | | |
406 | 406 | | |
| 407 | + | |
| 408 | + | |
| 409 | + | |
| 410 | + | |
| 411 | + | |
| 412 | + | |
| 413 | + | |
| 414 | + | |
| 415 | + | |
| 416 | + | |
407 | 417 | | |
408 | 418 | | |
409 | 419 | | |
| |||
695 | 705 | | |
696 | 706 | | |
697 | 707 | | |
| 708 | + | |
| 709 | + | |
| 710 | + | |
| 711 | + | |
| 712 | + | |
| 713 | + | |
| 714 | + | |
698 | 715 | | |
699 | 716 | | |
700 | | - | |
701 | | - | |
702 | | - | |
703 | | - | |
704 | | - | |
| 717 | + | |
| 718 | + | |
| 719 | + | |
| 720 | + | |
| 721 | + | |
| 722 | + | |
| 723 | + | |
| 724 | + | |
| 725 | + | |
| 726 | + | |
| 727 | + | |
| 728 | + | |
| 729 | + | |
| 730 | + | |
| 731 | + | |
705 | 732 | | |
706 | 733 | | |
707 | 734 | | |
| |||
719 | 746 | | |
720 | 747 | | |
721 | 748 | | |
722 | | - | |
| 749 | + | |
| 750 | + | |
| 751 | + | |
| 752 | + | |
| 753 | + | |
| 754 | + | |
| 755 | + | |
| 756 | + | |
| 757 | + | |
| 758 | + | |
723 | 759 | | |
724 | 760 | | |
725 | 761 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
| 107 | + | |
| 108 | + | |
| 109 | + | |
| 110 | + | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
| 121 | + | |
| 122 | + | |
| 123 | + | |
| 124 | + | |
| 125 | + | |
| 126 | + | |
0 commit comments