Skip to content

Commit 9780aca

Browse files
spec(047): Phase 3 prereqs — HandCoded* builders + TextBox descriptor proof
Ships the §14 Phase 3 prerequisites that unblock bulk control migration: 3.0.1 — HandCodedControlled / HandCodedEvent escape-hatch builders on ControlDescriptor. Method-level TPayload generic threads a user-supplied per-control event payload (e.g. TextBoxEventPayload) through subscription without disturbing the existing single-event fast-path DescriptorControlledPayload. Native TDelegate generic lets authors pass the control's natural trampoline type (TextChangedEventHandler, RoutedEventHandler) directly — no EventHandler<TArgs> bridge closure. 3.0.2 — TextBoxDescriptor as the 2-event proof point. Text via HandCodedControlled (TextChanged), SelectionChanged via HandCodedEvent — both sharing TextBoxEventPayload's existing slots. Trampolines stay static (zero per-mount closures); subscription is gated on the live element's callback presence and fires once per control lifetime. Documented gap: the descriptor does not request rerender from TextChanged, so controlled-mode snap-back differs from the hand-coded handler on filtered-input-returns-same-state scenarios. Acceptable for the proof point; matches the §14 thesis that the hand-coded escape hatch retains nuances the declarative path can re-engage if needed. 3.0.3 — x64 advisory perf capture under docs/specs/047/phase3-results/CPC-ander-YTZ3O-x64-advisory/. ALL Q1 gating benches land in the ≤5% band on this advisory capture (M2 -2.2%, M10 +1.1% vs ReactorV2 — within the §9.2.1 ±3% thesis). The README is explicit that this is NOT authoritative — a stable-AC ARM64 re-capture on LAPTOP-4MEP83VI mirroring the Phase 2 methodology should land before §14 Phase 3 is closed. The Phase 2 ARM64 verdict (judgment-call band, descriptors as primary) stands until then. Test infrastructure: - 3 new Desc_TextBox_* self-test fixtures (16 TAP checks) covering mount/update, 2-event subscription wiring, and the callback-gate. - DescriptorVariantFactory now registers TextBoxDescriptor for the ReactorDescriptors bench variant alongside the Q1 trio. Validation on this branch (x64 Cloud PC): - src/Reactor + Reactor.AppTests.Host + PerfBench.ControlModel: 0 errors. - All 60 V1_* + Desc_* checks pass identically under V1 ON and V1 OFF (REACTOR_USE_V1_PROTOCOL env var flipped both ways on the same exe). - Reactor.Tests xUnit: 9086 passed / 0 failed / 62 skipped. - Vulnerable-package gate: 0 vulnerable packages. - Solution-wide build could not complete locally due to a Cloud PC disk space constraint (415 MB free, copy step failed on native runtime DLLs for StressPerf.{Direct,DirectX}). Reactor-scope projects all built clean; the failures are environmental, not code, and CI will exercise the full solution. Deferred to a later session: - 3.0.4 — author onboarding doc (docs/guide/descriptor-authoring.md), per resume guidance. - ARM64 stable-AC re-capture on LAPTOP-4MEP83VI to ratify the §9.2.1 thesis (see Phase 3 results README for capture protocol). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent 64d5588 commit 9780aca

12 files changed

Lines changed: 1059 additions & 9 deletions

File tree

Lines changed: 131 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,131 @@
1+
# Spec 047 §14 Phase 3 (3.0.3) — TextBox descriptor proof, x64 advisory
2+
3+
**This is an advisory x64 capture, NOT authoritative.** The Phase 2 Q1
4+
verdict was ratified on `LAPTOP-4MEP83VI` (ARM64, stable AC, dedicated
5+
hardware). This capture was run on a Cloud PC (`CPC-ander-YTZ3O`, AMD
6+
EPYC 7763, x64) and inherits Cloud PC noise characteristics — co-tenant
7+
load, virtualized scheduling, no AC/foreground control. **Do not cite
8+
these numbers in §13 or §14 spec text.** Use them as a directional read
9+
on whether the Phase 3 prereq 3.0.2 (TextBox descriptor with the new
10+
`HandCodedControlled` + `HandCodedEvent` builders) regresses the bench
11+
matrix. A real ARM64 stable-AC re-capture on `LAPTOP-4MEP83VI` should
12+
land before §14 Phase 3 is closed.
13+
14+
## Why this advisory capture exists
15+
16+
Phase 3 prereq 3.0.2 ships the first descriptor port that uses
17+
`HandCodedControlled` / `HandCodedEvent` — the escape-hatch builders
18+
needed for multi-event controls. TextBox is the proof point (2 events:
19+
`TextChanged` round-tripping `Text`, plus fire-only `SelectionChanged`).
20+
The `DescriptorVariantFactory` now registers TextBoxDescriptor alongside
21+
the Q1 head-to-head trio (ToggleSwitch / Slider / Border).
22+
23+
Phase 2 §13 Q1 pre-committed to the bench matrix as the validation gate
24+
for "does the descriptor model add bounded tax." With a new descriptor
25+
shape entering the matrix, the gate must re-run before bulk Phase 3 port
26+
work begins. The §9.2.1 thesis is that the hand-coded-shape descriptor
27+
(i.e. `HandCodedControlled` with a user-supplied native trampoline)
28+
matches the hand-coded handler within ±3% on M2 / M10.
29+
30+
## Capture environment
31+
32+
`CPC-ander-YTZ3O`, x64 (AMD EPYC 7763 64-Core Processor), Release, .NET
33+
10.0.x, Windows 11 26200. **Cloud PC — not on AC/dedicated hardware**.
34+
3 process launches × 5 reps = 15 measurements per (bench, variant) cell.
35+
225 rows total across `launch-1.jsonl` + `launch-2.jsonl` +
36+
`launch-3.jsonl`.
37+
38+
## Headline result
39+
40+
| Bench | vs ReactorV2 ns | vs ReactorToday ns | Phase 2 ARM64 (vs V2) | Band on this capture |
41+
|---|---:|---:|---:|---|
42+
| M1 Mount_Leaf_NoCallback | -0.9% | -2.6% | -1.0% | ≤5% |
43+
| M2 Mount_Leaf_OneCallback | -2.2% | +2.9% | **+9.6%** | ≤5% |
44+
| M5 Dispatch_Switch_Warm | +1.0% | +4.2% | -2.3% | ≤5% |
45+
| M7 Update_NoChange | -1.4% | -0.5% | +8.1% | ≤5% |
46+
| M10 EventHandlerState_Alloc | +1.1% | +9.5% | +19.3% | ≤5% |
47+
48+
**No bench exceeds ±5% vs ReactorV2 on this capture.** Adding the
49+
`HandCodedControlled` + `HandCodedEvent` TextBox descriptor to the
50+
registered set does not regress the matrix at this measurement
51+
resolution.
52+
53+
The +9.6% M2 and +19.3% M10 numbers from the Phase 2 ARM64 stable-AC
54+
capture do not reproduce on this x64 Cloud PC capture. Possible
55+
explanations (none ratified):
56+
57+
1. **Architecture-dependent codegen.** Virtual `PropEntry.Mount`
58+
dispatch + delegate invocation cost may differ between RyuJIT-x64
59+
and RyuJIT-arm64 codegen paths. The Phase 2 README attributes the
60+
residual cost to "virtual `PropEntry.Mount` dispatch + getter/setter
61+
delegate invocations vs the hand-coded handler's inlined property
62+
writes" — that cost is JIT-implementation-sensitive.
63+
2. **Cache hierarchy / memory bandwidth.** AMD EPYC 7763 has different
64+
L1/L2/L3 sizes and inclusivity vs Snapdragon X. The descriptor model
65+
touches more memory per mount (entry list + per-entry delegates) and
66+
may sit better in this cache hierarchy.
67+
3. **Cloud PC virtualization noise.** Cloud PC runs in a virtualized
68+
environment with co-tenants competing for resources. The 95% CI
69+
half-widths in `summary.md` are wide (M2 vs V2 CI is ±15-16k ns on
70+
a ~190k ns mean ⇒ ±8%), so the -2.2% delta may be within noise.
71+
4. **TextBox descriptor offsetting something.** Adding TextBoxDescriptor
72+
could change the registration table or method-dispatch table shape
73+
in a way that incidentally benefits the descriptor variant on these
74+
benches. (Unlikely — benches don't mount TextBox in M1/M2/M5/M7/M10
75+
— but worth flagging for the ARM64 re-capture.)
76+
77+
## §9.2.1 thesis check (TextBox HandCoded shape)
78+
79+
The §9.2.1 thesis: a hand-coded-shape descriptor (using
80+
`HandCodedControlled` + native trampoline) matches the hand-coded
81+
handler within ±3% on M2 / M10. On this advisory x64 capture:
82+
83+
| Bench | ReactorDescriptors vs ReactorV2 | Within ±3%? |
84+
|---|---:|---|
85+
| M2 | **-2.2%** | ✅ yes |
86+
| M10 | **+1.1%** | ✅ yes |
87+
88+
**On this capture the thesis holds.** ARM64 re-capture should confirm.
89+
90+
## Q1 matrix application — for completeness
91+
92+
Per §13 Q1's pre-committed decision matrix, on the Q1 head-to-head
93+
gating benches (M1 / M2 / M5 / M7):
94+
95+
| Band | Verdict | Triggered on this capture? |
96+
|---|---|---|
97+
| ≤5% on all M1/M2/M5/M7 | Ship descriptors as primary | **Yes** |
98+
| 5-15% on any gating bench | Judgment call | No |
99+
| >15% on any gating bench | Ship hand-coded as primary | No |
100+
101+
On this advisory capture the matrix lands in the "ship descriptors as
102+
primary" band on all four gating benches. This is **more favorable**
103+
than the Phase 2 ARM64 capture (which landed in the judgment-call band
104+
on M2 / M7) — consistent with the noise / arch explanations above.
105+
106+
The Phase 2 ARM64 verdict (judgment-call band, recommendation =
107+
descriptors as primary at Phase 3 scope) stands as the authoritative
108+
verdict. This capture does not move it.
109+
110+
## Files
111+
112+
- `launch-1.jsonl` / `launch-2.jsonl` / `launch-3.jsonl` — raw bench
113+
output. 225 rows total.
114+
- `aggregate.py` — copy of the Phase 2 aggregator. Run with no args
115+
from this directory.
116+
- `summary.md` — aggregator output.
117+
118+
## Next step
119+
120+
A stable-AC ARM64 re-capture on `LAPTOP-4MEP83VI` mirroring the Phase 2
121+
methodology should land before §14 Phase 3 is closed:
122+
123+
- Same 3×5 capture pattern (`--reps 5`, 3 process launches).
124+
- Same benches (M1 / M2 / M5 / M7 / M10).
125+
- Same variant set (ReactorToday / ReactorV2 / ReactorDescriptors).
126+
- AC power, foreground window, other apps closed.
127+
- Output to `docs/specs/047/phase3-results/LAPTOP-4MEP83VI/<date>-textbox-proof-3x5/`.
128+
129+
The ARM64 capture is what ratifies the §9.2.1 thesis for the multi-event
130+
descriptor shape. This advisory capture only confirms the new code
131+
compiles, runs, and doesn't blow up the matrix at coarse resolution.
Lines changed: 122 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,122 @@
1+
"""Spec 047 §14 Phase 2 (Q1 spike) — aggregate launch-N.jsonl into a means
2+
+ 95% CI table per (bench, variant), and emit the Q1 decision-matrix deltas
3+
(ReactorDescriptors vs ReactorV2, ReactorDescriptors vs ReactorToday).
4+
5+
Usage: python aggregate.py # reads launch-*.jsonl in CWD
6+
"""
7+
import glob
8+
import json
9+
import math
10+
import statistics
11+
from collections import defaultdict
12+
13+
14+
def main():
15+
rows = []
16+
for path in sorted(glob.glob("launch-*.jsonl")):
17+
with open(path, "r", encoding="utf-8") as f:
18+
for line in f:
19+
line = line.strip()
20+
if not line:
21+
continue
22+
row = json.loads(line)
23+
if row.get("status") != "ok":
24+
continue
25+
rows.append(row)
26+
27+
# Group by (benchId, variant).
28+
buckets = defaultdict(list)
29+
for r in rows:
30+
buckets[(r["benchId"], r["variant"])].append(r)
31+
32+
benches = sorted({b for (b, _) in buckets}, key=_bench_key)
33+
variants = ["ReactorToday", "ReactorV2", "ReactorDescriptors"]
34+
35+
def summarize(rs, key):
36+
vals = [r[key] for r in rs]
37+
if not vals:
38+
return (math.nan, math.nan, 0)
39+
mean = statistics.mean(vals)
40+
if len(vals) > 1:
41+
stdev = statistics.stdev(vals)
42+
# 95% CI half-width for a t-distribution. For n=15 dof=14, t ≈ 2.145.
43+
# Approximate with 1.96 for simplicity — close enough at n≥10.
44+
ci_half = 1.96 * stdev / math.sqrt(len(vals))
45+
else:
46+
ci_half = math.nan
47+
return mean, ci_half, len(vals)
48+
49+
# ── Per-(bench, variant) summary table. ──
50+
print("# Per-(bench, variant) means")
51+
print()
52+
print(f"| Bench | Variant | n | Mean ns | 95% CI ±ns | Mean alloc B | 95% CI ±B |")
53+
print(f"|---|---|---:|---:|---:|---:|---:|")
54+
for b in benches:
55+
for v in variants:
56+
rs = buckets.get((b, v), [])
57+
mean_ns, ci_ns, n = summarize(rs, "meanNs")
58+
mean_b, ci_b, _ = summarize(rs, "allocBytes")
59+
if n == 0:
60+
print(f"| {b} | {v} | 0 | — | — | — | — |")
61+
else:
62+
print(
63+
f"| {b} | {v} | {n} | {mean_ns:,.0f} | {ci_ns:,.0f} "
64+
f"| {mean_b:,.0f} | {ci_b:,.0f} |"
65+
)
66+
print(f"| | | | | | | |")
67+
68+
# ── Q1 decision-matrix deltas. ──
69+
print()
70+
print("# Q1 head-to-head — ReactorDescriptors deltas")
71+
print()
72+
print(
73+
"| Bench | vs ReactorV2 ns | vs ReactorV2 alloc | vs ReactorToday ns | vs ReactorToday alloc | Q1 band |"
74+
)
75+
print("|---|---:|---:|---:|---:|---|")
76+
for b in benches:
77+
ds = buckets.get((b, "ReactorDescriptors"), [])
78+
v2 = buckets.get((b, "ReactorV2"), [])
79+
today = buckets.get((b, "ReactorToday"), [])
80+
d_ns, _, _ = summarize(ds, "meanNs")
81+
d_b, _, _ = summarize(ds, "allocBytes")
82+
v_ns, _, _ = summarize(v2, "meanNs")
83+
v_b, _, _ = summarize(v2, "allocBytes")
84+
t_ns, _, _ = summarize(today, "meanNs")
85+
t_b, _, _ = summarize(today, "allocBytes")
86+
87+
def pct(a, base):
88+
if base and not math.isnan(base) and not math.isnan(a):
89+
return (a - base) / base * 100.0
90+
return math.nan
91+
92+
vs_v2_ns = pct(d_ns, v_ns)
93+
vs_v2_b = pct(d_b, v_b)
94+
vs_t_ns = pct(d_ns, t_ns)
95+
vs_t_b = pct(d_b, t_b)
96+
97+
# §13 Q1 matrix bands keyed off the worst of ns vs V2.
98+
worst = vs_v2_ns
99+
if math.isnan(worst):
100+
band = "-"
101+
elif abs(worst) <= 5:
102+
band = "<=5%: ship descriptors"
103+
elif abs(worst) <= 15:
104+
band = "5-15%: judgment call"
105+
else:
106+
band = ">15%: ship hand-coded"
107+
108+
print(
109+
f"| {b} | {vs_v2_ns:+.1f}% | {vs_v2_b:+.1f}% | {vs_t_ns:+.1f}% | {vs_t_b:+.1f}% | {band} |"
110+
)
111+
112+
113+
def _bench_key(s):
114+
# M1, M2, ..., M13 — sort numerically.
115+
try:
116+
return int(s.lstrip("M"))
117+
except ValueError:
118+
return 999
119+
120+
121+
if __name__ == "__main__":
122+
main()

0 commit comments

Comments
 (0)