Skip to content

Latest commit

 

History

History
80 lines (61 loc) · 3.45 KB

File metadata and controls

80 lines (61 loc) · 3.45 KB

1.19 — Final perf validation — deferral note

Spec 047 §14 Phase 1 exit gate item 1: ReactorV2 ≤ +10% on M1, M2, M5, M7, L1, L4 vs Phase 0 baseline; no worse than ReactorToday on any §15.4 macro that ships in Phase 1.

Status

Deferred for execution on the baseline machines. This is the headline numerical gate for Phase 1; it cannot be evaluated until 1.17 and 1.18 run on hardware.

What this PR establishes

All the code that the gate measures ships:

  • Five ported controls in src/Reactor/Core/V1Protocol/Handlers/.
  • V1 protocol surface in src/Reactor/Core/V1Protocol/.
  • Feature flag + dispatch wiring in Reconciler.{Mount,Update}.cs.
  • The external assembly proof (Reactor.External.TestControl).
  • The §8.2 setter-suppression scope — already shipped pre-Phase 1 (Reconciler.ApplySetters enters EchoSuppressScopeDepth++ for the duration of the setter chain). M13 OnIsOnChangedFireCount = 0 is satisfied by construction.

Gate evaluation steps (run on each baseline machine)

# 1. Microbenches (M1, M2, M5, M7, M10, M12, M13) on the V1 path.
dotnet run -c Release --project tests/perf_bench/PerfBench.ControlModel -- \
    --variant ReactorV2 \
    --controls ToggleSwitch,Slider,TextBox,Border,ListView \
    --out docs/specs/047/phase1-results/<machine>/<date>/micro/

# 2. Macros L1, L4 (per 1.17 + 1.18: L13, L14, L2, L3, L6 V2 also).
dotnet run -c Release --project tests/stress_perf/StressPerf.ReactorV2

# 3. Aggregator emits the three required tables.
dotnet run --project tools/spec047-aggregator -- \
    --in 'docs/specs/047/phase1-results/<machine>/<date>/**/*.jsonl' \
    --baseline 'docs/specs/047/baseline-results/<machine>/' \
    --out docs/specs/047/phase1-results/<machine>/<date>/aggregate/

Pass conditions (from spec §14 Phase 1 exit gate)

  1. V2 ≤ +10% on M1, M2, M5, M7, L1, L4 vs Phase 0 baseline — required.
  2. No regressions on M13 — required (OnIsOnChangedFireCount = 0).
  3. No worse than ReactorToday on any macro that ships in Phase 1.

What happens if any metric fails

Per the task file: "enumerate the regressions and decide remediation (fix in Phase 1 vs accept and document)."

Two remediation playbooks the orchestrator has already tightened up during this PR:

  • Pool reuse path — every ported handler uses ctx.RentControl<T>(), matching the legacy _pool.TryRent(typeof(T)) shape. Pool participation for Border / TextBox / ToggleSwitch is unchanged.
  • Setter scopeApplySetters runs identically through both code paths (V1=ON wraps it via ctx.ApplySettersReconciler.ApplySetters).

If a regression surfaces on M2 (memory-per-mount) — most likely culprit is closure allocation in OnCustomEvent. The V1 surface allows the same wire-once trampoline shape as the legacy code: capture ctrl, not element, and read back via GetElementTag. The ported handlers follow this pattern, so this should not regress.

If a regression surfaces on M7 (modifier composition cost) — investigate whether the V1 path's per-handler Children strategy is being re-allocated per mount. The shipped handlers all use static strategy instances ({ get; } with initializer) so this should be a no-op.

Exit-gate decision record

The exit-gate decision lives in phase1-results/<machine>/<date>/exit-gate.md on each baseline machine after the run. The orchestrator's job in this PR is to ensure the code is ready to be measured — the measurement itself happens on hardware.