Generalize lr_adapt_proxy Into a Framework-Level Adaptation Module (PyCMA Demo Client, Exact Parity)
Build a reusable adaptation framework inside this repo, then prove it with one client integration (pycma) that reproduces current lr_adapt_proxy behavior exactly under deterministic conditions (workers=1).
This remains a decision-complete implementation spec for external peer review.
Round-2 precision updates:
- Removed redundant
initial_valuefromAdaptationContext. - Converted multi-worker soft-gate parity from qualitative wording to measurable criteria.
- Split test phases so deterministic parity tests land before runner rewiring.
- Explicitly defined
direction="maximize"behavior and hard-gate trace keys. - Clarified
was_clampedsurfacing as internal-only in v1.
Round-3 strict polish updates:
- Pinned soft-gate baseline to a specific run ID and manifest commit.
- Pinned soft-gate metric provenance to specific artifact file and row filters.
- Added explicit types for
fitness,current_value, andAdaptationActionfields in plan prose. - Explicitly documented that pipeline-level parity is intentionally
workers>=1(notworkers=1).
Locked decisions for v1:
- Use repo-relative references only (no absolute local filesystem links).
- Use a tiered parity contract: deterministic exact parity for
workers=1; measurable consistency checks for multi-worker runs. - No rollback/feature-flag path in v1.
- In scope: internal architecture refactor from hardcoded proxy logic to a generic policy interface with one production client.
- In scope: preserve current CLI/config behavior and output schemas.
- In scope: preserve current method names (
vanilla_cma,lr_adapt_proxy,pop4x). - Out of scope: adding a second optimizer client, changing benchmark matrix, changing inferential methodology, retuning hyperparameters, or revising scientific claims.
- Out of scope: fallback toggles or dual-path runtime switches for the legacy implementation.
- Current hook is method-specific and runs post-
tellonly forlr_adapt_proxyin experiments/methods.py. - Current rule directly mutates
es.sigmain experiments/lr_adapt_proxy.py. - Current diagnostics contract includes
proxy_signal,proxy_noise,proxy_snr,proxy_ema_snr,proxy_sigma_factor,proxy_sigmain experiments/lr_adapt_proxy.py. - Current run-row contract includes
proxy_sigma_factor_lastandproxy_ema_snr_lastin experiments/methods.py. - Soft-gate baseline run is pinned to
artifacts/runs/high-rigor/20260305T060114Z-cac939ce/results. - Baseline manifest for that run pins source commit
675b7bdinmanifest.json. - Baseline config is
experiments/config/high_rigor.yaml.
- Add adaptation core package at experiments/adaptation.
- Add typed context and action models in experiments/adaptation/types.py.
- Add policy protocol in experiments/adaptation/protocols.py.
- Add
LRProxyPolicyin experiments/adaptation/policies/lr_proxy.py. - Add pycma client adapter in experiments/adaptation/clients/pycma_sigma.py.
- Refactor runner wiring in experiments/methods.py to use the policy interface instead of method-specific mutation branches.
- Keep experiments/lr_adapt_proxy.py as a backward-compatible shim that delegates to the new policy core while preserving existing function signature and diagnostics keys.
- Step-level parity (
workers=1, hard gate): generation-by-generation exact equality using Python float==on the following keys:proxy_signalproxy_noiseproxy_snrproxy_ema_snrproxy_sigma_factorproxy_sigma
- Run-level parity (
workers=1, hard gate): exact equality for key run outputs:final_bestproxy_sigma_factor_lastproxy_ema_snr_last
- Pipeline-level parity (
workers>=1, soft gate): measurable consistency checks (byte-identical output is not required). This is intentionallyworkers>=1and notworkers=1.
CI/review gates:
- Hard gate: deterministic
workers=1step-level and run-level parity must pass. - Soft gate: multi-worker runs must satisfy all of the following against the pinned baseline:
- schema is identical for
runs_long.csv,cell_stats.csv,method_aggregate.csv, andfindings.json, - sign of
median_of_cell_median_deltaforlr_adapt_proxymatches baseline, where value is read fromresults/method_aggregate.csvrowmethod=lr_adapt_proxy, abs(cells_q_lt_0_05_current - cells_q_lt_0_05_baseline) <= 2, where values are read fromresults/method_aggregate.csvrowmethod=lr_adapt_proxy.
- schema is identical for
LRProxyParamsdataclass inpolicies/lr_proxy.pywith existing parameters only:ema_alpha,snr_up_threshold,snr_down_threshold,sigma_up_factor,sigma_down_factor,sigma_min_ratio,sigma_max_ratio.AdaptationContextdataclass intypes.pywith fields:fitness: np.ndarraygeneration_index: int(zero-based, job-scoped)current_value: floatdirection: Literal["minimize", "maximize"]
AdaptationActiondataclass intypes.pywith fields:next_value: floatfactor: floatwas_clamped: bool
AdaptationStepdataclass intypes.pywith fields:action,diagnostics.AdaptationPolicyprotocol inprotocols.pywithstep(context: AdaptationContext) -> AdaptationStep.LRProxyPolicyclass state fields:ema_snr,best_so_far,initial_sigma.direction="maximize"behavior in v1: raiseNotImplementedError(explicit fail-fast to prevent silent misuse).initial_valueis intentionally absent fromAdaptationContext; policy-ownedinitial_sigmais the single source of truth.- Diagnostics names returned by
LRProxyPolicy.stepremain exactly:proxy_signal,proxy_noise,proxy_snr,proxy_ema_snr,proxy_sigma_factor,proxy_sigma. - Runner output column names remain unchanged, including
proxy_sigma_factor_lastandproxy_ema_snr_last. was_clampedis internal diagnostic data in v1 and is not persisted to run-level CSV/manifest schema.- Strict purity boundary: policy code is pure decision logic and never mutates optimizer internals.
- Runner builds method descriptor for each
method_name. - For
lr_adapt_proxy, runner instantiatesLRProxyPolicy(params, initial_sigma)once per job. - Each generation: runner computes fitness via existing objective flow.
- Runner builds
AdaptationContextand callspolicy.step(context)afteres.tell. - Client adapter applies returned action to optimizer state (
es.sigma <- action.next_value). - Runner records last-step diagnostics into existing output fields.
- Non-adaptive methods (
vanilla_cma,pop4x) bypass policy path and keep current behavior. - Boundary rule: policy never receives an optimizer object; only the adapter mutates optimizer state.
- Phase 1: Add adaptation core types/protocols/policy/client skeletons.
- Phase 2: Add compatibility shim behavior in experiments/lr_adapt_proxy.py that forwards to
LRProxyPolicy. - Phase 3a: Add and pass deterministic parity tests (unit + golden trace) against shim path before runner rewiring.
- Phase 3b: Refactor experiments/methods.py to generic policy wiring and remove hardcoded
use_lrmutation branch. - Phase 4: Run post-refactor integration/regression checks (eval-only pipeline checks, verifier checks, schema checks, multi-worker soft-gate checks).
- Phase 5: Update docs in docs/analysis/lr_adapt_proxy_technical_spec.md, docs/analysis/lr_adapt_proxy_mechanism.md, and README.md to reflect finalized architecture and unchanged empirical claims.
- Doc-gate check: no absolute local filesystem paths in this plan.
- Plan lint check: pipeline-level parity line contains
workers>=1(and notworkers=1). - Baseline pin check: plan contains run ID
20260305T060114Z-cac939ce, commit675b7bd, and configexperiments/config/high_rigor.yaml. - Metric provenance check: soft-gate section names
results/method_aggregate.csvwith row filtermethod=lr_adapt_proxy. - Type clarity check: plan explicitly types
fitness,current_value, andAdaptationActionfields. - Unit:
robust_spreadparity for representative arrays, including near-constant values. - Unit: deterministic sequence parity for SNR/EMA/factor/clamp logic across many generations.
- Unit: first-generation edge case parity (
best_so_far is Nonepath). - Unit: no-improvement path parity (
signal=0) and clamp-bound saturation behavior. - Unit: policy purity guard (no optimizer object passed to policy).
- Unit:
direction="maximize"raisesNotImplementedError. - Integration: golden trace parity test (
workers=1, fixed seed) asserting exact equality for all six trace keys. - Integration: single-job run with fixed seed and
workers=1asserting exact equality forfinal_best,proxy_sigma_factor_last,proxy_ema_snr_last. - Integration: eval-only pipeline on small config to confirm no schema drift in
runs_long.csv,cell_stats.csv,method_aggregate.csv,findings.json. - Integration: verifier pass using existing script scripts/verify_rerun_artifacts.py.
- Contract:
was_clampedavailable in step diagnostics but absent from persisted run schemas. - Regression: ensure
vanilla_cmaandpop4xrows are unchanged by adaptation refactor path. - Contract: ensure pairwise artifact naming and manifest links remain unchanged.
- Multi-worker soft gate checks:
- schema invariance,
median_of_cell_median_deltasign consistency,cells_q_lt_0_05count within ±2 of baseline.
PLAN.mdhas no unresolved ambiguities identified in round-3 feedback.PLAN.mdexplicitly resolves all round-2 findings:initial_valueredundancy removed,- measurable soft gate added,
- phase-order contradiction resolved,
- maximize behavior defined,
- explicit hard-gate trace keys and
was_clampedtreatment documented.
- Baseline and metric provenance are explicit enough that implementers cannot choose different references.
- Type-level expectations for context/action fields are explicit in the plan.
- Parity section is enforceable and testable (contains numeric/measurable conditions).
- Existing configs run unchanged, including experiments/config/high_rigor.yaml and experiments/config/eval_only_lr_vs_vanilla.yaml.
- Output schemas and key names remain identical to baseline.
lr_adapt_proxynumeric behavior passes deterministic exact parity tests atworkers=1.- Existing wrapper scripts still execute without interface changes.
- No new scope creep is introduced (still one client, no rollback path).
- Risk: accidental behavior drift while extracting logic. Mitigation: Phase 3a parity tests are locked and passing before Phase 3b runner refactor; merge is gated on hard parity checks.
- Risk: over-generalization adds unused abstraction noise. Mitigation: one concrete protocol, one policy, one client in v1 only.
- Risk: hidden downstream schema dependency. Mitigation: run existing verifier and compare key artifact columns/keys as explicit checks in Phase 4.
- Risk: reviewers assume implied rollback support. Mitigation: explicit no-fallback stance in v1; failures are fixed in-path rather than via feature flags.
- Risk:
proxy_*diagnostic key coupling can be misread as full generality. Mitigation: treat key naming as intentional v1 compatibility debt; document decoupling as a future generalization step in the architecture note.
- This plan as review artifact.
- A short architecture note in repo docs describing policy/context/action model, why pycma is first client, and why
proxy_*naming remains in v1. - A parity matrix table template for reviewers listing each invariant and its test.
- This revision edits only
PLAN.md(no code/config changes). - Baseline pin uses tracked high-rigor run
20260305T060114Z-cac939ce, manifest commit675b7bd, and configexperiments/config/high_rigor.yaml. cells_q_lt_0_05tolerance default is fixed at ±2 cells for multi-worker soft gate.- Soft-gate thresholds are calibrated for the current 36-cell matrix; if matrix size changes, revisit tolerance policy (for example, percentage-based bounds).
- Default target is exact pycma parity for deterministic
workers=1runs. - Multi-worker parity target is measurable consistency plus schema invariance, not byte-identical outputs.
- Default API style is pure policy API; mutation remains in client adapter only.
- Default first client is pycma sigma control only.
- Default compatibility target is strict for config/CLI/artifact schemas.
- No fallback/feature-flag path is introduced in v1.
- v1 intentionally does not implement maximize semantics; fail-fast behavior is preferred over silent behavior.