paiml · noahgift · May 20, 2026 · May 20, 2026
diff --git a/evidence/phase-6/v1004-3knob-plumbing-shipped-2026-05-20.md b/evidence/phase-6/v1004-3knob-plumbing-shipped-2026-05-20.md
@@ -0,0 +1,89 @@
+# V1_004 3-knob plumbing SHIPPED — M288 prerequisites resolved
+
+[M288 dispatch recipe](v1004-3knob-dispatch-recipe-2026-05-20.md) | [M287 bench pattern](m32d-bench-pattern-2026-05-20.md) | [M286 M32d shipped](m32d-shipped-2026-05-20.md)
+
+**Status (2026-05-20, M289)**: Companion-side update noting that the [M288 dispatch recipe](v1004-3knob-dispatch-recipe-2026-05-20.md)'s prerequisite — env-var plumbing for the 3-knob toolkit — is now SHIPPED at paiml/aprender#1846. Operator can dispatch the M288 sub-benches (A/B/C/D) as soon as #1846 merges + apr binary is rebuilt.
+
+## What changed since M288
+
+[M288](v1004-3knob-dispatch-recipe-2026-05-20.md) noted (verbatim):
+
+> Note: `APR_AGENT_TEMPERATURE` + `APR_AGENT_TOP_K` + `APR_AGENT_TOP_P` env-var plumbing in the bench script + apr code dispatcher is NOT YET shipped. It's the bench-side companion work for aprender#1842. If those env vars aren't read yet, the bench will use greedy regardless.
+
+That gap is now closed. paiml/aprender#1846 ships:
+
+1. **`ChatCompletionRequest` extensions** — 5 new optional fields (`top_k`, `repeat_penalty`, `repeat_last_n`, `seed`, plus existing `top_p`). Aprender-specific extensions to the OpenAI schema; all `#[serde(default)]` so existing clients are unaffected.
+2. **`try_qwen3_moe_backend` wire-up** — threads the 5 fields from HTTP request through to `QuantizedGenerateConfig::default()` overrides. Previously hardcoded to defaults (greedy).
+3. **`AprServeDriver::build_openai_body` env-var reads** — reads 6 env vars (`APR_AGENT_TEMPERATURE`, `APR_AGENT_TOP_K`, `APR_AGENT_TOP_P`, `APR_AGENT_REPEAT_PENALTY`, `APR_AGENT_REPEAT_LAST_N`, `APR_AGENT_SEED`) and includes in HTTP body when set. apr code dispatches → apr serve receives JSON → `try_qwen3_moe_backend` consumes → `sample_from_logits` applies.
+
+## End-to-end flow (post-#1846)
+
+```
+operator shell
+  APR_AGENT_TEMPERATURE=0.3 APR_AGENT_REPEAT_PENALTY=1.2 ...
+  └─→ scripts/phase-6-bench.sh  (inherits env)
+       └─→ ccpa-arena-bench --driver-binary apr code ...  (env inherited)
+            └─→ apr code  (env inherited)
+                 └─→ AprServeDriver::launch  (env inherited)
+                      └─→ apr serve  (HTTP server; subprocess of apr code)
+                 └─→ AprServeDriver::build_openai_body  (READS env vars)
+                      └─→ HTTP POST /v1/chat/completions {temperature: 0.3, ...}
+                           └─→ try_qwen3_moe_backend  (PARSES request)
+                                └─→ QuantizedGenerateConfig {temperature: 0.3, repeat_penalty: 1.2, ...}
+                                     └─→ run_qwen3_moe_generate
+                                          └─→ sample_from_logits  (APPLIES penalty + sampling)
+```
+
+Every link in the chain is now wired. The only thing standing between operator and V1_004 discharge measurement is:
+
+1. `gh pr merge 1846 --squash --admin` (operator authorization)
+2. Rebuild + install apr binary from new main: `cd /home/noah/src/aprender && cargo build --release -p apr-cli --bin apr && cp /mnt/nvme-raid0/targets/aprender/release/apr /home/noah/.local/bin/apr`
+3. Dispatch one of the [M288 sub-benches](v1004-3knob-dispatch-recipe-2026-05-20.md#recommended-dispatch-sequence)
+
+## Recommended first dispatch
+
+Sub-bench B from M288 (sampling + repetition penalty combined) is the strongest single-shot candidate to break the M287 driver_error pattern:
+
+```bash
+APR_MODEL=/home/noah/models/Qwen3-Coder-30B-A3B-Instruct-Q4_K_M.gguf \
+PHASE6_COMPLIANCE_ENFORCED=1 \
+PHASE6_MAX_TURNS=20 \
+PHASE6_WALL_SECONDS=3600 \
+APR_TIMEOUT_S=900 \
+APR_AGENT_HTTP_TIMEOUT_S=1500 \
+APR_AGENT_MAX_TOKENS_CAP=1024 \
+APR_AGENT_TEMPERATURE=0.3 \
+APR_AGENT_TOP_K=50 \
+APR_AGENT_TOP_P=0.95 \
+APR_AGENT_REPEAT_PENALTY=1.2 \
+APR_AGENT_REPEAT_LAST_N=64 \
+bash scripts/phase-6-bench.sh 2>&1 | tee /tmp/phase-6-30b-3knob-b.log
+```
+
+Expected wall: ~10-15 hours. Acceptance: `evidence/under-contract/scores.json` with `student_pass_rate > 0` discharges V1_004.
+
+## Status reconciliation across all aprender PRs
+
+| PR | Status | Content |
+|---|---|---|
+| #1832 | ✅ MERGED | M32d KV cache (19× speedup) |
+| #1837 | ✅ MERGED | qwen3-moe-sampling-v1 contract |
+| #1842 | ✅ MERGED | sampling impl + tests (absorbed #1844 in squash) |
+| #1844 | ✅ MERGED | rep-penalty impl + tests |
+| #1843 | ❌ CLOSED | superseded by #1844 |
+| #1835 | 🟡 OPEN | qwen3-moe-streaming-sse-v1 contract (workspace-test pending) |
+| #1846 | 🟡 OPEN | 3-knob HTTP wire-up (this PR's prerequisite) |
+| #1829 | ❌ CLOSED | M32d engineer playbook (superseded by in-session #1832) |
+| #1826 | ✅ MERGED | M32d scope doc |
+
+## Companion-side state
+
+CCPA M281-M288 (8 docs) tracks the full upstream arc + dispatch recipe. M289 (this doc) closes the loop noting the plumbing is shipped.
+
+The currently-running greedy V1_004 bench (started 05:58Z, on fixture ~8-10 by now) is the empirical baseline against which any 3-knob sub-bench compares. Don't dispatch the new bench until the baseline finishes — the comparison is the value.
+
+## What this doc is NOT
+
+- NOT a V1_004 discharge — sub-benches not dispatched yet
+- NOT an aprender#1846 merge — operator decision
+- NOT a guarantee that the 3-knob toolkit will discharge V1_004 — M287 pattern is the empirical hypothesis-under-test; sampling/penalty may or may not break it. The bench is the measurement.