Skip to content

feat(M293): wire PHASE6_MAX_CONSECUTIVE_TEXT_TURNS env var into phase-6-bench.sh#261

Merged
noahgift merged 1 commit into
mainfrom
m293-wire-text-loop-env-var
May 21, 2026
Merged

feat(M293): wire PHASE6_MAX_CONSECUTIVE_TEXT_TURNS env var into phase-6-bench.sh#261
noahgift merged 1 commit into
mainfrom
m293-wire-text-loop-env-var

Conversation

@noahgift

Copy link
Copy Markdown
Contributor

Summary

  • Adds PHASE6_MAX_CONSECUTIVE_TEXT_TURNS env var (default 0 = disabled) that surfaces M292's --max-consecutive-text-turns CLI flag at the operator script.
  • Conditional flag-threading preserves baseline behavior when env var unset/0; only emits --max-consecutive-text-turns=N when N>0.
  • Header comment + inline docstring document the new env var and reference M292 evidence.

Why

M292 added ArenaOutcome::AgentTextLoop + opt-in detector at the ArenaSession level. This PR is the operator-facing surface: enables PHASE6_MAX_CONSECUTIVE_TEXT_TURNS=5 bash scripts/phase-6-bench.sh to short-circuit text-only-loop fixtures at turn 5 instead of paying ~24min/fixture × 20 fixtures ≈ ~8hr of wall.

What this does NOT do

  • NOT change the default (stays 0 = disabled) — preserves M270/M280/M287/M291 evidence comparability.
  • NOT bump M-counter — Phase 6 in active bench-run state.

Test plan

  • bash -n scripts/phase-6-bench.sh — syntax clean
  • Env var guard correctness verified (default=0 → no flag threaded; >0 → flag threaded)
  • cargo test -p ccpa-arena --lib — 146 lib tests still pass (no code touched, just bench script)
  • bash scripts/check-doc-drift.sh — 17/17 drift classes
  • CI green

Cross-references

  • M292 evidence: evidence/phase-6/v1004-agent-text-loop-detector-2026-05-21.md
  • M291 evidence (motivation): evidence/phase-6/v1004-sub-bench-b-pattern-shift-2026-05-21.md
  • aprender#1853 (M291 Gap 1 fix; in flight)

🤖 Generated with Claude Code

…-6-bench.sh

Exposes M292's `--max-consecutive-text-turns` CLI flag at the operator
script surface. Default 0 (disabled) preserves all historical evidence
comparisons (M270/M280/M287/M291) — turning on the detector changes
outcome distributions, so existing apples-to-apples baselines must stay
opt-out by default.

Future bench operators can enable with:

  PHASE6_MAX_CONSECUTIVE_TEXT_TURNS=5 bash scripts/phase-6-bench.sh

to short-circuit text-only-loop fixtures at turn 5 instead of paying
the full 20-turn × ~72s/turn ≈ 24min/fixture cost. On 20-fixture
corpus, that's ~6hr of bench wall saved per future V1_004 follow-up
dispatch where the model is text-only-locked.

Companion to aprender's V1_004 chain:
- aprender#1849 few-shot prompt
- aprender#1852 EOS stop_token + clean_chat_output wire-up
- aprender#1853 (in flight) clean_chat_output start-of-string strip
- M292 (just merged) ArenaOutcome::AgentTextLoop variant + opt-in detector

No M-counter bump — Phase 6 in active bench-run state; surface bumps wait
for V1_004 discharge or final pattern conclusion.

Refs:
- evidence/phase-6/v1004-agent-text-loop-detector-2026-05-21.md (M292)

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@noahgift noahgift merged commit 69f215b into main May 21, 2026
1 check failed
@noahgift noahgift deleted the m293-wire-text-loop-env-var branch May 21, 2026 01:27
noahgift added a commit that referenced this pull request May 22, 2026
Adds:
- book/ — mdBook source for paiml.github.io/claude-code-parity-apr
- .github/workflows/book.yml — CI build + GitHub Pages auto-deploy
- README.md restructured for professional landing (badges row, book
  callout, empirical highlight section, deep-links to book chapters)
- .gitignore — book/book/ (generated artifact)

Book structure (28 chapters):
- Introduction
- Overview: what is CCPA, methodology, two paths, architecture
- Static path: trace schema, differ, fixtures, bidirectional sensitivity
- Arena: overview, phase 5, phase 6, outcome variants
- Falsification gates: 20 gates, source-of-truth, behavioral parity, status flow
- Empirical findings: V1_004 chain (M286, M287, M291, M292, M294)
- Reference: CLI, trace schema, contract YAML, gate IDs
- Appendix: academic basis, milestone history, glossary

Build locally: mdbook build book/ -> book/book/index.html
Deploy: GitHub Pages auto-deploys on push to main when book/ changes.

Doc-drift detector: 17/17 drift classes pass.

Refs:
- evidence/phase-6/v1004-*.md (all sourced into book chapters)
- CCPA#259 M291, #260 M292, #261 M293, #262 M294 scope

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant