Skip to content

docs(M290): V1_004 follow-up snapshot — 5 PRs shipped + root-cause diagnosis#258

Merged
noahgift merged 1 commit into
mainfrom
m290-v1004-followup-snapshot
May 20, 2026
Merged

docs(M290): V1_004 follow-up snapshot — 5 PRs shipped + root-cause diagnosis#258
noahgift merged 1 commit into
mainfrom
m290-v1004-followup-snapshot

Conversation

@noahgift

Copy link
Copy Markdown
Contributor

Summary

Consolidated view of the V1_004 follow-up chain after fixture 10's history inspection revealed 3 independent dense-vs-MoE gaps causing the M287 verbosity pattern.

Root cause diagnosis

Fixture 10 (oo__05-builder-pattern, turns=7) student output turn 2+ was repeated "Human: I need to see..." text. Three independent gaps:

  1. No EOS stop_token in try_qwen3_moe_backend
  2. No clean_chat_output post-decode cleanup
  3. No few-shot examples in CODE_SYSTEM_PROMPT (model emits Markdown rust instead of <tool_call> JSON)

5 PRs shipped this session

PR Status What
#1832 MERGED M32d KV cache, 19× speedup
#1837 MERGED sampling contract
#1842 MERGED sampling impl (+ #1844 squashed)
#1846 MERGED 3-knob HTTP wire-up
#1849 OPEN few-shot prompt
#1852 OPEN EOS stop_token + clean_chat_output

4-layer defense post-merge

  • Layer 1 (prompt): #1849 shows examples + bans Markdown blocks
  • Layer 2 (sampling): #1842 lets operator tune temperature/penalty
  • Layer 3 (real-time stop): #1852's stop_tokens halts at EOS
  • Layer 4 (post-process): #1852's clean_chat_output strips runaway

Revised dispatch sequence (sub-benches A/B/C)

Each ~10-15hr wall. Operator-coordinated. Acceptance: student_pass_rate > 0 in any sub-bench discharges V1_004.

Mechanical snapshot doc. M-counter NOT bumped.

🤖 Generated with Claude Code

…agnosis

Consolidated view of the V1_004 follow-up chain after fixture 10's
history inspection revealed 3 independent dense-vs-MoE gaps causing
the M287 verbosity pattern.

## Root cause diagnosis

Fixture 10 (oo__05-builder-pattern, turns=7) student output turn 2+
was repeated "Human: I need to see..." text. Three gaps:

1. No EOS stop_token in try_qwen3_moe_backend
2. No clean_chat_output post-decode cleanup
3. No few-shot examples in CODE_SYSTEM_PROMPT

## 5 PRs shipped this session

| PR | Status | What |
|---|---|---|
| #1832 | MERGED | M32d KV cache, 19× speedup (prerequisite) |
| #1837 | MERGED | sampling contract |
| #1842 | MERGED | sampling impl (+ #1844 squashed) |
| #1846 | MERGED | 3-knob HTTP wire-up |
| #1849 | OPEN | few-shot prompt |
| #1852 | OPEN | EOS stop_token + clean_chat_output |

## 4-layer defense post-merge

- Layer 1 (prompt): #1849 shows examples + bans Markdown blocks
- Layer 2 (sampling): #1842 lets operator tune temperature/penalty
- Layer 3 (real-time stop): #1852's stop_tokens halts at EOS
- Layer 4 (post-process): #1852's clean_chat_output strips runaway

## Revised dispatch sequence

Sub-bench A (prompt only), Sub-bench B (3-knob + prompt), C (control).
Each ~10-15hr wall. Operator-coordinated.

Mechanical snapshot doc. M-counter NOT bumped.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@noahgift noahgift merged commit 34260b0 into main May 20, 2026
@noahgift noahgift deleted the m290-v1004-followup-snapshot branch May 20, 2026 20:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant