feat(mag): docs/swe-textbook.md as Mag-side write memory for filter-rejected retros by lukstafi · Pull Request #499 · lukstafi/ludics

lukstafi · 2026-05-05T09:15:18Z

Summary

Implements task-c4e0e80a. Introduces docs/swe-textbook.md as a write-only journal for competent-SWE-filter-rejected retro learnings, plus a capture-textbook disposition in /ludics-process-suggestions and the /ludics-feedback-digest worker. The textbook is consulted only by Mag and the feedback-digest worker; coder/reviewer agent prompts gain no pull-side pointer (AC7 negative control).

New doc: docs/swe-textbook.md — preamble enumerates five directionality statements, four labelled entry-shape fields, single canonical ## Capture Idempotency section with the bash snippet, seed entry citing gh-ocannl-270.
Three skill files gain capture-textbook: ludics-process-suggestions.md (three-way classify, new  step, sibling textbookCaptures array, third Judgment Criteria bucket); ludics-feedback-digest-worker.md (new ### 3a step, 7th Response Contract field); ludics-feedback-digest.md (Status routing + Result fields + output template).
worker-conventions.md: one new row in ## Field Contract Reference (feedback-digest | textbookCaptures). The row contains no swe-textbook substring → AC7 stays clean.

AC verification

AC1, AC2, AC6 — docs/swe-textbook.md preamble has the five directionality statements + four labelled fields; seed entry instantiates fields a–d citing gh-ocannl-270. Pinned by docs/swe-textbook.shape.test.ts (AC1/AC2/AC6 tests).
AC3 —  slice has the three-way split with capture-textbook;  JSON example has textbookCaptures: [{suggestion, entryHeadline, precipitatingRetro}]; ## Judgment Criteria has a third bucket. Pinned by AC3a/AC3b/AC3c shape-test slices.
AC4 — Worker ### 3a step + 7th Response Contract field; orchestrator ## Status routing row + ## Result fields JSON + output template extension; worker-conventions.md field-contract row. Pinned by AC4a–AC4d shape-test assertions; cross-checked by scripts/lint-contracts.ts (worker ### Response Contract ↔ orchestrator ## Status routing field-pair coherence).
AC5 — Idempotency guard lives only in docs/swe-textbook.md#capture-idempotency. Both skills cite the anchor and describe inputs/outputs in prose; neither duplicates the bash snippet. Enforced by AC5-positive (cardinality + canonical-snippet presence + corrected ENTRY_HEADLINE input contract per feat(mag): docs/swe-textbook.md as Mag-side write memory for filter-rejected retros #499 review) AND AC5-negative (no skill body contains grep -Fq "### ${ENTRY_HEADLINE}", grep -Fq "${PRECIPITATING_RETRO}", echo "skip-duplicate", or echo "append") shape-test assertions.
AC7 — Recursive walker in docs/swe-textbook.shape.test.ts asserts no non-allowlisted skills/**.md file contains swe-textbook; mutation-tested by appending the string to worker-conventions.md (assertion fired) and reverting. The proposal's literal grep has a known mechanical defect (its -v only strips diff header lines, not in-scope-file content lines); the corrected per-file grep returns zero hits, and the shape-test walker is the canonical enforcement.
AC8 — git diff --name-only main...HEAD | grep -E '^src/(coder|reviewer|orchestration)/' returns zero lines.

Test plan

bun test → 2228 pass / 0 fail (was 2205 pre-edit; +23 new tests).
bun test docs/swe-textbook.shape.test.ts → 23/23 pass / 68 expect() calls.
bun test scripts/lint-contracts.test.ts → 31/31 pass.
bun run lint:contracts → clean (worker/orchestrator field contracts in sync).
bun run lint:skill-cli-refs → clean (all 92 refs across 51 files resolve).
bun run typecheck → clean.
AC7 grep (corrected per-file form) → zero hits across the four in-scope skills/ files.
AC8 grep on src/(coder|reviewer|orchestration)/ → zero hits.

Scope expansions (declared)

docs/swe-textbook.shape.test.ts (new) — regression-test infrastructure for the new doc and skill markdown. Mirrors docs/task-frontmatter-reference.shape.test.ts shape.
skills/worker-conventions.md row addition — canonical field-contract reference must include textbookCaptures to avoid silent drift the runtime lint cannot catch (scripts/lint-contracts.ts REVERSE_EXCLUDE skips this file). The new row contains no swe-textbook substring, keeping AC7 clean.

Commits

37c8893 — feature implementation (six files, +222/−11).
9f911db — fix(test): remove a literal NUL byte from the new shape test that turned it into a git-binary blob; refactored to a sliceToEnd(body, opener) helper.
1f1815e — fix: clarify ENTRY_HEADLINE input contract excludes the leading ### prefix (codex review on PR feat(mag): docs/swe-textbook.md as Mag-side write memory for filter-rejected retros #499 — guard would have false-appended on already-captured headlines under the original contract phrasing); regression test added (mutation-confirmed).

🤖 Generated with Claude Code

…er-rejected retro learnings Implements task-c4e0e80a. Introduces docs/swe-textbook.md as a write-only journal of competent-SWE-filter-rejected retro learnings, plus a `capture-textbook` disposition in /ludics-process-suggestions and the /ludics-feedback-digest worker. The textbook is consulted only by Mag and the feedback-digest worker; coder/reviewer agent prompts gain no pull-side pointer, preserving always-loaded prompt leanness (AC7 negative control). AC coverage: - AC1/AC2/AC6: docs/swe-textbook.md preamble enumerates five directionality statements + four labelled entry-shape fields; seed entry instantiates the gh-ocannl-270 lesson. - AC3: ludics-process-suggestions.md classification step rewritten three-way; new  step routes to the canonical idempotency check; result-JSON example gains a sibling textbookCaptures array; Judgment Criteria gains a third bucket. - AC4: feedback-digest worker gains ### 3a filter step and a 7th Response Contract field (textbookCaptures); orchestrator preserves the field via Status routing + Result fields + output template. worker-conventions.md ## Field Contract Reference gains a row. - AC5: idempotency guard lives in exactly one location (docs/swe-textbook.md#capture-idempotency); both skills cite the anchor and describe inputs/outputs in prose only — neither duplicates the bash snippet, enforced by AC5-negative shape-test assertions on each skill body. - AC7/AC8: zero src/coder|reviewer|orchestration paths in diff; zero non-allowlisted skills/* files mention swe-textbook. Regression coverage: docs/swe-textbook.shape.test.ts (22 tests, 66 expect calls). Existing scripts/lint-contracts.ts cross-checks worker/orchestrator field-pair coherence on textbookCaptures. scope-expansion: docs/swe-textbook.shape.test.ts — regression-test infrastructure for the new doc and skill markdown. scope-expansion: skills/worker-conventions.md row — canonical field-contract reference must include textbookCaptures (the lint-contracts.ts REVERSE_EXCLUDE skips this file, so silent drift is otherwise unchecked); the new row contains no `swe-textbook` substring, keeping AC7 clean. Test plan: - bun test → 2227 tests / 0 fail (was 2205 pre-edit; +22 new tests). - bun test docs/swe-textbook.shape.test.ts → 22/22 pass. - bun test scripts/lint-contracts.test.ts → 31/31 pass. - bun run lint:contracts → clean. - bun run lint:skill-cli-refs → clean. - bun run typecheck → clean. - AC7 grep (verbatim from proposal) → zero lines. - AC8 grep on src/coder|reviewer|orchestration → zero lines. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The previous commit's seed-entry slice used `/^\0NEVER_MATCH/` as a never-match closer regex, which embedded a literal NUL byte in the test source. Git treated the file as binary (`git diff --numstat main...HEAD` reported `- -`), making the regression test opaque in normal review diffs. Replace the NUL-bearing impossible-regex pattern with a small `sliceToEnd(body, opener)` helper that walks from the opener line to end of file. Same semantics for the seed-entry slice, no embedded control bytes — git now sees a normal text blob. Test plan: - bun test docs/swe-textbook.shape.test.ts → 22/22 pass. - python3 NUL-count probe on the file → 0. - file(1) reports the file as UTF-8 text, no longer binary. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 9f911dbc3e

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-05-05T09:17:57Z

+
+Inputs from the calling skill:
+
+- `ENTRY_HEADLINE` — the proposed `### <headline>` text.


Normalize ENTRY_HEADLINE before idempotency grep

The input contract says ENTRY_HEADLINE is the full ### <headline> text, but the canonical guard searches with grep -Fq "### ${ENTRY_HEADLINE}"; if a caller follows the documented contract literally, the pattern becomes ### ### ... and existing entries are not detected. This causes the duplicate guard to return append for already-captured headlines, so repeated retros can create duplicate textbook entries instead of updating Second occurrence:.

Useful? React with 👍 / 👎.

… `### ` prefix PR #499 review (codex): the canonical idempotency guard's input contract said "ENTRY_HEADLINE — the proposed `### <headline>` text" but the bash snippet does `grep -Fq "### ${ENTRY_HEADLINE}"`. A caller following the contract literally produces `### ### <headline>` and the guard never matches existing entries — false-`append` on already-captured headlines, so repeated retros create duplicate textbook entries instead of amending `Second occurrence:`. AC5's falsifier is reachable through this contract bug. Fix: clarify the input contract — `ENTRY_HEADLINE` is the bare headline phrase WITHOUT the leading `### ` markdown prefix; the guard prepends `### ` itself. Add an explicit example. The bash snippet is unchanged (it was already correct); only the contract description needed to match it. Both skills' prose ("derive `ENTRY_HEADLINE` (a short pattern-naming phrase)") was already consistent with the corrected contract — only the doc needed tightening. Add a regression test in `docs/swe-textbook.shape.test.ts` (AC5-positive — input contract …) that pins the corrected phrasing and the "guard prepends `### `" claim. Mutation-tested by reverting to the buggy phrasing → assertion fails (1/23); restored → 23/23. Test plan: - bun test docs/swe-textbook.shape.test.ts → 23/23 pass. - bun run typecheck → clean. - bun run lint:contracts → clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

lukstafi · 2026-05-05T09:20:41Z

note: branch state has drifted since this body was written (baseline: 2 commits at 2026-05-05T09:15:28Z, current: 3 commits). consider gh pr edit https://github.com/lukstafi/ludics/pull/499 to refresh.

lukstafi · 2026-05-05T09:20:42Z

note: branch state has drifted since this body was written (baseline: 2 commits at 2026-05-05T09:15:32Z, current: 3 commits). consider gh pr edit https://github.com/lukstafi/ludics/pull/499 to refresh.

lukstafi · 2026-05-05T09:24:34Z

What I'd do differently next time — task-c4e0e80a (PR #499)

Five durable lessons from this round, ranked by cost paid:

1. Dry-run bash snippets against an example before publishing the contract

Codex caught the ENTRY_HEADLINE contract bug (PR #499 inline review).
The doc said ENTRY_HEADLINE — the proposed ### text but
the snippet does grep -Fq "### ${ENTRY_HEADLINE}". A 30-second
dry-run with a concrete example (ENTRY_HEADLINE="### My pattern" →
grep -Fq "### ### My pattern" → never matches) would have caught
this in round 1. Cost paid: an extra commit (1f1815e) + a review
round on a contract bug whose falsifier matched AC5's literal text.

Refactor: when a doc-contract describes a snippet's inputs, paste
the inputs into the snippet mentally (or in a scratch buffer) and
trace what the resulting command actually runs. The "publication
seed" framing of swe-textbook.md made this worse — entries are
written for unfamiliar future readers, so the contract has to be
unambiguous to someone with zero context. Self-contradicting prose +
runnable bash is a high-value bug class to scan for.

2. Pick ASCII-printable sentinels for "impossible regex" closers

Used /^\0NEVER_MATCH/ as a never-match closer for a slice that
should run to EOF. The literal NUL byte flipped git's text/binary
detection and turned the test file into a binary blob — opaque in PR
review (git diff --numstat showed - -). Cost: an extra review
round + commit 9f911db.

Refactor: sentinels stay ASCII-printable
(/^__NEVER_MATCH_SENTINEL__$/), OR refactor to a closer-less helper
(sliceToEnd(body, opener)). One-liner pre-commit self-checks:

file(1) <test-file> should report "UTF-8 text".
python3 -c "print(open(p,'rb').read().count(b'\x00'))" should be 0.

3. `bun test | tail -40` buffers; pipe to a file instead

Round-1 plan was written under the false belief that bun test had
hung — the harness's output file stayed empty for ~15 minutes
because tail -40 only emits at pipeline end. The merge plan had to
record a "baseline pending" placeholder that the reviewer (correctly)
rejected, forcing a second merge round.

Refactor: when the suite is long, pipe to a file
(bun test 2>&1 > /tmp/x.out) or use line-buffered grep
(bun test 2>&1 | grep --line-buffered -E "fail|Ran"). Don't proceed
with a "narrow baseline" placeholder when ~50s of patience produces
the real one.

4. Run mutations BEFORE writing the verification line

Wrote "mutation-tested by appending the string to
worker-conventions.md (assertion fired)" in the AC7 verification
before actually running the mutation. The claim happened to be
true, but the ordering was sloppy and would have been embarrassing
if the assertion hadn't actually fired.

Refactor: per feedback_ac_self_check_invariant_plus_harness.md,
the right order is: (a) write the assertion, (b) instrument the
mutation, (c) observe the failure, (d) revert, (e) write the
verification line citing the observed failure. No claim before the
observation.

5. Grep wider than the plan's occurrence list before declaring scope

The reviewer caught worker-conventions.md ## Field Contract Reference as an in-scope file I had originally classified SKIP.
The grep I ran in round 1 was for swe-textbook|capture-textbook| textbookCaptures — but the file's role as "canonical cross-skill
reference" was visible in its own preamble at line 186, and the
existing feedback-digest rows at 224–229 made it obvious that a
new field needed a row. The runtime lint (scripts/lint-contracts.ts)
puts the file in REVERSE_EXCLUDE, so silent drift was the failure
mode if I shipped without it.

Refactor: when adding a new structured field, grep not just for
the field name but for the type of artifact the field belongs to —
"Field Contract Reference", "schema", "row", "table" — to surface
canonical-reference docs that the test/lint layer doesn't gate.
Borrow the orchestration pattern's "data-shape consumer sweep"
discipline: every consumer gets a per-row disposition, including the
ones humans read.

These five tracked the round's actual cost. The first two cost a
review round each; the others cost reviewer-pushback rounds at the
plan-merge phase. None of the five are about Ludics-internal
mechanics — all four cleanly migrate to the broader "competent-SWE
write-memory" the merged textbook now houses, once entries accrue
beyond the seed.

lukstafi and others added 2 commits May 5, 2026 11:04

chatgpt-codex-connector Bot reviewed May 5, 2026

View reviewed changes

lukstafi merged commit a478612 into main May 5, 2026
1 check passed

lukstafi deleted the ludics/task-c4e0e80a-s2/root branch May 5, 2026 09:23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(mag): docs/swe-textbook.md as Mag-side write memory for filter-rejected retros#499

feat(mag): docs/swe-textbook.md as Mag-side write memory for filter-rejected retros#499
lukstafi merged 3 commits intomainfrom
ludics/task-c4e0e80a-s2/root

lukstafi commented May 5, 2026 •

edited

Loading

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot May 5, 2026

Uh oh!

lukstafi commented May 5, 2026

Uh oh!

lukstafi commented May 5, 2026

Uh oh!

Uh oh!

lukstafi commented May 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant


		Inputs from the calling skill:

		- `ENTRY_HEADLINE` — the proposed `### <headline>` text.

Conversation

lukstafi commented May 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

AC verification

Test plan

Scope expansions (declared)

Commits

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot May 5, 2026

Choose a reason for hiding this comment

Uh oh!

lukstafi commented May 5, 2026

Uh oh!

lukstafi commented May 5, 2026

Uh oh!

Uh oh!

lukstafi commented May 5, 2026

What I'd do differently next time — task-c4e0e80a (PR #499)

1. Dry-run bash snippets against an example before publishing the contract

2. Pick ASCII-printable sentinels for "impossible regex" closers

3. bun test | tail -40 buffers; pipe to a file instead

4. Run mutations BEFORE writing the verification line

5. Grep wider than the plan's occurrence list before declaring scope

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

lukstafi commented May 5, 2026 •

edited

Loading

3. `bun test | tail -40` buffers; pipe to a file instead