Skip to content

feat(lint): adapter-call-site lint for mag.orchestration YAML keys (gh-ludics-496)#500

Merged
lukstafi merged 3 commits intomainfrom
ludics/gh-ludics-496-s1/root
May 5, 2026
Merged

feat(lint): adapter-call-site lint for mag.orchestration YAML keys (gh-ludics-496)#500
lukstafi merged 3 commits intomainfrom
ludics/gh-ludics-496-s1/root

Conversation

@lukstafi
Copy link
Copy Markdown
Owner

@lukstafi lukstafi commented May 5, 2026

Summary

  • Extends lint:config-reference with a fourth direction that asserts every documented mag.orchestration.<leaf> key in templates/config.reference.yaml has at least one literal read site in src/adapters/t3code.ts or src/adapters/tmux-adapter.ts — closing the silent-inert-key class flagged by PR Split settled-no-signal vs hung detection (task-a670cdbf) #493 round-2.
  • New extractAdapterReadKeysFromSource helper in scripts/lint-config-helpers.ts strips comments before regex so // orchCfg.future_key mentions cannot satisfy coverage. Floor-count + zero-match self-test guards mirror the gh-ludics-406 SILENT-DRIFT doctrine.
  • Captures two competent-SWE-filter learnings to docs/swe-textbook.md (write-side memory; AC11/AC12 negative controls keep the file out of always-loaded coder/reviewer prompts).

Closes #496.

Test plan

  • bun test scripts/lint-config-helpers.test.ts — 32 pass (6 new under extractAdapterReadKeysFromSource)
  • bun test scripts/lint-config-reference.test.ts — 13 pass (5 new under the adapter-call-site direction)
  • bun test docs/swe-textbook.shape.test.ts — 26 pass (3 new under the gh-ludics-496-entries describe block)
  • bun run lint:config-reference against the live tree — exit 0, no inert keys, no self-test errors
  • bun run typecheck — clean
  • bun test — 2243 pass / 0 fail (baseline 2228 + 15 new)
  • AC11/AC12 scope checks: forbidden-glob diff and git grep -F "swe-textbook" in agent-loaded skills both empty

🤖 Generated with Claude Code

@lukstafi
Copy link
Copy Markdown
Owner Author

lukstafi commented May 5, 2026

@codex review Focus on bugs, correctness issues, and edge cases. Do not check adherence to a spec or plan.

@chatgpt-codex-connector
Copy link
Copy Markdown

Codex Review: Didn't find any major issues. Already looking forward to the next diff.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

lukstafi and others added 3 commits May 5, 2026 13:45
Adds a regex-over-source extractor that returns the set of literal
first-segment keys read from `orchCfg` in a single adapter source
file (src/adapters/t3code.ts or src/adapters/tmux-adapter.ts). Strips
JSDoc / block / line comments before matching so a `// orchCfg.foo`
mention does not silently satisfy coverage for an unimplemented key.

Mirrors the SILENT-DRIFT WARNING shape from extractMagPathsFromSource
(gh-ludics-406) — the wrap-it-behind-a-helper failure mode is the
same; the floor-count and zero-match self-test are the same
mechanical guard.

Floor-count tests today: t3code 11 distinct keys (floor 9 with
slack), tmux-adapter 8 distinct keys (floor 6 with slack). Plus a
motivating-keys test that pins both phase_timeouts and
substantive_stall — if either drops out of the live extractor, the
lint that's about to be wired up in scripts/lint-config-reference.ts
stops covering its own precipitating failure class
(task-a670cdbf round-2 review of PR #493).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…cs-496)

Wires `extractAdapterReadKeysFromSource` through `runLint`. The new
direction enumerates the first-segment keys documented under
`mag.orchestration:` in templates/config.reference.yaml and asserts
that each is read by at least one of src/adapters/t3code.ts or
src/adapters/tmux-adapter.ts. Disjunctive coverage matches reality
— `default_mode`/`default_coder`/`default_reviewer` are t3code-only
concepts, and tmux's coverage is otherwise a strict subset.

`RunLintResult` gains two arrays. `inertYamlKeys` lists documented
leaves with no adapter read; `inertSelfTestErrors` fires when either
adapter file's extractor returns the empty set, so a future DRY
refactor that hides every `orchCfg?.<key>` literal behind a helper
fails loudly rather than silently passing the lint at zero coverage.
The two arms are kept distinct so a complete extractor miss is not
misdiagnosed as every YAML key being inert.

`IGNORE_ADAPTER_READ` bootstraps as the empty Set: every current
mag.orchestration: leaf is covered (verified by the live
`bun run lint:config-reference` pass).

Test scaffolding: `makeFixture` now backfills minimal
`const _ = orchCfg?.foo;` stubs into `src/adapters/{t3code,tmux-
adapter}.ts` for every fixture, so existing TS↔YAML / harness-subset
tests don't trip the new self-test arm. The five new
adapter-direction E2E tests override the stubs with explicit content
to pin AC3 (disjunctive), AC4 (first-segment over substantive_stall.*),
AC5 (zero-match self-test), and the AC8 happy/inert pair.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…496)

Both entries cite task-a670cdbf round-2 review of PR #493 as the
precipitating retro:

1. "New OrchestrationConfig fields require parse+merge in adapter
   init" — names the four surfaces (interface, default,
   migrateState backfill, adapter parse+merge) and the
   shared-parser pattern; closed mechanically by the lint shipped
   in the sibling commits.

2. "'Adapter init reads YAML' is a separate AC for
   OrchestrationConfig field additions" — flags the AC-bundling
   anti-pattern that hides the YAML-read step under "config field
   exists".

Both are competent-SWE-filter "obvious-to-experienced-engineer"
items that survive deadline pressure across distant surfaces.
Captured to the textbook's write-side memory rather than promoted
to always-loaded coder/reviewer prompts (per AC11/AC12 negative
control).

AC9 idempotency: pre-append snapshot evaluation showed all three
guard checks (both headlines, shared retro) miss against the
existing seed entry; post-append self-check shows all three hit.

Shape regression test in docs/swe-textbook.shape.test.ts pins each
new headline + retro citation under the existing slice helpers.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@lukstafi lukstafi force-pushed the ludics/gh-ludics-496-s1/root branch from 641344a to 78cb251 Compare May 5, 2026 11:45
@lukstafi lukstafi merged commit c89abd0 into main May 5, 2026
1 check passed
@lukstafi
Copy link
Copy Markdown
Owner Author

lukstafi commented May 5, 2026

Suggest Refactor (coder retrospective for #500 / gh-ludics-496)

What I'd do differently next time, in order of bite:

  1. Read AC guard semantics literally before reaching for
    relaxations.
    AC9 said "Both must miss for the append to
    proceed". My rev-0 merge reached for a "headline-only arm for
    Entry B" because the shared task-a670cdbf retro would trip the
    guard's OR after Entry A's write. The reviewer's correction —
    evaluate the guard once against the pre-append snapshot for
    both entries — is the natural reading once you ask "against
    what file state?". Lesson for next time: when a guard names two
    literal greps, the snapshot the greps run against is part of
    the guard's contract; don't relax the predicate, fix the
    timing.

  2. When a global change widens a tested function's read surface,
    bake the default into the fixture helper, not into per-call-site
    stubs.
    My rev-1 plan said "update the
    describe('runLint', …) happy-path tests with adapter stubs"
    and missed the separate describe('CLI integration', …)
    happy-path fixture. The reviewer caught it with
    sed -n '1,230p'. The shipped implementation went where the
    rev-2 plan eventually pointed: merge defaults inside
    makeFixture itself. The right impulse from rev 0 would have
    been "any test that calls runLint now trips the new arm —
    helper-level backfill is one place to update, per-call-site is
    N places to remember". I should have applied this 1.5 rounds
    earlier.

  3. Disambiguate comment-vs-code mentions during the pre-plan
    audit, not at implementation time.
    My alignment-check grep
    found 2 comment-only mentions of orchCfg.default_mode in
    t3code.ts (lines 721, 744). They are false-friends today (the
    key is also genuinely read), but a future // TODO: handle orchCfg?.future_key mention is a real false-positive risk. I
    noticed this only at implementation time and wired in
    comment-stripping with a (b') test as part of Edit 1. Better:
    the plan should have called out "pre-existing comment mentions
    exist; choose strip-or-not as a design decision in plan, not a
    surprise at impl". Test (b') is the right shape; it should have
    been in the plan from rev 0.

  4. Don't over-budget the fallback path of an idempotency check
    you already know will be clean.
    AC9's fallback (⚠️ ASSUMPTION GAP + escalate if any pre-append guard hits) was real
    engineering insurance, but the plan spent ~30 lines on it. By
    inspection: the seed entry cites gh-ocannl-270, this retro is
    task-a670cdbf, zero overlap; the snapshot would always be
    clean. The escalation path mattered for future tasks where
    that property is not given by inspection, but it didn't need
    that much plan real-estate for this PR. Faster: state the
    fallback in one line, link it to the existing
    Second occurrence: field-shape, move on.

  5. The five-iteration plan ping-pong (Coder → merge-0 →
    review → merge-1 → review → merge-2 → review-pass) was longer
    than necessary.
    Both reviewer REQUEST_CHANGES rounds turned
    on points that a more rigorous "what existing tests does this
    break?" audit at round-1-coder would have surfaced (item 2
    above) or that a literal AC reading would have settled (item 1).
    Worth a personal heuristic: before accepting a planned
    relaxation of a contract clause, write the un-relaxed version
    and ask "is this actually impossible?" — usually it isn't, and
    the relaxation was just convenience.

What worked well and shouldn't change:

  • The four-test-shapes-per-extractor pattern (positive / negative /
    comment-strip / two floor counts / motivating-keys) is the right
    granularity for source-grep extractors. Mirror it next time.
  • The pair (per-key inert array + zero-match self-test array)
    catches both "single literal removed by accident" and "every
    literal collapsed behind a helper". Each guard alone is
    insufficient; together they're the right shape for any
    source-grep lint.
  • Bun-test's mkdtempSync(join(tmpdir(), '<prefix>-')) per-test
    pattern keeps the new helper-test reuse hygienic; matched the
    existing in-file precedent without inventing a new harness.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

New OrchestrationConfig field can be silently inert: needs adapter-call-site lint

1 participant