Skip to content

Commit 8c0087b

Browse files
hyhmrrightclaude
andcommitted
improve: deepen all four skill guides — coverage, output quality & trigger accuracy
- decay-risks.md: promote Hyrum's Law and Orthogonality to named symptoms in R2 - pr-review-guide: add PR size calibration table, split Step 6 into 6a/6b, reframe severity block as tiebreaker referencing per-risk guides - architecture-guide: add Step 5 Testability Seam Assessment (Feathers seam model), simplify Mermaid Phase A/B into linear instruction with explicit Phase B reminder, add Conway's Law calibration examples; renumber Conway's Law to Step 6 - debt-guide: add concrete Pain×Spread calibration examples, add Step 2b for intentional vs accidental debt classification (Cunningham definition), replace unmeasurable date-based criterion with observable payback-plan check - test-guide: split Step 2 into 2a (Test Brittleness) + 2b (Mock Abuse) with merged single-pass sampling, add Characterization Test template (Feathers Ch.8), structure test performance guidance into three severity tiers - all SKILL.md: rewrite trigger descriptions in natural language, add explicit DO NOT trigger guards (fixes brooks-debt false-triggering on HTTP health checks) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
1 parent 1be19b0 commit 8c0087b

9 files changed

Lines changed: 195 additions & 50 deletions

File tree

skills/_shared/decay-risks.md

Lines changed: 7 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -78,8 +78,13 @@ and multiplies regression risk on every change.
7878
- A method uses more data from another class than from its own class
7979
- Two classes know each other's internal state directly
8080
- Changing one module requires recompiling or retesting many unrelated modules
81-
- Any observable behavior (including internal implementation details) is depended upon
82-
by callers, creating a de facto interface beyond the declared API
81+
- **Hyrum's Law**: with sufficient callers, every observable behavior — including
82+
implementation details, error message text, coincidental call ordering, and undocumented
83+
side effects — becomes an implicit contract that callers depend on, even though it was
84+
never guaranteed by the declared API
85+
- **Orthogonality violation**: changing one dimension of a feature forces edits in
86+
unrelated dimensions — adding a new payment type should not require touching logging,
87+
caching, or notification code, but in a non-orthogonal design it does
8388
- Information Leakage: a design decision (e.g., a file format, protocol detail, or data
8489
shape) is encoded in more than one module, so changing it requires coordinated edits
8590
in multiple places even though only one module "owns" the concept

skills/brooks-audit/SKILL.md

Lines changed: 13 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -6,10 +6,15 @@ description: >
66
Domain-Driven Design, A Philosophy of Software Design, Software Engineering at Google,
77
xUnit Test Patterns, The Art of Unit Testing, Working Effectively with Legacy Code,
88
and How Google Tests Software.
9-
Triggers when: user asks to audit architecture, review module structure,
10-
check system design, or assess project organization.
9+
Triggers when: user asks to audit architecture, review module or folder structure,
10+
check system design, understand how the codebase is organized, assess project layout,
11+
or asks "is this a good design?", "where should I put X?", or "why does everything
12+
depend on everything?".
1113
Also triggers when user mentions: clean architecture / dependency inversion /
12-
hexagonal architecture / bounded contexts / module coupling / package structure.
14+
hexagonal architecture / bounded contexts / module coupling / package structure /
15+
tangled dependencies / circular imports / spaghetti code / directory layout.
16+
Do NOT trigger for: PR-level code review (use brooks-review) or line-level refactoring
17+
questions — this skill analyzes structural/module-level concerns, not individual functions.
1318
Use this skill proactively when project structure or module dependencies are discussed.
1419
---
1520

@@ -25,9 +30,10 @@ description: >
2530
## Process
2631

2732
1. Draw the module dependency graph as a Mermaid diagram (Step 1 of the guide)
28-
2. Scan for each decay risk in the order specified in the guide
29-
3. Assign node colors in the Mermaid diagram based on findings (red/yellow/green)
30-
4. Run the Conway's Law check
31-
5. Output using the Report Template from common.md — Mermaid graph FIRST, then Findings
33+
2. Scan for each decay risk in the order specified in the guide (Steps 2–4)
34+
3. Assign node colors in the Mermaid diagram based on findings (red/yellow/green) — do this after Step 4
35+
4. Run the Testability Seam Assessment (Step 5)
36+
5. Run the Conway's Law check (Step 6)
37+
6. Output using the Report Template from common.md — Mermaid graph FIRST, then Findings
3238

3339
**Mode line in report:** `Architecture Audit`

skills/brooks-audit/architecture-guide.md

Lines changed: 36 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -87,11 +87,11 @@ graph TD
8787
class Database,MessageQueue,AuthService,WebApp,MobileApp clean
8888
````
8989

90-
**Phase A (during Step 1):** Generate the graph structure only — nodes, subgraphs, and edges.
91-
Do NOT add `classDef` or `class` lines yet. You need findings from Steps 2-4 before coloring.
90+
Draw the graph structure first — nodes, subgraphs, and edges — without any `classDef` or
91+
`class` lines. You cannot assign colors until you have completed the risk scan in Steps 2–4.
9292

93-
**Phase B (after Step 4):** Add `classDef` definitions and `class` assignments based on findings.
94-
The example above shows the final output after both phases.
93+
**After completing Step 4**, return to this graph and add the `classDef` and `class` lines
94+
based on findings. The example above shows the final colored output.
9595

9696
Rules:
9797
1. **Nodes** — Use top-level directories or services as nodes, not individual files
@@ -142,7 +142,30 @@ Check each in turn:
142142
- Can the module responsibility of each module be stated in one sentence from its name alone?
143143
- Would a new developer know which module to add a new feature to?
144144

145-
### Step 5: Conway's Law Check
145+
### Step 5: Testability Seam Assessment
146+
147+
A *seam* is a place in the architecture where behavior can be altered without editing source
148+
code — typically an interface, a configuration point, or a dependency injection boundary.
149+
Seam density is a proxy for testability and evolvability.
150+
151+
Scan for:
152+
- **No seam at the infrastructure boundary**: can you replace a real database, file system,
153+
or HTTP client with a test double without editing the module under test? If not, the
154+
architecture forces integration tests where unit tests would suffice.
155+
- **Seam collapse**: a module that was once testable in isolation has had its seams removed
156+
(e.g., direct constructor instantiation replaced a dependency injection point, or a global
157+
singleton replaced an injected collaborator).
158+
- **Missing seam in legacy areas**: modules without an obvious injection point or interface
159+
boundary — any change requires touching the entire call stack to substitute behavior.
160+
161+
If all modules have clear seams at their infrastructure boundaries → no finding.
162+
163+
If seams are absent or collapsed: flag as 🟡 Warning with a Remedy pointing to the specific
164+
module and the injection point that needs to be restored or introduced.
165+
166+
Source: Feathers — Working Effectively with Legacy Code, Ch. 4: The Seam Model
167+
168+
### Step 6: Conway's Law Check
146169

147170
After the six-risk scan, assess the relationship between architecture and team structure:
148171

@@ -153,6 +176,14 @@ After the six-risk scan, assess the relationship between architecture and team s
153176
- A mismatch that is theoretical but not yet causing pain is 🟡 Warning.
154177
- If team structure is unknown, note this as context missing and skip the check.
155178

179+
**Calibration examples:**
180+
- 🔴 Critical: the Payments module is owned by Team A but contains auth logic owned by Team B —
181+
every Payments change requires a sync meeting with Team B
182+
- 🟡 Warning: two separate teams own the `utils/` and `helpers/` directories which do the same
183+
things — theoretically painful but not yet causing release coordination issues
184+
- Not a finding: a single team owns a monorepo with multiple logical modules — Conway's Law
185+
misalignment requires *separate teams* to be meaningful
186+
156187
---
157188

158189
## Applying the Iron Law

skills/brooks-debt/SKILL.md

Lines changed: 8 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -6,10 +6,15 @@ description: >
66
Domain-Driven Design, A Philosophy of Software Design, Software Engineering at Google,
77
xUnit Test Patterns, The Art of Unit Testing, Working Effectively with Legacy Code,
88
and How Google Tests Software.
9-
Triggers when: user asks about tech debt, where to refactor, health check,
9+
Triggers when: user asks about tech debt, where to refactor, what to clean up first,
10+
codebase health (in the software quality sense — not server/HTTP health endpoints),
1011
or systemic maintainability questions.
11-
Also triggers when user asks why the codebase is hard to maintain,
12-
why adding developers isn't helping, or why complexity keeps growing.
12+
Also triggers when user asks: why the codebase is hard to maintain, why it's a mess,
13+
why adding developers isn't helping, why complexity keeps growing, what the worst part
14+
of the codebase is, or where to start paying back debt.
15+
Do NOT trigger for: server health checks, HTTP /health endpoints, infrastructure monitoring,
16+
or questions about application uptime — "health check" in those contexts means something
17+
different and this skill is not relevant.
1318
Use this skill proactively when maintainability or refactoring priorities are discussed.
1419
---
1520

skills/brooks-debt/debt-guide.md

Lines changed: 31 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -56,13 +56,19 @@ After listing all findings, score each one:
5656

5757
**Pain score (1–3):** How much does this slow down development today?
5858
- 3: Developers actively avoid touching this area; it causes bugs on most changes
59+
*(e.g., "nobody wants to touch the billing module because it always breaks something")*
5960
- 2: This area is noticeably slower to work in than the rest of the codebase
61+
*(e.g., "adding a field takes 2–3x longer here than elsewhere")*
6062
- 1: This is a quality issue but not currently causing active pain
63+
*(e.g., "inconsistent naming, but we always know what we mean")*
6164

6265
**Spread score (1–3):** How many files, modules, or developers does this affect?
6366
- 3: Affects 5+ modules or all developers on the team
67+
*(e.g., "every new feature touches the God class in core/")*
6468
- 2: Affects 2–4 modules or a subset of the team
69+
*(e.g., "the auth and notification modules are tightly coupled")*
6570
- 1: Isolated to one module or one developer's area
71+
*(e.g., "legacy parser that only one person maintains")*
6672

6773
**Priority = Pain × Spread** (max 9)
6874

@@ -72,6 +78,23 @@ After listing all findings, score each one:
7278
| 4–6 | Scheduled debt | Plan within quarter |
7379
| 1–3 | Monitored debt | Log and watch |
7480

81+
### Step 2b: Classify Debt Intent
82+
83+
After scoring, classify each finding as intentional or accidental:
84+
85+
**Intentional debt** — a conscious shortcut taken to meet a deadline, with the expectation
86+
of paying it back. The team knows about it. It may be legitimate (a strategic prototype,
87+
a known temporary workaround during a migration).
88+
89+
**Accidental debt** — degradation that accumulated without a deliberate decision: the team
90+
did not choose it and may not even know it exists. This is the kind Ward Cunningham's
91+
original definition warned against — not a tactical trade-off, but structural erosion.
92+
93+
Mark each finding with `[intentional]` or `[accidental]` in the Debt Summary Table.
94+
Intentional debt with no visible payback plan — no linked ticket, no code comment, no
95+
documented decision — should be treated as accidental for prioritization purposes.
96+
Focus remediation energy on accidental debt first; intentional debt at least has an owner.
97+
7598
### Step 3: Group by Decay Risk
7699

77100
Report findings grouped by risk type, not by file or module.
@@ -91,14 +114,14 @@ After the Findings section, append a Debt Summary Table:
91114
```
92115
## Debt Summary
93116
94-
| Risk | Findings | Avg Priority | Dominant Classification |
95-
|------|----------|-------------|------------------------|
96-
| Cognitive Overload | N | X.X | Monitored / Scheduled / Critical |
97-
| Change Propagation | N | X.X | ... |
98-
| Knowledge Duplication | N | X.X | ... |
99-
| Accidental Complexity | N | X.X | ... |
100-
| Dependency Disorder | N | X.X | ... |
101-
| Domain Model Distortion | N | X.X | ... |
117+
| Risk | Findings | Avg Priority | Dominant Classification | Intent |
118+
|------|----------|-------------|------------------------|--------|
119+
| Cognitive Overload | N | X.X | Monitored / Scheduled / Critical | intentional / accidental |
120+
| Change Propagation | N | X.X | ... | ... |
121+
| Knowledge Duplication | N | X.X | ... | ... |
122+
| Accidental Complexity | N | X.X | ... | ... |
123+
| Dependency Disorder | N | X.X | ... | ... |
124+
| Domain Model Distortion | N | X.X | ... | ... |
102125
103126
**Recommended focus:** [The one or two risks with the highest average priority — these are
104127
where investment will have the most impact.]

skills/brooks-review/SKILL.md

Lines changed: 9 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -7,11 +7,15 @@ description: >
77
xUnit Test Patterns, The Art of Unit Testing, Working Effectively with Legacy Code,
88
and How Google Tests Software.
99
Triggers when: user asks to review code, check a PR, review a pull request,
10-
or shares a diff for feedback.
11-
Also triggers when user mentions: Brooks's Law / Mythical Man-Month / conceptual integrity /
12-
second system effect / code smells / refactoring / clean architecture / DDD /
13-
domain-driven design / SOLID principles / Hyrum's Law / deep modules / tactical programming.
14-
Use this skill proactively whenever code, a diff, or a PR is shared for review.
10+
shares a diff or pastes code inline asking "does this look right?" or "is this okay?",
11+
or asks for feedback on a specific function, class, or file.
12+
Also triggers when user mentions: code smells / refactoring / clean architecture /
13+
DDD / domain-driven design / SOLID principles / Hyrum's Law / deep modules /
14+
tactical programming / conceptual integrity / Brooks's Law / Mythical Man-Month /
15+
second system effect.
16+
Do NOT trigger for: questions about how to write code from scratch, language syntax
17+
questions, or questions about tools and frameworks where no existing code is being reviewed.
18+
Use this skill proactively whenever existing code, a diff, or a PR is shared for review.
1519
---
1620

1721
# Brooks-Lint — PR Review

skills/brooks-review/pr-review-guide.md

Lines changed: 31 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,16 @@ in the changed code. Every finding must follow the Iron Law: Symptom → Source
1111
ORM migrations, lock files, minified bundles), skip those files entirely. Generated code reflects
1212
tool choices, not developer decisions. Note in the report which files were skipped and why.
1313

14+
**Scope calibration:** Adjust analysis depth based on PR size before starting.
15+
16+
| PR Size | Approach |
17+
|---------|----------|
18+
| < 50 lines | Focus on Steps 1–3 only; run Step 6a only if imports changed; run Step 6b if any class, method, or variable was renamed or introduced |
19+
| 50–300 lines | Full process, all steps |
20+
| > 300 lines | Full process; note in the Scope line that review is sampled — cover the highest-risk areas rather than every file |
21+
22+
For PRs > 500 lines: flag in the Summary that a PR this size is itself a Change Propagation signal. A change that cannot be reviewed in one pass suggests tangled responsibilities.
23+
1424
---
1525

1626
## Analysis Process
@@ -61,15 +71,20 @@ Look for:
6171
- Does this change add a class that only wraps another class or delegates everything?
6272
- Does this change add configuration options or extension points that serve no current requirement?
6373

64-
### Step 6: Scan for Dependency Disorder and Domain Model Distortion
74+
### Step 6a: Scan for Dependency Disorder
6575

66-
Look for Dependency Disorder:
6776
- Do any new imports create a dependency from a high-level module to a low-level one?
68-
- Do any new imports introduce a cycle?
77+
(e.g., domain service now imports a database driver or HTTP client)
78+
- Do any new imports introduce a cycle between modules?
79+
- Does any new interface force callers to depend on methods they do not use?
80+
81+
If no new imports and no structural changes → skip, no finding.
6982

70-
Look for Domain Model Distortion:
71-
- Do new class or variable names match the language the business uses?
72-
- Does any new class hold only data with no behavior, where behavior was expected?
83+
### Step 6b: Scan for Domain Model Distortion
84+
85+
- Do new class or variable names match the language the business uses for the same concept?
86+
- Does any new class hold only data with no behavior (pure data bag), where behavior was expected?
87+
- Does any new method put logic that belongs to the domain in a service or utility layer?
7388

7489
---
7590

@@ -89,6 +104,16 @@ Do not write a finding that you cannot complete fully. If you can identify a sym
89104
cannot state a consequence, you have not understood the risk well enough — re-read
90105
`../_shared/decay-risks.md` for that risk before writing the finding.
91106

107+
**Severity calibration:** Each risk in `../_shared/decay-risks.md` has its own Severity
108+
Guide with numeric thresholds — use those as the primary reference. When a finding sits
109+
on the boundary between two tiers, use this as a tiebreaker:
110+
- 🔴 Critical — actively breaking velocity or creating production risk *today*
111+
- 🟡 Warning — will if left unaddressed through the next few features
112+
- 🟢 Suggestion — worth fixing when nearby, not urgent
113+
114+
When multiple findings exist, list Critical items first. If there are more than 5 findings,
115+
add a one-line "Recommended fix order" at the end of the Findings section.
116+
92117
---
93118

94119
## Output

skills/brooks-test/SKILL.md

Lines changed: 8 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -4,10 +4,15 @@ description: >
44
Test quality review drawing on twelve classic engineering books, with primary focus
55
on xUnit Test Patterns, The Art of Unit Testing, How Google Tests Software,
66
and Working Effectively with Legacy Code.
7-
Triggers when: user asks about test quality, flaky tests, mock abuse,
8-
test debt, legacy code testability, or shares test files for review.
7+
Triggers when: user asks about test quality, shares test files for review,
8+
or complains that tests keep breaking for no reason, tests are slow, tests are hard
9+
to understand, test setup is complicated, or they can't tell what a test is testing.
910
Also triggers when user mentions: test smells / characterization tests /
10-
test pyramid / test doubles / over-mocking / brittle tests.
11+
test pyramid / test doubles / over-mocking / brittle tests / flaky tests /
12+
too many mocks / tests break on refactoring / slow test suite.
13+
Do NOT trigger for: writing new tests from scratch (use the regular test-writing workflow)
14+
or questions about testing frameworks and syntax — this skill reviews an existing test
15+
suite for structural quality problems, not individual test authoring questions.
1116
Use this skill proactively whenever test files are shared for review.
1217
---
1318

0 commit comments

Comments
 (0)