Skip to content

Commit 7477f08

Browse files
apankov1claude
andcommitted
docs: replace testing philosophy with structured testing rules
Evolve the 35-line testing philosophy into a full testing contract with pre-test gate, QA technique matching, MUST/SHOULD quality standards, and an explicit anti-patterns table. Stripped from project-specific references to serve as a reusable ruleset. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1 parent 831606f commit 7477f08

2 files changed

Lines changed: 91 additions & 53 deletions

File tree

README.md

Lines changed: 7 additions & 33 deletions
Original file line numberDiff line numberDiff line change
@@ -61,43 +61,17 @@ Before building a QA/testing skill, try the same prompt without any skill. If th
6161

6262
If the skill just teaches techniques the model already knows, it adds context tokens without adding value.
6363

64-
## Author's testing.md
64+
## Author's testing rules
6565

66-
After retiring the skills, the testing guidance that remains is a 35-line philosophy file used as a Claude Code rule (`.claude/rules/testing.md`). It encodes the principles that survived the evaluation — the stuff that actually matters for test quality:
66+
After retiring the skills, the testing guidance that remains is a structured ruleset used as a Claude Code rule. It encodes the pre-test gate, quality standards, and anti-patterns that survived the evaluation — evolved from a 35-line philosophy into a full testing contract:
6767

6868
See [testing-rules/testing.md](testing-rules/testing.md) for the full file.
6969

70-
```markdown
71-
# Testing
72-
73-
## Test real systems, not simulations
74-
75-
No mocks. Integration tests with real bindings (Miniflare for D1/KV/R2).
76-
vi.fn() only for platform APIs unavailable in test (WebSocket, ExecutionContext).
77-
78-
Why: mocked tests pass while prod breaks. The mock diverges from the real
79-
system silently. If you can test against the real thing, do it.
80-
81-
## Test the boundary, not the internals
82-
83-
Call the exported function. If deleting the call site doesn't break the test,
84-
you're testing the wrong layer. Never write inline "simulators" that copy
85-
production logic — import and call the actual code.
86-
87-
## Bugs get tests first
88-
89-
Write the failing test. Verify it fails for the right reason. Then fix.
90-
This order is non-negotiable — it proves the test actually catches the bug.
91-
92-
## What to test
93-
94-
- **Defect-first**: find fault-prone patterns, target those
95-
- **State machines**: all N×N transitions, not just happy path
96-
- **Combinatorial inputs**: pairwise coverage for multi-factor scenarios
97-
- **Boundaries**: Zod parse at every trust boundary
98-
```
99-
100-
This is the behavioral instruction that 14 skills couldn't improve on.
70+
Key elements:
71+
- **Pre-test gate** — a 6-step mandatory process before writing any test (read the requirement, read the implementation, select QA technique, enumerate cases, write AAA tests, self-verify)
72+
- **QA technique matching** — equivalence partitioning, boundary value analysis, decision tables, state transition testing matched per function
73+
- **Quality standards** with MUST/SHOULD priority markers
74+
- **Anti-patterns table** — 8 explicitly forbidden patterns (tautological assertions, mock-the-SUT, truthiness-only, etc.)
10175

10276
## License
10377

testing-rules/testing.md

Lines changed: 84 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -1,35 +1,99 @@
1-
# Testing
1+
# Testing Rules
22

3-
> Author's `.claude/rules/testing.md` — the testing philosophy that survived evaluating 14 QA skills.
4-
> Used as a Claude Code rule file (path-scoped to `**/*.spec.ts`).
3+
Quality standards and pre-test gate for all test code.
54

6-
## Test real systems, not simulations
5+
Priority markers: **MUST** = correctness risk if violated. **SHOULD** = quality risk. **MAY** = advisory.
76

8-
No mocks. Integration tests with real bindings (Miniflare for D1/KV/R2). `vi.fn()` only for platform APIs unavailable in test (WebSocket, ExecutionContext, DOLogger stubs).
7+
## Pre-Test Gate
98

10-
Why: mocked tests pass while prod breaks. The mock diverges from the real system silently. If you can test against the real thing, do it.
9+
**MUST** complete before writing ANY test file. This is non-negotiable.
1110

12-
## Test the boundary, not the internals
11+
### Step 1: Identify the requirement
1312

14-
Call the exported function. If deleting the call site doesn't break the test, you're testing the wrong layer. Never write inline "simulators" that copy production logic — import and call the actual code.
13+
Which issue or acceptance criteria does this test address? Read them.
1514

16-
If the function is private, extract the pure logic into its own module. Test that module. Production code delegates to it.
15+
### Step 2: Read the implementation
1716

18-
## Bugs get tests first
17+
Read the source file(s) being tested. Identify:
18+
- Exported functions and their signatures
19+
- Decision branches (if/else, switch, early returns)
20+
- Error paths (throws, catch blocks, error returns)
21+
- External boundaries (API calls, DB queries, external service bindings)
22+
- Edge cases visible in the code (null checks, empty arrays, boundary comparisons)
1923

20-
Write the failing test. Verify it fails for the right reason. Then fix. Then full suite. This order is non-negotiable — it proves the test actually catches the bug.
24+
### Step 3: Select QA technique per function
2125

22-
## What to test
26+
Match each function to the appropriate technique:
27+
- Multi-branch logic → Equivalence partitioning (one test per class)
28+
- Threshold / limit → Boundary value analysis (at, below, above)
29+
- Multiple conditions to outcome → Decision table (one test per row)
30+
- Entity lifecycle → State transition testing (valid + invalid transitions)
31+
- Data transformation → Equivalence partitioning + boundaries
32+
- Error handling → Equivalence partitioning (per error category)
2333

24-
- **Defect-first**: look at the production code, find the fault-prone patterns, write tests that target those — not tests that exercise the API shape
25-
- **State machines**: test all N×N transitions, not just happy path. Invalid transitions must throw.
26-
- **Combinatorial inputs**: pairwise coverage for multi-factor scenarios. Cover all factor pairs in near-minimal cases.
27-
- **Boundaries**: Zod parse at every trust boundary. Valid input, invalid input, edge values.
34+
### Step 4: Enumerate test cases
2835

29-
## Naming
36+
For each function under test, list cases derived from the technique:
37+
- Name the specific partition, boundary, state transition, or decision row
38+
- Describe the expected output
39+
- Name the production defect it would catch
3040

31-
`module.spec.ts` (unit), `module.workers.spec.ts` (Miniflare), `module.contract.spec.ts` (schema), `module.pairwise.spec.ts` (combinatorial).
41+
### Step 5: Write test code (AAA structure)
3242

33-
Test names describe behavior: `'returns X when Y is Z'`. Never: `'works correctly'`, `'should work'`.
43+
```typescript
44+
it("returns 'house' for 'House' origin value", () => {
45+
// Arrange
46+
const input = "House";
3447

35-
Assertions use specific values: `expect(result.code).toBe('game_not_found')` not `expect(result).toBeDefined()`.
48+
// Act
49+
const result = normalizeOrigin(input);
50+
51+
// Assert
52+
expect(result).toBe("house");
53+
});
54+
```
55+
56+
### Step 6: Self-verify
57+
58+
Could this test fail if the production code had a real bug? If the test would still pass with a broken implementation, delete it.
59+
60+
## Quality Standards
61+
62+
### MUST
63+
64+
- Complete the pre-test gate before writing any test file
65+
- Select a QA technique from the guide for each function under test
66+
- Enumerate test cases before writing test code
67+
- Use AAA structure (Arrange, Act, Assert) in every test
68+
- Assert on computed values with specific expected values — never truthiness-only
69+
- Import and call at least one production function per test file
70+
- No tautological assertions (`expect(true).toBe(true)`)
71+
- No self-referential assertions (`expect(x).toBe(x)`)
72+
- Never mock the system under test — mock only at external boundaries (fetch, timers, external services)
73+
- Include negative test cases (error paths, invalid inputs, throws)
74+
- Bugs get tests first: write the failing test, verify it fails for the right reason, then fix
75+
76+
### SHOULD
77+
78+
- Use test data builders instead of inline object literals
79+
- Name test files after the function or behavior, not the source file
80+
- One focused concern per test file
81+
- Test the boundary, not the internals — if deleting the call site doesn't break the test, you're testing the wrong layer
82+
83+
## Test Tiers
84+
85+
- **Unit** (`unit/`): Pure logic, no I/O, no DB. Use fake timers for clock-dependent helpers.
86+
- **Integration** (`integration/`): Real database bindings. Real time only (DB time functions are not controlled by fake timers). Prefer seeded timestamps over elapsed-time waits.
87+
88+
## Anti-Patterns (Explicitly Forbidden)
89+
90+
| Anti-pattern | Example | Why |
91+
|---|---|---|
92+
| Tautological assertion | `expect(true).toBe(true)` | Cannot fail |
93+
| Self-referential | `expect(x).toBe(x)` | Always passes |
94+
| Literal roundtrip | Build `{name: "foo"}`, assert `obj.name === "foo"` | Tests construction |
95+
| Truthiness-only | `expect(result).toBeTruthy()` | Passes for any non-null |
96+
| Mock the SUT | Mock `doThing` to test `doThing` | Tests the mock |
97+
| Empty test body | `it("works", () => {})` | Proves nothing |
98+
| No production call | `it("adds", () => expect(1+1).toBe(2))` | Tests JavaScript |
99+
| Schema-success-only | `expect(result.success).toBe(true)` | Doesn't verify parsed data |

0 commit comments

Comments
 (0)