Skip to content

Commit f612316

Browse files
apankov1claude
andauthored
feat: add defect-first-testing skill + harden slop-detector (#19)
## Summary - **New skill: `defect-first-testing`** — reverses the test-writing workflow. Instead of starting from "what does this function do?" (which produces slop tests), agents start from "what bugs could exist?" by analyzing the code's fault surface - 16 fault detectors across 8 categories (boundary, null-safety, error-handling, math, type-safety, mutation, async, branching) - `analyzeFaultSurface()` → `suggestTests()` → write tests → `validateCoverage()` pipeline - 15 defect classes with test suggestion templates - Fault catalog reference with before/after examples for all 16 patterns - 58 tests, zero dependencies, biome-clean - **Slop-detector hardening** — fixes false positives and improves detection accuracy: - `extractCodeOnly()` helper for string/comment-aware matching - `checkVacuousProperty`: code-only line analysis avoids string false positives - `checkSchemaSuccessOnly`: recognizes `if (!result.success) assert.fail(...)` as legitimate safeParse testing - `checkNoProductionCall`: uses `builtinModules` from `node:module` instead of hardcoded list Closes #19 ## Test plan - [x] `npx tsx --test skills/defect-first-testing/defect-first.spec.ts` — 58 pass - [x] `npx tsx --test skills/*/*.spec.ts` — 432 pass (full suite) - [x] `npx biome check .` — clean 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
1 parent 423e22a commit f612316

6 files changed

Lines changed: 1896 additions & 42 deletions

File tree

Lines changed: 142 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,142 @@
1+
---
2+
name: defect-first-testing
3+
description: "Reverses the test-writing workflow: analyze production code for its fault surface (what bugs could exist), then write tests targeting those specific defects. Use this skill whenever writing tests, generating test files, adding test coverage, or when asked to 'test this function' — prevents slop tests that compile and pass but catch zero bugs."
4+
---
5+
6+
# Defect-First Testing
7+
8+
Agents write tests that exercise APIs but catch zero bugs. They start from "what does this function do?" and produce tests that mirror the implementation. This skill reverses the workflow — start from "what bugs could exist in this code?" and write tests that would detect those bugs.
9+
10+
**When to use**: Writing tests for any function or module. Generating test files. Adding test coverage. Reviewing whether existing tests actually catch bugs.
11+
12+
**When not to use**: Writing implementation code. Measuring coverage metrics. Testing trivial getters/setters with no logic.
13+
14+
## Rationalizations (Do Not Skip)
15+
16+
| Rationalization | Why It's Wrong | Required Action |
17+
|---|---|---|
18+
| "I'll test the happy path first" | Happy-path tests catch zero bugs — the happy path already works | Start from fault surface, test defect scenarios first |
19+
| "100% coverage means thorough testing" | Coverage counts lines executed, not bugs caught | Check that each test targets a specific defect class |
20+
| "The function signature tells me what to test" | Signatures describe contracts, not failure modes | Analyze the implementation for fault-prone patterns |
21+
| "I'll add edge cases later" | "Later" never comes — and agents don't revisit | Identify edge cases up front via fault surface analysis |
22+
23+
---
24+
25+
## What To Protect (Start Here)
26+
27+
Before writing tests, analyze the production code for fault-prone patterns:
28+
29+
| Decision | Question to Answer | If Yes → Check Fault Class |
30+
|---|---|---|
31+
| Boundaries are correct | Does the code compare values or use indices? | `off-by-one`, `boundary-zero`, `empty-collection` |
32+
| Null/undefined is handled | Does the code access properties that could be null? | `null-undefined` |
33+
| Errors propagate correctly | Does the code catch or throw errors? | `missing-error-path`, `swallowed-error`, `wrong-error-type` |
34+
| Types are validated | Does the code convert or coerce types? | `type-coercion`, `nan-propagation` |
35+
| Math is safe | Does the code divide, modulo, or use domain-restricted functions? | `division-by-zero`, `nan-propagation` |
36+
| Mutations are intentional | Does the code modify arrays/objects in place? | `shared-mutation` |
37+
| Async failures surface | Does the code use Promise.all or .catch? | `unhandled-rejection` |
38+
| All branches execute | Does the code have switch/if-else chains? | `missing-branch` |
39+
40+
---
41+
42+
## The Defect-First Workflow
43+
44+
### Step 1: Analyze the Fault Surface
45+
46+
Read the production code and call `analyzeFaultSurface(source)`. This scans for patterns that historically produce bugs and returns a structured fault surface.
47+
48+
```typescript
49+
const surface = analyzeFaultSurface(productionCode);
50+
// surface.entries — each fault with line, defect class, and test strategy
51+
// surface.summary — fault counts per category
52+
// surface.coverage — unique defect classes found
53+
```
54+
55+
### Step 2: Generate Test Suggestions
56+
57+
Call `suggestTests(surface)` to get concrete test case suggestions for each defect class.
58+
59+
```typescript
60+
const suggestions = suggestTests(surface);
61+
// Each suggestion: name, defectComment, inputs, expectedBehavior
62+
```
63+
64+
### Step 3: Write Tests
65+
66+
For each suggestion, write a test that:
67+
1. Starts with a `// Defect:` comment explaining what production bug this catches
68+
2. Constructs inputs that trigger the specific fault
69+
3. Asserts on the specific behavior that would break if the defect existed
70+
71+
```typescript
72+
// Defect: off-by-one in loop termination — iterating arr.length
73+
// instead of arr.length-1 causes reading past the last valid element
74+
it('handles boundary at last element', () => {
75+
const result = processItems([1, 2, 3]);
76+
assert.equal(result.lastProcessed, 3);
77+
});
78+
```
79+
80+
### Step 4: Validate Coverage
81+
82+
Call `validateCoverage(testSource, surface)` to check that your tests cover the identified fault surface.
83+
84+
```typescript
85+
const validation = validateCoverage(testSource, surface);
86+
// validation.covered — number of defect classes with targeting tests
87+
// validation.gaps — defect classes with no targeting test
88+
// validation.score — 0-100 coverage score
89+
```
90+
91+
---
92+
93+
## Included Utilities
94+
95+
```typescript
96+
import {
97+
analyzeFaultSurface,
98+
suggestTests,
99+
validateCoverage,
100+
formatTestPlan,
101+
} from './defect-first.ts';
102+
```
103+
104+
---
105+
106+
## Key Principle: Every Test Needs a Defect Hypothesis
107+
108+
A test without a defect hypothesis is just an API exercise. Before writing `it('should return X')`, answer: **"What production bug does this test catch?"**
109+
110+
If you can't name the bug, don't write the test.
111+
112+
| Bad (API exercise) | Good (defect hypothesis) |
113+
|---|---|
114+
| `it('returns an array')` | `it('returns empty array for empty input, not undefined')` |
115+
| `it('handles valid input')` | `it('rejects NaN when numeric input expected')` |
116+
| `it('calls the callback')` | `it('calls callback exactly once, not per retry attempt')` |
117+
118+
---
119+
120+
## Violation Rules
121+
122+
| Rule | Severity | Description |
123+
|---|---|---|
124+
| Tests written without fault surface analysis | must-fix | Every test file must be preceded by `analyzeFaultSurface()` |
125+
| Test without `// Defect:` comment | should-fix | Every `it()` block should name its defect hypothesis |
126+
| Defect class in surface with no targeting test | should-fix | `validateCoverage()` reports gaps |
127+
| Test asserts on type/truthiness only | must-fix | Assertions must verify specific values, not just existence |
128+
129+
---
130+
131+
## Companion Skills
132+
133+
- **slop-test-detector**: Run after generating tests to catch remaining slop patterns. If `analyzeTestFile()` reports must-fail findings, the tests need rework.
134+
- **fault-injection-testing**: For testing failure paths in code with external dependencies.
135+
- **pairwise-test-coverage**: For combinatorial input coverage when multiple parameters interact.
136+
- **model-based-testing**: For testing state machine transitions.
137+
138+
---
139+
140+
## Reference
141+
142+
See [references/fault-catalog.md](references/fault-catalog.md) for the complete catalog of 16 code patterns and their associated defect classes, with before/after examples.

0 commit comments

Comments
 (0)