Skip to content

Commit df84567

Browse files
fix: test gate allowlist filtering and fail-open on timeout (dso-sfoc) (merge worktree-20260321-144840)
2 parents 7b4a53a + 13744a3 commit df84567

File tree

12 files changed

+710
-22
lines changed

12 files changed

+710
-22
lines changed

.tickets/.index.json

Lines changed: 73 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -91,6 +91,16 @@
9191
"title": "Cleanup tests",
9292
"type": "epic"
9393
},
94+
"dso-5ooy": {
95+
"deps": [
96+
"w21-ykic",
97+
"w21-ovpn"
98+
],
99+
"priority": 2,
100+
"status": "open",
101+
"title": "Conditional Security & Performance Review Overlays",
102+
"type": "epic"
103+
},
94104
"dso-6oo5": {
95105
"deps": [
96106
"dso-wqt4",
@@ -185,6 +195,42 @@
185195
"title": "Display Task titles during sprint batch execution",
186196
"type": "epic"
187197
},
198+
"dso-b538": {
199+
"deps": [
200+
"dso-bxng"
201+
],
202+
"priority": 2,
203+
"status": "open",
204+
"title": "As a DSO practitioner, the Deep Sonnet C reviewer applies deep hygiene, design, and maintainability checks",
205+
"type": "story"
206+
},
207+
"dso-b7nb": {
208+
"deps": [
209+
"dso-bxng"
210+
],
211+
"priority": 2,
212+
"status": "open",
213+
"title": "As a DSO practitioner, the Deep Sonnet A reviewer applies deep correctness checks with acceptance criteria validation",
214+
"type": "story"
215+
},
216+
"dso-boct": {
217+
"deps": [
218+
"dso-9ltc"
219+
],
220+
"priority": 2,
221+
"status": "open",
222+
"title": "As a DSO practitioner, the Light tier haiku reviewer applies a focused 6-item checklist to low-complexity changes",
223+
"type": "story"
224+
},
225+
"dso-bxng": {
226+
"deps": [
227+
"dso-9ltc"
228+
],
229+
"priority": 2,
230+
"status": "open",
231+
"title": "As a DSO practitioner, the Standard tier sonnet reviewer applies full 5-dimension checklists with researched sub-criteria",
232+
"type": "story"
233+
},
188234
"dso-cb9v": {
189235
"deps": [
190236
"dso-ocfn",
@@ -278,6 +324,15 @@
278324
"title": "Using-lockpicks brainstorm enforcement",
279325
"type": "epic"
280326
},
327+
"dso-gfry": {
328+
"deps": [
329+
"dso-bxng"
330+
],
331+
"priority": 2,
332+
"status": "open",
333+
"title": "As a DSO practitioner, the Deep Sonnet B reviewer applies deep verification checks evaluating test quality and coverage",
334+
"type": "story"
335+
},
281336
"dso-gjww": {
282337
"deps": [],
283338
"priority": 4,
@@ -313,6 +368,17 @@
313368
"title": "Don't merge to main between batches",
314369
"type": "epic"
315370
},
371+
"dso-jf62": {
372+
"deps": [
373+
"dso-b7nb",
374+
"dso-gfry",
375+
"dso-b538"
376+
],
377+
"priority": 2,
378+
"status": "open",
379+
"title": "As a DSO practitioner, the Deep Opus architectural reviewer applies cross-cutting synthesis checks across all specialist findings",
380+
"type": "story"
381+
},
316382
"dso-jny6": {
317383
"deps": [
318384
"dso-4lbf"
@@ -461,6 +527,13 @@
461527
"title": "Generic agent invocations should be given descriptive names",
462528
"type": "epic"
463529
},
530+
"dso-sfoc": {
531+
"deps": [],
532+
"priority": 1,
533+
"status": "closed",
534+
"title": "Bug: pre-commit test gate times out on large ticket-only commits (no allowlist filtering)",
535+
"type": "bug"
536+
},
464537
"dso-soyt": {
465538
"deps": [],
466539
"priority": 2,

.tickets/dso-5ooy.md

Lines changed: 85 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,85 @@
1+
---
2+
id: dso-5ooy
3+
status: open
4+
deps: [w21-ykic, w21-ovpn]
5+
links: []
6+
created: 2026-03-21T23:27:40Z
7+
type: epic
8+
priority: 2
9+
assignee: Joe Oakhart
10+
---
11+
# Conditional Security & Performance Review Overlays
12+
13+
14+
## Notes
15+
16+
<!-- note-id: k3xxzh2r -->
17+
<!-- timestamp: 2026-03-21T23:28:18Z -->
18+
<!-- origin: agent -->
19+
<!-- sync: unsynced -->
20+
21+
22+
## Context
23+
24+
Security and performance concerns cut across all review dimensions but don't belong as permanent sub-criteria in every review — they waste tokens on changes with no security/performance surface. This epic adds conditional review overlays that trigger only when the classifier detects relevant signals, orthogonal to the tier system (any tier can trigger an overlay).
25+
26+
## Brainstorm Research (to be resumed)
27+
28+
### Architecture Decision
29+
- Security and performance reviews are **conditional overlays**, not permanent dimensions
30+
- Triggered by classifier signals alongside tier routing: classifier emits trigger flags (security_review: true, performance_review: true)
31+
- Any tier level can trigger an overlay — a Light tier change touching auth still gets security review
32+
- Each overlay has its own dedicated reviewer agent, checklist, and findings that merge into reviewer-findings.json
33+
34+
### Security Review Triggers (proposed)
35+
- Code that touches external integrations
36+
- Code that touches data layer
37+
- Authentication or authorization code
38+
- Encryption-related code
39+
40+
### Security Review Criteria (from research)
41+
Source: Anthropic claude-code-security-review (OWASP-aligned)
42+
- Injection attacks: SQL, command, LDAP, XPath, NoSQL, XXE
43+
- Authentication & authorization: broken auth, privilege escalation, insecure direct object references, auth bypass, session flaws
44+
- Data exposure: hardcoded secrets, sensitive data logging, information disclosure, PII handling violations
45+
- Cryptographic issues: weak algorithms, improper key management, insecure RNG
46+
- Input validation: missing validation, improper sanitization, buffer overflows
47+
- Business logic flaws: race conditions, TOCTOU (time-of-check-time-of-use)
48+
- Configuration security: insecure defaults, missing security headers, permissive CORS
49+
- Supply chain: vulnerable dependencies, typosquatting
50+
- Code execution: RCE via deserialization, pickle injection, eval injection
51+
- XSS: reflected, stored, DOM-based
52+
- Error message information leakage (OWASP): errors that reveal internal state
53+
54+
### Performance Review Triggers (proposed)
55+
- Any operation more expensive than O(n)
56+
- Code that touches infrastructure
57+
- Code that touches data layer
58+
- Future enhancement: trigger on spike in test runtime or application latency in E2E testing (needs friction-free way to surface this data)
59+
60+
### Performance Review Criteria (from research)
61+
- N+1 query problems
62+
- Nested loops over large datasets
63+
- Inefficient algorithms or database queries
64+
- Memory usage patterns and potential leaks
65+
- Bundle size and optimization opportunities
66+
- Sequential I/O where parallel is possible (AI-specific)
67+
- Image optimization
68+
69+
### Integration Architecture (to be designed)
70+
Pipeline becomes: classifier → tier + overlay triggers → dispatch tier reviewer(s) + overlay reviewer(s) → merged findings → resolution loop
71+
- Overlay reviewers need own agent definitions and checklists
72+
- Findings merge into same reviewer-findings.json and scoring
73+
- Classifier/dispatch changes needed to trigger overlays
74+
- Performance runtime trigger (pytest --durations baseline comparison) deferred as future enhancement to avoid friction
75+
76+
### Open Questions
77+
- Exact classifier signal thresholds for triggering overlays
78+
- Whether overlays should have their own severity scale or use the existing critical/important/minor
79+
- How overlay findings interact with the autonomous resolution loop
80+
- Whether the security overlay replaces or supplements the existing dso-0wi2 sensitive-info security review
81+
82+
## Dependencies
83+
- w21-ykic (Tiered Review Architecture): requires classifier infrastructure to add overlay trigger signals
84+
- w21-ovpn (Review Intelligence & Precision): requires enriched checklist architecture (reviewer-delta files, confidence scoring, false-positive filters)
85+

.tickets/dso-b538.md

Lines changed: 37 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,37 @@
1+
---
2+
id: dso-b538
3+
status: open
4+
deps: [dso-bxng]
5+
links: []
6+
created: 2026-03-21T23:20:14Z
7+
type: story
8+
priority: 2
9+
assignee: Joe Oakhart
10+
parent: w21-ovpn
11+
---
12+
# As a DSO practitioner, the Deep Sonnet C reviewer applies deep hygiene, design, and maintainability checks
13+
14+
15+
## Notes
16+
17+
<!-- note-id: qbufm2gh -->
18+
<!-- timestamp: 2026-03-21T23:21:29Z -->
19+
<!-- origin: agent -->
20+
<!-- sync: unsynced -->
21+
22+
23+
## Description
24+
25+
**What**: Create the reviewer-delta-deep-hygiene-design-maint.md checklist for the Deep tier Sonnet C (hygiene + design + maintainability specialist) reviewer.
26+
27+
**Why**: Deep tier reviews high-complexity changes. Sonnet C owns three structural dimensions — the qualities that prevent long-term codebase decay. No ticket context needed because structural quality is ticket-independent.
28+
29+
## Acceptance Criteria
30+
31+
- When this story is complete, reviewer-delta-deep-hygiene-design-maint.md includes all Standard hygiene/design/maintainability criteria plus:
32+
- Flag functions where branching depth suggests extraction opportunities
33+
- Evaluate whether new abstractions follow single responsibility
34+
- Flag in-place mutation of shared data structures when immutable patterns are established in surrounding code
35+
- When this story is complete, the checklist includes no ticket context instructions (structural quality is ticket-independent)
36+
- When this story is complete, build-review-agents.sh regenerates the deep hygiene/design/maintainability reviewer agent successfully
37+

.tickets/dso-b7nb.md

Lines changed: 40 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,40 @@
1+
---
2+
id: dso-b7nb
3+
status: open
4+
deps: [dso-bxng]
5+
links: []
6+
created: 2026-03-21T23:20:11Z
7+
type: story
8+
priority: 2
9+
assignee: Joe Oakhart
10+
parent: w21-ovpn
11+
---
12+
# As a DSO practitioner, the Deep Sonnet A reviewer applies deep correctness checks with acceptance criteria validation
13+
14+
15+
## Notes
16+
17+
<!-- note-id: 6sefb1v1 -->
18+
<!-- timestamp: 2026-03-21T23:21:18Z -->
19+
<!-- origin: agent -->
20+
<!-- sync: unsynced -->
21+
22+
23+
## Description
24+
25+
**What**: Create the reviewer-delta-deep-correctness.md checklist for the Deep tier Sonnet A (correctness specialist) reviewer.
26+
27+
**Why**: Deep tier reviews high-complexity changes (classifier score 7+). Sonnet A owns correctness with full ticket context, enabling acceptance criteria validation that other reviewers can't perform.
28+
29+
## Acceptance Criteria
30+
31+
- When this story is complete, reviewer-delta-deep-correctness.md includes all Standard correctness criteria plus:
32+
- Acceptance criteria validation against ticket (when ticket context available)
33+
- Deeper edge-case analysis with explicit escape hatch: if code handles edge cases adequately, state so — do not manufacture findings
34+
- Inaccurate naming elevated from minor to important severity (name implies different behavior than implementation)
35+
- When this story is complete, the checklist includes ticket context instructions: use full ticket (minus verbose status update notes) when available; do not block on missing ticket context
36+
- When this story is complete, build-review-agents.sh regenerates the deep correctness reviewer agent successfully
37+
38+
## Constraints
39+
- Must reference Standard checklist criteria by inclusion, not duplication — the build process composes base + standard + deep-correctness
40+

.tickets/dso-boct.md

Lines changed: 46 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,46 @@
1+
---
2+
id: dso-boct
3+
status: open
4+
deps: []
5+
links: []
6+
created: 2026-03-21T23:20:00Z
7+
type: story
8+
priority: 2
9+
assignee: Joe Oakhart
10+
parent: w21-ovpn
11+
---
12+
# As a DSO practitioner, the Light tier haiku reviewer applies a focused 6-item checklist to low-complexity changes
13+
14+
15+
## Notes
16+
17+
<!-- note-id: t9c2dk6q -->
18+
<!-- timestamp: 2026-03-21T23:20:49Z -->
19+
<!-- origin: agent -->
20+
<!-- sync: unsynced -->
21+
22+
23+
## Description
24+
25+
**What**: Create the reviewer-delta-light.md checklist for the Light tier haiku reviewer with exactly 6 high-signal items.
26+
27+
**Why**: Light tier reviews low-complexity changes (classifier score 0-2). Haiku has limited context budget — the checklist must focus on what delivers value for small changes without codebase research.
28+
29+
## Acceptance Criteria
30+
31+
- When this story is complete, reviewer-delta-light.md contains exactly 6 checklist items:
32+
1. Silent failures: swallowed exceptions, empty catch blocks
33+
2. Tolerance/assertion weakening: changes that relax existing validation
34+
3. Test-code correspondence: production change without test change in same diff (binary check)
35+
4. Type system escape hatches: Any/any/interface{} without justifying comment
36+
5. Dead code introduced in the diff: unused imports, unreachable branches
37+
6. Non-descriptive names in the diff: single letters, generic words (data, temp, result, process, handle)
38+
- When this story is complete, the checklist includes no codebase research instructions (no Grep/Read)
39+
- When this story is complete, the checklist includes no similarity pipeline or ticket context references
40+
- When this story is complete, the checklist includes escape hatch language: if no issues found, state so explicitly rather than manufacturing findings
41+
- When this story is complete, build-review-agents.sh regenerates the light reviewer agent successfully
42+
43+
## Constraints
44+
- No codebase research tools — haiku doesn't have the context budget
45+
- Items must be evaluable from the diff alone
46+

.tickets/dso-bxng.md

Lines changed: 44 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,44 @@
1+
---
2+
id: dso-bxng
3+
status: open
4+
deps: []
5+
links: []
6+
created: 2026-03-21T23:20:06Z
7+
type: story
8+
priority: 2
9+
assignee: Joe Oakhart
10+
parent: w21-ovpn
11+
---
12+
# As a DSO practitioner, the Standard tier sonnet reviewer applies full 5-dimension checklists with researched sub-criteria
13+
14+
15+
## Notes
16+
17+
<!-- note-id: 2k8hj2b0 -->
18+
<!-- timestamp: 2026-03-21T23:21:05Z -->
19+
<!-- origin: agent -->
20+
<!-- sync: unsynced -->
21+
22+
23+
## Description
24+
25+
**What**: Create the reviewer-delta-standard.md checklist for the Standard tier sonnet reviewer with full 5-dimension coverage and researched sub-criteria.
26+
27+
**Why**: Standard tier handles ~30-40% of reviews. This is the most common reviewer and needs comprehensive criteria across all dimensions. Condensed ticket context reduces false positives without exhausting context budget.
28+
29+
## Acceptance Criteria
30+
31+
- When this story is complete, reviewer-delta-standard.md contains sub-criteria for all 5 dimensions:
32+
- Correctness: edge cases/failure states with escape hatch, race conditions in async operations, silent failures, tolerance/assertion weakening, over-engineering/YAGNI
33+
- Verification: behavior-driven not implementation-driven tests, test-code correspondence in same changeset, assertion quality (meaningful vs trivial), arrange-act-assert structure, test smells (naming, fixture bloat)
34+
- Hygiene: type system escape hatches without justification, nesting depth >2 levels (suggest early returns/extraction), dead code, suppression scrutiny (noqa/type:ignore with justifying comments), explicit exclusion of linter-catchable issues
35+
- Design: SOLID adherence, architectural pattern adherence, correct file/folder placement, Rule of Three duplication via similarity pipeline, coupling/dependency direction, reuse of existing utilities
36+
- Maintainability: codebase consistency (local patterns — error handling style, return type patterns, abstraction level — not linter rules), clear and accurate naming (non-descriptive AND inaccurate), comments explain why not what, doc correspondence for public interface changes (minor severity — only when specific existing doc artifact is stale)
37+
- When this story is complete, the checklist includes anti-shortcut distribution: noqa/type:ignore -> hygiene, skipped tests -> verification, tolerances/assertions -> correctness
38+
- When this story is complete, consolidation findings are severity=minor with orchestrator ticket creation
39+
- When this story is complete, the checklist includes ticket context instructions: use condensed summary (title + acceptance criteria) when available; do not block on missing ticket context
40+
- When this story is complete, build-review-agents.sh regenerates the standard reviewer agent successfully
41+
42+
## Research Sources
43+
Google engineering practices, OWASP, test smell literature, 5 Claude Code review plugins (Anthropic official, Claude Command Suite, claude-code-skills, claude-code-showcase, wshobson commands)
44+

0 commit comments

Comments
 (0)