Skip to content

feat(#898): add security heuristics to review sub-agent#2038

Merged
ben-alkov merged 2 commits into
mainfrom
agent/898-security-subagent-heuristics
Jun 8, 2026
Merged

feat(#898): add security heuristics to review sub-agent#2038
ben-alkov merged 2 commits into
mainfrom
agent/898-security-subagent-heuristics

Conversation

@fullsend-ai-coder

@fullsend-ai-coder fullsend-ai-coder Bot commented Jun 8, 2026

Copy link
Copy Markdown
Contributor

The security sub-agent definition was 17 lines of high-level domain listing without specific detection heuristics. This caused it to miss critical findings like fail-open auth bugs and unjustified role escalations on PR #792.

Add three concrete review sections to the security sub-agent:

  • Fail-open/fail-closed evaluation checklist: instructs the
    sub-agent to check what happens when env vars, allowlists,
    and config values are unset, empty, or malformed. Defaults
    must deny, not permit.
  • Permission manifest audit guidance: ported from the standalone
    code-review skill. Any permission expansion requires a finding;
    expansions without linked justification are high severity.
  • Workflow permission and role auditing: flags role escalations,
    permissions block changes, and secrets block modifications in
    workflow files.

These heuristics directly target the four findings a human reviewer caught on PR #792 that the bot missed across 9 review rounds.

Note: make lint could not run due to sandbox Go module cache permissions. The change is markdown-only and does not affect Go compilation.


Toward #898

Post-script verification

  • Branch is not main/master (agent/898-security-subagent-heuristics)
  • Secret scan passed (gitleaks — 4cd660565f8abfc3bd1f0f697488a2e8e408101f..HEAD)
  • Pre-commit hooks passed (authoritative run on runner)
  • Tests ran inside sandbox

@github-actions

github-actions Bot commented Jun 8, 2026

Copy link
Copy Markdown

Site preview

Preview: https://e4036467-site.fullsend-ai.workers.dev

Commit: 91378a66e1b1b178341883ba21499731270cda4b

@fullsend-ai-review

Copy link
Copy Markdown

🤖 Review · Started 6:09 PM UTC
Commit: d0ac11b · View workflow run →

@fullsend-ai-review

fullsend-ai-review Bot commented Jun 8, 2026

Copy link
Copy Markdown

Review

Findings

Low

  • [pattern-inconsistency] internal/scaffold/fullsend-repo/skills/pr-review/sub-agents/security.md — The Exploration budget section uses domain-specific tier names ("Low-risk diffs" / "Security-relevant diffs") rather than the size-based labels ("Trivial diffs" / "Non-trivial diffs") established by intent-coherence.md and style-conventions.md. The security-specific framing is a reasonable domain adaptation (categorizing by risk surface rather than line count), but creates a minor cross-sub-agent inconsistency. With this PR, 3 of 6 sub-agents now include an Exploration budget section (intent-coherence, style-conventions, security), while 3 do not (correctness, docs-currency, cross-repo-contracts). This is a pre-existing pattern gap that this PR improves (was 2/6, now 3/6).

Info

  • [scope-exceeded] internal/scaffold/fullsend-repo/skills/pr-review/sub-agents/security.md — Issue Review agent misses security-critical findings on large architectural PRs #898 documents 4 missed findings across 3 review dimensions. This PR addresses the 2 security-dimension gaps (fail-open detection, role escalation detection). The PR correctly uses "Toward Review agent misses security-critical findings on large architectural PRs #898" language, signaling partial progress. The intent-coherence gap (ADR inconsistency) and correctness gap (missing test assertion) remain open.

  • [architectural-conflict] internal/scaffold/fullsend-repo/skills/pr-review/sub-agents/security.md — The orchestrator skill notes its design "departs from ADR-0018" and that "a superseding ADR is needed." This PR expands investment in the LLM-based sub-agent pattern without addressing the ADR gap. The tension predates this PR. The PR's rationale (sub-agent missed findings due to insufficient heuristics, not LLM non-determinism) provides evidence supporting the current direction.

  • [design-direction] internal/scaffold/fullsend-repo/skills/pr-review/sub-agents/security.md — The expansion from ~35 to ~130 lines brings the security sub-agent in line with the level of detail in correctness.md and intent-coherence.md. The new sections (exploration budget, fail-open checklist, permission manifest auditing, workflow permission tracking) are architecturally consistent with the existing sub-agent pattern. The category list expansion in SKILL.md (from 14 to 20 security categories) properly aligns the orchestrator's routing table with the sub-agent's new detection capabilities.

Previous run

Review

Findings

Low

  • [pattern-inconsistency] internal/scaffold/fullsend-repo/skills/pr-review/sub-agents/security.md — With this PR, 3 of 6 sub-agents now include an "Exploration budget" section (intent-coherence, style-conventions, security), while 3 do not (correctness, docs-currency, cross-repo-contracts). The docs-currency and cross-repo-contracts sub-agents are short enough that the omission is unremarkable, but correctness.md (opus-tier, 76 lines, complex evaluation logic) would benefit from the same calibration guidance. This is a pre-existing inconsistency that this PR improves (was 2/6, now 3/6) rather than introduces.

Info

  • [stale-reference] internal/scaffold/fullsend-repo/skills/pr-review/SKILL.md:197 — Prior medium finding resolved. The routing table now includes all six new categories (fail-open, permission-expansion, permission-reduction, role-escalation, workflow-permission, secret-exposure), matching the categories declared in the security sub-agent's new sections.

  • [edge-case] internal/scaffold/fullsend-repo/skills/pr-review/sub-agents/security.md — Prior low finding resolved. Each new section now explicitly specifies its category string via **Category:** directives, ensuring consistent category usage across invocations.

  • [scope-exceeded] internal/scaffold/fullsend-repo/skills/pr-review/sub-agents/security.md — Issue Review agent misses security-critical findings on large architectural PRs #898 documents 4 missed findings across 3 review dimensions. This PR addresses the 2 security-dimension gaps (fail-open detection, role escalation detection). The PR correctly uses "Toward Review agent misses security-critical findings on large architectural PRs #898" language, signaling partial progress. The intent-coherence gap (ADR inconsistency) and correctness gap (missing test assertion) remain open.

  • [architectural-conflict] internal/scaffold/fullsend-repo/skills/pr-review/sub-agents/security.md — The orchestrator skill notes its design "departs from ADR-0018" and that "a superseding ADR is needed." This PR expands investment in the LLM-based sub-agent pattern without addressing the ADR gap. The tension predates this PR. The PR's rationale (sub-agent missed findings due to insufficient heuristics, not LLM non-determinism) provides evidence supporting the current direction.

  • [design-direction] internal/scaffold/fullsend-repo/skills/pr-review/sub-agents/security.md — The expansion from 22 to ~117 lines brings the security sub-agent in line with the level of detail in correctness.md and intent-coherence.md. The new sections (exploration budget, fail-open checklist, permission manifest auditing, workflow permission tracking) are architecturally consistent with the existing sub-agent pattern.

Previous run (2)

Review

Findings

Medium

  • [stale-reference] internal/scaffold/fullsend-repo/skills/pr-review/SKILL.md:197 — The orchestrator's category routing table (step 3a) for the security dimension does not include categories for the three new heuristic areas added to the security sub-agent. The new fail-open/fail-closed, permission manifest, and workflow permission sections will produce findings using categories not present in the routing table (e.g., fail-open, permission-expansion, role-escalation). On re-reviews, these findings will fall through to ambiguous keyword-matching fallback or default to correctness, causing incorrect sub-agent routing. The existing privilege-escalation category partially covers role escalation, but fail-open and permission-manifest findings have no close match.
    Remediation: Add categories for the new heuristic areas to the security row of the step 3a routing table in the scaffold's SKILL.md. Suggested additions: fail-open, permission-expansion, permission-reduction, role-escalation, workflow-permission, secret-exposure. Alternatively, specify exact category strings in each new section of security.md and ensure those categories appear in the routing table.

Low

  • [edge-case] internal/scaffold/fullsend-repo/skills/pr-review/sub-agents/security.md — The new sections describe severity assignments for various findings (fail-open is critical, wildcard is high or info, permission expansion without justification is at least high) but do not specify what category string to use in the finding JSON. Without specifying categories, different invocations may produce inconsistent category strings for the same type of finding, making deduplication (step 6b) and re-review anchoring (step 3a) unreliable. Note: this is a pre-existing pattern — other sub-agents also omit explicit categories — but fixing it here would improve routing table consistency. See also: [stale-reference] finding above.

  • [pattern-inconsistency] internal/scaffold/fullsend-repo/skills/pr-review/sub-agents/security.md — Two of five sub-agents (intent-coherence.md, style-conventions.md) include an "Exploration budget" section defining when to read additional files and when to short-circuit. The security sub-agent's new heuristics add substantial review guidance without an analogous section. Note: the correctness sub-agent (same opus tier) also lacks this section, so this is not a universally established convention.

Info

  • [scope-exceeded] internal/scaffold/fullsend-repo/skills/pr-review/sub-agents/security.md — Issue Review agent misses security-critical findings on large architectural PRs #898 documents 4 missed findings from PR feat: reusable workflows (ADR 31), centralized routing (ADR 34), layered content (ADR 35) #792: (1) fail-open security bug, (2) role escalation, (3) ADR inconsistency, (4) missing test assertion. This PR addresses findings (1) and (2), which are in the security sub-agent's domain. Finding (3) belongs to intent-coherence and finding (4) to correctness. The PR claims "Closes Review agent misses security-critical findings on large architectural PRs #898" but only delivers on the security-specific gaps. Consider reframing as a partial fix or providing evidence that the other dimensions already have sufficient heuristics.

  • [design-direction] internal/scaffold/fullsend-repo/skills/pr-review/sub-agents/security.md — The expansion from 22 to 90 lines brings the security sub-agent in line with the level of detail in correctness.md (75 lines) and intent-coherence.md (76 lines). The added content is architecturally consistent with the existing LLM-based sub-agent pattern.

  • [architectural-conflict] internal/scaffold/fullsend-repo/skills/pr-review/sub-agents/security.md — The orchestrator skill notes its design "departs from ADR-0018" and that "a superseding ADR is needed." This PR expands investment in the LLM-based sub-agent pattern without addressing the ADR gap. The tension predates this PR and is not introduced by it, but the PR's rationale (sub-agent missed findings due to insufficient heuristics, not LLM non-determinism) provides evidence supporting the current direction.

@fullsend-ai-review fullsend-ai-review Bot added the requires-manual-review Review requires human judgment label Jun 8, 2026
@fullsend-ai-review

Copy link
Copy Markdown

🤖 Finished Review · ✅ Success · Started 6:09 PM UTC · Completed 6:16 PM UTC
Commit: d0ac11b · View workflow run →

@ben-alkov

Copy link
Copy Markdown
Member

/fs-fix

@fullsend-ai-coder

Copy link
Copy Markdown
Contributor Author

🤖 Fix · Started 6:56 PM UTC
Commit: d0ac11b · View workflow run →

@fullsend-ai-coder

Copy link
Copy Markdown
Contributor Author

🔧 Fix agent — iteration 1 (human-triggered)

Addressed 3 of 6 review findings with code changes: added routing table categories to SKILL.md, added explicit category annotations to each new security.md section, and added an exploration budget section. Disagreed with 3 info-level findings (scope-exceeded premise is incorrect — PR already says 'Toward' not 'Closes'; design-direction and architectural-conflict are observational and require no changes).

Fixed (3):

  1. [stale-reference] Routing table missing categories for new security heuristics (internal/scaffold/fullsend-repo/skills/pr-review/SKILL.md): Added six new categories (fail-open, permission-expansion, permission-reduction, role-escalation, workflow-permission, secret-exposure) to the security row of the step 3a routing table so re-review anchoring routes findings from the new heuristic sections correctly.
  2. [edge-case] New sections omit explicit category strings (internal/scaffold/fullsend-repo/skills/pr-review/sub-agents/security.md): Added Category annotations to each of the three new sections (fail-open/fail-closed, permission manifest, workflow permission) specifying the exact category strings the sub-agent should use in finding JSON.
  3. [pattern-inconsistency] Missing Exploration budget section (internal/scaffold/fullsend-repo/skills/pr-review/sub-agents/security.md): Added an Exploration budget section calibrating investigation depth to diff risk level, matching the pattern used by intent-coherence and style-conventions sub-agents.

Disagreed (3):

  1. [scope-exceeded] PR claims Closes Review agent misses security-critical findings on large architectural PRs #898 but only addresses 2 of 4 findings: The PR body already uses 'Toward Review agent misses security-critical findings on large architectural PRs #898', not 'Closes Review agent misses security-critical findings on large architectural PRs #898'. The reviewer's premise is incorrect — the PR does not claim to close the issue.
  2. [design-direction] Expansion is architecturally consistent: Positive observation requiring no code change.
  3. [architectural-conflict] Pre-existing ADR-0018 tension: The reviewer acknowledges this tension predates the PR and is not introduced by it. No action required in this PR.

Tests: passed

Updated by fullsend fix agent

fullsend-ai-coder Bot added a commit that referenced this pull request Jun 8, 2026
…stics

Add six new category strings (fail-open, permission-expansion,
permission-reduction, role-escalation, workflow-permission,
secret-exposure) to the security row of the orchestrator routing
table in SKILL.md so re-review anchoring routes findings correctly.

Add explicit Category annotations to each new section in security.md
so the sub-agent emits consistent category strings for deduplication.

Add an Exploration budget section to security.md, matching the pattern
used by intent-coherence and style-conventions sub-agents.

Addresses review feedback on #2038
@fullsend-ai-coder

Copy link
Copy Markdown
Contributor Author

🤖 Finished Fix · ✅ Success · Started 6:56 PM UTC · Completed 7:01 PM UTC
Commit: d0ac11b · View workflow run →

@fullsend-ai-review

fullsend-ai-review Bot commented Jun 8, 2026

Copy link
Copy Markdown

🤖 Finished Review · ✅ Success · Started 7:03 PM UTC · Completed 7:10 PM UTC
Commit: d0ac11b · View workflow run →

@fullsend-ai-review fullsend-ai-review Bot added ready-for-merge All reviewers approved — ready to merge and removed requires-manual-review Review requires human judgment labels Jun 8, 2026
The security sub-agent definition was 17 lines of high-level domain
listing without specific detection heuristics. This caused it to
miss critical findings like fail-open auth bugs and unjustified
role escalations on PR #792.

Add three concrete review sections to the security sub-agent:

- Fail-open/fail-closed evaluation checklist: instructs the
  sub-agent to check what happens when env vars, allowlists,
  and config values are unset, empty, or malformed. Defaults
  must deny, not permit.
- Permission manifest audit guidance: ported from the standalone
  code-review skill. Any permission expansion requires a finding;
  expansions without linked justification are high severity.
- Workflow permission and role auditing: flags role escalations,
  permissions block changes, and secrets block modifications in
  workflow files.

These heuristics directly target the four findings a human
reviewer caught on PR #792 that the bot missed across 9 review
rounds.

Note: make lint could not run due to sandbox Go module cache
permissions. The change is markdown-only and does not affect
Go compilation.

Closes #898
…stics

Add six new category strings (fail-open, permission-expansion,
permission-reduction, role-escalation, workflow-permission,
secret-exposure) to the security row of the orchestrator routing
table in SKILL.md so re-review anchoring routes findings correctly.

Add explicit Category annotations to each new section in security.md
so the sub-agent emits consistent category strings for deduplication.

Add an Exploration budget section to security.md, matching the pattern
used by intent-coherence and style-conventions sub-agents.

Addresses review feedback on #2038
@ben-alkov ben-alkov force-pushed the agent/898-security-subagent-heuristics branch from 04df0c6 to 91378a6 Compare June 8, 2026 19:19
@fullsend-ai-review

fullsend-ai-review Bot commented Jun 8, 2026

Copy link
Copy Markdown

🤖 Finished Review · ✅ Success · Started 7:20 PM UTC · Completed 7:29 PM UTC
Commit: d0ac11b · View workflow run →

@fullsend-ai-review fullsend-ai-review Bot added ready-for-merge All reviewers approved — ready to merge and removed ready-for-merge All reviewers approved — ready to merge labels Jun 8, 2026
@ben-alkov ben-alkov added this pull request to the merge queue Jun 8, 2026
Merged via the queue into main with commit 809dd64 Jun 8, 2026
10 checks passed
@ben-alkov ben-alkov deleted the agent/898-security-subagent-heuristics branch June 8, 2026 19:38
@fullsend-ai-retro

Copy link
Copy Markdown

🤖 Retro · Started 7:40 PM UTC
Commit: d0ac11b · View workflow run →

@fullsend-ai-retro

Copy link
Copy Markdown

Retro: PR #2038 — Add security heuristics to review sub-agent

Workflow quality: Good. The end-to-end cycle (code → review → fix → re-review → merge) completed in ~1.5 hours with minimal human intervention. The review agent caught a real structural issue (missing routing table categories), the fix agent resolved it, and the re-review confirmed the fix.

Key observations

  1. Self-referential review (covered by existing issues). The review bot approved changes to its own security sub-agent definition (skills/pr-review/sub-agents/security.md) without flagging this as a self-referential concern. This PR is a concrete example that validates the urgency of #1309 (mandatory finding when diff modifies review agent behavior files) and #1736 (add skills/ to protected paths).

  2. No human review recorded. The PR was merged with only bot approvals. The sole human action was ben-alkov triggering /fs-fix. For changes to security detection heuristics that propagate to all fullsend installations via the scaffold, human review seems warranted. This is covered by #1574 (gate ready-for-merge on human approval).

  3. Redundant third review round. The review agent ran 3 times: once on the initial commit, once after the fix (approved), and a third time with no intervening changes (approved again). The third round wasted tokens. This is covered by #1125 and #1271 (filter self-triggering review loops).

  4. 37-minute human latency for fix trigger. The review agent found a medium-severity finding at 18:16 UTC but didn't use CHANGES_REQUESTED, so the fix agent didn't auto-trigger. A human had to manually run /fs-fix at 18:53 UTC. This gap is partially addressed by #1453 (escalate medium findings).

Proposals

No new proposals — all identified improvement opportunities are already tracked by existing open issues. The most impactful ones to prioritize for this class of workflow are #1309 and #1736 (preventing self-referential review approval on agent definition changes).

@fullsend-ai-retro

Copy link
Copy Markdown

🤖 Finished Retro · ✅ Success · Started 7:40 PM UTC · Completed 7:45 PM UTC
Commit: d0ac11b · View workflow run →

@ralphbean ralphbean left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pushed a couple commits — 93ddb44 trims the heuristics down to policy directives (severity thresholds, "always flag" rules) and drops the domain-knowledge explanations that Claude already knows. 7dd318d deduplicates the permission file-type list between the Own block and the new section. LGTM with those in.

@ben-alkov

Copy link
Copy Markdown
Member

eek, I didn't mean to merge this one... I hadn't reviewed it yet.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready-for-merge All reviewers approved — ready to merge

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants