Skip to content

feat(#990): require exhaustive variable verification for security claims#2039

Merged
ben-alkov merged 2 commits into
mainfrom
agent/990-exhaustive-security-verification
Jun 8, 2026
Merged

feat(#990): require exhaustive variable verification for security claims#2039
ben-alkov merged 2 commits into
mainfrom
agent/990-exhaustive-security-verification

Conversation

@fullsend-ai-coder

Copy link
Copy Markdown
Contributor

The review agent was making blanket safety claims after verifying only a subset of variables in security-sensitive contexts (e.g., confirming one variable was sanitized and asserting "sanitization is handled" while other interpolated variables remained unchecked). This partial-verification anti-pattern is more harmful than a silent miss because it reduces human scrutiny on the exact areas that need it.

Changes:

  • security.md (pr-review sub-agent): Added a "Verification
    methodology" section with a 4-step procedure: enumerate all inputs,
    verify each independently, report coverage explicitly, and flag
    gaps rather than dismissing them. Placed before the existing
    injection defense section.
  • SKILL.md (code-review skill): Added an "Exhaustive
    security-control verification" bullet to the Security dimension
    with the same core guidance, keeping the standalone review skill
    consistent with the sub-agent definition.

Note: pre-commit could not run in the sandbox due to Go module cache permission errors (infrastructure issue, not code-related). The post-script runs pre-commit authoritatively on the runner.


Closes #990

Post-script verification

  • Branch is not main/master (agent/990-exhaustive-security-verification)
  • Secret scan passed (gitleaks — 4cd660565f8abfc3bd1f0f697488a2e8e408101f..HEAD)
  • Pre-commit hooks passed (authoritative run on runner)
  • Tests ran inside sandbox

@github-actions

github-actions Bot commented Jun 8, 2026

Copy link
Copy Markdown

Site preview

Preview: https://f5afdd9d-site.fullsend-ai.workers.dev

Commit: e55d31de7f0cbcd832e123c927d04a1a45fdabf8

@fullsend-ai-review

Copy link
Copy Markdown

🤖 Review · Started 6:12 PM UTC
Commit: d0ac11b · View workflow run →

@fullsend-ai-review

fullsend-ai-review Bot commented Jun 8, 2026

Copy link
Copy Markdown

Review

Findings

Low

  • [procedural-list-format] internal/scaffold/fullsend-repo/skills/pr-review/sub-agents/security.md:35 — The new "Verification methodology" section uses a numbered list (1–4) for its procedural steps, while all other sub-agent files with procedural content (e.g., correctness.md's "Technical documentation" checklist) use bullet lists with bold headers. The numbered format is semantically reasonable for an ordered procedure, but deviates from the established sub-agent convention.

  • [design-direction] internal/scaffold/fullsend-repo/skills/pr-review/sub-agents/security.md:22 — Issue Review agent should not assert safety claims without exhaustive variable verification #990 suggested exploring a structured output field for tracking which variables were verified vs. unverified (e.g., verified_inputs: [], unverified_inputs: []). This PR appropriately focuses on the anti-pattern warning first, but a follow-up issue would help track the structured-output enhancement — it would make exhaustive verification auditable by downstream tooling.

Info

  • [scope-alignment] The change traces cleanly to issue Review agent should not assert safety claims without exhaustive variable verification #990 and implements both of the issue's primary suggestions: (1) anti-pattern warning added to security evaluation guidance, and (2) upstream code-review skill updated since the pattern is universally applicable. Scope matches authorization.

  • [design-direction] The SKILL.md summary bullet and security.md detailed procedure serve different agents in different contexts (standalone code-review vs. dedicated security sub-agent). The different detail levels are appropriate — not redundant.

Previous run

Review

Findings

Low

  • [code-organization] internal/scaffold/fullsend-repo/skills/pr-review/sub-agents/security.md:55 — The new ## Injection defense heading introduces a section containing only a single directive line ("Inspect the code diff for injection patterns."). Other ## sections in sub-agent files contain substantial procedural content. Consider either expanding this section with injection-specific guidance or removing the heading and keeping the directive as a closing line (matching the pre-PR structure).

  • [design-direction] internal/scaffold/fullsend-repo/skills/pr-review/sub-agents/security.md:22 — Issue Review agent should not assert safety claims without exhaustive variable verification #990 suggested exploring a structured output field for tracking which variables were verified vs. unverified (beyond natural-language descriptions). This PR appropriately defers that work, but a follow-up issue would help track it — adding a verified_inputs or coverage_report field to security findings would make exhaustive verification auditable by downstream tooling.

Info

  • [scope-alignment] The change traces cleanly to issue Review agent should not assert safety claims without exhaustive variable verification #990 and implements both of the issue's primary suggestions: (1) anti-pattern warning added to security evaluation guidance, and (2) upstream code-review skill updated since the pattern is universally applicable. Scope matches authorization.

  • [design-direction] The SKILL.md summary bullet and security.md detailed procedure serve different agents in different contexts (standalone code-review vs. dedicated security sub-agent). The different detail levels are appropriate — not redundant.

Comment thread internal/scaffold/fullsend-repo/skills/pr-review/sub-agents/security.md Outdated
@fullsend-ai-review fullsend-ai-review Bot added the ready-for-merge All reviewers approved — ready to merge label Jun 8, 2026
@fullsend-ai-review

Copy link
Copy Markdown

🤖 Finished Review · ✅ Success · Started 6:12 PM UTC · Completed 6:20 PM UTC
Commit: d0ac11b · View workflow run →

@ben-alkov

Copy link
Copy Markdown
Member

/fs-fix

@fullsend-ai-coder

Copy link
Copy Markdown
Contributor Author

🤖 Fix · Started 6:52 PM UTC
Commit: d0ac11b · View workflow run →

fullsend-ai-coder Bot added a commit that referenced this pull request Jun 8, 2026
Remove the `## Injection defense` heading that introduced a section
with only a single directive line. The directive is kept as a closing
line of the verification methodology section, matching the pre-PR
structure and other sub-agent file conventions.

Addresses review feedback on #2039

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@fullsend-ai-coder

Copy link
Copy Markdown
Contributor Author

🔧 Fix agent — iteration 1 (human-triggered)

Addressed 1 of 2 review findings. Removed the stub ## Injection defense heading per the code-organization finding. Disagreed with the design-direction finding as it requests filing an issue, not a code change.

Fixed (1):

  1. stub injection defense heading with single directive line (internal/scaffold/fullsend-repo/skills/pr-review/sub-agents/security.md): Removed the ## Injection defense heading that introduced a section with only a single directive line. The directive is kept as a closing line of the verification methodology section, matching the pre-PR structure and other sub-agent file conventions.

Disagreed (1):

  1. file follow-up issue for structured coverage tracking: This finding requests filing a follow-up issue, not a code change. Filing issues is outside the fix agent's scope. The suggestion to add a coverage_report field to security findings is valid but should be tracked by a human or the triage agent.

Tests: passed

Updated by fullsend fix agent

@fullsend-ai-coder

Copy link
Copy Markdown
Contributor Author

🤖 Finished Fix · ✅ Success · Started 6:52 PM UTC · Completed 6:56 PM UTC
Commit: d0ac11b · View workflow run →

@fullsend-ai-review

fullsend-ai-review Bot commented Jun 8, 2026

Copy link
Copy Markdown

🤖 Finished Review · ✅ Success · Started 6:58 PM UTC · Completed 7:06 PM UTC
Commit: d0ac11b · View workflow run →

@fullsend-ai-review fullsend-ai-review Bot added ready-for-merge All reviewers approved — ready to merge and removed ready-for-merge All reviewers approved — ready to merge labels Jun 8, 2026
@ben-alkov ben-alkov self-assigned this Jun 8, 2026
ben-alkov pushed a commit that referenced this pull request Jun 8, 2026
Remove the `## Injection defense` heading that introduced a section
with only a single directive line. The directive is kept as a closing
line of the verification methodology section, matching the pre-PR
structure and other sub-agent file conventions.

Addresses review feedback on #2039

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@ben-alkov ben-alkov force-pushed the agent/990-exhaustive-security-verification branch from 601e447 to 6393a40 Compare June 8, 2026 19:45
fullsend-ai-coder Bot and others added 2 commits June 8, 2026 15:46
The review agent was making blanket safety claims after verifying only
a subset of variables in security-sensitive contexts (e.g., confirming
one variable was sanitized and asserting "sanitization is handled"
while other interpolated variables remained unchecked). This
partial-verification anti-pattern is more harmful than a silent miss
because it reduces human scrutiny on the exact areas that need it.

Changes:
- security.md (pr-review sub-agent): Added a "Verification
  methodology" section with a 4-step procedure: enumerate all inputs,
  verify each independently, report coverage explicitly, and flag
  gaps rather than dismissing them. Placed before the existing
  injection defense section.
- SKILL.md (code-review skill): Added an "Exhaustive
  security-control verification" bullet to the Security dimension
  with the same core guidance, keeping the standalone review skill
  consistent with the sub-agent definition.

Note: pre-commit could not run in the sandbox due to Go module cache
permission errors (infrastructure issue, not code-related). The
post-script runs pre-commit authoritatively on the runner.

Closes #990
Remove the `## Injection defense` heading that introduced a section
with only a single directive line. The directive is kept as a closing
line of the verification methodology section, matching the pre-PR
structure and other sub-agent file conventions.

Addresses review feedback on #2039

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@ben-alkov ben-alkov force-pushed the agent/990-exhaustive-security-verification branch from 6393a40 to e55d31d Compare June 8, 2026 19:46
@fullsend-ai-review

Copy link
Copy Markdown

🤖 Review · Started 7:48 PM UTC
Commit: d0ac11b · View workflow run →

@ben-alkov ben-alkov added this pull request to the merge queue Jun 8, 2026
Merged via the queue into main with commit c4f2c74 Jun 8, 2026
8 checks passed
@ben-alkov ben-alkov deleted the agent/990-exhaustive-security-verification branch June 8, 2026 19:52
@fullsend-ai-retro

Copy link
Copy Markdown

🤖 Retro · Started 7:54 PM UTC
Commit: d0ac11b · View workflow run →

@fullsend-ai-review

Copy link
Copy Markdown

Review skipped — this PR is already merged.

The /fs-review command only reviews open pull requests.

Posted by fullsend post-review check

@fullsend-ai-review

Copy link
Copy Markdown

🤖 Finished Review · ✅ Success · Started 7:48 PM UTC · Completed 7:56 PM UTC
Commit: d0ac11b · View workflow run →

@fullsend-ai-retro

Copy link
Copy Markdown

Retro: PR #2039 — Exhaustive security verification

Overall: The workflow completed successfully. The code agent produced a clean, focused 43-line change across 2 files. The fix agent correctly addressed 1 of 2 low-severity review findings and appropriately declined the other as out-of-scope. The human merged within ~1.75 hours.

Key observation: 19 review runs were dispatched for this small PR (14 succeeded, 5 cancelled), driven by a cascade of pull_request_target events from the fix agent's push and subsequent activity. This is the dominant cost driver.

No new proposals filed. All identified improvement opportunities are already covered by existing open issues:

  • Excessive review dispatches: #1418, #1422, #1452, #1014, #893, #981, #1357
  • Self-contradicting findings (review flagged numbered list while acknowledging it was semantically reasonable): #1881
  • Repeated findings across re-reviews (same design-direction comment appeared in both reviews): #1013, #1285
  • Low-severity noise driving rework: #1368
  • Excessive retro dispatches (4 retro runs): #939, #1411

The review dispatch deduplication cluster (#1418, #1422, #1452, #1014, #981, #1357) represents the highest-impact improvement opportunity — resolving those issues would have eliminated ~17 of the 19 review runs on this PR.

@fullsend-ai-retro

Copy link
Copy Markdown

🤖 Finished Retro · ✅ Success · Started 7:54 PM UTC · Completed 8:09 PM UTC
Commit: d0ac11b · View workflow run →

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready-for-merge All reviewers approved — ready to merge

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Review agent should not assert safety claims without exhaustive variable verification

1 participant