Skip to content

feat(#2096): add two-pass review strategy for large PRs#2303

Open
fullsend-ai-coder[bot] wants to merge 4 commits into
mainfrom
agent/2096-two-pass-review-strategy
Open

feat(#2096): add two-pass review strategy for large PRs#2303
fullsend-ai-coder[bot] wants to merge 4 commits into
mainfrom
agent/2096-two-pass-review-strategy

Conversation

@fullsend-ai-coder

@fullsend-ai-coder fullsend-ai-coder Bot commented Jun 15, 2026

Copy link
Copy Markdown
Contributor

For PRs with 50+ files, the review orchestrator now runs a lightweight security-triage pre-pass before dispatching dimension sub-agents. The triage pass uses a haiku-model sub-agent to classify changed files as security-critical or standard based on path patterns (e.g., /mint/, /auth/, /oidc/) and diff content heuristics (auth logic, token handling, permission changes).

Security-critical files identified by the triage pass receive prioritized context in the security and correctness sub-agent context packages — their full diffs appear first with explicit classification headers, ensuring they get dedicated reasoning budget rather than competing with boilerplate changes.

Changes:

  • New sub-agent definition: sub-agents/security-triage.md (haiku
    model, read-only classifier)
  • New orchestrator step 3c-1 in SKILL.md: security-critical file
    triage, runs synchronously before context package assembly
  • Updated step 3d in SKILL.md: security-prioritized context package
    assembly for security and correctness sub-agents when triage
    results are available
  • Updated sub-agent roster table with security-triage entry

The 50-file threshold is a starting point that may need tuning. Triage failures fall back to uniform attention (all files treated as security-critical) to preserve existing behavior as a safe default.


Closes #2096

Post-script verification

  • Branch is not main/master (agent/2096-two-pass-review-strategy)
  • Secret scan passed (gitleaks — f119a205ffb44da8687d9ff1673b6b2d2de2e927..HEAD)
  • Pre-commit hooks passed (authoritative run on runner)
  • Tests ran inside sandbox

@github-actions

github-actions Bot commented Jun 15, 2026

Copy link
Copy Markdown

E2E tests did not run

E2E tests run automatically for org/repo members and collaborators on pull requests.

For other contributors, a maintainer must add the ok-to-test label after the latest push.

See E2E testing guide for details.

@github-actions

github-actions Bot commented Jun 15, 2026

Copy link
Copy Markdown

Site preview

Preview: https://d6666021-site.fullsend-ai.workers.dev

Commit: 0867e8eda2ab4f7a499cbcde0a78981cedab800d

@codecov

codecov Bot commented Jun 15, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

@fullsend-ai-review

fullsend-ai-review Bot commented Jun 15, 2026

Copy link
Copy Markdown

🤖 Finished Review · ✅ Success · Started 8:10 PM UTC · Completed 8:24 PM UTC
Commit: 4c0bf7e · View workflow run →

@fullsend-ai-review

fullsend-ai-review Bot commented Jun 15, 2026

Copy link
Copy Markdown

Review

Findings

Medium

  • [protected-path] internal/scaffold/fullsend-repo/skills/pr-review/SKILL.md, internal/scaffold/fullsend-repo/skills/pr-review/sub-agents/security-triage.md — Both modified files are scaffold source for skill definitions deployed to all target repos via internal/scaffold/fullsend-repo/skills/. Changes here alter the review agent's behavior across all fullsend-managed repositories. The PR links to issue Review agent: two-pass strategy for large PRs — triage security-critical files, then deep-review #2096 and explains the rationale for the changes (two-pass review strategy for large PRs with security-prioritized context allocation). Human approval is always required for protected-path changes, regardless of context.

Labels: PR modifies review agent skill definitions and adds a new sub-agent under skills/pr-review/.

Previous run

Review

Findings

Medium

  • [protected-path] internal/scaffold/fullsend-repo/skills/pr-review/SKILL.md, internal/scaffold/fullsend-repo/skills/pr-review/sub-agents/security-triage.md — Both modified files are under the skills/ protected path. The PR links to issue Review agent: two-pass strategy for large PRs — triage security-critical files, then deep-review #2096 and explains the rationale for the changes. Human approval is always required for protected-path changes, regardless of context.

  • [logic-error] internal/scaffold/fullsend-repo/skills/pr-review/SKILL.md — The PR body is internally contradictory about the file-count threshold: the first paragraph correctly states "For PRs with 50+ files" (matching the implementation), but the last paragraph states "The 30-file threshold is a starting point that may need tuning." The implementation consistently uses 50, which aligns with step 2's per-file diff boundary. The PR description should be updated to avoid misleading reviewers.
    Remediation: Update the PR body's last paragraph to say "The 50-file threshold" instead of "The 30-file threshold."

Low

  • [logic-error] internal/scaffold/fullsend-repo/skills/pr-review/SKILL.md — Step 3c says "All selected sub-agents run in parallel" without explicitly excluding security-triage from the selection pool. While step 3c only enumerates the 6 dimension sub-agents by name in its selection lists, and both step 4 and the "Non-standard dispatch types" paragraph clarify the exclusion, adding an explicit note to step 3c would improve clarity for future implementors.
    Remediation: Add a brief note to step 3c: "security-triage and challenger are not dimension sub-agents and are not selected here."

  • [fail-open] internal/scaffold/fullsend-repo/skills/pr-review/SKILL.md — The security triage step (3c-1) provides only the first ~20 lines of diff for path-matched files and ~50 lines for non-matched files to the Haiku classifier. Security-relevant changes beyond these thresholds could be missed by content heuristics, causing a file to be classified as "standard" and receive deprioritized context. Partially mitigated by path-pattern heuristics (path alone triggers classification), the "err on the side of inclusion" instruction, and the fail-closed fallback.

  • [fail-open] internal/scaffold/fullsend-repo/skills/pr-review/sub-agents/security-triage.md — The triage sub-agent's path pattern list does not include several security-sensitive paths from the orchestrator's protected-path list (e.g., .github/ beyond workflows, .claude/, root-level policies/). A file under .claude/ could be classified as "standard" by triage, receiving deprioritized context. Mitigated by step 6e's protected-path check running independently in the orchestrator regardless of triage classification.

  • [architectural-conflict] internal/scaffold/fullsend-repo/skills/pr-review/SKILL.md — The PR adds a new "pre-pass" dispatch type that extends the existing LLM-based orchestration pattern, which the SKILL.md header (lines 13–19) already acknowledges departs from ADR-0018's prohibition on LLM-driven orchestration. No superseding ADR exists. This is pre-existing architectural debt incrementally deepened by this PR, not introduced by it.

  • [design-smell] internal/scaffold/fullsend-repo/skills/pr-review/SKILL.md — Step 4 carves out security-triage and challenger as non-dimension sub-agents, and the roster table now mixes review dimension agents with orchestration helpers. The "Non-standard dispatch types" paragraph documents this distinction, but the overloaded roster may create confusion as the sub-agent count grows.

Previous run (2)

Review

Findings

Medium

  • [logic-error] internal/scaffold/fullsend-repo/skills/pr-review/SKILL.md — The PR body states the threshold is "30+ files" but the SKILL.md implementation consistently uses 50 files as the threshold ("≥ 50" in step 3c-1, "< 50 files" in step 3d). The implementation is internally consistent and the 50-file threshold is defensible (aligns with step 2's per-file diff boundary), but the PR description will mislead reviewers about what was actually implemented.
    Remediation: Update the PR body to state the threshold is 50 files (not 30), or add a note explaining why the threshold was raised from the issue's suggested 30 to 50.

Low

  • [fail-open] internal/scaffold/fullsend-repo/skills/pr-review/SKILL.md — The security triage step (3c-1) provides only the "first ~20 lines" of each file's diff summary to the Haiku classifier. Security-relevant changes beyond line 20 could cause a file to be categorized as "standard" and receive diluted review attention. Partially mitigated by path-pattern heuristics in the triage sub-agent.

  • [fail-open] internal/scaffold/fullsend-repo/skills/pr-review/SKILL.md — An empty security_critical_files array from the triage agent is accepted at face value ("all files get standard review") rather than triggering the failure fallback. A manipulated or confused triage output returning an empty classification would degrade security review quality without triggering the safe fallback. Consider treating an empty array the same as a triage failure when the PR contains files matching known security-sensitive path patterns.

  • [api-shape] internal/scaffold/fullsend-repo/skills/pr-review/sub-agents/security-triage.md — The security-triage sub-agent's output format (JSON object with security_critical_files, standard_files, summary) differs from the standard findings array. The sub-agent definition does not explicitly state that output must be valid JSON (not wrapped in markdown code fences), which could cause parsing failures in step 3c-1.

  • [doc-style] internal/scaffold/fullsend-repo/skills/pr-review/SKILL.md — The section header "Security-prioritized context (large PRs with triage results):" uses bold text with trailing colon rather than a heading marker (####). While bold-with-colon is used elsewhere for inline categorical labels, this header introduces a substantial sub-section with its own numbered list.

Previous run (3)

Review

Findings

Low

  • [edge-case] internal/scaffold/fullsend-repo/skills/pr-review/sub-agents/security-triage.md — Classification criteria use **/CODEOWNERS (glob matching at any depth) while SKILL.md's protected paths use CODEOWNERS (root-level prefix match). Minor inconsistency — both match root-level CODEOWNERS, and nested CODEOWNERS files are legitimately security-relevant, so the triage agent's broader glob is defensible.

  • [fail-open] internal/scaffold/fullsend-repo/skills/pr-review/SKILL.md — The security triage step (3c-1) provides only the "first 20 lines" of each file's diff summary to the Haiku classifier. Security-relevant changes that appear beyond line 20 of a file's diff could be missed by the classifier, causing the file to be categorized as "standard" and receive diluted security review attention. The fallback only triggers on outright triage failure, not on subtle misclassification. Consider increasing the diff summary window for files that don't match any path pattern, or adding a secondary keyword-based heuristic.

  • [architectural-conflict] internal/scaffold/fullsend-repo/skills/pr-review/SKILL.md — This PR adds another LLM-based orchestration decision to a system that already acknowledges a conflict with ADR-0018's prohibition on LLM-based orchestration. The SKILL.md header notes a superseding ADR is needed. This is pre-existing architectural debt that this PR incrementally deepens. Consider filing a follow-up issue to write the superseding ADR.

  • [api-shape] internal/scaffold/fullsend-repo/skills/pr-review/sub-agents/security-triage.md — The security-triage sub-agent's output format (JSON object with security_critical_files, standard_files, summary) differs from the standard findings array format. The "Non-standard dispatch types" paragraph provides a high-level warning, and step 3c-1 point 4 documents the structure, but an explicit "different format" parsing note mirroring step 6d-4's clarity would improve consistency.

  • [doc-style] internal/scaffold/fullsend-repo/skills/pr-review/SKILL.md — Step 3c-1 uses a numbered sub-step scheme (3c-1) while existing steps use letter-based sub-steps (2a, 3a, 3b). The 3a-1 precedent exists for budget allocation priority, but the pattern is unusual.

  • [doc-style] internal/scaffold/fullsend-repo/skills/pr-review/sub-agents/security-triage.md — Frontmatter uses model: haiku (bare name) while some sub-agents use fully qualified names like claude-sonnet-4-6@default. Consistent with the opus bare-name pattern but inconsistent with sonnet sub-agents.

Info

  • [fail-open] internal/scaffold/fullsend-repo/skills/pr-review/sub-agents/security-triage.md — The triage agent uses Haiku (the weakest available model) for security-critical classification decisions. The "err on the side of inclusion" instruction and path-pattern matching mitigate most risk. The design accepts this tradeoff for latency/cost reasons.

  • [scope-alignment] internal/scaffold/fullsend-repo/skills/pr-review/SKILL.md — The PR implements a 50-file threshold while issue Review agent: two-pass strategy for large PRs — triage security-critical files, then deep-review #2096 suggests 30 files as a starting point. The discrepancy is a reasonable engineering judgment — aligning with step 2's per-file diff boundary ensures per-file diff summaries are always available for the triage prompt.

  • [design-direction] internal/scaffold/fullsend-repo/skills/pr-review/SKILL.md — The PR now explicitly acknowledges the third sub-agent pattern (security-triage as pre-pass) via the "Non-standard dispatch types" note, addressing the prior review's recommendation to document the distinction.

  • [pattern-consistency] internal/scaffold/fullsend-repo/skills/pr-review/SKILL.md — The sub-agent roster table introduces a new "pre-pass" dispatch type distinct from "parallel" and "sequential". Semantically accurate for the new workflow.

  • [pattern-consistency] internal/scaffold/fullsend-repo/skills/pr-review/sub-agents/security-triage.md — The sub-agent follows the established structure pattern (frontmatter, H1 title, role statement, Own/Do not own sections, procedure, output format, constraints).

  • [doc-style] internal/scaffold/fullsend-repo/skills/pr-review/SKILL.md — The section header "Security-prioritized context (large PRs with triage results):" uses bold text with trailing colon rather than heading markers. Appropriate as a variant within step 3d rather than a standalone section.

Previous run (4)

Review

Findings

Medium

  • [edge-case] internal/scaffold/fullsend-repo/skills/pr-review/SKILL.md — Step 3c-1 triggers the security triage pre-pass when FILE_COUNT >= 30, but step 2 only produces per-file diffs when FILE_COUNT >= 50. For PRs with 30–49 files, step 2 uses gh pr diff (a single unified diff blob). Step 3c-1's procedure asks the orchestrator to compose "first ~20 lines of each file's diff" as diff summaries, and step 3d's security-prioritized context references "full per-file diffs for all security_critical_files first" — both presuppose per-file diffs are available, but they are not produced until FILE_COUNT >= 50. This creates an ambiguous gap for PRs in the 30–49 file range.
    Remediation: Either lower step 2's per-file diff threshold from 50 to 30 to align with the triage trigger, or add explicit instructions in step 3c-1 for extracting per-file diff segments from the unified gh pr diff output when FILE_COUNT is between 30 and 49.

Low

  • [logic-error] internal/scaffold/fullsend-repo/skills/pr-review/SKILL.md — The security-triage sub-agent is listed in the roster table with dispatch type pre-pass, but step 3c and step 4 do not explicitly exclude pre-pass type agents from the selection and dispatch loop. Double-dispatch is unlikely in practice (step 3c lists only dimension agents, and the challenger precedent shows non-parallel types are handled separately), but a clarifying note would remove ambiguity.

  • [architectural-conflict] internal/scaffold/fullsend-repo/skills/pr-review/SKILL.md — This PR adds another LLM-based orchestration decision to a system that already acknowledges a conflict with ADR-0018's prohibition on LLM-based orchestration. The SKILL.md header notes a superseding ADR is needed. This is a pre-existing concern not introduced by this PR, but each incremental addition deepens the unresolved debt.

  • [edge-case] internal/scaffold/fullsend-repo/skills/pr-review/sub-agents/security-triage.md — Classification criteria use **/CODEOWNERS (glob matching at any depth) while SKILL.md's protected paths use CODEOWNERS (root-level prefix match). Minor inconsistency — both would match root-level CODEOWNERS, and nested CODEOWNERS files are legitimately security-relevant.

  • [design-direction] internal/scaffold/fullsend-repo/skills/pr-review/SKILL.md — The security-triage sub-agent is structurally different from the six dimension sub-agents: it produces file classifications (not review findings) and runs as a preprocessing step. This creates a third sub-agent pattern alongside dimension agents (parallel) and the challenger (sequential). Consider documenting this distinction explicitly in the roster section.

  • [api-shape] internal/scaffold/fullsend-repo/skills/pr-review/sub-agents/security-triage.md — The output format (JSON object with security_critical_files, standard_files, summary) differs from the standard findings array. Step 3c-1 point 4 documents the expected structure, but lacks the explicit "different format" parsing note that step 6d-4 provides for the challenger's non-standard output.

  • [doc-style] internal/scaffold/fullsend-repo/skills/pr-review/SKILL.md — Step 3c-1 uses a numbered sub-step scheme (3c-1) while existing steps use letter-based sub-steps (2a, 3a, 3b, etc.). The 3a-1 precedent exists for budget allocation priority, but the pattern is unusual within the document.

Info

  • [pattern-inconsistency] internal/scaffold/fullsend-repo/skills/pr-review/SKILL.md — The sub-agent roster table introduces a new pre-pass dispatch type distinct from parallel and sequential. Semantically accurate for the new workflow.

  • [design-pattern] internal/scaffold/fullsend-repo/skills/pr-review/SKILL.md — The security-triage sub-agent introduces a preprocessing classification pattern to inform context allocation for downstream review agents, aligning with issue Review agent: two-pass strategy for large PRs — triage security-critical files, then deep-review #2096's requirement.


Labels: PR modifies review agent skill definitions and sub-agent roster, which is harness/skills infrastructure.

Previous run (5)

Review

Findings

Medium

  • [edge-case] internal/scaffold/fullsend-repo/skills/pr-review/SKILL.md — Step 3c-1 triggers the security triage pre-pass when FILE_COUNT >= 30, but step 2 only produces per-file diffs for FILE_COUNT >= 50. The triage procedure references "first 20 lines of each file's diff" and step 3d's security-prioritized context references "full per-file diffs for all security_critical_files first" — but for PRs with 30–49 files, step 2 uses gh pr diff (a single unified blob), so per-file diff summaries are not directly available. This creates ambiguity for the orchestrator in the 30–49 file range.
    Remediation: Either raise the triage threshold to 50 to align with step 2's large-PR boundary, or add explicit instructions for how to produce per-file diff summaries when the PR has 30–49 files (e.g., by splitting the unified diff on file headers, or by running per-file diffs specifically for the triage step).

  • [architectural-conflict] internal/scaffold/fullsend-repo/skills/pr-review/SKILL.md — This PR adds another LLM-based orchestration decision (security-triage pre-pass) to a system that already acknowledges a conflict with ADR-0018's prohibition on LLM-based orchestration. The SKILL.md header (lines 12–19) notes "A superseding ADR is needed to formally retire ADR-0018's prohibition." Each incremental addition to the LLM orchestrator deepens this unresolved architectural debt.
    Remediation: Write and accept a superseding ADR that formally retires ADR-0018's prohibition on LLM-based orchestration for the pr-review skill.

Low

  • [logic-error] internal/scaffold/fullsend-repo/skills/pr-review/SKILL.md — The security-triage sub-agent is listed in the roster table with dispatch type pre-pass, but nothing in the document explicitly excludes it from step 4's "For each selected sub-agent" dispatch loop. Step 3c's selection lists only dimension agents, so double-dispatch is unlikely in practice, but a clarifying note would prevent ambiguity.

  • [fail-open] internal/scaffold/fullsend-repo/skills/pr-review/SKILL.md — The haiku model used for security triage may not reliably detect subtle security relevance (e.g., utility functions that handle token scoping, or files whose security relevance is only apparent from content rather than path). A misclassified file still appears in the security sub-agent's context under "Standard files" so it is still reviewed, but receives less prominent positioning.

  • [design-direction] internal/scaffold/fullsend-repo/skills/pr-review/SKILL.md — The security-triage sub-agent is structurally different from the six dimension sub-agents: it produces file classifications rather than review findings and runs as a preprocessing step. Listing it in the roster table alongside dimension sub-agents may create a misleading abstraction. The challenger sub-agent has a similar deviation (sequential dispatch, different output format), establishing precedent, but the difference could be made more explicit.

  • [naming-convention] internal/scaffold/fullsend-repo/skills/pr-review/sub-agents/security-triage.md — The H1 title # Security Triage (File Classifier) includes a parenthetical clarification not present in other sub-agent titles (Correctness, Security, Challenger). Minor deviation from established convention.

  • [api-shape] internal/scaffold/fullsend-repo/skills/pr-review/sub-agents/security-triage.md — The output format (JSON object with security_critical_files, standard_files, summary) differs from the standard findings array. Step 3c-1 point 4 documents this format, analogous to step 6d's documentation of the challenger's format.

  • [doc-style] internal/scaffold/fullsend-repo/skills/pr-review/SKILL.md — Step 3c-1 uses a 3c-1 numbering scheme. The existing document uses 3a-1 for budget allocation priority, so this follows precedent, but the third-level numbering pattern is unusual.

  • [edge-case] internal/scaffold/fullsend-repo/skills/pr-review/sub-agents/security-triage.md — Classification criteria use **/CODEOWNERS (glob matching at any depth) while SKILL.md's protected paths use CODEOWNERS (prefix matching at repo root). Functionally different but both serve their respective purposes.

  • [scope-exceeded] internal/scaffold/fullsend-repo/skills/pr-review/SKILL.md — The PR title says "two-pass review strategy" and the implementation does involve two passes (triage then review), but the review pass uses context reordering rather than separate file-set dispatches. This is a reasonable design choice but may warrant a clarifying note.

Info

  • [design-smell] internal/scaffold/fullsend-repo/skills/pr-review/SKILL.md — The edge cases section specifies logging an info-level note when the triage pass fails, which provides basic observability. Consider whether deeper alerting or metrics would help track whether the feature is providing value.

  • [pattern-inconsistency] internal/scaffold/fullsend-repo/skills/pr-review/SKILL.md — The sub-agent roster table introduces a new pre-pass dispatch type distinct from parallel and sequential. This is intentional and well-documented.

@fullsend-ai-review fullsend-ai-review Bot added the requires-manual-review Review requires human judgment label Jun 15, 2026
@ben-alkov ben-alkov added the ok-to-test Allow e2e CI to run after maintainer review (must be re-applied after each push) label Jun 16, 2026
@ben-alkov ben-alkov force-pushed the agent/2096-two-pass-review-strategy branch from 4c0bf7e to 84a141c Compare June 18, 2026 18:54
@github-actions github-actions Bot removed the ok-to-test Allow e2e CI to run after maintainer review (must be re-applied after each push) label Jun 18, 2026
@ben-alkov ben-alkov added agent/review Review agent ok-to-test Allow e2e CI to run after maintainer review (must be re-applied after each push) labels Jun 18, 2026
@fullsend-ai-review

fullsend-ai-review Bot commented Jun 18, 2026

Copy link
Copy Markdown

🤖 Review · ❌ Terminated · Started 6:57 PM UTC · Ended 7:10 PM UTC
Commit: 4e21a60 · View workflow run →

@ben-alkov

Copy link
Copy Markdown
Member

/fs-fix but ignore

  • [architectural-conflict] internal/scaffold/fullsend-repo/skills/pr-review/SKILL.md
  • [fail-open] internal/scaffold/fullsend-repo/skills/pr-review/SKILL.md
  • info-level findings

@fullsend-ai-coder

fullsend-ai-coder Bot commented Jun 18, 2026

Copy link
Copy Markdown
Contributor Author

🤖 Finished Fix · ✅ Success · Started 7:03 PM UTC · Completed 7:11 PM UTC
Commit: 4e21a60 · View workflow run →

@fullsend-ai-review fullsend-ai-review Bot added requires-manual-review Review requires human judgment component/harness Agent harness, config, and skills loading and removed requires-manual-review Review requires human judgment labels Jun 18, 2026
@fullsend-ai-review

Copy link
Copy Markdown

🤖 Finished Review · ✅ Success · Started 6:57 PM UTC · Completed 7:10 PM UTC
Commit: 84a141c · View workflow run →

fullsend-ai-coder Bot added a commit that referenced this pull request Jun 18, 2026
- Raise security triage threshold from 30 to 50 files to align with
  step 2's per-file diff boundary, resolving ambiguity in the 30-49
  file range where per-file diffs were not available [edge-case]
- Add clarifying note to roster table documenting that security-triage
  and challenger use non-standard dispatch and are excluded from step
  4's parallel loop [logic-error, design-direction]
- Clarify step 4 heading to explicitly scope dispatch to dimension
  sub-agents only [logic-error]
- Remove parenthetical from security-triage sub-agent title to match
  naming convention of other sub-agents [naming-convention]

Addresses review feedback on #2303
@fullsend-ai-coder

Copy link
Copy Markdown
Contributor Author

🔧 Fix agent — iteration 1 (human-triggered)

Addressed 4 of 8 non-excluded review findings (1 medium, 3 low). Fixed threshold mismatch by raising triage trigger from 30 to 50 files to align with step 2's per-file diff boundary. Added clarifying notes for non-standard dispatch types in roster table and step 4. Removed parenthetical from sub-agent title. Disagreed with 4 findings that were already documented, followed existing convention, or were out of scope. Per human instruction, ignored [architectural-conflict], [fail-open], and info-level findings.

Fixed (4):

  1. [edge-case] threshold mismatch between step 2 (>=50) and step 3c-1 (>=30) (internal/scaffold/fullsend-repo/skills/pr-review/SKILL.md): Raised security triage threshold from 30 to 50 files to align with step 2's per-file diff boundary. Updated all references (step 3c-1 trigger, Why section, step 3d context, and fallback clause). Added explanation that the threshold aligns with step 2 so per-file diff summaries are always available.
  2. [logic-error] security-triage not explicitly excluded from step 4 dispatch loop (internal/scaffold/fullsend-repo/skills/pr-review/SKILL.md): Added clarifying note to step 4 heading that dispatch applies to dimension sub-agents only, explicitly excluding security-triage (step 3c-1) and challenger (step 6d).
  3. [design-direction] security-triage structurally different from dimension sub-agents in roster table (internal/scaffold/fullsend-repo/skills/pr-review/SKILL.md): Added 'Non-standard dispatch types' paragraph after the roster table explaining that security-triage and challenger are not dimension sub-agents, documenting their separate dispatch points and different output formats.
  4. [naming-convention] parenthetical in security-triage H1 title (internal/scaffold/fullsend-repo/skills/pr-review/sub-agents/security-triage.md): Removed '(File Classifier)' parenthetical from H1 title to match the naming convention of other sub-agents (Correctness, Security, Challenger).

Disagreed (4):

  1. [api-shape] output format differs from standard findings array: The reviewer notes this is already documented in step 3c-1 point 4, analogous to the challenger's format documentation in step 6d. No change needed — the difference is intentional and documented.
  2. [doc-style] 3c-1 numbering scheme: The reviewer acknowledges this follows the existing 3a-1 precedent. Consistent with established convention — no change needed.
  3. [edge-case] CODEOWNERS glob pattern difference between security-triage and SKILL.md protected paths: The reviewer notes these are 'functionally different but both serve their respective purposes.' The security-triage classifier uses glob patterns for file matching across any depth, while SKILL.md's protected paths use prefix matching at repo root. Both are correct for their contexts.
  4. [scope-exceeded] two-pass naming vs implementation details: This concerns the PR title, which is outside the scope of code fixes. The implementation does involve two passes (triage then review with prioritized context), and the reviewer notes it is 'a reasonable design choice.'

Tests: passed

Updated by fullsend fix agent

@github-actions github-actions Bot removed the ok-to-test Allow e2e CI to run after maintainer review (must be re-applied after each push) label Jun 18, 2026
@fullsend-ai-review

fullsend-ai-review Bot commented Jun 18, 2026

Copy link
Copy Markdown

🤖 Finished Review · ✅ Success · Started 7:15 PM UTC · Completed 7:28 PM UTC
Commit: 9a7370e · View workflow run →

fullsend-ai-review[bot]

This comment was marked as outdated.

@fullsend-ai-review fullsend-ai-review Bot added ready-for-merge All reviewers approved — ready to merge and removed requires-manual-review Review requires human judgment labels Jun 18, 2026
@fullsend-ai-review fullsend-ai-review Bot added requires-manual-review Review requires human judgment and removed requires-manual-review Review requires human judgment labels Jun 22, 2026
@ben-alkov ben-alkov requested a review from ralphbean June 22, 2026 21:11
@ben-alkov

Copy link
Copy Markdown
Member

/fs-fix Address

[fail-open] internal/scaffold/fullsend-repo/skills/pr-review/sub-agents/security-triage.md — The triage sub-agent's path pattern list does not include several security-sensitive paths from the orchestrator's protected-path list (e.g., .github/ beyond workflows, .claude/, root-level policies/). A file under .claude/ could be classified as "standard" by triage, receiving deprioritized context. Mitigated by step 6e's protected-path check running independently in the orchestrator regardless of triage classification.

and also make sure that all of the security-sensitive paths in the orchestrator, subagents, and related skills are guarded again recursion (i.e. paths below e.g. ".claude/" must also be protected).

@fullsend-ai-coder

fullsend-ai-coder Bot commented Jun 29, 2026

Copy link
Copy Markdown
Contributor Author

🤖 Finished Fix · ✅ Success · Started 7:08 PM UTC · Completed 7:16 PM UTC
Commit: 657bf48 · View workflow run →

@fullsend-ai-coder

Copy link
Copy Markdown
Contributor Author

🔧 Fix agent — iteration 3 (human-triggered)

Aligned security-triage sub-agent path patterns with the orchestrator's full protected-path list. Added 16 governance/infrastructure paths as a new classification category with recursive glob patterns. Updated orchestrator fallback validation examples to reference the new patterns.

Fixed (3):

  1. security-triage missing protected paths from orchestrator (internal/scaffold/fullsend-repo/skills/pr-review/sub-agents/security-triage.md): Added a new 'Governance and infrastructure paths' subsection to the security-triage sub-agent's classification criteria, covering all protected paths from the orchestrator's step 6e list: .claude/, .cursor/, .gitattributes, .github/, .pre-commit-config.yaml, AGENTS.md, agents/, api-servers/, CLAUDE.md, Containerfile, Dockerfile, harness/, images/, plugins/, scripts/, skills/. The narrower .github/workflows/*.yml pattern was replaced by .github/** to cover all GitHub configuration.
  2. ensure recursive matching for directory paths (internal/scaffold/fullsend-repo/skills/pr-review/sub-agents/security-triage.md): All directory paths use /** glob suffix (e.g., .claude/, agents/, scripts/**) to ensure files below the directory root are matched recursively. Single-file entries (.gitattributes, AGENTS.md, CLAUDE.md, Containerfile, Dockerfile, .pre-commit-config.yaml) use exact match since they are individual files, not directories.
  3. update orchestrator fallback validation examples (internal/scaffold/fullsend-repo/skills/pr-review/SKILL.md): Updated the fallback validation example list in step 3c-1 to include governance path patterns (.claude/, .github/, agents/, scripts/) alongside the existing examples, so the orchestrator validates against the full set of security-sensitive patterns when checking for triage failures.

Tests: passed

Updated by fullsend fix agent

fullsend-ai-coder Bot added a commit that referenced this pull request Jun 29, 2026
… protected paths

Add governance and infrastructure paths (`.claude/`, `.github/`,
`agents/`, `scripts/`, `skills/`, etc.) to the security-triage
sub-agent's classification criteria. These paths were present in
the orchestrator's step 6e protected-path list but missing from
the triage classifier, which could cause files under those paths
to be classified as "standard" and receive deprioritized context
in large PRs.

All directory paths use `/**` glob suffix to ensure recursive
matching of files below the directory root. The narrower
`.github/workflows/*.yml` pattern is replaced by `.github/**` to
cover all GitHub configuration files.

The orchestrator's fallback validation examples in step 3c-1 are
updated to reference the new governance patterns.

Addresses review feedback on #2303
ben-alkov pushed a commit that referenced this pull request Jun 29, 2026
- Raise security triage threshold from 30 to 50 files to align with
  step 2's per-file diff boundary, resolving ambiguity in the 30-49
  file range where per-file diffs were not available [edge-case]
- Add clarifying note to roster table documenting that security-triage
  and challenger use non-standard dispatch and are excluded from step
  4's parallel loop [logic-error, design-direction]
- Clarify step 4 heading to explicitly scope dispatch to dimension
  sub-agents only [logic-error]
- Remove parenthetical from security-triage sub-agent title to match
  naming convention of other sub-agents [naming-convention]

Addresses review feedback on #2303
ben-alkov pushed a commit that referenced this pull request Jun 29, 2026
- [fail-open] Increase diff summary window from ~20 to ~50 lines for
  files that don't match security path patterns, giving the classifier
  more content signal for non-obvious security-relevant changes
- [fail-open] Treat empty security_critical_files as triage failure
  when changed files match known security-sensitive path patterns
- [api-shape] Add explicit constraint requiring raw JSON output (no
  markdown code fences) to security-triage sub-agent
- [doc-style] Convert bold-with-colon header to proper #### heading
  for the security-prioritized context subsection

Addresses review feedback on #2303
ben-alkov pushed a commit that referenced this pull request Jun 29, 2026
… protected paths

Add governance and infrastructure paths (`.claude/`, `.github/`,
`agents/`, `scripts/`, `skills/`, etc.) to the security-triage
sub-agent's classification criteria. These paths were present in
the orchestrator's step 6e protected-path list but missing from
the triage classifier, which could cause files under those paths
to be classified as "standard" and receive deprioritized context
in large PRs.

All directory paths use `/**` glob suffix to ensure recursive
matching of files below the directory root. The narrower
`.github/workflows/*.yml` pattern is replaced by `.github/**` to
cover all GitHub configuration files.

The orchestrator's fallback validation examples in step 3c-1 are
updated to reference the new governance patterns.

Addresses review feedback on #2303
@ben-alkov ben-alkov force-pushed the agent/2096-two-pass-review-strategy branch from 2fa8ae9 to be2998e Compare June 29, 2026 19:19
ben-alkov pushed a commit that referenced this pull request Jun 30, 2026
- Raise security triage threshold from 30 to 50 files to align with
  step 2's per-file diff boundary, resolving ambiguity in the 30-49
  file range where per-file diffs were not available [edge-case]
- Add clarifying note to roster table documenting that security-triage
  and challenger use non-standard dispatch and are excluded from step
  4's parallel loop [logic-error, design-direction]
- Clarify step 4 heading to explicitly scope dispatch to dimension
  sub-agents only [logic-error]
- Remove parenthetical from security-triage sub-agent title to match
  naming convention of other sub-agents [naming-convention]

Addresses review feedback on #2303
ben-alkov pushed a commit that referenced this pull request Jun 30, 2026
- [fail-open] Increase diff summary window from ~20 to ~50 lines for
  files that don't match security path patterns, giving the classifier
  more content signal for non-obvious security-relevant changes
- [fail-open] Treat empty security_critical_files as triage failure
  when changed files match known security-sensitive path patterns
- [api-shape] Add explicit constraint requiring raw JSON output (no
  markdown code fences) to security-triage sub-agent
- [doc-style] Convert bold-with-colon header to proper #### heading
  for the security-prioritized context subsection

Addresses review feedback on #2303
ben-alkov pushed a commit that referenced this pull request Jun 30, 2026
… protected paths

Add governance and infrastructure paths (`.claude/`, `.github/`,
`agents/`, `scripts/`, `skills/`, etc.) to the security-triage
sub-agent's classification criteria. These paths were present in
the orchestrator's step 6e protected-path list but missing from
the triage classifier, which could cause files under those paths
to be classified as "standard" and receive deprioritized context
in large PRs.

All directory paths use `/**` glob suffix to ensure recursive
matching of files below the directory root. The narrower
`.github/workflows/*.yml` pattern is replaced by `.github/**` to
cover all GitHub configuration files.

The orchestrator's fallback validation examples in step 3c-1 are
updated to reference the new governance patterns.

Addresses review feedback on #2303
@ben-alkov ben-alkov force-pushed the agent/2096-two-pass-review-strategy branch from be2998e to 9bec0e6 Compare June 30, 2026 17:36
@ben-alkov ben-alkov requested a review from a team as a code owner June 30, 2026 17:36
@fullsend-ai-review

fullsend-ai-review Bot commented Jun 30, 2026

Copy link
Copy Markdown

🤖 Finished Review · ❌ Failure · Started 5:42 PM UTC · Completed 5:49 PM UTC
Commit: 104508d · View workflow run →

For PRs with 30+ files, the review orchestrator now runs a
lightweight security-triage pre-pass before dispatching dimension
sub-agents. The triage pass uses a haiku-model sub-agent to classify
changed files as security-critical or standard based on path
patterns (e.g., **/mint/**, **/auth/**, **/oidc/**) and diff content
heuristics (auth logic, token handling, permission changes).

Security-critical files identified by the triage pass receive
prioritized context in the security and correctness sub-agent
context packages — their full diffs appear first with explicit
classification headers, ensuring they get dedicated reasoning budget
rather than competing with boilerplate changes.

Changes:
- New sub-agent definition: sub-agents/security-triage.md (haiku
  model, read-only classifier)
- New orchestrator step 3c-1 in SKILL.md: security-critical file
  triage, runs synchronously before context package assembly
- Updated step 3d in SKILL.md: security-prioritized context package
  assembly for security and correctness sub-agents when triage
  results are available
- Updated sub-agent roster table with security-triage entry

The 30-file threshold is a starting point that may need tuning.
Triage failures fall back to uniform attention (all files treated as
security-critical) to preserve existing behavior as a safe default.

Closes #2096
- Raise security triage threshold from 30 to 50 files to align with
  step 2's per-file diff boundary, resolving ambiguity in the 30-49
  file range where per-file diffs were not available [edge-case]
- Add clarifying note to roster table documenting that security-triage
  and challenger use non-standard dispatch and are excluded from step
  4's parallel loop [logic-error, design-direction]
- Clarify step 4 heading to explicitly scope dispatch to dimension
  sub-agents only [logic-error]
- Remove parenthetical from security-triage sub-agent title to match
  naming convention of other sub-agents [naming-convention]

Addresses review feedback on #2303
- [fail-open] Increase diff summary window from ~20 to ~50 lines for
  files that don't match security path patterns, giving the classifier
  more content signal for non-obvious security-relevant changes
- [fail-open] Treat empty security_critical_files as triage failure
  when changed files match known security-sensitive path patterns
- [api-shape] Add explicit constraint requiring raw JSON output (no
  markdown code fences) to security-triage sub-agent
- [doc-style] Convert bold-with-colon header to proper #### heading
  for the security-prioritized context subsection

Addresses review feedback on #2303
… protected paths

Add governance and infrastructure paths (`.claude/`, `.github/`,
`agents/`, `scripts/`, `skills/`, etc.) to the security-triage
sub-agent's classification criteria. These paths were present in
the orchestrator's step 6e protected-path list but missing from
the triage classifier, which could cause files under those paths
to be classified as "standard" and receive deprioritized context
in large PRs.

All directory paths use `/**` glob suffix to ensure recursive
matching of files below the directory root. The narrower
`.github/workflows/*.yml` pattern is replaced by `.github/**` to
cover all GitHub configuration files.

The orchestrator's fallback validation examples in step 3c-1 are
updated to reference the new governance patterns.

Addresses review feedback on #2303
@ben-alkov ben-alkov force-pushed the agent/2096-two-pass-review-strategy branch from 9bec0e6 to 0867e8e Compare June 30, 2026 19:21
@ben-alkov

Copy link
Copy Markdown
Member

/fs-review

@fullsend-ai-review

fullsend-ai-review Bot commented Jun 30, 2026

Copy link
Copy Markdown

🤖 Finished Review · ✅ Success · Started 7:25 PM UTC · Completed 7:37 PM UTC
Commit: f1117dd · View workflow run →

@fullsend-ai-review fullsend-ai-review Bot added requires-manual-review Review requires human judgment component/skills and removed requires-manual-review Review requires human judgment labels Jun 30, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agent/review Review agent component/harness Agent harness, config, and skills loading component/skills requires-manual-review Review requires human judgment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Review agent: two-pass strategy for large PRs — triage security-critical files, then deep-review

1 participant