feat: add finding verification via lightweight sub-agent checks#95
Conversation
|
Caution Review failedPull request was closed or merged during review 📝 WalkthroughWalkthroughAdds an optional verification step for delegated findings: a new CoreSettings flag and env binding, orchestrator wiring to forward the flag and tool context, summarizer/line-resolver updates to include verification metadata and hunk-aware mapping, a new FindingVerifier module that runs concurrent read-only sub-agents to validate findings, and extensive tests. ChangesFinding Verification Feature
Sequence Diagram(s)sequenceDiagram
participant Agent as BaseAIAgent
participant Orchestrator as DelegationOrchestrator
participant Summarizer as SummarizationAgent
participant Verifier as FindingVerifier
participant SubAgent as Verification Sub-Agent
participant FileTools as File-Read Tools
Agent->>Orchestrator: execute(..., verify_findings=True)
Orchestrator->>Summarizer: construct(settings, parent_tools, mcp_manager, local_registry)
Summarizer->>Summarizer: extract findings
Orchestrator->>Verifier: _verify_findings(findings, diff, tool_context)
Verifier->>SubAgent: initialize provider & create verification engine
loop per finding (concurrent, bounded)
Verifier->>SubAgent: run_turn(prompt, allowed_tools)
SubAgent->>FileTools: read_file / read_file_lines
FileTools-->>SubAgent: file content
SubAgent-->>Verifier: verification response (status, reasoning, adjusted_severity, confidence, adjusted_line)
Verifier->>Verifier: parse & validate response -> update Finding (verified, verification_reason, line)
end
Verifier-->>Orchestrator: verified findings
Orchestrator-->>Agent: aggregated results (maybe verified)
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
src/cicaddy/delegation/verifier.py (1)
200-206: ⚡ Quick winUse Google-style docstrings for new methods
Several new method docstrings in this
src/module are short-form; please switch them to Google-style (with sections likeArgs:/Returns:) for consistency with project standards.As per coding guidelines,
src/**/*.py: "Use type hints and Google-style docstrings in Python code".Also applies to: 249-256, 292-293, 315-317, 324-329, 370-371
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/cicaddy/delegation/verifier.py` around lines 200 - 206, The docstring for the newly added method _create_verification_engine (and the other short-form docstrings referenced at the noted ranges) must be converted to Google-style docstrings: replace the one-line/short-form description with a multi-section docstring that includes at least an "Args:" section documenting provider: "BaseProvider", num_findings: int, session_id: str, and a "Returns:" section documenting the ExecutionEngine return type; keep the existing brief description as the opening line and update other new methods (lines ~249-256, 292-293, 315-317, 324-329, 370-371) to the same Google-style format to match project standards and type hints.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/cicaddy/delegation/verifier.py`:
- Around line 182-185: When asyncio.gather(..., return_exceptions=True) returns
an Exception for a task, the code in verifier.py currently logs and appends the
original finding without populating verification metadata; update the Exception
branch (the block handling "if isinstance(result, Exception)") to set fallback
fields on the finding before appending: set findings[i]['verified'] = False and
set findings[i]['verification_reason'] = str(result) (or a short prefixed string
like f"verification_error: {result}") so verified_findings is always consistent;
keep the existing logger.warning and then append the modified findings[i] to
verified_findings.
- Around line 141-167: The code creates an asyncio.Semaphore with the
caller-provided max_concurrent which can be 0 or negative and cause deadlock;
before creating the semaphore in the verification routine (the function taking
max_concurrent: int = 3 and constructing asyncio.Semaphore(max_concurrent)),
validate or clamp max_concurrent to a positive integer (e.g. raise ValueError if
<= 0 or set max_concurrent = max(1, max_concurrent)) so the semaphore always has
capacity, then proceed to create asyncio.Semaphore(max_concurrent) and continue
initializing the provider as before.
---
Nitpick comments:
In `@src/cicaddy/delegation/verifier.py`:
- Around line 200-206: The docstring for the newly added method
_create_verification_engine (and the other short-form docstrings referenced at
the noted ranges) must be converted to Google-style docstrings: replace the
one-line/short-form description with a multi-section docstring that includes at
least an "Args:" section documenting provider: "BaseProvider", num_findings:
int, session_id: str, and a "Returns:" section documenting the ExecutionEngine
return type; keep the existing brief description as the opening line and update
other new methods (lines ~249-256, 292-293, 315-317, 324-329, 370-371) to the
same Google-style format to match project standards and type hints.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: e1c2bf0f-17f3-415d-a54a-5b57f271ac1c
📒 Files selected for processing (7)
src/cicaddy/agent/base.pysrc/cicaddy/config/settings.pysrc/cicaddy/delegation/orchestrator.pysrc/cicaddy/delegation/summarizer.pysrc/cicaddy/delegation/verifier.pytests/unit/test_delegation_summarizer.pytests/unit/test_delegation_verifier.py
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (2)
src/cicaddy/delegation/verifier.py (2)
261-265: 💤 Low valueAdd explicit type annotation to satisfy static analysis.
dataclasses.replace()returns a generic type that the type checker doesn't recognize asFinding. Adding an explicit annotation addresses the SonarCloud warning about the return type.Proposed fix
- result = dataclasses.replace(finding) + result: Finding = dataclasses.replace(finding)🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/cicaddy/delegation/verifier.py` around lines 261 - 265, The call to dataclasses.replace(finding) returns a generic type causing static-analysis warnings; update the assignment to include an explicit Finding type annotation (e.g., result: Finding = dataclasses.replace(finding)) and add or ensure an import for the Finding type is present in verifier.py so the type checker recognizes the result.
344-351: 💤 Low valueRemove redundant
json.JSONDecodeErrorfrom exception tuple.
json.JSONDecodeErroris a subclass ofValueError(since Python 3.5), so it's already caught byValueError. This matches the SonarCloud hint about redundant exception classes.Proposed fix
- except (json.JSONDecodeError, ValueError, TypeError): + except (ValueError, TypeError):🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/cicaddy/delegation/verifier.py` around lines 344 - 351, The except clause in the block that calls extract_json and json.loads (inside verifier.py, around the code that assigns raw = extract_json(content) and data = json.loads(raw)) redundantly lists json.JSONDecodeError which is a subclass of ValueError; update the exception tuple to remove json.JSONDecodeError so it only catches ValueError and TypeError (i.e., except (ValueError, TypeError):) and keep the existing return of VerificationResult(status="uncertain", reasoning=content.strip()[:500]).
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@tests/unit/test_line_resolver.py`:
- Line 429: The test assigns the second tuple element to an unused name
`unresolved` when calling resolve_findings([finding], SAMPLE_DIFF); replace the
unused binding with `_` (e.g., change `resolved, unresolved =
resolve_findings(...)` to `resolved, _ = resolve_findings(...)`) in the
occurrences around the resolve_findings calls (including the instances at the
lines that correspond to the other two similar cases) so the unused variable
warning (RUF059) is eliminated.
---
Nitpick comments:
In `@src/cicaddy/delegation/verifier.py`:
- Around line 261-265: The call to dataclasses.replace(finding) returns a
generic type causing static-analysis warnings; update the assignment to include
an explicit Finding type annotation (e.g., result: Finding =
dataclasses.replace(finding)) and add or ensure an import for the Finding type
is present in verifier.py so the type checker recognizes the result.
- Around line 344-351: The except clause in the block that calls extract_json
and json.loads (inside verifier.py, around the code that assigns raw =
extract_json(content) and data = json.loads(raw)) redundantly lists
json.JSONDecodeError which is a subclass of ValueError; update the exception
tuple to remove json.JSONDecodeError so it only catches ValueError and TypeError
(i.e., except (ValueError, TypeError):) and keep the existing return of
VerificationResult(status="uncertain", reasoning=content.strip()[:500]).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: d4a190f2-74b4-4fa1-a7ae-411fc0d77a45
📒 Files selected for processing (7)
src/cicaddy/delegation/line_resolver.pysrc/cicaddy/delegation/orchestrator.pysrc/cicaddy/delegation/summarizer.pysrc/cicaddy/delegation/verifier.pytests/unit/test_delegation_orchestrator.pytests/unit/test_delegation_verifier.pytests/unit/test_line_resolver.py
💤 Files with no reviewable changes (1)
- src/cicaddy/delegation/summarizer.py
8698255 to
f408c63
Compare
Add a FindingVerifier that spawns parallel, budget-constrained AI sub-agents to verify code review findings against actual source code. Each finding is independently checked using read_file tools, filtering false positives before inline comment posting. Key design decisions: - Single shared AI provider across concurrent verifications - Sub-linear budget scaling (sqrt-based) per finding - Tool allowlist restricted to read_file/read_file_lines only - Boundary-sanitized prompts to prevent injection via finding fields - Snippet-based line re-resolution from verifier output - Path-separator guards on all suffix matching to prevent false matches - Graceful degradation: verification failures preserve original findings Feature is off by default, enabled via DELEGATION_VERIFY_FINDINGS=true. Signed-off-by: Wayne Sun <gsun@redhat.com>
5220206 to
a2d65a5
Compare
- Guard max_concurrent with max(1, ...) to prevent Semaphore deadlock - Remove redundant json.JSONDecodeError (subclass of ValueError) - Add explicit type annotation on dataclasses.replace result - Replace unused unresolved variables with _ in test_line_resolver Signed-off-by: Wayne Sun <nicksun98@gmail.com> Signed-off-by: Wayne Sun <gsun@redhat.com>
- Track snippet_resolved flag to prevent adjusted_line being silently discarded when snippet resolution fails (verifier.py) - Add path-separator guard to _find_target_file suffix matching, consistent with _find_hunk_ranges_for_file and is_line_in_diff_ranges (line_resolver.py) - Add path-separator guard to _filter_diff_for_files suffix matching (summarizer.py) - Guard provider.shutdown() to only run after successful initialize(), preventing misleading shutdown warnings on init failure (verifier.py) - Pre-compute diff snippets per file to avoid re-parsing full diff for every finding during parallel verification (verifier.py) - Pass orchestrator max_concurrent through to verifier to respect configured API rate limits (orchestrator.py) Signed-off-by: Wayne Sun <nicksun98@gmail.com> Signed-off-by: Wayne Sun <gsun@redhat.com>
…fety - Extract _resolve_one_finding() from resolve_findings in line_resolver.py (S3776) - Extract _validate_line_mapping() from _apply_line_mappings in summarizer.py (S3776) - Extract _apply_line_correction() and _extract_adjusted_line() in verifier.py (S3776) - Use cast(Finding, dataclasses.replace()) to satisfy type checker (S5890) - Convert async mock functions to sync side_effect pattern (S7503) - Fix unused variable in test_line_resolver.py (S1481) - Fix mock parameter signatures to match updated _verify_single Signed-off-by: Wayne Sun <gsun@redhat.com>
|



Summary
Add a FindingVerifier that spawns parallel, budget-constrained AI sub-agents to verify
code review findings against actual source code, filtering false positives before inline
comment posting.
that reads actual source files via
read_filetools to confirm or reject findingsDELEGATION_VERIFY_FINDINGS=trueChanges
src/cicaddy/delegation/verifier.py(547 lines) —FindingVerifierwith parallelverification, robust JSON parsing,
read_filetool filtering, boundary-sanitized prompts,snippet-based line re-resolution, path-separator guards on suffix matching
src/cicaddy/delegation/line_resolver.py—get_diff_line_ranges(),is_line_in_diff_ranges(),exact-match-first logic,
new_count==0guard, path-separator suffix guardssrc/cicaddy/delegation/orchestrator.py— threadsverify_findingsflag andToolContextthrough to
FindingVerifiersrc/cicaddy/delegation/summarizer.py—Findingdataclass extended withverified/verification_reasonfields; snippet-first line resolution inresolve_findings()src/cicaddy/delegation/schemas/summarizer_response.schema.json— addedverifiedandverification_reasonfields (required foradditionalProperties: false)src/cicaddy/config/settings.py— newDELEGATION_VERIFY_FINDINGSboolean settingsrc/cicaddy/agent/base.py— reads setting and passes to orchestratorKey design decisions
sqrt(num_findings) * 2fraction to avoid starving individual verificationsread_file/read_file_linesonly (frozenset)_sanitize_for_boundary'/' +prefix to preventbar.pymatchingfoobar.pyuncertainDEFAULT_AI_PROVIDERfor provider fallback (matches token budget lookup)Test plan
test_delegation_verifier.py— parsing, tool filtering, prompt construction,parallel execution, provider lifecycle, snippet line resolution, adjusted_line validation
test_line_resolver.py— diff range extraction, hunk validation, line range checkstest_delegation_orchestrator.py— verification flag threading, error handlinguv run pytest tests/unit/ -q— 1399 tests passpre-commit run --files <all changed files>— all checks passDELEGATION_VERIFY_FINDINGS=trueand run against a real PRSummary by CodeRabbit