feat: UI fixes, L030 false-positive reduction, noqa suppression, and AI validation framework#266
Conversation
…AI validation framework ## UI & Gateway fixes - Activity tab: wrap AI proposals in ExpandableSection for consistent collapsible/minimizable behavior - Analytics tab: add rule description column to AI Proposal Acceptance table with adjusted column widths - Projects: fix violations count mismatch by adding project_violation_count query that counts all violations instead of using paginated list length - Project detail: increase violation fetch limit from 100 to 500 - AI proposals: surface declined AI-candidate violations in review panel so users see all AI-tagged findings, not just successful ones ## L030 false-positive reduction - Only fire when the module's short name has a known ansible.builtin equivalent; collection modules with no builtin counterpart (e.g. community.general.timezone) are no longer flagged - Include builtin_alternative in violation detail for actionable output ## Inline noqa suppression - Add # noqa: <rule_id> support in YAML to suppress specific rules per-task (single and multi-rule forms) - Integrate into graph scanner evaluation loop ## AI-assisted validation framework - Add AIValidationProvider protocol and AIValidationResult model for contextual finding validation (true/false positive assessment) - Add validation prompt template and AbbenayProvider.validate_finding() - Define AI_REVIEWABLE_RULES set (R108, R103, R104, R101, R105) - Add is_ai_reviewable() partition helper ## Nginx cache fix - Add Cache-Control: no-cache on index.html to prevent stale bundles - Serve /assets/ with immutable caching and return 404 for missing assets instead of falling back to index.html (SPA fallback) Made-with: Cursor
There was a problem hiding this comment.
Pull request overview
This PR improves both the UI and scanning/remediation pipeline by refining rule behavior (L030), adding inline YAML suppression (# noqa:), introducing an AI-assisted validation interface for contextual findings, and addressing frontend caching and UX issues.
Changes:
- Reduce L030 false positives by only flagging non-builtin modules when a builtin equivalent exists, and expose a suggested builtin alternative in violation detail.
- Add inline
# noqa: <rule_id>suppression parsing in the graph scanner plus new unit tests. - Introduce AI validation models/protocols and Abbenay implementation, alongside several UI updates and an nginx cache-header fix.
Reviewed changes
Copilot reviewed 16 out of 16 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/test_task_local_graph_rules_batch1.py | Adds coverage for L030 builtin-equivalent logic and new # noqa suppression parsing. |
| src/apme_gateway/db/queries.py | Adds project_violation_count() to return the true total for the latest scan. |
| src/apme_gateway/api/router.py | Uses project_violation_count() when building project summaries. |
| src/apme_engine/validators/native/rules/L030_non_builtin_use.md | Updates rule documentation to reflect builtin-equivalent-only behavior with new examples. |
| src/apme_engine/validators/native/rules/L030_non_builtin_use_graph.py | Implements builtin-equivalent detection and adds builtin_alternative detail field. |
| src/apme_engine/remediation/partition.py | Adds AI-reviewable rule set and helper is_ai_reviewable(). |
| src/apme_engine/remediation/ai_provider.py | Introduces AI validation protocol + result/verdict models. |
| src/apme_engine/remediation/abbenay_provider.py | Implements validation prompt + parsing and validate_finding() for Abbenay. |
| src/apme_engine/engine/graph_scanner.py | Adds parse_noqa() and skips suppressed rules during node evaluation. |
| src/apme_engine/daemon/primary_server.py | Surfaces “declined” AI-candidate findings as proposals for UI visibility. |
| frontend/src/pages/ProjectDetailPage.tsx | Raises violation fetch limit from 100 to 500. |
| frontend/src/pages/AnalyticsPage.tsx | Adds rule description column to AI Proposal Acceptance table (and other analytics tables). |
| frontend/src/pages/ActivityDetailPage.tsx | Wraps AI proposals table in ExpandableSection with independent toggle state. |
| frontend/src/data/ruleDescriptions.ts | Updates the L030 description string to match new semantics. |
| docs/rules/RULE_CATALOG.md | Updates tested counts/entries to reflect additional rule test coverage (R103). |
| containers/ui/nginx.conf.template | Prevents caching of index.html and enables immutable caching for hashed /assets/. |
- Fix inaccurate docstring in graph_scanner.py: suppressed rules are skipped entirely, not tracked in diagnostics - Split comma-separated rule_ids when building proposed_rule_files set to correctly match individual rules against declined proposals - Add defensive parsing for raw_line in _build_declined_proposals to handle list/tuple line ranges and non-numeric values without crashing - Remove duplicate AI_REVIEWABLE_RULES from abbenay_provider.py; the canonical definition lives in partition.py Made-with: Cursor
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 16 out of 16 changed files in this pull request and generated no new comments.
Comments suppressed due to low confidence (2)
src/apme_engine/daemon/primary_server.py:1738
session.statusis being set to4for AWAITING_APPROVAL, butSessionState(src/apme_engine/daemon/session.py) and_session_replay_state()in this file treat1as AWAITING_APPROVAL (matchingSessionStatusenum in proto). Using4will emit an unknown enum value inApprovalAckand prevents proposal state replay. Set this to the correct AWAITING_APPROVAL value (1) and keep status values consistent across the daemon.
session.remaining_manual = []
# 6. Enrich violations with node YAML from the graph progression.
_enrich_violations_from_graph(remaining, graph, fixed=False)
src/apme_engine/daemon/primary_server.py:1742
ProposalsReadyis constructed withouttierandstatus. The proto defines these fields and_session_replay_state()populates them; leaving them unset here defaults toSESSION_STATUS_UNSPECIFIEDand tier=0, which can break clients that rely on these values. Populatetier=session.current_tierandstatusbased onsession.statuswhen emitting this event.
for fv in graph_report.fixed_violations:
fv["remediation_class"] = RemediationClass.AUTO_FIXABLE
_enrich_violations_from_graph(graph_report.fixed_violations, graph, fixed=True)
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 17 out of 17 changed files in this pull request and generated 3 comments.
Comments suppressed due to low confidence (1)
src/apme_engine/cli/remediate.py:174
- The CLI always enqueues an
ApprovalRequestafter receiving aproposalsevent, even whenevent.proposals.statusisCOMPLETE(3) and/or when proposals are informationalstatus="declined". With the new server behavior that may emit results immediately for declined-only proposals, this can lead to duplicate result events and noisy warnings about unknown proposal IDs. Consider skipping the approval step entirely whenevent.proposals.status == COMPLETE, and for--auto-approveonly approving proposals whoseProposal.status == "proposed".
elif oneof == "proposals":
proposals = list(event.proposals.proposals)
if not proposals:
continue
if getattr(args, "auto_approve", False):
approved = [p.id for p in proposals]
elif use_json:
approved = []
else:
approved = _interactive_review(proposals)
cmd_queue.put(
SessionCommand(
approve=ApprovalRequest(approved_ids=approved),
)
)
…nrichment (ansible#267) * feat(engine,ui): per-violation diffs, play-level rule fixes, and UI enrichment Replace the deprecated snippet field with full per-violation context: original_yaml, fixed_yaml, co_fixes, and node_line_start flow from the ContentGraph through gRPC, Gateway, and REST to the frontend. The UI renders Source and Diff tabs in the violation modal using client-side unified diffs. Fix play-level rule scoping: M010 and L042 now match and report at the play node (not every child task), using affected_children for counts. Populate yaml_lines for play nodes by deriving line ranges from the playbook file content when the parser does not supply them. Narrow L005 from "any non-builtin module" to "community.* collections only" with a recommendation to use certified or validated collections. Reclassify remediation_class as observational: violations are "Fixable" only when actually fixed during convergence, not when a transform merely exists. UI improvements: normalize pill shapes, add Fix type filter, reorder filters to match pill order (Severity → Scope → Fix → Rule), sticky file headers in the results scroll area. Made-with: Cursor * fix(ui): clamp node_line_start to 1 and make Fix filter labels context-aware - Use `|| 1` instead of `?? 1` for node_line_start so a backend value of 0 (unknown) falls back to 1-based line numbering. - Pass isRemediate to ViolationOutputToolbar so the Fix filter menu shows "Fixed" instead of "Fixable" when viewing remediate results, matching the context-aware labels in violation rows. Addresses Copilot review comments on PR ansible#267. Made-with: Cursor
When --json is set, approve no proposals instead of entering _interactive_review. The interactive prompts (input()) write to stdout, corrupting JSON output in CI where stdin is /dev/null. This fixes the test_remediation_idempotent integration test which broke because declined proposals (added for UI visibility) now trigger the proposals event even without Abbenay, causing the interactive review to run in --json mode.
…fields - Use status=1 (AWAITING_APPROVAL) instead of 4 (undefined) when proposals require approval. The proto enum defines 1/2/3 only; using 4 broke session replay which checks status == 1. - Populate tier and status on ProposalsReady event to match the contract used by _session_replay_state and CLI/UI consumers.
…k UI (ansible#270) * fix(remediation): correct summary accounting and unify remediate/check UI The remediation summary card showed inconsistent numbers because: 1. AI-resolved violations were never added to all_fixed in the convergence loop — only Tier 1 transforms tracked their fixes. This made GraphFixReport.fixed undercount. 2. The gateway mixed violation counts (Tier 1) with proposal counts (AI accepted) when computing "remediated", producing totals where Remediated + Manual > Total. 3. Rescan after partial Tier 1 passes dropped violations on unmodified nodes, losing them from the active list. Changes: - Track AI-resolved violations after AI rescan by comparing dirty-node violations before/after and adding disappeared ones to all_fixed - Preserve non-dirty violations across rescan sites (3 locations) - Promote stalled Tier 1 violations to Tier 2 for AI processing - Gateway: use report.fixed directly for remediated (no ai_accepted hack), use remaining count for manual_review in remediate mode - Unify remediate view to show per-violation rows (like check), removing the combined-fixed summary row - Increase FixSession client timeout from 600s to 1800s Invariant: Total = Remediated + Manual (for remediate mode) Made-with: Cursor * fix(ui,remediation): address Copilot review feedback - ViolationDetailModal: add scanType prop so the Remediation field shows "Fixed" in remediate mode and "Fixable" in check mode - graph_engine: normalize rule_id when comparing pre/post AI rescan keys to handle legacy prefixed IDs (e.g. "native:L021") Made-with: Cursor
- parse_noqa: strip quoted strings before matching # noqa: comments so occurrences inside single/double-quoted YAML scalars are ignored - primary_server: only emit ProposalsReady when there are actual proposals to approve (proposed_proposals non-empty); declined-only sets go straight to result, preventing double emission of _session_build_result and duplicate sink side-effects - L030: update rule description to "Prefer ansible.builtin when a builtin equivalent exists" matching the narrower trigger semantics
The description said "Prefer ansible.builtin modules when available" but the rule now only fires when a builtin equivalent exists. Updated the description across the rule, markdown docs, RULE_CATALOG, and generator. Made-with: Cursor
- graph_scanner: anchor _NOQA_RE so # must be preceded by whitespace or start-of-line, preventing false matches mid-token like echo#noqa - queries: replace two-query project_violation_count with single query on Scan.total_violations, avoiding COUNT(*) on violations - test_gateway_projects: extend test_list_projects to seed violations and assert total_violations field on project list items
a1a57f8 to
8de4a6a
Compare
- parse_noqa: relax docstring to clarify the quoted-string stripping
is best-effort and does not handle YAML escaped single-quotes ('')
- test_noqa_suppresses_specific_rule: add baseline scan without noqa
to prove L026 fires, then assert suppressed scan has zero L026 hits
- test_noqa_multiple_rules: same pattern — baseline proves L026/L030
fire, then assert both are suppressed with # noqa: L026, L030
Made-with: Cursor
- Relax parse_noqa docstring to acknowledge that YAML '' escape inside single-quoted scalars is not handled (accepted edge case). - Rewrite test_noqa_suppresses_specific_rule with a baseline scan that proves L026 fires, then a suppressed scan asserting it doesn't. - Rewrite test_noqa_multiple_rules the same way for L026+L030. Made-with: Cursor
37fc0e8 to
c431501
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 20 out of 20 changed files in this pull request and generated no new comments.
Comments suppressed due to low confidence (1)
src/apme_gateway/api/router.py:2138
- In the WebSocket result payload,
manual_reviewis documented as the Tier-3 manual count (seeActivitySummary/FixReport.remaining_manual), but in remediate mode it’s currently set tolen(remaining)(which includes AI-candidate violations too). Usereport.remaining_manualconsistently (or introduce a separateremaining_totalfield) so the UI’s “Manual” metric doesn’t get inflated/mislabeled during remediation runs.
"ai_accepted": ai_accepted_count,
"manual_review": remaining_count
if is_remediate
else (report.remaining_manual if report else 0),
Remove AIValidationProvider protocol, AIValidationVerdict enum, AIValidationResult dataclass, AI_REVIEWABLE_RULES, is_ai_reviewable(), validate_finding(), and the validation prompt template. These were scaffolding with no callers — the existing propose_node_fix() path already handles contextual triage via its skip mechanism. Made-with: Cursor
…edger status Remove _build_declined_proposals — synthetic Proposal objects duplicated data already present in the remaining violations list. Instead, add an "ai_abstained" status to ViolationRecord so the violation ledger tracks AI attempts that produced no fix. - Add ContentGraph.abstain_violations() for the open → ai_abstained transition - Record ai_abstained in _apply_ai_transforms for nodes where AI returned None and for AISkipped entries within partial fixes - Expose ai_abstained_violations in GraphFixReport - Include ai_abstained violations in remaining_violations for downstream consumers Made-with: Cursor
…ied pill Wire AI_ABSTAINED through the full stack: proto enum, violation_convert mapping tables, CLI convert, gateway DB model + Pydantic schema + reporting servicer + router, and frontend pill/modal rendering. Violations where AI attempted but could not produce a fix now show an orange "AI Tried" pill distinct from the blue "AI" pill. Made-with: Cursor
| passes=passes, | ||
| fixed=len(fixed_violations), | ||
| remaining_violations=remaining, | ||
| remaining_violations=remaining + ai_abstained, |
There was a problem hiding this comment.
GraphFixReport.remaining_violations now concatenates open + ai_abstained. Downstream (e.g. the session path) calls add_classification_to_violations() on the returned remaining_violations, which currently overwrites remediation_resolution to UNRESOLVED for every entry. That will erase the AI_ABSTAINED resolution set in the ledger, so the new “AI Tried”/abstain signal won’t make it to the API/UI. Consider keeping remaining_violations as just open (and rely on ai_abstained_violations separately), or ensure the post-processing classification preserves an existing remediation_resolution when present.
| remaining_violations=remaining + ai_abstained, | |
| remaining_violations=remaining, |
| # Mark violations on nodes where AI produced no fix at all. | ||
| from apme_engine.remediation.partition import normalize_rule_id # noqa: PLC0415 | ||
|
|
||
| for nid in node_keys: | ||
| if nid in proposed_node_ids: | ||
| continue | ||
| rule_ids = frozenset(normalize_rule_id(str(v.get("rule_id", ""))) for v in by_node[nid]) | ||
| graph.abstain_violations(nid, rule_ids, pass_number=pass_num) |
There was a problem hiding this comment.
The fallback abstain pass marks every node without an applied proposal as ai_abstained, including cases where _propose_one raised (captured as a BaseException above) or where the node/context was skipped. That conflates “AI abstained (couldn’t fix)” with “AI failed/errored or wasn’t attempted”, and can hide transient provider failures by transitioning violations away from open. Track attempted/failed node IDs separately and only abstain when the AI actually returned an explicit no-fix/skip outcome; consider using AI_FAILED (or leaving open) for exceptions.
| def abstain_violations( | ||
| self, | ||
| node_id: str, | ||
| rule_ids: frozenset[str], | ||
| *, | ||
| pass_number: int, | ||
| ) -> int: | ||
| """Transition ``open`` violations to ``ai_abstained`` on a node. | ||
|
|
||
| Called when the AI attempted violations on this node but could | ||
| not produce a fix (returned ``None`` or ``AISkipped``). | ||
|
|
||
| Args: | ||
| node_id: Graph node whose violations to mark. | ||
| rule_ids: Set of normalized rule IDs the AI abstained from. | ||
| pass_number: Convergence pass in which AI abstained. | ||
|
|
||
| Returns: | ||
| Number of violations transitioned to ``ai_abstained``. | ||
| """ | ||
| from .models import RemediationResolution # noqa: PLC0415 | ||
|
|
||
| node = self.get_node(node_id) | ||
| if node is None: | ||
| return 0 | ||
| count = 0 | ||
| for record in node.violation_ledger.values(): | ||
| if record.status != "open": | ||
| continue | ||
| _, rule_id = record.key | ||
| if rule_id in rule_ids: | ||
| record.status = "ai_abstained" | ||
| record.fixed_in_pass = pass_number | ||
| record.violation["remediation_resolution"] = RemediationResolution.AI_ABSTAINED | ||
| count += 1 | ||
| return count |
There was a problem hiding this comment.
New abstain_violations() behavior is central to the PR (status transition + remediation_resolution tagging), but there are no unit tests asserting the ledger transition (open -> ai_abstained) and that remediation_resolution is set/propagated as expected. Adding a focused test (e.g. in the existing graph remediation engine test suite) would prevent regressions and ensure the new resolution value survives the end-to-end report/build path.
…ion_to_violations add_classification_to_violations() was unconditionally setting remediation_resolution=UNRESOLVED on all remaining violations, clobbering the AI_ABSTAINED value that abstain_violations() had already set on the ViolationDict. Now it only sets the default when no resolution exists. Made-with: Cursor
Summary
# noqa: <rule_id>suppression support for YAML contentai_abstainedviolation ledger status to track AI attempts that produced no fixAI_ABSTAINEDresolution end-to-end: proto → gateway → frontend "AI Tried" pillChanges
UI & Gateway
ExpandableSectionfor consistent collapsible behaviorproject_violation_countqueryL030 Rule
copyfromcommunity.general.copy) exists inansible.builtincommunity.general.timezone,amazon.aws.ec2_instance) are no longer flaggedbuiltin_alternativefield for actionable remediation guidanceInline noqa Suppression
parse_noqa()extracts suppressed rule IDs from# noqa: <rule_id>comments in YAML# noqa: R108) and multi-rule (# noqa: R108, L030) forms_evaluate_node()in the graph scanner — suppressed rules are skipped during evaluationNginx Cache Fix
index.htmlserved withCache-Control: no-cache, no-store, must-revalidate/assets/served with immutable caching andtry_files $uri =404(no SPA fallback for hashed assets)Violation Ledger:
ai_abstainedstatusai_abstainedstatus inViolationRecord— tracks violations where AI attempted but could not produce a fixContentGraph.abstain_violations()transitions open violations toai_abstainedAISkippedentriesGraphFixReport.ai_abstained_violationsexposes the count for downstream consumers_build_declined_proposalsapproach (synthetic Proposal objects that duplicated data already in the violations list)AI_ABSTAINED resolution end-to-end
AI_ABSTAINEDadded toRemediationResolutionenum (models.py) and proto enum (value 11)abstain_violations()setsremediation_resolution=AI_ABSTAINEDon the violation dictviolation_convert.pyand CLI_convert.pyremediation_resolutioncolumn added toViolationDB model, Pydantic schema, reporting servicer, and API routerremediation_resolutionfield onViolationDetailandViolationRecordtypes;tierLabel()andtierPillClass()show "AI Tried" with orange pill when resolution isAI_ABSTAINEDBug Fixes
session.status = 4(invalid) tosession.status = 1(AWAITING_APPROVAL per proto enum) — proposals were silently dropped on session resumeProposalsReadynow populatestierandstatusfields on initial emit (was only set on replay)Test plan
tox -e lintpassestox -e unit— 2242 passed, 42 skippedtest_remediation_resolution_is_str_enumupdated and passing (11 members)test_round_trip_resolutioncovers AI_ABSTAINED via iterationMade with Cursor