Skip to content

feat(ui): policy alert cards, notifications, and durable receipts#952

Open
dislovelhl wants to merge 26 commits intoamd:mainfrom
dislovelhl:feat/optional-governance-layer
Open

feat(ui): policy alert cards, notifications, and durable receipts#952
dislovelhl wants to merge 26 commits intoamd:mainfrom
dislovelhl:feat/optional-governance-layer

Conversation

@dislovelhl
Copy link
Copy Markdown
Contributor

@dislovelhl dislovelhl commented May 4, 2026

Closes #925

Policy BLOCK decisions were already emitted as policy_alert SSE events, but Agent UI users could not reliably see, understand, or revisit those decisions. This PR turns those backend policy events into visible Policy Shield activity cards, critical notifications/toasts, and persisted receipt details so blocked tool calls stay clear across reloads.

Threads

  • Durable persistence: policy-only BLOCK streams are saved with decision, reason, rule IDs, policy version, and receipt ID so the messages API can restore them.
  • UI surfacing: chat stream handling renders Policy Shield cards, notification center filtering, and View receipt anchoring.
  • Coverage: unit, integration, and Electron assertions cover policy alert field preservation and reload visibility.

Test plan

  • cd src/gaia/apps/webui && npm run build
  • python -m pytest tests/unit/chat/ui/test_utils_helpers.py::TestMessageToResponse::test_agent_steps_preserves_policy_alert_fields tests/integration/test_chat_ui_integration.py -k policy_alert -q
  • git diff --check
  • cd tests/electron && CI=true npm test -- --runInBand

dislovelhl and others added 24 commits April 19, 2026 23:41
Adds `gaia.governance` — an opt-in, additive governance package that
wraps tool execution with an ACGS-lite-style action kernel and seams
for constitutional-swarm workflow checkpoints / receipts / policy-
version binding.

Key properties

- Zero edits to the base `Agent` class. `GovernedAgentMixin` overrides
  `_execute_tool` via `super()`; adding it to an agent costs nothing
  when no adapter is supplied.
- Canonical tool-name resolution before governance, so unprefixed MCP
  aliases cannot bypass risk tags on their canonical names.
- Fail-closed REVIEW: only an explicit `governance_reviewer` callback
  counts. The default `AgentConsole.confirm_tool_execution` auto-
  approves, so it is intentionally not consulted.
- Envelope-bound receipt hashes cover the full evidence set
  (receipt id, workflow, decision, policy version, constitution hash,
  timestamp, evidence) with strict canonical JSON.
- Workflow-bound checkpoint resolution and atomic check-and-set in
  the in-memory bridge.

Ergonomics

- `GaiaGovernanceAdapter.default(audit_log=...)` for one-line wiring
  with in-repo stubs.
- `GovernanceConfig` dataclass consolidates six governance kwargs;
  per-kwarg style preserved for back-compat.
- `@govern(risk="blocked", reason=...)` decorator colocates policy
  with tool source; explicit dict merges with decorator tags.

What's here (PR 1)

- Action-level governance via `GovernedAgentMixin`
- Protocol interfaces: `PolicyEngine`, `CheckpointRuntime`,
  `ReceiptServiceProtocol`, `PolicyBindingProtocol`
- In-memory + JSONL receipt services
- Reference stub policy engine (`RuleBasedPolicyEngine`)
- 55 unit + integration tests; pylint clean against repo `.pylintrc`
- Governed weather-agent example with CLI reviewer
- `src/gaia/governance/README.md` with quickstart and extension table

What's deferred (PR 2+)

- ACGS-lite-backed policy engine
- Persistent checkpoint bridge via constitutional-swarm
- Policy control plane wiring
- Plan-step / multi-agent workflow transitions (the
  `workflow_mapper` helper is a forward-compatibility seam)

Review: iterated against Codex (architecture / correctness / security)
and Gemini (DX / API ergonomics / docs) advisors. All HIGH / MEDIUM
findings addressed; regression tests added for each.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The new governance package was missing from setup.py packages list,
causing test_all_filesystem_packages_in_setup_py to fail. Black and
isort were not run on the new files before commit.

Constraint: Black line-length follows project default (88)
Constraint: isort profile follows project default
Tested: black --check, isort --check-only both pass locally
Not-tested: full unit test suite (requires project deps)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…cation

Fix thread safety in InMemoryCheckpointBridge.create_checkpoint by holding
_lock during _records write. Guard mapping lookup in resolve_checkpoint with
.get() to raise InvalidResolutionError instead of KeyError on unknown types.

Add Lock to InMemoryReceiptService for concurrent access. Harden _read_all
deserialization to filter to known ReceiptRecord fields, silently skipping
malformed or schema-mismatched lines instead of crashing.

Replace assert in _handle_review_checkpoint with GaiaGovernanceError raise
(asserts are stripped with -O). Eliminate type: ignore[union-attr] by passing
the already-resolved adapter as an explicit parameter.

Make handle_transition REVIEW branch explicit with elif + raise
GaiaGovernanceError on unknown decision types instead of implicit fallthrough.

Remove duplicate GovernanceCallback / GovernanceReviewer aliases: define once
in config.py with specific types (ActionRequest, GovernanceDecision) and
import in mixin.py.

Confidence: high
Scope-risk: narrow
Tested: black+isort clean, syntax verified
Not-tested: full test suite (no test runner available locally)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
test_tool_decorator.py has an autouse fixture that calls
_TOOL_REGISTRY.clear() before and after each test.  When it runs
before test_governance_dx.py in CI, _dx_decorated_blocked is no longer
in the registry so _lookup_tool_fn returns None and the two tests that
depend purely on decorated tags (test_mixin_reads_decorated_tags_from_registry
and test_explicit_dict_overrides_decorated_tags) see an ALLOW decision
instead of BLOCK.

Fix: add an autouse fixture in test_governance_dx.py that re-registers
the test tools if the registry was cleared between collection and
execution.

Constraint: _TOOL_REGISTRY is a module-level mutable global; test
isolation must be explicit when multiple suites share it.
Tested: test_tool_decorator.py + test_governance_dx.py sequentially (16 passed)
Not-tested: concurrent xdist workers (not used in this CI)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Black reformatted 5 test files that were introduced without running the
formatter first: test_governance_dx.py (also picks up the autouse-fixture
added in the previous commit), test_governance_adapter.py,
test_governance_jsonl_receipts.py, test_governance_schemas.py,
test_governance_receipts.py.

No logic changes.

Tested: 40 governance unit tests pass locally
Scope-risk: narrow

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Remaining 6 files flagged by CI Black check:
- tests/integration/test_governed_*.py (5 governance integration tests)
- src/gaia/mcp/mcp_bridge.py

No logic changes.

Scope-risk: narrow

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Closes the merge blockers from PR review:

- Tighten exception scopes in mixin and receipt service. Replace blanket
  `except Exception: pass` with specific exception types and `logger.warning`
  for the unexpected case. Most importantly, `_resolve_canonical_tool_name`
  now logs unexpected resolver errors instead of silently falling through
  to the raw name — closing the alias-bypass risk where governance could
  check tags on the wrong key.
- Correct documentation: tag merge is additive (union, deduplicated), not
  "explicit dict wins". README, decorator docstring, and mixin comment now
  match the behavior tests have always asserted.
- Strict canonical JSON for BLOCK-receipt evidence: handle non-JSON tool
  args, complex types, and cycles without falling back to repr().
- Strict canonical JSON in JsonlReceiptService.issue_receipt: reject
  non-canonical metadata (NaN/Inf, opaque objects) at issue time instead
  of allowing tampered receipts to land in the audit log.
- Register the governance SDK in public docs: new
  `docs/sdk/sdks/governance.mdx` and an entry in `docs/docs.json`.
The API Tests job has no `timeout-minutes`, so a hung Lemonade server
or stalled model pull leaves the runner spinning indefinitely (a 4-hour
no-op happened on PR amd#921). 30 min comfortably covers the worst-case
sequential path: 60s server start + 10min model pull + 2min model load
+ 30s API server start + test run.
…p-copy tags

Final polish on top of the merge-blocker fixes. Reviewer feedback from a
parallel code/architecture audit converged on these items:

- Delete `workflow_mapper.py` and `StaticPolicyBindingService.bind_receipt`.
  Both are advertised as forward-compat seams but have zero callers in
  src/, tests/, examples/, or docs/. Re-introduce them in the PR that
  actually wires the new event surface, with the real signature in hand.
- Tighten `JsonlReceiptService.get_receipt`: cache reads and writes were
  unsynchronized while a concurrent `issue_receipt` was mutating the same
  dict under `_lock`. Move the cache check + install under the lock.
- Add a `logger.debug` breadcrumb for malformed-line skips in
  `_read_all` so an operator chasing a missing receipt has something to
  grep.
- Deep-copy inner risk-tag lists in `GovernedAgentMixin.__init__` so a
  caller cannot mutate the agent's tag table after construction by
  holding onto the original list reference.
- Add a comment in `_canonical_json_value` documenting why `bool` is
  checked before `int` (subclass relationship — without the ordering,
  `True` would canonicalize as `1`).
- README: drop the `workflow_mapper` mention from "What's not here yet"
  now that the seam is gone.
Adds tests for the previously-uncovered branches surfaced by the
test-coverage audit. Each test guards against a specific regression:

- `test_resolver_unexpected_exception_logs_and_governs_raw_name` — proves
  a buggy `_resolve_tool_name` that raises an unexpected exception still
  triggers governance on the raw name AND emits an operator-visible
  warning. Future regression where the warning is swapped for a silent
  fallback fails this test.
- `test_resolver_lookup_error_is_silent_and_governs_raw_name` — proves
  the expected "tool not in registry" case (LookupError) is absorbed
  silently with no log noise.
- `test_unknown_transition_outcome_fails_closed` — proves a custom
  `CheckpointRuntime` returning a status the mixin doesn't know is
  denied, not let through.
- `test_handle_transition_rejects_unknown_decision_type` — same idea at
  the adapter layer for an unknown `GovernanceDecision.decision`.
- `test_read_all_skips_malformed_lines` — proves a corrupt line in the
  middle of an audit log doesn't block readers from finding subsequent
  valid records.
- Existing callback-exception and reviewer-exception tests gain caplog
  assertions so a future silent-swallow regression is caught.

Plus two readability fixes:
- Rename `test_explicit_dict_overrides_decorated_tags` to
  `test_explicit_empty_dict_does_not_downgrade_decorator_tags` — the
  body asserts additive semantics; the old name said the opposite.
- Replace hardcoded `"test_governance_adapter.SlotOnlyEvidence"`
  qualname strings with `f"{Cls.__module__}.{Cls.__qualname__}"` so the
  tests survive a file rename.
… log

`_prompt_review` now returns `(approved, exception_or_None)` instead of just
`approved`. When a reviewer raises, `_handle_review_checkpoint` stamps the
exception type and message into `CheckpointResolution.reason` so the receipt
metadata records "reviewer raised RuntimeError: bad reviewer" rather than the
boilerplate "reviewer rejected" — which previously made the audit log unable
to tell a deliberate "no" from a crash.

The operator-facing `logger.warning` was already in place; this commit closes
the audit-trail gap so downstream consumers (compliance, forensics, retros)
can distinguish the two without grepping operator logs.

Adds two tests:
- `test_reviewer_exception_is_treated_as_reject` extended to assert the
  receipt's `metadata.evidence.resolution.reason` contains the exception
  type and message
- new `test_reviewer_explicit_no_keeps_plain_reason` — a reviewer that
  returns False produces a plain "reviewer rejected" reason, not an
  exception-flavored one (regression guard against false positives)
The Claude AI Assistant workflow runs the claude-code-action which
requires ANTHROPIC_API_KEY. That secret is only configured on the
canonical amd/gaia repo, so every fork without the secret hits a
hard failure on PR-review, issue-handler, and release-notes events.

Add `github.repository == 'amd/gaia'` to each job's `if:` so the
workflow no-ops on forks rather than failing red. Forks can still
opt-in by setting their own ANTHROPIC_API_KEY and removing the guard,
but the default is silent skip.

Tested by re-running PR #3 on dislovelhl/gaia-acgs after this commit:
all four jobs should report `Skipped` instead of failing.
Addresses architectural feedback on amd#921 (review 4197475871). Governance
REVIEW now reuses GAIA's existing blocking confirmation flow when the
active console advertises it, instead of running as a parallel
enforcement path that silently fails closed.

- OutputHandler grows a `blocking_confirmation: bool = False` capability
  flag; SSEOutputHandler sets it to True (it already blocks on the
  frontend permission modal).
- _prompt_review precedence: explicit governance_reviewer wins; else
  delegate to console.confirm_tool_execution iff the console advertises
  blocking_confirmation; else fail closed. The console is resolved per
  call, not captured at __init__.
- The default console still returns True immediately, so CLI without an
  explicit reviewer continues to fail closed (no auto-approve).

Test coverage:
- tests/integration/test_governed_review_flow.py — @govern(risk=review)
  + SSEOutputHandler emits permission_request, deny resolves the
  checkpoint, denied tool body never executes, non-blocking consoles
  fail closed, audit receipt distinguishes REVIEW_REJECTED from BLOCK.
- tests/unit/chat/ui/test_sse_confirmation.py — handshake coverage for
  approve/deny/timeout/cancel.

Documents the relationship to confirm_tool_execution in
docs/sdk/sdks/governance.mdx and src/gaia/governance/README.md so the
"which mechanism shows a UI prompt?" answer is no longer ambiguous.

The legacy TOOLS_REQUIRING_CONFIRMATION set is intentionally untouched
in this commit; unifying the pipeline is staged for follow-up PRs.
Today GovernedAgentMixin returns a denied dict on BLOCK and the end user
can't tell a policy refusal from a generic tool failure. This adds a
structured policy_alert event so the Agent UI can surface a "blocked by
policy" notification with the audit receipt id.

- OutputHandler grows print_policy_alert() (default no-op for headless
  consoles).
- SSEOutputHandler.print_policy_alert emits a typed event onto the SSE
  queue with tool, decision, reason, rule_ids, policy_version, and the
  audit receipt_id when present. Tool args are deliberately excluded —
  receipt_id is the safe correlator for deep-linking back to the audit.
- GovernedAgentMixin._emit_policy_alert is called immediately before the
  denied result is returned. The receipt_id surfaced in the SSE event is
  the same id stored by the receipt service, so the alert and the audit
  log link 1:1. Emission failures are logged (warning, exc_info) and
  swallowed so a UI bug can never break governance.
- Frontend StreamEvent type union grows policy_alert + the new optional
  fields. Rendering (toast, inline card, "view receipt" route) ships in
  PR-2 once the receipt-viewer UX is decided.

Tests:
- tests/unit/chat/ui/test_sse_handler.py — exact event shape, including
  the omits-receipt-id-when-None case.
- tests/integration/test_governed_review_flow.py — full BLOCK path
  through SSEOutputHandler asserts denied result, agent.calls == [] (tool
  body never executes), audit receipt persisted, and the alert event's
  receipt_id matches the denied result's receipt_id.

Docs:
- docs/sdk/sdks/governance.mdx + src/gaia/governance/README.md document
  the event shape and intended UI consumption.
- docs/sdk/sdks/agent-ui.mdx links the event into the SSE event
  reference.

Stacked on feat/governance-review-bridge (PR #3) which lands the Path A
capability bridge for governance REVIEW.
…ckend

feat(governance): emit policy_alert SSE event when BLOCK denies a tool
feat(governance): bridge REVIEW to existing console confirmation surface
Inherited via main merge from amd#919, which introduced both a regression
test asserting AttributeError must surface and a broad-except wrapper
that swallowed it. Same commit, opposite intent — removing the except
satisfies the test, the docstring ("Ratchets the Apr-20 review fix"),
and the project's no-silent-fallbacks rule.

Also drops one stray blank line in the test file to satisfy black.

Unblocks PR 921 CI (Code Quality + Unit Tests checks).
Flake8 E731 — "do not assign a lambda expression, use a def" — flagged
the inline reviewer fallback in mixin.py:348. Behaviour is identical;
just satisfies the project's lint contract.
Persist policy BLOCK steps through reload and disconnect so policy receipts remain visible in the Agent UI. Render alerts as non-actionable notifications, toast links, and inline Policy Shield activity cards.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@github-actions github-actions Bot added documentation Documentation changes dependencies Dependency updates devops DevOps/infrastructure changes tests Test changes electron Electron app changes agents labels May 4, 2026
Resolve PR amd#952 conflicts by keeping the SettingsPage migration from main while preserving policy alert notifications and tests. Sync Agent UI package metadata with the merged GAIA version.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@dislovelhl dislovelhl marked this pull request as ready for review May 4, 2026 19:21
@dislovelhl dislovelhl requested a review from kovtcharov-amd as a code owner May 4, 2026 19:21
Copilot AI review requested due to automatic review settings May 4, 2026 19:21
@dislovelhl
Copy link
Copy Markdown
Contributor Author

Resolved the merge conflicts with amd/main and pushed the merge commit to the PR branch.

What changed in the resolution:

  • Kept main's SettingsPage migration while preserving the policy-alert notification popover in App.tsx.
  • Updated Electron static assertions from SettingsModal to SettingsPage.
  • Synced Agent UI package metadata to 0.17.6 to match src/gaia/version.py.

Validation run locally:

  • cd src/gaia/apps/webui && npm run build
  • python -m pytest tests/unit/chat/ui/test_utils_helpers.py::TestMessageToResponse::test_agent_steps_preserves_policy_alert_fields tests/integration/test_chat_ui_integration.py -k policy_alert -q
  • git diff --check
  • cd tests/electron && CI=true npm test -- --runInBand

GitHub now reports the PR as mergeable; remaining BLOCKED state appears to be from required checks/reviews rather than conflicts.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds a first-class “policy alert” (governance) event/step that can be streamed to the UI, persisted in chat history, and surfaced via dedicated UI affordances (agent activity cards, notifications, and a receipt-focused toast).

Changes:

  • Persist policy_alert agent steps (including decision/reason/rule IDs/policy version/receipt ID) and ensure policy-only BLOCK streams are reloadable from the messages API.
  • Update the web UI to recognize policy_alert events, render “Policy Shield” activity cards, and add a notification center with policy filtering + receipt anchoring.
  • Expand unit/integration/electron test coverage for policy alerts and update async test helpers to use asyncio.run.

Reviewed changes

Copilot reviewed 21 out of 22 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
tests/unit/chat/ui/test_utils_helpers.py Adds unit test ensuring policy alert fields survive message_to_response.
tests/unit/chat/ui/test_history_limits.py Switches _run_sync helper to asyncio.run.
tests/unit/chat/ui/test_chat_helpers_model_resolution.py Switches _run_sync helper to asyncio.run.
tests/integration/test_chat_ui_integration.py Adds integration coverage for policy-only blocks, multi-block streams, disconnect persistence, and dedupe behavior.
tests/electron/test_electron_chat_app.js Adds/updates string-based UI coverage assertions for policy alert routing and UI elements.
tests/electron/test_agent_process_manager.js Adjusts test config and timer cleanup to avoid pending-RPC timer leaks/flakes.
src/gaia/ui/models.py Extends AgentStepResponse with policy_alert fields for persistence/API responses.
src/gaia/ui/_chat_helpers.py Captures policy_alert events during streaming and persists policy-only BLOCK responses for reloadability.
src/gaia/apps/webui/src/types/index.ts Extends AgentStep type with policy_alert and related metadata fields.
src/gaia/apps/webui/src/types/agent.ts Extends notification types/model to include policy_alert metadata.
src/gaia/apps/webui/src/styles/index.css Adds global styles for the notification center trigger/popover.
src/gaia/apps/webui/src/services/api.ts Routes policy_alert SSE events through the agent-event callback path.
src/gaia/apps/webui/src/components/NotificationCenter.tsx Adds policy alert rendering and receipt anchoring/filter tab.
src/gaia/apps/webui/src/components/NotificationCenter.css Styles policy notifications and policy detail blocks.
src/gaia/apps/webui/src/components/ChatView.tsx Handles policy_alert events (steps + notifications + toast) and adds notification-center trigger UI.
src/gaia/apps/webui/src/components/ChatView.css Adds styling for the policy alert toast.
src/gaia/apps/webui/src/components/AgentActivity.tsx Renders policy alert “Policy Shield” cards and updates summary behavior.
src/gaia/apps/webui/src/components/AgentActivity.css Styles policy alert cards and policy-aware summary bar state.
src/gaia/apps/webui/src/App.tsx Mounts the notification center popover controlled by the notification store.
src/gaia/apps/webui/package.json Bumps UI version and adds make script alias.
src/gaia/apps/webui/package-lock.json Updates lockfile version metadata to match package version bump.
docs/guides/agent-ui.mdx Documents policy alerts/receipts behavior in the Agent UI guide.
Files not reviewed (1)
  • src/gaia/apps/webui/package-lock.json: Language not supported

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread docs/guides/agent-ui.mdx
Comment on lines +186 to +196
### Policy Alerts and Receipts

When a governance-enabled agent blocks a tool call, the Agent UI shows a
non-actionable policy alert instead of an approval prompt. Policy blocks appear
as inline **Policy Shield** activity cards, critical notifications, and a toast
with a **View receipt** link when a receipt ID is available.

Policy alerts are durable session history. If you reload the UI or reconnect
after a blocked request with no assistant text, the block reason, rule IDs,
policy version, and receipt ID remain attached to the assistant message.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 21 out of 22 changed files in this pull request and generated 1 comment.

Files not reviewed (1)
  • src/gaia/apps/webui/package-lock.json: Language not supported

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/gaia/apps/webui/src/components/ChatView.tsx Outdated
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Copy link
Copy Markdown
Collaborator

itomek commented May 5, 2026

@dislovelhl — triage notes before doing a code-level pass:

  • Title. Feat/optional governance layer doesn't fit — the governance layer is already on main at src/gaia/governance/. This PR is the Agent UI surface for the existing policy_alert event (Policy Shield cards, notifications, persistence). Conventional-commits form: feat(ui): policy alert cards, notifications, and durable receipts.
  • No linked issue. Add Closes #N so the diff has a stated success criterion and auto-closes on merge. (Copilot flagged the same — body is real, but the issue link is missing.)
  • Scope. 22 files spanning docs + UI types + React + Python persistence + Electron + integration tests is a lot for a single PR. Splitting into (1) _chat_helpers.py persistence + integration tests, (2) UI components + types, (3) Electron tests would land each piece faster. Not a blocker for this PR, but worth considering for the next.

Happy to do a code-level pass once the title and issue link are sorted. Copilot's two technical findings (Zustand selector to avoid full-store subscription in ChatView; PR-description template) are worth picking up before the rebase too.

@dislovelhl dislovelhl changed the title Feat/optional governance layer feat(ui): policy alert cards, notifications, and durable receipts May 5, 2026
@dislovelhl
Copy link
Copy Markdown
Contributor Author

@itomek — prereqs you flagged are sorted (title, Closes #925, Zustand selector, template body). Author-side map to make the code-level pass faster.

Scope at a glance. The backend policy_alert SSE event landed via the #921 series. This PR is purely the UI surface + durable persistence:

  • _chat_helpers.py — new policy_alert branch in the SSE producer loop. _persist_policy_block_if_needed() writes Blocked: {tool} is restricted by policy. Multi-block joins tools and pluralizes. Disconnect path: _cleanup_stream() moved into finally:, persistence runs on each alert before close. The trickiest piece is the delete-and-replace flow: when the agent also returns a natural-language explanation, the persisted block message is deleted and replaced — that's how a single assistant message lands in DB either way. Guard: test_policy_alert_block_with_agent_explanation_does_not_duplicate_message.
  • models.pyAgentStepResponse carries decision / reason / ruleIds / policyVersion / receiptId.
  • Frontend — api.ts adds 'policy_alert' to AGENT_EVENT_TYPES. ChatView.tsx produces step + critical notification + 5.2s toast with "View receipt". AgentActivity.tsx renders FlowPolicyAlert (auto-expands when alerts are present). NotificationCenter.tsx adds the Policy filter chip and sets id="policy-receipt-${receiptId}" — that's the anchor the toast scrolls to.

Test coverage, honestly.

  • Integration: blocks / multi-block / disconnect / no-duplicate are covered (four named tests in test_chat_ui_integration.py).
  • Electron: mostly expect(content).toContain(...) against fs.readFileSync output. Proves the source strings exist; doesn't prove toast → View receipt → notification anchor scrolls correctly in a live DOM. RTL/Playwright is on my follow-up list, not in this PR.

Open question for you. ruleIds / policyVersion / receiptId are exposed in the UI. Intentional for local GAIA transparency — but if there's a tenant/enterprise posture where rule IDs should be redacted, I'd land that as a follow-up flag rather than blocking this PR. Your call.

Follow-up test I'd write next (not this PR): policy_alert emitted, then the stream errors mid-response — the persisted block message should still survive.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agents dependencies Dependency updates devops DevOps/infrastructure changes documentation Documentation changes electron Electron app changes tests Test changes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Governance: surface BLOCK decisions to the user via a policy_alert SSE event

3 participants