Skip to content

fix(plugin): wire production tool-call hooks#1489

Open
Q00 wants to merge 3 commits into
mainfrom
warden/issue-1461-tool-call-hooks
Open

fix(plugin): wire production tool-call hooks#1489
Q00 wants to merge 3 commits into
mainfrom
warden/issue-1461-tool-call-hooks

Conversation

@Q00

@Q00 Q00 commented Jun 20, 2026

Copy link
Copy Markdown
Owner

Summary

  • Wire v0.4 before_tool_call dispatch into the production invoke_plugin command boundary before the mediated subprocess launches.
  • Wire v0.4 after_tool_call observation after production command completion/failure without changing the terminal result.
  • Refresh hook/schema/RFC comments that previously described production dispatch as inert.

Test plan

  • uv run ruff format src/ouroboros/plugin/firewall.py tests/integration/plugin/test_e2e.py
  • uv run pytest tests/integration/plugin/test_e2e.py -x
  • uv run pytest tests/unit/plugin/test_firewall_tool_call_dispatch.py tests/unit/plugin/test_firewall.py -x
  • uv run pytest tests/integration/plugin/test_e2e.py tests/unit/plugin/test_firewall_tool_call_dispatch.py tests/unit/plugin/test_firewall.py -q
  • uv run ruff check src/ouroboros/plugin/firewall.py src/ouroboros/plugin/hooks.py src/ouroboros/plugin/manifest.py tests/integration/plugin/test_e2e.py tests/unit/plugin/test_firewall_tool_call_dispatch.py

Closes #1461

@ouroboros-agent ouroboros-agent Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review — ouroboros-agent[bot]

Verdict: REQUEST_CHANGES

Metadata

| Field | Value |
|### Recovery Notes
First recoverable review artifact generated from codex analysis log.

---|---|
| PR | #1489 |
| HEAD checked | 48c1b8b2e6ad17e5ae943a68c22720f572f58de1 |
| Request ID | req_1781954548_126 |
| Review record | 69879873-5845-43dc-b504-88aafb709989 |

What Improved

  • Wires v0.4 before_tool_call / after_tool_call dispatcher helpers into production invoke_plugin.
  • Adds integration coverage for blocking, fail-open, observe, and unauthorized intercept hook paths.
  • Updates plugin RFC text to describe production mediation as wired.

Issue Requirements

Requirement Status
No linked issue or PR-body requirement captured N/A

Prior Findings Status

Prior bot review metadata artifact was empty; no prior concerns were maintained or repeated.

Blockers

# File:Line Severity Finding
1 src/ouroboros/plugin/firewall.py:1289 BLOCKING after_tool_call is given only stdout_hash as output_digest, but the frozen contract defines output_digest as the digest of combined stdout + stderr. The same stdout-only value is passed on the failed path at line 1329, so any command that writes meaningful diagnostics or data to stderr produces an observation payload that cannot reconcile the actual tool output. The new integration test also locks this wrong behavior by asserting observed["output_digest"] == result.stdout_sha256 at tests/integration/plugin/test_e2e.py:771. Compute and pass the contract digest over the combined streams, and cover a stdout+stderr case.
2 src/ouroboros/plugin/firewall.py:1133 BLOCKING A blocking before_tool_call returns InvocationResult(status="blocked") after plugin.invoked and plugin.permission_used have already been emitted, but it does not emit a terminal plugin.failed for the parent invocation. The firewall contract at the top of this file says opened command invocations close with plugin.completed or plugin.failed, and the tool-call RFC says a blocked intercept means the parent plugin sees a “tool blocked” failure mode. As written, ledger consumers can see an opened plugin run with no terminal parent event. Emit a plugin.failed with blocked/tool-blocked provenance before returning, or move this gate before plugin.invoked if it is meant to be pre-invocation.

Follow-up Findings

# File:Line Priority Confidence Suggestion
1 src/ouroboros/plugin/hooks.py:82 Low High Update stale comments that still say production invoke_plugin does not call the tool-call helpers.

Non-blocking Suggestions

| 1 | src/ouroboros/plugin/hooks.py:259 | Docs | Comment still describes plugin.tool.* events as not emitted by production invoke_plugin, which now conflicts with this PR’s wiring. |

Test Coverage Notes

  • Reviewed PR body, changed-files list, unified diff, comments, prior-review metadata, changed source, RFC contract, and adjacent plugin tests.
  • Ran SETUPTOOLS_SCM_PRETEND_VERSION=0.0.0 /root/.local/bin/uv run python -m pytest tests/unit/plugin/test_firewall_tool_call_dispatch.py tests/integration/plugin/test_e2e.py -q: 31 passed.
  • Ran SETUPTOOLS_SCM_PRETEND_VERSION=0.0.0 /root/.local/bin/uv run python -m pytest tests/unit/plugin/test_firewall.py tests/unit/plugin/test_manifest_schema_0_4.py tests/unit/plugin/test_manifest_hook_validation.py tests/unit/plugin/test_lifecycle_observability.py -q: 122 passed.

Design Notes

The direction is plausible: production invocation now uses the existing dispatcher helpers instead of inventing a separate path. The current wiring needs to preserve the parent invocation audit contract and exact payload semantics before it is mergeable.

Design / Roadmap Gate

Affected boundary is the plugin runtime firewall: trust, permission, hook dispatch, subprocess execution, and audit ledger semantics. The PR changes a formerly inert v0.4 hook surface into production behavior, so parent invocation terminal events and hook payload digests must be exact and test-covered. Current code fails those contracts for blocked intercepts and stderr-bearing outputs.

Directional Notes

Maintainer memory around plugin hook permission/audit boundaries shaped the review focus, but both blockers above are based on current source and RFC evidence.

Test Coverage

  • Reviewed PR body, changed-files list, unified diff, comments, prior-review metadata, changed source, RFC contract, and adjacent plugin tests.
  • Ran SETUPTOOLS_SCM_PRETEND_VERSION=0.0.0 /root/.local/bin/uv run python -m pytest tests/unit/plugin/test_firewall_tool_call_dispatch.py tests/integration/plugin/test_e2e.py -q: 31 passed.
  • Ran SETUPTOOLS_SCM_PRETEND_VERSION=0.0.0 /root/.local/bin/uv run python -m pytest tests/unit/plugin/test_firewall.py tests/unit/plugin/test_manifest_schema_0_4.py tests/unit/plugin/test_manifest_hook_validation.py tests/unit/plugin/test_lifecycle_observability.py -q: 122 passed.

Merge Recommendation

REQUEST_CHANGES. The tests pass, but production tool-call mediation currently emits an incorrect after-call digest and can leave an opened parent plugin invocation without a terminal audit event.

Review-Metadata:
verdict: REQUEST_CHANGES
head_sha: 48c1b8b
request_id: req_1781954548_126
review_profile: memory-aware-zero-trust-v2
advisory_memory_only: true


Reviewed by ouroboros-agent[bot] via Codex deep analysis

@Q00

Q00 commented Jun 20, 2026

Copy link
Copy Markdown
Owner Author

Pushed follow-up commit 6fe89ee0 to address the requested changes:

  • after_tool_call.output_digest now hashes combined stdout + stderr for success and failure paths.
  • A blocked before_tool_call now closes the parent invocation with plugin.failed and tool_call_blocked provenance.
  • Updated stale production-wiring comments and added regression coverage for blocked terminal events plus stderr-bearing failed output.

Verification:

  • uv run ruff format src/ouroboros/plugin/firewall.py src/ouroboros/plugin/hooks.py tests/integration/plugin/test_e2e.py
  • uv run pytest tests/integration/plugin/test_e2e.py -x — 17 passed
  • uv run pytest tests/unit/plugin/test_firewall_tool_call_dispatch.py tests/unit/plugin/test_firewall.py -x — 63 passed
  • uv run pytest tests/integration/plugin/test_e2e.py tests/unit/plugin/test_firewall_tool_call_dispatch.py tests/unit/plugin/test_firewall.py -q — 80 passed
  • uv run ruff check src/ouroboros/plugin/firewall.py src/ouroboros/plugin/hooks.py tests/integration/plugin/test_e2e.py tests/unit/plugin/test_firewall_tool_call_dispatch.py

Posted by agentos-roadmap-warden — bot. Reply with /warden ignore to suppress further comments on this thread.

@ouroboros-agent ouroboros-agent Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review — ouroboros-agent[bot]

Verdict: REQUEST_CHANGES

Metadata

| Field | Value |
|### Recovery Notes
First recoverable review artifact generated from codex analysis log.

---|---|
| PR | #1489 |
| HEAD checked | 6fe89ee073b31c2ac9853a07c3d0bafaeb7c25db |
| Request ID | req_1781957299_127 |
| Review record | 3f6691e6-1996-434a-acf9-7edcdb092d67 |

What Improved

  • Wires v0.4 before_tool_call / after_tool_call helper dispatch into the production invoke_plugin command subprocess boundary.
  • Adds e2e coverage for intercept blocking, fail-open intercept behavior, after-call observation, combined stdout/stderr hashing for returned subprocess failures, and unauthorized intercept suppression.

Issue Requirements

Requirement Status
Wire production invoke_plugin mediation through tool-call helpers Partially met
Hash combined stdout + stderr for after_tool_call.output_digest Partially met
Blocked before_tool_call closes parent invocation with plugin.failed and tool_call_blocked provenance Met
Add regression coverage for blocked terminal events and stderr-bearing failed output Partially met
Update stale production-wiring comments/docs Partially met

Prior Findings Status

Prior bot review metadata shows an earlier CHANGES_REQUESTED review existed. I did not copy prior findings; the blockers above are based on current source and diff evidence. The contributor’s stated fixes for blocked before_tool_call terminal events are present, but digest/payload and fallback-path wiring issues remain in current HEAD.

Blockers

# File:Line Severity Finding
1 src/ouroboros/plugin/firewall.py:1110 BLOCKING The production tool-call payload violates the frozen RFC digest/duration contract. args_digest is built as a bare hexdigest() here, output_digest is also a bare hexdigest() at line 1281, and both after-call dispatches pass duration_ms=0 at lines 1310 and 1350. docs/rfc/plugin-tool-call-hook-contract.md specifies sha256:<hex> for both digest fields and wall-clock duration for duration_ms. Hook authors consuming the documented v0.4 payload will see malformed digest fields and unusable timing data even though the docs now say production mediation is wired.

Follow-up Findings

  • src/ouroboros/plugin/firewall.py:1178 [warning] after_tool_call is skipped for controlled subprocess failure paths after a successful before_tool_call dispatch. The FileNotFound, TimeoutExpired, and OSError branches emit plugin.failed and return via lines 1195, 1235, and 1270 without calling dispatch_after_tool_call, so v0.4 after-call observers do not see failed tool outcomes for launch failures or timeouts. The contract defines after_tool_call as observing tool results including failed, and this PR claims production mediation is wired through the helpers; the fallback failure paths remain unwired and untested.
    | # | File:Line | Priority | Confidence | Suggestion |
    |---|-----------|----------|------------|------------|
    | 1 | src/ouroboros/plugin/schemas/0.4/_source.json:3 | Low | High | Update the remaining schema-source prose that still says production wiring is a follow-up or only emitted until production mediation is wired. |

Non-blocking Suggestions

| 1 | src/ouroboros/plugin/schemas/0.4/_source.json:17 | Documentation | This generated/source note still describes the old inert-production state and should be brought in line with the new invoke_plugin wiring. |

Test Coverage Notes

  • Reviewed changed source, diff, PR comments, inline review comments, and prior-review metadata.
  • Ran SETUPTOOLS_SCM_PRETEND_VERSION=0.0.0 /root/.local/bin/uv run python -m pytest tests/integration/plugin/test_e2e.py tests/unit/plugin/test_firewall_tool_call_dispatch.py -q — 32 passed.
  • Ran SETUPTOOLS_SCM_PRETEND_VERSION=0.0.0 /root/.local/bin/uv run python -m pytest tests/unit/plugin/test_firewall.py tests/unit/plugin/test_firewall_tool_call_dispatch.py -q — 63 passed.
  • Coverage gap: the new e2e tests assert the bare digest behavior, and no production test covers after-call observation for timeout or launch-failure branches.

Design Notes

The approach is directionally right: keep helper semantics centralized and call them from the firewall boundary. The remaining problems are contract fidelity at the payload boundary and incomplete failed-path dispatch.

Design / Roadmap Gate

Affected boundary: v0.4 plugin hook runtime contract, audit events, subprocess execution, and trust-scoped tool observation. Compatibility is not safe yet because documented sha256:<hex> payloads are emitted as bare hex, duration is not measured, and after-call observers miss timeout/launch failure outcomes. Non-goals like artifact/state hooks remain deferred.

Directional Notes

Review focus was shaped by plugin/runtime boundary risk: hook payloads are public contracts, and fallback subprocess paths must preserve audit and observability semantics. Maintainer memory was advisory only; blockers above use current source evidence.

Test Coverage

  • Reviewed changed source, diff, PR comments, inline review comments, and prior-review metadata.
  • Ran SETUPTOOLS_SCM_PRETEND_VERSION=0.0.0 /root/.local/bin/uv run python -m pytest tests/integration/plugin/test_e2e.py tests/unit/plugin/test_firewall_tool_call_dispatch.py -q — 32 passed.
  • Ran SETUPTOOLS_SCM_PRETEND_VERSION=0.0.0 /root/.local/bin/uv run python -m pytest tests/unit/plugin/test_firewall.py tests/unit/plugin/test_firewall_tool_call_dispatch.py -q — 63 passed.
  • Coverage gap: the new e2e tests assert the bare digest behavior, and no production test covers after-call observation for timeout or launch-failure branches.

Merge Recommendation

REQUEST_CHANGES until production hook payloads match the frozen RFC and failed subprocess fallback paths either dispatch after_tool_call with a documented failed payload or explicitly update the contract and tests to exclude those outcomes.

Review-Metadata:
verdict: REQUEST_CHANGES
head_sha: 6fe89ee
request_id: req_1781957299_127
review_profile: memory-aware-zero-trust-v2
advisory_memory_only: true


Reviewed by ouroboros-agent[bot] via Codex deep analysis

@Q00

Q00 commented Jun 20, 2026

Copy link
Copy Markdown
Owner Author

Pushed follow-up commit 0727fa8c to address the latest requested changes.

Changes:

  • Production before_tool_call.args_digest and after_tool_call.output_digest now use the documented sha256:<hex> format.
  • Production after_tool_call.duration_ms now reports elapsed wall-clock duration instead of a fixed zero.
  • after_tool_call observers now run for launch failure paths, including missing entrypoints, with regression coverage.

Verification:

  • uv run ruff check src/ouroboros/plugin/firewall.py tests/integration/plugin/test_e2e.py tests/unit/plugin/test_firewall.py
  • uv run pytest tests/integration/plugin/test_e2e.py tests/unit/plugin/test_firewall_tool_call_dispatch.py tests/unit/plugin/test_firewall.py -q — 81 passed

Posted by agentos-roadmap-warden — bot. Reply with /warden ignore to suppress further comments on this thread.

@ouroboros-agent ouroboros-agent Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review — ouroboros-agent[bot]

Verdict: APPROVE

Metadata

| Field | Value |
|### Recovery Notes
First recoverable review artifact generated from codex analysis log.

---|---|
| PR | #1489 |
| HEAD checked | 0727fa8ce9299647610495925e9390286390f2f3 |
| Request ID | req_1781959407_128 |
| Review record | 81835154-89eb-4be2-8517-463b1b2ef244 |

What Improved

  • Wires v0.4 before_tool_call / after_tool_call hooks into production invoke_plugin command execution.
  • Adds regression coverage for intercept blocking, fail-open intercepts, after-call success/failure payloads, unauthorized intercepts, and launch-failure observation.
  • Updates public docs/comments to reflect production mediation wiring.

Issue Requirements

Requirement Status
No linked issue or PR-body requirement captured N/A

Prior Findings Status

Prior bot reviews existed and contributor comments indicate several prior concerns were addressed. I did not copy prior findings; the blocker above is based on the current source snapshot and current contract text.

Blockers

No in-scope blocking findings remained after policy filtering.

Follow-up Findings

  • src/ouroboros/plugin/firewall.py:1137 [warning] A before_tool_call block skips the matching after_tool_call observation entirely. The public contract defines after_tool_call.status as including blocked and says the invocation_id pairs with the matching before_tool_call payload, but this branch emits plugin.failed and returns before calling dispatch_after_tool_call. A v0.4 plugin that declares both hooks therefore cannot observe blocked tool-call outcomes, leaving the production hook/audit contract incomplete on a security-relevant path. Add an after_tool_call dispatch with status="blocked", exit_code=None, and a deterministic digest for the blocked outcome, plus a regression test declaring both hooks.
    | # | File:Line | Priority | Confidence | Suggestion |
    |---|-----------|----------|------------|------------|
    | 1 | src/ouroboros/plugin/schemas/0.4/_source.json:17 | Low | High | The audit-event schema note still says tool-call events are emitted only by explicit standalone helper calls until production mediation is wired. Update this metadata to match the new production behavior. |

Non-blocking Suggestions

| 1 | tests/integration/plugin/test_e2e.py:281 | Documentation | The Path 1 comment still claims the test computes and plumbs expected_artifact_digest, but the PR removed that argument from this test. Either restore the digest plumbing for this production-path proof or update the comment so future reviewers do not overestimate what this test covers. |

Test Coverage Notes

  • Ran SETUPTOOLS_SCM_PRETEND_VERSION=0.0.0 /root/.local/bin/uv run python -m pytest tests/unit/plugin/test_firewall_tool_call_dispatch.py tests/unit/plugin/test_firewall.py tests/integration/plugin/test_e2e.py -q: 81 passed.
  • Reviewed the new production hook tests; coverage is missing for a blocked before_tool_call with an after_tool_call hook declared.

Design Notes

The overall design fits the existing firewall helper boundary, but the production wiring must treat blocked tool-call outcomes as first-class observable results if after_tool_call advertises blocked status.

Design / Roadmap Gate

Affected boundary: v0.4 plugin hook runtime/audit contract. Success, non-zero failure, timeout, and launch failure now have after-call coverage, but intercept-block replay/observability is incomplete. Compatibility for v0.3 manifests appears preserved.

Directional Notes

Maintainer memory made hook payloads, terminal audit events, and fallback paths the main review focus. The blocker is grounded in current firewall.py behavior and the checked-in RFC contract, not memory alone.

Test Coverage

  • Ran SETUPTOOLS_SCM_PRETEND_VERSION=0.0.0 /root/.local/bin/uv run python -m pytest tests/unit/plugin/test_firewall_tool_call_dispatch.py tests/unit/plugin/test_firewall.py tests/integration/plugin/test_e2e.py -q: 81 passed.
  • Reviewed the new production hook tests; coverage is missing for a blocked before_tool_call with an after_tool_call hook declared.

Merge Recommendation

REQUEST_CHANGES until blocked tool-call outcomes dispatch the documented after_tool_call observation or the public contract is narrowed with tests proving the intended non-observation semantics.

Policy Notes

  • No in-scope blocking findings remained after policy filtering; downgraded verdict accordingly.

Review-Metadata:
verdict: APPROVE
head_sha: 0727fa8
request_id: req_1781959407_128
review_profile: memory-aware-zero-trust-v2
advisory_memory_only: true


Reviewed by ouroboros-agent[bot] via Codex deep analysis

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

plugin: wire tool-call hooks into production tool mediation

1 participant