fix(plugin): wire production tool-call hooks by Q00 · Pull Request #1489 · Q00/ouroboros

Q00 · 2026-06-20T11:22:24Z

Summary

Wire v0.4 before_tool_call dispatch into the production invoke_plugin command boundary before the mediated subprocess launches.
Wire v0.4 after_tool_call observation after production command completion/failure without changing the terminal result.
Refresh hook/schema/RFC comments that previously described production dispatch as inert.

Test plan

uv run ruff format src/ouroboros/plugin/firewall.py tests/integration/plugin/test_e2e.py
uv run pytest tests/integration/plugin/test_e2e.py -x
uv run pytest tests/unit/plugin/test_firewall_tool_call_dispatch.py tests/unit/plugin/test_firewall.py -x
uv run pytest tests/integration/plugin/test_e2e.py tests/unit/plugin/test_firewall_tool_call_dispatch.py tests/unit/plugin/test_firewall.py -q
uv run ruff check src/ouroboros/plugin/firewall.py src/ouroboros/plugin/hooks.py src/ouroboros/plugin/manifest.py tests/integration/plugin/test_e2e.py tests/unit/plugin/test_firewall_tool_call_dispatch.py

Closes #1461

ouroboros-agent

Review — ouroboros-agent[bot]

Verdict: REQUEST_CHANGES

Metadata

| Field | Value |
|### Recovery Notes
First recoverable review artifact generated from codex analysis log.

---|---|
| PR | #1489 |
| HEAD checked | 48c1b8b2e6ad17e5ae943a68c22720f572f58de1 |
| Request ID | req_1781954548_126 |
| Review record | 69879873-5845-43dc-b504-88aafb709989 |

What Improved

Wires v0.4 before_tool_call / after_tool_call dispatcher helpers into production invoke_plugin.
Adds integration coverage for blocking, fail-open, observe, and unauthorized intercept hook paths.
Updates plugin RFC text to describe production mediation as wired.

Issue Requirements

Requirement	Status
No linked issue or PR-body requirement captured	N/A

Prior Findings Status

Prior bot review metadata artifact was empty; no prior concerns were maintained or repeated.

Blockers

#	File:Line	Severity	Finding
1	src/ouroboros/plugin/firewall.py:1289	BLOCKING	`after_tool_call` is given only `stdout_hash` as `output_digest`, but the frozen contract defines `output_digest` as the digest of combined stdout + stderr. The same stdout-only value is passed on the failed path at line 1329, so any command that writes meaningful diagnostics or data to stderr produces an observation payload that cannot reconcile the actual tool output. The new integration test also locks this wrong behavior by asserting `observed["output_digest"] == result.stdout_sha256` at `tests/integration/plugin/test_e2e.py:771`. Compute and pass the contract digest over the combined streams, and cover a stdout+stderr case.
2	src/ouroboros/plugin/firewall.py:1133	BLOCKING	A blocking `before_tool_call` returns `InvocationResult(status="blocked")` after `plugin.invoked` and `plugin.permission_used` have already been emitted, but it does not emit a terminal `plugin.failed` for the parent invocation. The firewall contract at the top of this file says opened command invocations close with `plugin.completed` or `plugin.failed`, and the tool-call RFC says a blocked intercept means the parent plugin sees a “tool blocked” failure mode. As written, ledger consumers can see an opened plugin run with no terminal parent event. Emit a `plugin.failed` with blocked/tool-blocked provenance before returning, or move this gate before `plugin.invoked` if it is meant to be pre-invocation.

Follow-up Findings

#	File:Line	Priority	Confidence	Suggestion
1	src/ouroboros/plugin/hooks.py:82	Low	High	Update stale comments that still say production `invoke_plugin` does not call the tool-call helpers.

Non-blocking Suggestions

| 1 | src/ouroboros/plugin/hooks.py:259 | Docs | Comment still describes plugin.tool.* events as not emitted by production invoke_plugin, which now conflicts with this PR’s wiring. |

Test Coverage Notes

Reviewed PR body, changed-files list, unified diff, comments, prior-review metadata, changed source, RFC contract, and adjacent plugin tests.
Ran SETUPTOOLS_SCM_PRETEND_VERSION=0.0.0 /root/.local/bin/uv run python -m pytest tests/unit/plugin/test_firewall_tool_call_dispatch.py tests/integration/plugin/test_e2e.py -q: 31 passed.
Ran SETUPTOOLS_SCM_PRETEND_VERSION=0.0.0 /root/.local/bin/uv run python -m pytest tests/unit/plugin/test_firewall.py tests/unit/plugin/test_manifest_schema_0_4.py tests/unit/plugin/test_manifest_hook_validation.py tests/unit/plugin/test_lifecycle_observability.py -q: 122 passed.

Design Notes

The direction is plausible: production invocation now uses the existing dispatcher helpers instead of inventing a separate path. The current wiring needs to preserve the parent invocation audit contract and exact payload semantics before it is mergeable.

Design / Roadmap Gate

Affected boundary is the plugin runtime firewall: trust, permission, hook dispatch, subprocess execution, and audit ledger semantics. The PR changes a formerly inert v0.4 hook surface into production behavior, so parent invocation terminal events and hook payload digests must be exact and test-covered. Current code fails those contracts for blocked intercepts and stderr-bearing outputs.

Directional Notes

Maintainer memory around plugin hook permission/audit boundaries shaped the review focus, but both blockers above are based on current source and RFC evidence.

Test Coverage

Reviewed PR body, changed-files list, unified diff, comments, prior-review metadata, changed source, RFC contract, and adjacent plugin tests.
Ran SETUPTOOLS_SCM_PRETEND_VERSION=0.0.0 /root/.local/bin/uv run python -m pytest tests/unit/plugin/test_firewall_tool_call_dispatch.py tests/integration/plugin/test_e2e.py -q: 31 passed.
Ran SETUPTOOLS_SCM_PRETEND_VERSION=0.0.0 /root/.local/bin/uv run python -m pytest tests/unit/plugin/test_firewall.py tests/unit/plugin/test_manifest_schema_0_4.py tests/unit/plugin/test_manifest_hook_validation.py tests/unit/plugin/test_lifecycle_observability.py -q: 122 passed.

Merge Recommendation

REQUEST_CHANGES. The tests pass, but production tool-call mediation currently emits an incorrect after-call digest and can leave an opened parent plugin invocation without a terminal audit event.

Review-Metadata:
verdict: REQUEST_CHANGES
head_sha: 48c1b8b
request_id: req_1781954548_126
review_profile: memory-aware-zero-trust-v2
advisory_memory_only: true

Reviewed by ouroboros-agent[bot] via Codex deep analysis

Q00 · 2026-06-20T12:03:32Z

Pushed follow-up commit 6fe89ee0 to address the requested changes:

after_tool_call.output_digest now hashes combined stdout + stderr for success and failure paths.
A blocked before_tool_call now closes the parent invocation with plugin.failed and tool_call_blocked provenance.
Updated stale production-wiring comments and added regression coverage for blocked terminal events plus stderr-bearing failed output.

Verification:

uv run ruff format src/ouroboros/plugin/firewall.py src/ouroboros/plugin/hooks.py tests/integration/plugin/test_e2e.py
uv run pytest tests/integration/plugin/test_e2e.py -x — 17 passed
uv run pytest tests/unit/plugin/test_firewall_tool_call_dispatch.py tests/unit/plugin/test_firewall.py -x — 63 passed
uv run pytest tests/integration/plugin/test_e2e.py tests/unit/plugin/test_firewall_tool_call_dispatch.py tests/unit/plugin/test_firewall.py -q — 80 passed
uv run ruff check src/ouroboros/plugin/firewall.py src/ouroboros/plugin/hooks.py tests/integration/plugin/test_e2e.py tests/unit/plugin/test_firewall_tool_call_dispatch.py

Posted by agentos-roadmap-warden — bot. Reply with /warden ignore to suppress further comments on this thread.

ouroboros-agent

Review — ouroboros-agent[bot]

Verdict: REQUEST_CHANGES

Metadata

| Field | Value |
|### Recovery Notes
First recoverable review artifact generated from codex analysis log.

---|---|
| PR | #1489 |
| HEAD checked | 6fe89ee073b31c2ac9853a07c3d0bafaeb7c25db |
| Request ID | req_1781957299_127 |
| Review record | 3f6691e6-1996-434a-acf9-7edcdb092d67 |

What Improved

Wires v0.4 before_tool_call / after_tool_call helper dispatch into the production invoke_plugin command subprocess boundary.
Adds e2e coverage for intercept blocking, fail-open intercept behavior, after-call observation, combined stdout/stderr hashing for returned subprocess failures, and unauthorized intercept suppression.

Issue Requirements

Requirement	Status
Wire production `invoke_plugin` mediation through tool-call helpers	Partially met
Hash combined stdout + stderr for `after_tool_call.output_digest`	Partially met
Blocked `before_tool_call` closes parent invocation with `plugin.failed` and `tool_call_blocked` provenance	Met
Add regression coverage for blocked terminal events and stderr-bearing failed output	Partially met
Update stale production-wiring comments/docs	Partially met

Prior Findings Status

Prior bot review metadata shows an earlier CHANGES_REQUESTED review existed. I did not copy prior findings; the blockers above are based on current source and diff evidence. The contributor’s stated fixes for blocked before_tool_call terminal events are present, but digest/payload and fallback-path wiring issues remain in current HEAD.

Blockers

#	File:Line	Severity	Finding
1	src/ouroboros/plugin/firewall.py:1110	BLOCKING	The production tool-call payload violates the frozen RFC digest/duration contract. `args_digest` is built as a bare `hexdigest()` here, `output_digest` is also a bare `hexdigest()` at line 1281, and both after-call dispatches pass `duration_ms=0` at lines 1310 and 1350. `docs/rfc/plugin-tool-call-hook-contract.md` specifies `sha256:<hex>` for both digest fields and wall-clock duration for `duration_ms`. Hook authors consuming the documented v0.4 payload will see malformed digest fields and unusable timing data even though the docs now say production mediation is wired.

Follow-up Findings

src/ouroboros/plugin/firewall.py:1178 [warning] after_tool_call is skipped for controlled subprocess failure paths after a successful before_tool_call dispatch. The FileNotFound, TimeoutExpired, and OSError branches emit plugin.failed and return via lines 1195, 1235, and 1270 without calling dispatch_after_tool_call, so v0.4 after-call observers do not see failed tool outcomes for launch failures or timeouts. The contract defines after_tool_call as observing tool results including failed, and this PR claims production mediation is wired through the helpers; the fallback failure paths remain unwired and untested.
| # | File:Line | Priority | Confidence | Suggestion |
|---|-----------|----------|------------|------------|
| 1 | src/ouroboros/plugin/schemas/0.4/_source.json:3 | Low | High | Update the remaining schema-source prose that still says production wiring is a follow-up or only emitted until production mediation is wired. |

Non-blocking Suggestions

| 1 | src/ouroboros/plugin/schemas/0.4/_source.json:17 | Documentation | This generated/source note still describes the old inert-production state and should be brought in line with the new invoke_plugin wiring. |

Test Coverage Notes

Reviewed changed source, diff, PR comments, inline review comments, and prior-review metadata.
Ran SETUPTOOLS_SCM_PRETEND_VERSION=0.0.0 /root/.local/bin/uv run python -m pytest tests/integration/plugin/test_e2e.py tests/unit/plugin/test_firewall_tool_call_dispatch.py -q — 32 passed.
Ran SETUPTOOLS_SCM_PRETEND_VERSION=0.0.0 /root/.local/bin/uv run python -m pytest tests/unit/plugin/test_firewall.py tests/unit/plugin/test_firewall_tool_call_dispatch.py -q — 63 passed.
Coverage gap: the new e2e tests assert the bare digest behavior, and no production test covers after-call observation for timeout or launch-failure branches.

Design Notes

The approach is directionally right: keep helper semantics centralized and call them from the firewall boundary. The remaining problems are contract fidelity at the payload boundary and incomplete failed-path dispatch.

Design / Roadmap Gate

Affected boundary: v0.4 plugin hook runtime contract, audit events, subprocess execution, and trust-scoped tool observation. Compatibility is not safe yet because documented sha256:<hex> payloads are emitted as bare hex, duration is not measured, and after-call observers miss timeout/launch failure outcomes. Non-goals like artifact/state hooks remain deferred.

Directional Notes

Review focus was shaped by plugin/runtime boundary risk: hook payloads are public contracts, and fallback subprocess paths must preserve audit and observability semantics. Maintainer memory was advisory only; blockers above use current source evidence.

Test Coverage

Reviewed changed source, diff, PR comments, inline review comments, and prior-review metadata.
Ran SETUPTOOLS_SCM_PRETEND_VERSION=0.0.0 /root/.local/bin/uv run python -m pytest tests/integration/plugin/test_e2e.py tests/unit/plugin/test_firewall_tool_call_dispatch.py -q — 32 passed.
Ran SETUPTOOLS_SCM_PRETEND_VERSION=0.0.0 /root/.local/bin/uv run python -m pytest tests/unit/plugin/test_firewall.py tests/unit/plugin/test_firewall_tool_call_dispatch.py -q — 63 passed.
Coverage gap: the new e2e tests assert the bare digest behavior, and no production test covers after-call observation for timeout or launch-failure branches.

Merge Recommendation

REQUEST_CHANGES until production hook payloads match the frozen RFC and failed subprocess fallback paths either dispatch after_tool_call with a documented failed payload or explicitly update the contract and tests to exclude those outcomes.

Review-Metadata:
verdict: REQUEST_CHANGES
head_sha: 6fe89ee
request_id: req_1781957299_127
review_profile: memory-aware-zero-trust-v2
advisory_memory_only: true

Reviewed by ouroboros-agent[bot] via Codex deep analysis

Q00 · 2026-06-20T12:43:25Z

Pushed follow-up commit 0727fa8c to address the latest requested changes.

Changes:

Production before_tool_call.args_digest and after_tool_call.output_digest now use the documented sha256:<hex> format.
Production after_tool_call.duration_ms now reports elapsed wall-clock duration instead of a fixed zero.
after_tool_call observers now run for launch failure paths, including missing entrypoints, with regression coverage.

Verification:

uv run ruff check src/ouroboros/plugin/firewall.py tests/integration/plugin/test_e2e.py tests/unit/plugin/test_firewall.py
uv run pytest tests/integration/plugin/test_e2e.py tests/unit/plugin/test_firewall_tool_call_dispatch.py tests/unit/plugin/test_firewall.py -q — 81 passed

Posted by agentos-roadmap-warden — bot. Reply with /warden ignore to suppress further comments on this thread.

ouroboros-agent

Review — ouroboros-agent[bot]

Verdict: APPROVE

Metadata

| Field | Value |
|### Recovery Notes
First recoverable review artifact generated from codex analysis log.

---|---|
| PR | #1489 |
| HEAD checked | 0727fa8ce9299647610495925e9390286390f2f3 |
| Request ID | req_1781959407_128 |
| Review record | 81835154-89eb-4be2-8517-463b1b2ef244 |

What Improved

Wires v0.4 before_tool_call / after_tool_call hooks into production invoke_plugin command execution.
Adds regression coverage for intercept blocking, fail-open intercepts, after-call success/failure payloads, unauthorized intercepts, and launch-failure observation.
Updates public docs/comments to reflect production mediation wiring.

Issue Requirements

Requirement	Status
No linked issue or PR-body requirement captured	N/A

Prior Findings Status

Prior bot reviews existed and contributor comments indicate several prior concerns were addressed. I did not copy prior findings; the blocker above is based on the current source snapshot and current contract text.

Blockers

No in-scope blocking findings remained after policy filtering.

Follow-up Findings

src/ouroboros/plugin/firewall.py:1137 [warning] A before_tool_call block skips the matching after_tool_call observation entirely. The public contract defines after_tool_call.status as including blocked and says the invocation_id pairs with the matching before_tool_call payload, but this branch emits plugin.failed and returns before calling dispatch_after_tool_call. A v0.4 plugin that declares both hooks therefore cannot observe blocked tool-call outcomes, leaving the production hook/audit contract incomplete on a security-relevant path. Add an after_tool_call dispatch with status="blocked", exit_code=None, and a deterministic digest for the blocked outcome, plus a regression test declaring both hooks.
| # | File:Line | Priority | Confidence | Suggestion |
|---|-----------|----------|------------|------------|
| 1 | src/ouroboros/plugin/schemas/0.4/_source.json:17 | Low | High | The audit-event schema note still says tool-call events are emitted only by explicit standalone helper calls until production mediation is wired. Update this metadata to match the new production behavior. |

Non-blocking Suggestions

| 1 | tests/integration/plugin/test_e2e.py:281 | Documentation | The Path 1 comment still claims the test computes and plumbs expected_artifact_digest, but the PR removed that argument from this test. Either restore the digest plumbing for this production-path proof or update the comment so future reviewers do not overestimate what this test covers. |

Test Coverage Notes

Ran SETUPTOOLS_SCM_PRETEND_VERSION=0.0.0 /root/.local/bin/uv run python -m pytest tests/unit/plugin/test_firewall_tool_call_dispatch.py tests/unit/plugin/test_firewall.py tests/integration/plugin/test_e2e.py -q: 81 passed.
Reviewed the new production hook tests; coverage is missing for a blocked before_tool_call with an after_tool_call hook declared.

Design Notes

The overall design fits the existing firewall helper boundary, but the production wiring must treat blocked tool-call outcomes as first-class observable results if after_tool_call advertises blocked status.

Design / Roadmap Gate

Affected boundary: v0.4 plugin hook runtime/audit contract. Success, non-zero failure, timeout, and launch failure now have after-call coverage, but intercept-block replay/observability is incomplete. Compatibility for v0.3 manifests appears preserved.

Directional Notes

Maintainer memory made hook payloads, terminal audit events, and fallback paths the main review focus. The blocker is grounded in current firewall.py behavior and the checked-in RFC contract, not memory alone.

Test Coverage

Ran SETUPTOOLS_SCM_PRETEND_VERSION=0.0.0 /root/.local/bin/uv run python -m pytest tests/unit/plugin/test_firewall_tool_call_dispatch.py tests/unit/plugin/test_firewall.py tests/integration/plugin/test_e2e.py -q: 81 passed.
Reviewed the new production hook tests; coverage is missing for a blocked before_tool_call with an after_tool_call hook declared.

Merge Recommendation

REQUEST_CHANGES until blocked tool-call outcomes dispatch the documented after_tool_call observation or the public contract is narrowed with tests proving the intended non-observation semantics.

Policy Notes

No in-scope blocking findings remained after policy filtering; downgraded verdict accordingly.

Review-Metadata:
verdict: APPROVE
head_sha: 0727fa8
request_id: req_1781959407_128
review_profile: memory-aware-zero-trust-v2
advisory_memory_only: true

Reviewed by ouroboros-agent[bot] via Codex deep analysis

fix(plugin): wire production tool-call hooks

48c1b8b

Q00 mentioned this pull request Jun 20, 2026

Meta SSOT: AgentOS roadmap sequencing (#920–#960) #961

Open

ouroboros-agent Bot requested changes Jun 20, 2026

View reviewed changes

fix(plugin): close tool-call hook review gaps

6fe89ee

ouroboros-agent Bot requested changes Jun 20, 2026

View reviewed changes

fix(plugin): align tool-call payload contract

0727fa8

ouroboros-agent Bot approved these changes Jun 20, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(plugin): wire production tool-call hooks#1489

fix(plugin): wire production tool-call hooks#1489
Q00 wants to merge 3 commits into
mainfrom
warden/issue-1461-tool-call-hooks

Q00 commented Jun 20, 2026

Uh oh!

ouroboros-agent Bot left a comment

Uh oh!

Q00 commented Jun 20, 2026

Uh oh!

ouroboros-agent Bot left a comment

Uh oh!

Q00 commented Jun 20, 2026

Uh oh!

ouroboros-agent Bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Q00 commented Jun 20, 2026

Summary

Test plan

Uh oh!

ouroboros-agent Bot left a comment

Choose a reason for hiding this comment

Review — ouroboros-agent[bot]

Metadata

What Improved

Issue Requirements

Prior Findings Status

Blockers

Follow-up Findings

Non-blocking Suggestions

Test Coverage Notes

Design Notes

Design / Roadmap Gate

Directional Notes

Test Coverage

Merge Recommendation

Uh oh!

Q00 commented Jun 20, 2026

Uh oh!

ouroboros-agent Bot left a comment

Choose a reason for hiding this comment

Review — ouroboros-agent[bot]

Metadata

What Improved

Issue Requirements

Prior Findings Status

Blockers

Follow-up Findings

Non-blocking Suggestions

Test Coverage Notes

Design Notes

Design / Roadmap Gate

Directional Notes

Test Coverage

Merge Recommendation

Uh oh!

Q00 commented Jun 20, 2026

Uh oh!

ouroboros-agent Bot left a comment

Choose a reason for hiding this comment

Review — ouroboros-agent[bot]

Metadata

What Improved

Issue Requirements

Prior Findings Status

Blockers

Follow-up Findings

Non-blocking Suggestions

Test Coverage Notes

Design Notes

Design / Roadmap Gate

Directional Notes

Test Coverage

Merge Recommendation

Policy Notes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant