Fix gateway tool output visibility and timing by henrypark133 · Pull Request #2555 · nearai/ironclaw

henrypark133 · 2026-04-16T23:27:38Z

Summary

fix gateway tool activity cards to correlate live tool events by call_id and show actual tool output when available
use the existing persisted result field for expanded history cards without adding new history storage, and bound active-thread history results to keep refreshes fast
thread call_id and live duration_ms through the web event surface, preserve engine_v2 action call IDs in the bridge, and replace misleading 0.0s with millisecond-friendly duration formatting

Testing

cargo fmt --all
cargo clippy --workspace --all-targets --all-features -- -D warnings
cargo test gateway_send_status_preserves_tool_event_fields --lib
cargo test test_build_turns_with_persisted_tool_result_for_display --lib
cargo test accumulator_tool_flow --lib
cargo test test_ws_multiple_events_in_sequence --test ws_gateway_integration
cargo test test_tool_result_for_display_truncates_long_content --lib
cargo test forward_event_to_channel_preserves_call_id_for_action_events --lib
cargo test thread_event_to_app_events_preserves_call_id_for_action_events --lib

Copilot

Pull request overview

This PR improves how the web gateway surfaces tool activity by correlating live tool events via call_id, exposing actual tool output (when persisted), and carrying real execution timings (duration_ms) end-to-end so the UI can render accurate tool cards and durations.

Changes:

Add call_id (and duration_ms for completions) to tool-related StatusUpdate/AppEvent variants and preserve these fields through the bridge and web channel layers.
Use persisted tool-call result to populate expanded history tool cards (with display truncation), avoiding new history storage.
Refactor frontend tool activity rendering to a controller that correlates events by call_id and formats durations in a millisecond-friendly way.

Reviewed changes

Copilot reviewed 16 out of 16 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
tests/ws_gateway_integration.rs	Asserts WS payloads include `call_id` and `duration_ms` for tool events.
tests/support_unit_tests.rs	Updates test fixtures for new `duration_ms` field on `ToolCompleted`.
src/channels/web/util.rs	Adds `tool_result_for_display` and threads persisted `result` + `call_id` into `ToolCallInfo`.
src/channels/web/types.rs	Extends `ToolCallInfo` DTO with `call_id` and `result`.
src/channels/web/tests/tool_event_passthrough.rs	Regression test ensuring gateway preserves tool identity/timing fields to SSE.
src/channels/web/tests/mod.rs	Registers new tool passthrough test module.
src/channels/web/server.rs	Includes `call_id` and display-ready `result` for in-memory turn tool calls.
src/channels/web/responses_api.rs	Correlates tool events by `call_id` when building response output items.
src/channels/web/mod.rs	Ensures `call_id`/`duration_ms` are passed through as `AppEvent`s.
src/channels/wasm/wrapper.rs	Updates WASM channel tests for `duration_ms`.
src/channels/channel.rs	Adds `duration_ms` to `StatusUpdate::ToolCompleted` and propagates it in constructor helpers/tests.
src/bridge/router.rs	Preserves engine action `call_id` and forwards `duration_ms` via tool-status events.
src/agent/thread_ops.rs	Measures tool execution durations and passes `duration_ms` into status updates.
src/agent/dispatcher.rs	Measures tool execution durations and passes `duration_ms` into status updates.
crates/ironclaw_gateway/static/app.js	Refactors tool activity cards to correlate by `call_id`, show persisted results, and format ms durations.
crates/ironclaw_common/src/event.rs	Adds optional `call_id` and `duration_ms` fields to tool-related `AppEvent`s.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

gemini-code-assist

Code Review

This pull request introduces call_id and duration_ms fields across the system to improve the tracking and display of tool activity. Key changes include refactoring the frontend tool activity state into a reusable controller, adding execution timing in the agent dispatcher, and updating event propagation to preserve tool identity. A logic error was identified in the frontend's tool correlation function where name-based fallback could lead to incorrect state updates during parallel tool calls.

henrypark133 · 2026-04-17T00:07:19Z

Addressed the open review comments on tool-visibility:

tool_result_for_display() now returns None for JSON null and empty strings, so the UI no longer renders synthetic "null" output.
createToolActivitySummary() now builds the duration node with textContent instead of appending via innerHTML, so <1ms renders safely.
findRendered() now treats call_id as authoritative and no longer falls back to name-based matching when a call_id is present but does not satisfy the predicate.

Validation after the follow-up patch:

cargo fmt --all
node --check crates/ironclaw_gateway/static/app.js
cargo test test_tool_result_for_display_skips_null --lib

Follow-up commit pushed: 733f5b95.

henrypark133 · 2026-04-17T00:23:09Z

@codex review

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 733f5b95ee

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Copilot

Pull request overview

Copilot reviewed 16 out of 16 changed files in this pull request and generated 1 comment.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-17T04:22:54Z

 const MAX_HISTORY_IMAGE_DATA_URL_BYTES_PER_IMAGE: usize = 512 * 1024;
 const MAX_HISTORY_IMAGE_DATA_URL_BYTES_PER_RESPONSE: usize = 1024 * 1024;
+const MAX_TOOL_RESULT_DISPLAY_CHARS: usize = 1000;



MAX_TOOL_RESULT_DISPLAY_CHARS is passed to truncate_preview, but truncate_preview truncates by bytes (and can return max_bytes + 3 due to the appended "..."). To avoid confusion/misconfiguration, rename this constant (and any docs/tests) to reflect bytes (e.g., MAX_TOOL_RESULT_DISPLAY_BYTES) or switch to a char-count truncation helper if the intent is truly characters.

henrypark133 · 2026-04-17T04:23:52Z

-                let call_id = format!("call_{}", Uuid::new_v4().simple());
+                let call_id = call_id
+                    .clone()
+                    .unwrap_or_else(|| format!("call_{}", Uuid::new_v4().simple()));


This starts threading call_id through the streaming worker, but the completion side still uses a single global current_tool_index later in the same match arm. With overlapping tool calls, a later ToolStarted overwrites that slot, so the first completion can emit response.output_item.done for the wrong function-call item. Since the PR already has the call_id here, the in-flight output index should be tracked by call_id as well instead of by one mutable slot.

henrypark133

Review: keep Responses API tool completions correlated per call

Most of the earlier tool-visibility feedback looks addressed, and the new call_id / duration_ms plumbing is much cleaner. One correctness issue still remains in the streaming Responses API path.

Critical: `response.output_item.done` still uses one global tool slot

File: src/channels/web/responses_api.rs:947
The streaming worker now threads call_id into the FunctionCall and FunctionCallOutput items, but it still tracks completion with a single current_tool_index. If tool A starts, tool B starts, and tool A completes first, the later start overwrites that slot and the code emits response.output_item.done for B instead of A. The final output list keeps the right call_id, but streamed clients can observe the wrong item transition to done and never get a done event for the actual completed call.

Suggested fix: track in-flight output indexes by call_id (or look them up by call_id on completion) instead of using a single mutable index.

Recommended verdict: Request changes.

Residual risk: I also kicked off a couple of targeted tests from the worktree, but they were still compiling when I posted this review.

Fix gateway tool output visibility

c57a95d

Copilot AI review requested due to automatic review settings April 16, 2026 23:27

Copilot started reviewing on behalf of henrypark133 April 16, 2026 23:28 View session

Copilot AI reviewed Apr 16, 2026

View reviewed changes

Comment thread src/channels/web/util.rs

Comment thread crates/ironclaw_gateway/static/app.js Outdated

gemini-code-assist bot reviewed Apr 16, 2026

View reviewed changes

Comment thread crates/ironclaw_gateway/static/app.js

Address PR review follow-ups

733f5b9

chatgpt-codex-connector bot reviewed Apr 17, 2026

View reviewed changes

Comment thread crates/ironclaw_gateway/static/app.js Outdated

This comment was marked as off-topic.

Sign in to view

henrypark133 commented Apr 17, 2026

View reviewed changes

Comment thread src/channels/web/tests/tool_event_passthrough.rs

fix(web): truncate live tool activity previews

a10f575

Copilot AI review requested due to automatic review settings April 17, 2026 04:17

Copilot started reviewing on behalf of henrypark133 April 17, 2026 04:17 View session

Copilot AI reviewed Apr 17, 2026

View reviewed changes

henrypark133 commented Apr 17, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix gateway tool output visibility and timing#2555

Fix gateway tool output visibility and timing#2555
henrypark133 wants to merge 3 commits intostagingfrom
tool-visibility

henrypark133 commented Apr 16, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

henrypark133 commented Apr 17, 2026

Uh oh!

henrypark133 commented Apr 17, 2026

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

Uh oh!

This comment was marked as off-topic.

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Apr 17, 2026

Uh oh!

henrypark133 Apr 17, 2026

Uh oh!

henrypark133 left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

henrypark133 commented Apr 16, 2026

Summary

Testing

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

henrypark133 commented Apr 17, 2026

Uh oh!

henrypark133 commented Apr 17, 2026

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

This comment was marked as off-topic.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Apr 17, 2026

Choose a reason for hiding this comment

Uh oh!

henrypark133 Apr 17, 2026

Choose a reason for hiding this comment

Uh oh!

henrypark133 left a comment

Choose a reason for hiding this comment

Review: keep Responses API tool completions correlated per call

Critical: response.output_item.done still uses one global tool slot

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Critical: `response.output_item.done` still uses one global tool slot