fix(#1361): preserve reasoning, tool calls, and partial output on Stop/Cancel by bergeouss · Pull Request #1375 · nesquena/hermes-webui

bergeouss · 2026-04-30T23:19:30Z

Problem

Hitting Stop during a streaming response loses both the AI's already-streamed reasoning and tool output — paid tokens gone with no way to recover.

Three distinct data-loss paths:

§A — Reasoning text lost on cancel
_reasoning_text is accumulated in a thread-local variable inside _run_agent_streaming. On cancel, cancel_stream() never sees it — it goes out of scope when the thread is interrupted.

§B — Tool calls lost on cancel
_live_tool_calls is also thread-local. cancel_stream() only captures STREAM_PARTIAL_TEXT — tool calls the user saw on screen disappear after refresh.

§C — Reasoning-only streams produce no partial message
When the entire output is reasoning (no visible tokens), the regex strip of thinking blocks returns empty string. The if _stripped: guard skips the append entirely — only *Task cancelled.* survives.

Fix

Mirrors the same pattern as the existing STREAM_PARTIAL_TEXT shared dict:

STREAM_REASONING_TEXT — new shared dict in api/config.py, populated in on_reasoning() and the reasoning branch of on_tool(), read in cancel_stream()
STREAM_LIVE_TOOL_CALLS — new shared dict, populated on tool.started and tool.completed, read in cancel_stream()
§C fix — cancel_stream() now appends a partial assistant message when _stripped is empty but reasoning or tool calls exist, with the reasoning as a reasoning field and tool_calls as a tool_calls field

Tests

8 new regression tests in tests/test_issue1361_cancel_data_loss.py:

✅ Cancel with reasoning-only → reasoning preserved
✅ Cancel with reasoning + partial tokens → both preserved
✅ Cancel without reasoning dict → works as before (no regression)
✅ Cancel with tool calls → tool_calls preserved
✅ Cancel with tools + text → both preserved
✅ Reasoning-only creates partial message (§C)
✅ Tools-only creates partial message (§C)
✅ No reasoning, no tools, no text → only cancel marker (no change)

Existing tests unaffected: #1298 (10/10) and #1195 (5/5) all pass.

Closes #1361

…t on cancel Stop/Cancel was discarding AI's already-streamed reasoning and tool output — paid tokens lost with no way to recover. Three sub-bugs fixed: §A: Reasoning trace (on_reasoning callback) was accumulated in a thread-local variable invisible to cancel_stream(). Added STREAM_REASONING_TEXT shared dict (same pattern as STREAM_PARTIAL_TEXT) to make it accessible. §B: Live tool calls (_live_tool_calls) were also thread-local. Added STREAM_LIVE_TOOL_CALLS shared dict to persist tool call info on cancel. §C: When the entire streamed output was reasoning-only (no visible tokens), _stripped was empty after regex cleanup, so no partial assistant message was appended — only the *Task cancelled.* marker survived. Now appends a partial message with reasoning/tool_calls fields even when content is empty. Tests: 8 new regression tests covering all three sub-bugs plus no-regression cases. Existing nesquena#1298 and nesquena#1195 tests unaffected (15/15 pass).

…p/Cancel (#1375) Three distinct data-loss paths fixed: §A — Reasoning text was accumulated in a thread-local _reasoning_text inside _run_agent_streaming. cancel_stream() never saw it because it went out of scope when the thread was interrupted. Now mirrored to a new shared dict STREAM_REASONING_TEXT keyed by stream_id, populated in on_reasoning() and the reasoning branch of on_tool(), read in cancel_stream(). §B — Live tool calls in thread-local _live_tool_calls were similarly invisible to cancel_stream(). Now mirrored to STREAM_LIVE_TOOL_CALLS on tool.started + tool.completed. §C — Reasoning-only streams produced no partial message because the thinking-block regex strip returned empty string and the `if _stripped:` guard skipped the append. Now appends the partial message when EITHER content text, reasoning trace, OR tool calls exist. Mirrors the existing STREAM_PARTIAL_TEXT pattern from #893 exactly: same dict creation in _run_agent_streaming, same _live_config fallback in cancel_stream, same cleanup in _periodic_checkpoint. 8 regression tests in tests/test_issue1361_cancel_data_loss.py covering all three sections plus tools+text combinations. Co-authored-by: bergeouss <bergeouss@users.noreply.github.com>

@bergeouss

…r change Adds two more contributor PRs to the v0.50.251 batch per user directive (per-PR review + Opus review for #1373; #1375 was clean ship-on-sight). #1375 (@bergeouss, +382 LOC, all CI green) — fixes #1361 paid-token data loss on Stop/Cancel. Mirrors the existing STREAM_PARTIAL_TEXT pattern from #893: adds STREAM_REASONING_TEXT and STREAM_LIVE_TOOL_CALLS shared dicts populated during streaming and read by cancel_stream(). Also fixes the §C reasoning-only-creates-no-message gap where the strip-thinking-blocks regex returned empty string and the if-guard skipped the partial append. 8 regression tests covering all 3 sections plus tools+text combinations. #1373 (@bergeouss, +105 LOC, had CI failures pre-fix) — fixes #1195 new-profile-routes-to-default. The is_dir() guard in get_hermes_home_for_profile() caused new profiles (no session yet) to silently route every session back to the default profile until the directory existed on disk. Removed the guard; profile path is now returned unconditionally. Pre-release fix for #1373's CI failures: the change flipped two behaviors pinned by tests in #798: - R19c (test_get_hermes_home_for_profile_falls_back_for_missing_profile) asserted nonexistent → base. Renamed and updated to assert the new always-return-profile-path behavior. - R19j (test_get_hermes_home_for_profile_rejects_path_traversal) asserted that valid-but-nonexistent profile names → base. Updated to assert profile-scoped path. Also updated docstring: the _PROFILE_ID_RE regex is now the SOLE defense against path traversal (previously is_dir() was a defense-in-depth layer); verified each known-bad shape still returns base. Tests: 3484 passing (3471 → 3484, +13).

…ST-FIX) Opus pass-2 review of v0.50.251 caught a critical regression in PR #1375: The cancel-partial message stored captured tool calls under the 'tool_calls' key. That key is whitelisted by _API_SAFE_MSG_KEYS so _sanitize_messages_for_api forwarded the entries to the next-turn LLM call. But the captured entries use the WebUI internal shape ({name, args, done, duration, is_error}) — they don't have the OpenAI/Anthropic id + function: {name, arguments} envelope. Strict providers (OpenAI, Anthropic, Z.AI/GLM) would 400 on the malformed entries. Net effect: the very cancel-then-continue scenario PR #1375 aimed to improve becomes a hard fail. Fix: - Rename the persisted key to '_partial_tool_calls' (underscore- prefixed private key NOT in _API_SAFE_MSG_KEYS, so sanitize correctly strips it). - Update static/messages.js hasMessageToolMetadata check to also recognize _partial_tool_calls for UI rendering. - Update test_issue1361_cancel_data_loss.py assertion to check _partial_tool_calls (and tool_calls as legacy fallback). Plus 2 NIT fixes from the same Opus review: NIT 1 (api/profiles.py:153): re.match → re.fullmatch for consistency with other _PROFILE_ID_RE callers in the codebase. The trailing- newline footgun ($ matches before final \n in re.match) is now closed. Without #1373's is_dir() guard, a name like 'valid\n' would have created a directory named 'valid\n' on Linux. Doesn't escape <HERMES_HOME>/profiles/ via Path joining, but unintended. NIT 2 (test_issue798.py): R19j coverage gaps — added trailing- newline tests, length-boundary tests (64-char valid, 65-char rejected), single-char minimum, and non-ASCII / Unicode-trick tests. New regression test (tests/test_pr1375_partial_tool_calls_sanitize.py): - test_partial_tool_calls_field_not_forwarded_to_llm: pins that sanitize-for-API strips _partial_tool_calls + reasoning + does NOT have tool_calls on a partial message - test_legitimate_tool_calls_are_preserved_for_completed_turns: pins that real OpenAI-shape tool_calls on completed turns survive sanitize unchanged Tests: 3486 passing (3484 → 3486, +2 sanitize tests).

nesquena-hermes · 2026-04-30T23:53:24Z

Shipped in v0.50.251 (merge 031feda3, tag https://github.com/nesquena/hermes-webui/releases/tag/v0.50.251). Thanks @bergeouss — solid foundation work on a real correctness gap.

Live at https://github.com/nesquena/hermes-webui/releases/tag/v0.50.251.

What landed

Your fix for #1361 ships clean — three distinct data-loss paths on Stop/Cancel are now closed:

§A Reasoning text accumulated in thread-local _reasoning_text is now mirrored to STREAM_REASONING_TEXT shared dict, visible to cancel_stream()
§B Live tool calls similarly mirrored to STREAM_LIVE_TOOL_CALLS
§C Reasoning-only streams (no visible tokens) now produce a partial assistant message, not just a *Task cancelled.* marker

The pattern conformance to STREAM_PARTIAL_TEXT from #893 was excellent — same dict-creation, same _live_config fallback, same cleanup in _periodic_checkpoint.

Pre-release fix (Opus pass-2 MUST-FIX)

One critical correctness bug Opus caught:

Your fix stored captured tool calls under the message key tool_calls. That key is whitelisted by _API_SAFE_MSG_KEYS in api/streaming.py so the sanitize-for-API path forwards them to the next-turn LLM call. But the captured entries use the WebUI internal shape — they don't have OpenAI/Anthropic's id + function: arguments envelope. Strict providers would 400 on the malformed entries.

Concrete failure scenario: user starts a turn, agent does some tool calls, user hits Stop, then types a follow-up. The next API call would 400 because the partial message's tool_calls don't match the API contract. So the very cancel-then-continue scenario your PR aimed to improve becomes a hard fail.

Fix at f53556b: renamed the key to _partial_tool_calls (underscore-prefixed private key NOT in _API_SAFE_MSG_KEYS, so sanitize correctly strips it). Updated static/messages.js to also recognize _partial_tool_calls for UI rendering. Added tests/test_pr1375_partial_tool_calls_sanitize.py with 2 sanitize-roundtrip tests pinning the invariant.

This is a "make the structural shape itself enforce the invariant" fix — better than a comment because the field name being non-whitelisted means any future refactor that moves the partial-message logic still gets the right behavior automatically.

Tests

8 of your contributor tests + 2 sanitize-roundtrip tests = 10 total pinning the cancel-stream behavior end-to-end.

…t on Stop/Cancel (nesquena#1375) Three distinct data-loss paths fixed: §A — Reasoning text was accumulated in a thread-local _reasoning_text inside _run_agent_streaming. cancel_stream() never saw it because it went out of scope when the thread was interrupted. Now mirrored to a new shared dict STREAM_REASONING_TEXT keyed by stream_id, populated in on_reasoning() and the reasoning branch of on_tool(), read in cancel_stream(). §B — Live tool calls in thread-local _live_tool_calls were similarly invisible to cancel_stream(). Now mirrored to STREAM_LIVE_TOOL_CALLS on tool.started + tool.completed. §C — Reasoning-only streams produced no partial message because the thinking-block regex strip returned empty string and the `if _stripped:` guard skipped the append. Now appends the partial message when EITHER content text, reasoning trace, OR tool calls exist. Mirrors the existing STREAM_PARTIAL_TEXT pattern from nesquena#893 exactly: same dict creation in _run_agent_streaming, same _live_config fallback in cancel_stream, same cleanup in _periodic_checkpoint. 8 regression tests in tests/test_issue1361_cancel_data_loss.py covering all three sections plus tools+text combinations. Co-authored-by: bergeouss <bergeouss@users.noreply.github.com>

@bergeouss

nesquena#1373 behavior change Adds two more contributor PRs to the v0.50.251 batch per user directive (per-PR review + Opus review for nesquena#1373; nesquena#1375 was clean ship-on-sight). nesquena#1375 (@bergeouss, +382 LOC, all CI green) — fixes nesquena#1361 paid-token data loss on Stop/Cancel. Mirrors the existing STREAM_PARTIAL_TEXT pattern from nesquena#893: adds STREAM_REASONING_TEXT and STREAM_LIVE_TOOL_CALLS shared dicts populated during streaming and read by cancel_stream(). Also fixes the §C reasoning-only-creates-no-message gap where the strip-thinking-blocks regex returned empty string and the if-guard skipped the partial append. 8 regression tests covering all 3 sections plus tools+text combinations. nesquena#1373 (@bergeouss, +105 LOC, had CI failures pre-fix) — fixes nesquena#1195 new-profile-routes-to-default. The is_dir() guard in get_hermes_home_for_profile() caused new profiles (no session yet) to silently route every session back to the default profile until the directory existed on disk. Removed the guard; profile path is now returned unconditionally. Pre-release fix for nesquena#1373's CI failures: the change flipped two behaviors pinned by tests in nesquena#798: - R19c (test_get_hermes_home_for_profile_falls_back_for_missing_profile) asserted nonexistent → base. Renamed and updated to assert the new always-return-profile-path behavior. - R19j (test_get_hermes_home_for_profile_rejects_path_traversal) asserted that valid-but-nonexistent profile names → base. Updated to assert profile-scoped path. Also updated docstring: the _PROFILE_ID_RE regex is now the SOLE defense against path traversal (previously is_dir() was a defense-in-depth layer); verified each known-bad shape still returns base. Tests: 3484 passing (3471 → 3484, +13).

…ST-FIX) Opus pass-2 review of v0.50.251 caught a critical regression in PR nesquena#1375: The cancel-partial message stored captured tool calls under the 'tool_calls' key. That key is whitelisted by _API_SAFE_MSG_KEYS so _sanitize_messages_for_api forwarded the entries to the next-turn LLM call. But the captured entries use the WebUI internal shape ({name, args, done, duration, is_error}) — they don't have the OpenAI/Anthropic id + function: {name, arguments} envelope. Strict providers (OpenAI, Anthropic, Z.AI/GLM) would 400 on the malformed entries. Net effect: the very cancel-then-continue scenario PR nesquena#1375 aimed to improve becomes a hard fail. Fix: - Rename the persisted key to '_partial_tool_calls' (underscore- prefixed private key NOT in _API_SAFE_MSG_KEYS, so sanitize correctly strips it). - Update static/messages.js hasMessageToolMetadata check to also recognize _partial_tool_calls for UI rendering. - Update test_issue1361_cancel_data_loss.py assertion to check _partial_tool_calls (and tool_calls as legacy fallback). Plus 2 NIT fixes from the same Opus review: NIT 1 (api/profiles.py:153): re.match → re.fullmatch for consistency with other _PROFILE_ID_RE callers in the codebase. The trailing- newline footgun ($ matches before final \n in re.match) is now closed. Without nesquena#1373's is_dir() guard, a name like 'valid\n' would have created a directory named 'valid\n' on Linux. Doesn't escape <HERMES_HOME>/profiles/ via Path joining, but unintended. NIT 2 (test_issue798.py): R19j coverage gaps — added trailing- newline tests, length-boundary tests (64-char valid, 65-char rejected), single-char minimum, and non-ASCII / Unicode-trick tests. New regression test (tests/test_pr1375_partial_tool_calls_sanitize.py): - test_partial_tool_calls_field_not_forwarded_to_llm: pins that sanitize-for-API strips _partial_tool_calls + reasoning + does NOT have tool_calls on a partial message - test_legitimate_tool_calls_are_preserved_for_completed_turns: pins that real OpenAI-shape tool_calls on completed turns survive sanitize unchanged Tests: 3486 passing (3484 → 3486, +2 sanitize tests).

@franksong2702

## Release v0.51.254 — Release HV (stage-r2) Phase-2 medium wave 1 — 4 PRs (UI/mobile/cancel fixes + an un-held model dedup). ### Fixed | Issue/PR | Author | Fix | |----------|--------|-----| | #3528 | @franksong2702 | Render partial tool calls after cancel — interrupted turns keep their `_partial_tool_calls` rows in the transcript + fallback tool-cards. (Codex confirmed it stays render-only, not forwarded to the provider API.) | | #3550 | @lurebat | Android offline recovery soft-reattaches the live stream instead of hard-reloading the page on a transient background/disconnect. | | #3479 | @mvanhorn | iOS Safari no longer snaps the conversation to the top when a handoff/compression card is inserted mid-stream or on `refreshSession()`. | | #3478 | @JayC-L | **Un-held:** named custom providers (`@custom:name:model`) dedup against bare model IDs without regressing Ollama multi-colon tags (`qwen2.5:7b-instruct-q4`). Only `@custom:` IDs strip the two-segment prefix. | ### Hold-sweep note #3478/#3489 was held yesterday for an Ollama multi-colon-tag regression risk (a blanket `lastIndexOf` would lose the model). The author pushed a scoped fix (only `@custom:` IDs use `lastIndexOf`); I verified `_normId` in node against the regression cases — Ollama bare tags are preserved. Un-held + shipped. ### Gate - Full pytest suite: **7588 passed, 0 failed** - ESLint: CLEAN · ruff: CLEAN · browser-smoke: CLEAN - Codex (regression): **SAFE TO SHIP** — #3552 partial-tool-calls verified render-only (no `_API_SAFE_MSG_KEYS` leak / no 400-on-strict-provider, the v0.50.251 #1375 trap); #3551 no EventSource double-subscribe; #3541 no regression vs the #3525 scroll-follow shipped in v0.51.253; #3489 no over-dedup. Co-authored-by: franksong2702 <franksong2702@users.noreply.github.com> Co-authored-by: lurebat <lurebat@users.noreply.github.com> Co-authored-by: mvanhorn <mvanhorn@users.noreply.github.com> Co-authored-by: JayC-L <JayC-L@users.noreply.github.com>

nesquena mentioned this pull request Apr 30, 2026

release: v0.50.251 #1371

Merged

5 tasks

nesquena-hermes closed this Apr 30, 2026

nesquena-hermes mentioned this pull request Jun 4, 2026

Release v0.51.254 — Release HV (stage-r2): mobile/cancel/scroll fixes + custom-provider dedup #3593

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(#1361): preserve reasoning, tool calls, and partial output on Stop/Cancel#1375

fix(#1361): preserve reasoning, tool calls, and partial output on Stop/Cancel#1375
bergeouss wants to merge 1 commit into
nesquena:masterfrom
bergeouss:fix/issue1361-cancel-data-loss

bergeouss commented Apr 30, 2026

Uh oh!

nesquena-hermes commented Apr 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

bergeouss commented Apr 30, 2026

Problem

Fix

Tests

Uh oh!

nesquena-hermes commented Apr 30, 2026

What landed

Pre-release fix (Opus pass-2 MUST-FIX)

Tests

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants