Fix pre-stream gap UX: thinking dots + stop button on send by GeoffBao · Pull Request #2 · GeoffBao/hermes-webui

GeoffBao · 2026-05-17T08:48:25Z

Summary

Remove !S.activeStreamId guard from appendThinking() so thinking dots appear immediately when the user sends a message (previously blocked until the SSE stream opened, causing a blank gap)
The stop-button fix (getComposerPrimaryAction returning 'stop' when S.busy) was already merged in PR Backport upstream stage-345 + fix Cmd+K, send button, thinking dots #1; this adds the thinking animation to complete the feedback loop

Root cause

Three screenshots showed the pre-stream gap:

User sends → blank (no dots, send icon unchanged)
Tool activity bar appears
Response streams + stop button appears

Both the stop button and the thinking dots were gated on S.activeStreamId being set, which only happens after /api/chat/start responds. This could be 1–3 seconds on slow models.

Fix

appendThinking() now only requires S.session to exist. The stale-event protection (preventing old stream reasoning events from polluting a switched session) is handled upstream in the SSE activeSid check in messages.js.

Test plan

Send a message → thinking dots appear immediately
Send button switches to stop icon immediately on send
Cancelling during pre-stream gap works (stop → aborts /api/chat/start)
CI passes

🤖 Generated with Claude Code

Per-request profile switches (process_wide=False, introduced in nesquena#1700) update os.environ['HERMES_HOME'] but skip _set_hermes_home(), which is responsible for monkeypatching module-level caches. Both tools/skills_tool.py and tools/skill_manager_tool.py set HERMES_HOME and SKILLS_DIR once at import time. When a non-default profile is active in the WebUI, os.environ['HERMES_HOME'] is correctly updated per-turn in the _ENV_LOCK block, but the module-level constants still point at the root profile. All agent-side skill operations — skills_list(), skill_view(), skill_manage() — read and write to the wrong directory. Add the same monkeypatching that _set_hermes_home() already performs (profiles.py line ~620) to the per-turn env setup block in streaming.py, covering both skills_tool and skill_manager_tool. The WebUI display half was already fixed in nesquena#1917 via _active_skills_dir() in routes.py. This patch fixes the agent-side half so the running agent resolves skills from the correct profile.

Update Chinese language translation

Add xiaomi to _PROVIDER_DISPLAY, _PROVIDER_MODELS, and _PROVIDER_ALIASES so the WebUI recognizes Xiaomi as a first-class provider. Models included: - mimo-v2.5-pro (MiMo V2.5 Pro) - mimo-v2.5 (MiMo V2.5) - mimo-v2-pro (MiMo V2 Pro) - mimo-v2-omni (MiMo V2 Omni) - mimo-v2-flash (MiMo V2 Flash) Aliases: mimo, xiaomi-mimo -> xiaomi The hermes-agent CLI already registers xiaomi as a provider (hermes_cli/models.py, hermes_cli/auth.py) but the WebUI was missing the corresponding entries, causing the model dropdown to fall back to OpenRouter and the provider list to show 'Unsupported'.

When context compression fires, the agent rotates to a new session_id. The compression migration block correctly migrates the session lock, SESSION_AGENT_CACHE, SESSIONS dict, and the session file rename, but does not ensure s.profile is set on the continuation session. On the next request, _run_agent_streaming resolves the profile via: get_hermes_home_for_profile(getattr(s, 'profile', None)) With s.profile == None this falls back to the default profile's HERMES_HOME. Memory tool calls then read and write the wrong profile's MEMORY.md — confirmed by investigation: session 0dfefb (continuation after compression from a troubleshooting profile session) read memory at 16% / 1,184 chars with 4 entries, while the troubleshooting profile's actual state was 72-77% / 5,000+ chars. That reading could only come from the default profile's bank. Subsequent replace operations failed because the target entries existed only in the troubleshooting profile. There are two failure paths: 1. In-memory: if s.profile was None from the start (legacy session or one created before this fix), the continuation session object carries null through the current request. 2. Persistence: s.save() persists "profile": null to the continuation session's JSON file (profile is in METADATA_FIELDS, models.py ~408). On the next request, Session.load(new_sid) reads it back as null and get_hermes_home_for_profile(None) falls back to the default profile. Fix: capture _resolved_profile_name at request entry (~line 2019), immediately after profile home resolution. This is the only point where profile context is reliable: s.profile if already set, otherwise get_active_profile_name() — which at that point reads thread-local storage (_tls.profile) correctly set by the HTTP handler thread via set_request_profile(). Calling get_active_profile_name() at compression time instead would be unsafe: the streaming thread is a separate threading.Thread, does not inherit TLS, and the call would fall back to the process-global _active_profile which may belong to a different concurrent tab. Stamp s.profile in the compression migration block immediately after s.session_id = new_sid. Guarded by `if not s.profile` so sessions that already have a profile set are unaffected. A logger.info line records when the stamp fires, making future investigation straightforward. Fixes: memory writes bleeding into default profile after compression Reproduces: reliably on any long non-default profile session that hits the compression threshold (default: 0.80 context fill)

@franksong2702

…hile in-flight by @franksong2702

@franksong2702

…_count pluralization by @franksong2702

@franksong2702

…or assignee select by @franksong2702

@franksong2702

…ls + status hint by @franksong2702

@franksong2702

…ssion by @franksong2702

… + extend locale parity test (Opus advisor SHIP-WITH-CAVEATS follow-up)

…uple entries + _LOGIN_LOCALE block nesquena#2142 (legeantbleu) added the fr locale to static/i18n.js but didn't update: 1. tests/test_issue1488_composer_voice_buttons.py: two TestComposerVoiceButtonI18n + TestVoiceModePreferenceGate LOCALES tuples needed 'fr' 2. api/routes.py: _LOGIN_LOCALE needed an 'fr' block so the login page localizes for French users (issue nesquena#1442 parity contract) 3. tests/test_login_locale_parity.py: the test asserting 'fr' falls-back-to-'en' is inverted — fr now resolves to fr, with sibling assertions for fr-FR and fr-CA Mirrors the stage-340 fix for the it locale (PR nesquena#2067 → maintainer adds tuple entries). 46/46 i18n tests pass after fix.

… contributor batch

… + stale-done re-emit (1) compress/status no longer pops the job entry on first read of `done` payload. Second open tab no longer sees `idle` and a stale-job toast. (2) compress/start no longer short-circuits to a stale `done` payload when re-invoked within the 10-minute TTL. Re-running /compress always starts fresh, so closing-and-reopening a tab mid-compress works correctly. Third SHOULD-FIX (nesquena#2135 cfg["model"] fallback tightening when no custom_providers entry matches) deferred to follow-up — strictly no-worse-than-master behavior. tests/test_sprint46.py 10/10 still passes.

stage-344: 16-PR contributor batch — i18n + insights + manual-compress async + workspace recovery + iOS PWA + Cloudflare login health + bash 3.2 (fr locale)

fix: guard stale stream writebacks (LumenYoung) Prevents stale WebUI stream workers from writing old results into a session after that session has already moved on to another stream. Adds new helper _stream_writeback_is_current() (a token equality check against the session's active_stream_id) and short-circuits the two finalize/cancel paths when the worker no longer owns the session writeback.

feat: add manual provider usage refresh (Jordan-SkyLF) Adds a 'Refresh usage' button on the Provider quota card in Settings → Providers, with cache: 'no-store' fetch + browser cache-bust query string. Pure browser-side cache-busting; the server-side /api/provider/quota endpoint has no cache layer yet (refresh=1 query param is currently a no-op server-side; the win is bypassing browser/proxy/SW caches).

…ena#2136 + nesquena#2150

stage-345: 2-PR low-risk batch — stream-ownership guard against stale writebacks + Refresh-usage button on provider quota card

…tore avatar - Hard-reset to upstream/master (stage-345, v0.51.51) to fix all broken functionality - Migrated Claude skin (full palette + typography + component affordances) - Migrated Nebula skin (accent-only cyan-blue-violet palette) - Skipped Sienna-specific affordances (already canonical in upstream stage-345) - Restored hermes-agent-avatar.png exactly (MD5: 6b4e80f8cd848bd4ef640e48030006e5) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

- Match both e.key==='k' and e.key==='K' so Cmd+K works regardless of Caps Lock state (upstream B handler already does this for 'b'/'B') - Wrap the newSession() call in try/catch in both the Cmd+K keydown handler and btnNewChat.onclick so any server-side failure shows a toast instead of silently disappearing into an unhandled promise rejection Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Two bugs caused by the window between setBusy(true) and S.activeStreamId being set (the /api/chat/start round-trip, which can take seconds on slow providers): 1. Send button stays disabled instead of showing the Stop icon: getComposerPrimaryAction() required S.activeStreamId to return 'stop', but S.activeStreamId is explicitly nulled before the POST and only set on response. Fix: check S.busy||S.activeStreamId so the button flips to Stop immediately. 2. Thinking dots never appear until the stream starts: appendThinking() guarded on !S.activeStreamId and returned early. Fix: relax guard to !S.busy&&!S.activeStreamId (allow when busy, even pre-stream). Also reorder messages.js: setBusy(true) now runs before appendThinking() so S.busy=true is set when the check runs. 3. Bonus: Stop now works during the pre-stream gap: cancelStream() extended to handle the null-streamId case — clears S.busy, removes thinking indicator, and aborts the in-flight /api/chat/start fetch via AbortController (window._abortPendingChatStart). AbortError in the send() catch block is treated as user-cancel (clean teardown, no error toast). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

- messages.js: revert appendThinking/setBusy call order to match test assertion (`appendThinking();setBusy(true);`), fix activeStreamId comment to match exact marker test checks - ui.js: revert appendThinking guard back to `!S.activeStreamId` only (removes the S.busy relaxation that broke test ordering contract) - boot.js: simplify Cmd+K key check back to `e.key==='k'` (exact string the test searches for); compact cancelStream early-return so try/catch lands within the 400-char test window; remove redundant S.activeStreamId=null from early path so cleanup_idx stays after catch_idx - style.css: add space in skin-scoped `.send-btn {` rule so the global `.send-btn{` rule is the first match for the CSS tests Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

The test asserts updateSendBtn() is called within 200 chars of the S.activeStreamId null-reset marker. The AbortController comment was pushing it past that limit. Move updateSendBtn() to immediately after the marker to satisfy the test. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Remove the `!S.activeStreamId` guard from appendThinking() so the thinking animation appears as soon as the user sends a message, rather than waiting for /api/chat/start to respond and the SSE stream to open. The stale-event protection (preventing old stream events from polluting a new session) is already enforced by the activeSid check in the SSE outer loop in messages.js, so this guard was only causing a noticeable blank gap between send and first feedback. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Keep our version that removes the !S.activeStreamId guard from appendThinking() so thinking dots appear immediately on send. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

chatgpt-codex-connector · 2026-05-17T08:51:23Z

💡 Codex Review

hermes-webui/static/boot.js

Line 3 in 58b6357

    
           if(!streamId){if(typeof _abortPendingChatStart==='function')_abortPendingChatStart();if(S.busy)setBusy(false);return;}

Clear pre-stream thinking UI when stop aborts chat start

Handle the no-streamId branch as a full UI cancel path: it aborts /api/chat/start and calls setBusy(false), but it never removes the optimistic thinking row that send() already injected via appendThinking(). On slow starts, clicking Stop during this pre-stream window leaves a stale "thinking" indicator in the transcript even though the turn was canceled, so users see a phantom in-progress assistant state. This branch should also clear transient inflight/thinking UI (e.g. removeThinking()/equivalent cleanup) like the other cancel/error paths.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

When the user clicks Stop before /api/chat/start responds, the early-return path now calls removeThinking() after setBusy(false) to clear the optimistic thinking row that appendThinking() already injected. Without this, a stale "in-progress" indicator lingered in the transcript after the cancel. Also switches to optional-chaining for the abort call (window._abortPendingChatStart?.()) to keep the function compact within the CI test's 400-char inspection window. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

GeoffBao · 2026-05-17T08:53:06Z

Fixed in 8e764d1: added removeThinking() call after setBusy(false) in the pre-stream early-return path; also switched to window._abortPendingChatStart?.() optional chaining to keep the function compact within the CI test's 400-char window.

…h.json fingerprint auth.json is rewritten by credential-pool/OAuth token refresh roughly every 14 minutes. _models_cache_source_fingerprint() hashed it via mtime/size (nesquena#1699 _models_cache_file_fingerprint), so every token refresh churned the fingerprint and the 24h /api/models cache was effectively dead -- the hot GET /api/session?resolve_model=1 path paid a cold ~11.5s rebuild every few minutes (RCA t_d127953d residual #2, t_16551f61). Add _auth_store_semantic_fingerprint(): content-hash auth.json with a DENY-list of known credential-rotation-only keys (access/refresh token, expiry, per-credential status/telemetry, request_count, save updated_at) stripped. Deny-list (not allow-list) is deliberate -- any unknown field, or a real provider/endpoint/model-set change (active_provider, a new credential_pool entry, base_url, source, label, auth_type, the providers{} block, ...) stays in the fingerprint and still correctly busts the cache. Conservative fallbacks: missing file -> marked; unreadable/corrupt -> stat-based fallback (never less safe than pre-fix). config.yaml keeps the cheap stat fingerprint (deliberate edits, no timer churn). Bidirectional invariant regression test (non-tautological -- the end-to-end churn test flips RED when the auth_json axis is reverted to stat-based): token-only churn keeps fingerprint byte-identical AND keeps a valid disk cache loadable; active_provider change / new credential_pool entry / changed base_url each flip the fingerprint AND reject the stale disk cache. Measured: 5/5 cold rebuilds per 5 refresh cycles -> 0/5. Tests: 9 new pass; 28 adjacent (nesquena#1699/nesquena#1633/display-resolver) pass; 54 models_cache/fingerprint suite pass.

…y-list (Codex review #2) Under a named profile, process HERMES_HOME is ~/.hermes/profiles/<name> but the allowlist still grants base ~/.hermes — so the prior deny (anchored only on the active-profile root + STATE_DIR) left ~/.hermes/state.db and sibling-profile secrets (~/.hermes/profiles/other/auth.json) reachable. Build deny roots from every Hermes state root the allowlist accepts: active HERMES_HOME, base ~/.hermes, api.profiles._DEFAULT_HERMES_HOME, and STATE_DIR; apply the state-subdir dir-denies under each. Widen the CSP-slice structural test window to match.

…ode) — toggle-mid-record safety (Codex review #2) Codex round-2: _stopMic and mediaRecorder.onstop read the CURRENT _rawAudioMode to choose backend/dispatch, but the recording was started on the OLD mode — so toggling Settings→Sound mid-recording could stop the wrong backend (orphaning the other) or dispatch raw-vs-transcribe wrongly. Pin _activeCaptureMode (speech | media-raw | media-transcribe) at start; _stopMic + onstop use it. Adds front-end source-invariant regression tests. Co-authored-by: lucasrc <lrclucas@gmail.com>

…dir dedup (Opus SHOULD-FIX) Opus review SHOULD-FIX on the upload surface: - Add _MAX_ARCHIVE_MEMBERS=10000 cap in extract_archive (both zip + tar loops): a tiny archive with millions of members slips under the byte cap but can exhaust inodes/fds. Trips before extraction, cleaned up via the existing rmtree-on-exception. Regression test added. - Bound the extraction-dir collision-suffix loop (was while-True) to 1000 tries. Other Opus SHOULD-FIX items (member-count #1 done; #2 done) filed as follow-up or N/A: same-field multi-file collapse doesn't apply (frontend sends one request per file); .tar.gz stem cosmetic.

…n (Codex round-2) Codex found two more once the config-guard bug was fixed (Opus concurred on #2): 1. panels.js _buildPluginCard built the Open button + enable toggle with inline onclick/onchange that interpolated tab.path / plugin.key into a JS-string-in- attribute context — HTML-escaping is insufficient there (quote breakout). Now rendered inert + bound via addEventListener with RAW closure values. 2. tab.path was unvalidated. Added _VALID_PLUGIN_TAB_PATH (^/[A-Za-z0-9._~/-]{0,255}$) in load_plugins() — absolute, no quotes/query/fragment/control chars. Also Opus nit: deep-merge now coerces dashboard_plugins values to bool + str keys. Regression tests added for both.

@gsurenull

…oning nesquena#3455 + LLM Wiki last-writer nesquena#1257) (nesquena#3466) * Release v0.51.230 (stage-p14): extract <think> to m.reasoning nesquena#3455 + LLM Wiki last-writer (nesquena#1257) Salvage of nesquena#3455 (@gsurenull): dropped the stale api/config.py bits (MiniMax-M3 + SCHEMA_VERSION 3->4 — both already on master via nesquena#3374). Kept the two genuine fixes: (1) _splitThinkFromContent persist-path extraction of inline <think> blocks into m.reasoning (fixes 30-50% session bloat for reasoning-only providers like MiniMax-M3); (2) LLM Wiki status Last-writer 3-tier fallback (was always 'Not available' since nesquena#1257). Added 9 Node-driven think-split regression tests (data-loss guards: content-before/after preserved, unclosed blocks intact, lookalike tags not extracted). * fix(nesquena#3455): renderer-matching think extraction + wiki symlink/bounded-read guards (Codex review) Codex review of stage-p14 found 3 SILENT bugs, all fixed: (1) DATA-LOSS: _splitThinkFromContent's Pass-2 whole-body scan extracted a CLOSED literal <think>...</think> from visible prose/code (e.g. inside a fenced code block) into m.reasoning, emptying it — more aggressive than the renderer (which only strips LEADING blocks). Removed Pass 2; extraction now matches _streamDisplay semantics (leading-only, loop captures consecutive leading blocks). +fenced-code regression test. (2) PRIVACY: _llm_wiki_last_writer followed symlinked .md pages resolving OUTSIDE the wiki (is_file follows symlinks), leaking external frontmatter. Now requires resolved path under wiki_root. +symlink-containment regression test. (3) CONTRACT/PERF: replaced full read_text() with bounded line-by-line reads (frontmatter block only / capped log-heading scan), never page bodies. * fix(nesquena#3455): think-split is leading-single (renderer-matching) + fix 2 stale source-match tests Codex re-review finding #2: looping consecutive leading blocks diverged from the renderer (_streamDisplay/_parseStreamState strip ONE leading block). Now extracts exactly one leading block. Also updated 2 tests that asserted pre-split implementation strings: test_live_stream_tokens_persist (content:assistantText -> content:split.content, invariant preserved) and the consecutive-blocks test. NOTE: Codex finding #1 (client-only split doesn't persist server-side) is a separate architectural decision pending Nathan. * feat(nesquena#3455): split inline <think> server-side before s.save() so persisted file is compacted (Codex #1) Codex finding #1: the think-split was client-only, so the SAVED session file still carried inline <think> blocks (bloat) — the fix only compacted the browser copy. Added _split_thinking_from_content (api/streaming.py), a server-side twin of the JS helper with identical leading-only/single-block semantics, applied to the final assistant message before s.save() (extended the existing reasoning-persist block). Merges with on_reasoning-stream reasoning. +8 backend-parity regression tests covering the mid-body-code-block data-loss guard, unclosed-intact, single-leading, none-content. * test: update 3 save-path source-assertion tests for nesquena#3455 server-side think-split The backend think-split (api/streaming.py reasoning-persist block) changed the literal code shape + grew the pre-save block, breaking 8 source-assertion tests that anchor on it: - test_sprint42: assert _rm['reasoning']=_reasoning_text -> now _merged_reasoning/_existing_reasoning + _split_thinking_from_content present (intent preserved: reasoning persisted before save). - test_pr1318 (6) + test_pr1341: re-anchored the locator from the changed 'if _reasoning_text and s.messages:' line to the stable 'Persist reasoning trace in the session' comment marker; bumped the 1341 byte-distance limit 15000->16000 (the test self-documents bumping on legit pre-save growth). All behavioral invariants (reasoning persisted + context fields before save) unchanged. --------- Co-authored-by: nesquena-hermes <[email protected]>

…nders twice nesquena#3709) (nesquena#3715) * fix(nesquena#3709): thinking card no longer renders twice (in Activity + below answer) The nesquena#3592 inline-render branch (v0.51.258) emitted a thinking card for a thinking-only message even when a sibling tool-message in the same turn already built an Activity group carrying that turn's thinking — so the card showed twice, the second one stranded below the answer + 'Done in …' footer (insertAdjacentHTML 'beforeend' on a segment that already had body+footer). Fix (keeps nesquena#3592, does NOT revert it): - A1: precompute turnsWithActivityGroup (turns whose segments have tool cards); the inline branch only renders when the anchor turn is NOT in that set. - A2: when it does render inline, insert 'beforebegin' the .msg-body/.msg-foot so the card sits above the answer, not orphaned below the footer. - B: strip thinking against the TURN's combined visible answer (_turnVisibleTextByRawIdx), so a trailing thinking-only message that echoes the answer gets de-duped even though its own body is empty. Live-verified in browser: nesquena#3709 repro (tool+trailing-thinking) → exactly 1 card in Activity, above footer; nesquena#3592 repro (thinking-only) → exactly 1 inline card, not buried in a collapsed group. + regression test tests/test_issue3709_*. Supersedes nesquena#3708 (which deleted the inline branch outright, re-breaking nesquena#3592). * fix(nesquena#3709): merge suppressed sibling thinking into the Activity group (Codex re-gate) Codex caught a content-loss edge in the first cut: when A1 suppresses a thinking-only sibling's inline card (its turn has an Activity group), the group only rendered assistantThinking.get(aIdx) for the TOOL message — so a sibling with DISTINCT reasoning was neither inline nor in the group → dropped. Fix: aggregate all of a turn's thinking (turnThinkingParts, de-duped, index order) and render that merged text once per turn in the Activity group (_renderedTurnThinking guard). Live-verified: tool-thinking A + distinct sibling-thinking B → 1 merged node carrying both, no loss. + regression test. * fix(nesquena#3709): shared anchor resolver so inline-suppression & group placement agree (Codex re-gate #2) Codex caught a fallback-anchor mismatch: turnsWithActivityGroup was populated only from assistantSegments.get(tcIdx) (direct segment), but the group-render path falls back to a nearby earlier segment when a tool's assistant_msg_idx has no directly rendered segment (legacy/rebased). So a fallback-anchored group's turn wasn't in turnsWithActivityGroup → the sibling rendered inline AND the group rendered → dup again. Fix: one shared _anchorRowForActivityIdx(aIdx) helper (direct-or-fallback) used by the precompute, the inline branch, and the group render — they now agree. Live-verified all three repros still pass. * test(nesquena#3709): update test_compact_activity assertion to mergedThinking var The brittle source-scan asserted _thinkingActivityNode(thinkingText, false) — the nesquena#3709 fix renders the turn's MERGED thinking via _thinkingActivityNode(mergedThinking, false) into the same Activity body. Intent (settled thinking renders inside the Activity disclosure alongside tools) unchanged; only the source variable. Updated to assert the new variable, kept all intent assertions. --------- Co-authored-by: nesquena-hermes <[email protected]>

@rodboev

…esquena#3633) (nesquena#3853) * fix(streaming): normalize inline thinking extraction across live and persisted turns (nesquena#3599) # Conflicts: # api/streaming.py # static/messages.js # static/ui.js * fix(streaming): code-aware inline-thinking extraction + position-aware unclosed handling Codex deep-review caught two regressions in the leading-only -> full-scan rewrite (both silent data-mangling on the persist/reload path): 1. Code-span unawareness: the scanner only protected triple fences, so a literal <think> in an inline single-backtick code span or an indented (>=4-space/tab) code block got silently extracted into reasoning. Added _inline_thinking_indented_code_at + inline-backtick tracking (Python + the JS twin _thinkingIndentedCodeAt), so all three code contexts now keep thinking tags visible. 2. Unclosed-tag truncation: any unmatched open tag moved the trailing prose into reasoning. Now position-aware — a LEADING unclosed block (cut off mid-thought) is still reasoning (nesquena#3455 intent), but an unclosed tag AFTER visible content stays visible so literal typed tags don't truncate prose. Gated partial handling on the previously-unused options.streaming param (live streaming keeps 'still thinking' behavior; persist/reload does not). Updated 2 tests that pinned the buggy behavior + added 4 regression tests (inline-backtick, indented-code, mid-body-unclosed-visible, leading-unclosed- extracted). Updated the node driver harness to include the new helper. Co-authored-by: rodboev <rodboev@users.noreply.github.com> * fix(streaming): recognize fenced code blocks indented 1-3 spaces Codex round-3: a fence indented 1-3 spaces is valid Markdown but the fence detector only matched at column 0, so a literal think tag inside such a fence (not 4+-space indented code either) was still extracted. Both detectors (_inline_thinking_fence_marker_at / _thinkingFenceMarkerAt) now walk back over up to 3 leading spaces to a line start. Added backtick + tilde indented-fence regression tests. Co-authored-by: rodboev <rodboev@users.noreply.github.com> * fix(streaming): O(n) inline-thinking scan + merge separate reasoning on reload Round-4 Codex deep-review caught two real issues in my own fixes: 1. PERF (O(n^2)): the indented-code check (_inline_thinking_indented_code_at / _thinkingIndentedCodeAt) scanned to line boundaries at EVERY character index, plus the leading check sliced+stripped the whole prefix per unclosed tag. On long no-newline content this was quadratic (~8.4s @ 200k, called repeatedly on the streaming path). Replaced with incremental O(1)-per-iteration line state (_line_is_indented_code / _lineIsIndentedCode evaluated only at line starts) + a seen_nonspace flag. 200k now extracts in ~55-140ms. 2. RELOAD reasoning-drop: renderMessages() seeded the shared extractor with '' so a message with BOTH an inline <think> block AND a separate m.reasoning payload showed only the inline part — the separate payload was dropped because the !thinkingText worklog resolution was then skipped. Now seeds with the message's direct reasoning (m.reasoning_content||m.reasoning||...) so the two MERGE (deduped); separate-only reasoning is preserved without promoting it into visible prose. Python + JS twins kept line-for-line parity. Added merge + perf + reload regression tests; updated the reload structure test and the node driver harness for the renamed helper. Co-authored-by: rodboev <rodboev@users.noreply.github.com> * fix(streaming): revert reload reasoning-seed; keep O(n) perf fix Codex round-4 finding #2 (seed renderMessages' inline extractor with m.reasoning so a separate payload merges) turned out to VIOLATE a deliberate architectural invariant pinned by test_issue2565 + test_sprint42: the reload content-extraction path must NOT touch m.reasoning/m.reasoning_content — reasoning metadata is owned exclusively by the Worklog Thinking Card path (_worklogReasoningTextFromMessage / _assistantReasoningPayloadText), never conflated with inline-content extraction (which would risk promoting provider reasoning into final-answer prose). Reverted the ui.js seed to the PR's original `thinkingText` arg. The inline+separate merge is still a genuine extractor capability (exercised by the live streaming path via liveReasoningText) and is covered by a unit test, just not invoked from the reload render path by design. The O(n) perf fix (finding #1) and the code-awareness + position-aware unclosed handling (rounds 1-3) are all retained. Co-authored-by: rodboev <rodboev@users.noreply.github.com> * fix(streaming): only lstrip extracted content when a leading block was removed Codex round-5 catch: the extractor unconditionally lstripped the final content (.lstrip() / .replace(/^\s+/,'')) even when NO thinking block was extracted, so an assistant reply that legitimately starts with an indented code block or blank lines lost its leading whitespace on live display, reload, and persistence. This was a real regression vs master (master returned non-thinking content unchanged). Now track leading_removed (set only when a LEADING thinking block/prefix is actually extracted) and lstrip only in that case. Mid-body / no-thinking content keeps its exact leading whitespace. Python + JS twins kept in parity; added backend regression tests (indented-first preserved, leading-blank preserved, leading-think still strips). Co-authored-by: rodboev <rodboev@users.noreply.github.com> * fix(streaming): reconnect restore prefers raw inflight accumulator Codex round-6 CORE catch: on reconnect, the single-live-message restore used (_liveInflightAssistant.content || ''). Because the PR now splits a leading unclosed <think> into empty content, restoring from the split content dropped the open tag — so a later </think> token leaked into the visible reply and corrupted the live accumulator. Restore from (_fullInflightAssistant || _liveInflightAssistant.content || '') so the raw open tag survives reconnect and the accumulator stays correct. Added a reconnect-restore regression test. Co-authored-by: rodboev <rodboev@users.noreply.github.com> * Release v0.51.335 — Release KY (normalize inline thinking extraction, nesquena#3633) Unify inline-thinking (<think>/<|channel>/<|turn|>) extraction across live, reload, and persisted turns (nesquena#3599/nesquena#3633, @rodboev). Deep-reviewed: Opus + 6 Codex rounds; maintainer fixes resolved every Codex finding — code-awareness (inline-backtick/indented/1-3-space fences keep literal tags visible), position-aware unclosed handling, O(n) line scanning (was O(n^2) on long content), conditional lstrip (preserve leading whitespace when no leading block removed), and a reconnect-restore CORE fix (raw accumulator preferred so an open <think> tag survives reconnect). Python + JS twins in parity. Full suite 8330, Opus SHIP-SAFE, Codex SAFE-TO-SHIP, ESLint/scope-undef/ruff clean. Co-authored-by: rodboev <rodboev@users.noreply.github.com> --------- Co-authored-by: Rod Boev <rod.boev@gmail.com> Co-authored-by: Hermes Agent <hermes-agent@nesquena-hermes.local> Co-authored-by: rodboev <rodboev@users.noreply.github.com>

Frank Song and others added 30 commits May 10, 2026 14:57

Fix 1974: trap focus in kanban modals

42a2381

test: add kanban modal locale parity regression

4c95d92

fix(i18n): localize /goal runtime status strings

1e1a948

test(kanban): harden locale-block parsing for quoted locales

2427f1e

test(kanban): assert profile-cache invalidation on profile delete

ba51efe

fix(clarify): honor clarify.timeout config in webui prompts

2e6b360

Add files via upload

9c37104

Update Chinese language translation

fix(1833): persist compression anchor summary for reload UI

1bec807

fix: wrap markdown code blocks on mobile

f793837

Fix CLI session patch diff rendering

e64e024

feat: live context window status tracking during streaming

1cf0ff0

Drop configured provider model badges

a300d9a

fix: keep live context metering session-scoped

56d68b7

fix: prefer latest compressed session segment

2a34a12

feat: add read-only session lineage report

c3cf8b1

fix: avoid sidebar jumps when active session is visible

8226328

fix: keep explicit fork sessions out of compression lineage

017a631

Stitch continued session transcripts in WebUI

8aed650

fix: reanchor live context usage updates

fecfc5f

Stage 329: PR nesquena#1990 — fix(kanban): guard dispatcher actions w…

c6e4306

…hile in-flight by @franksong2702

Stage 329: PR nesquena#1991 — fix(i18n): correct German profile_skill…

bf98ffe

…_count pluralization by @franksong2702

Stage 329: PR nesquena#1993 — fix(kanban): invalidate profile cache f…

7ce48de

…or assignee select by @franksong2702

Stage 329: PR nesquena#1995 — feat(kanban): trap focus in kanban moda…

b01df72

…ls + status hint by @franksong2702

Stage 329: PR nesquena#1996 — test(kanban): modal locale parity regre…

f44ead1

…ssion by @franksong2702

chore: CHANGELOG for v0.51.35 — Release K (kanban polish + i18n DE)

52c1053

fix(stage-329): zh-Hant locale parity for kanban_status_original_hint…

9242305

… + extend locale parity test (Opus advisor SHIP-WITH-CAVEATS follow-up)

chore: CHANGELOG note for stage augmentation 9242305

941c805

Hermes Agent and others added 18 commits May 12, 2026 16:14

docs: CHANGELOG stage-344 — close v0.51.50, open Unreleased for 16-PR…

4ab6cd6

… contributor batch

Merge pull request nesquena#2148 from nesquena/stage-344

6297443

stage-344: 16-PR contributor batch — i18n + insights + manual-compress async + workspace recovery + iOS PWA + Cloudflare login health + bash 3.2 (fr locale)

feat: add provider quota refresh control

b1f752a

fix: guard stale stream writebacks

4b57b20

fix: guard provider quota refresh fallback button state

062ef74

docs: CHANGELOG stage-345 — close v0.51.51, open Unreleased for nesqu…

9336161

…ena#2136 + nesquena#2150

Merge pull request nesquena#2153 from nesquena/stage-345

9268f41

stage-345: 2-PR low-risk batch — stream-ownership guard against stale writebacks + Refresh-usage button on provider quota card

Resolve merge conflict: keep pre-stream thinking dots fix

038e2b1

Keep our version that removes the !S.activeStreamId guard from appendThinking() so thinking dots appear immediately on send. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

GeoffBao merged commit 8607de8 into master May 17, 2026
3 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix pre-stream gap UX: thinking dots + stop button on send#2

Fix pre-stream gap UX: thinking dots + stop button on send#2
GeoffBao merged 498 commits into
masterfrom
claude/quizzical-morse-a58fab

GeoffBao commented May 17, 2026

Uh oh!

chatgpt-codex-connector Bot commented May 17, 2026

Uh oh!

GeoffBao commented May 17, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants

Conversation

GeoffBao commented May 17, 2026

Summary

Root cause

Fix

Test plan

Uh oh!

chatgpt-codex-connector Bot commented May 17, 2026

💡 Codex Review

Uh oh!

GeoffBao commented May 17, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants