perf: skip lineage sidecar hydration for session tails#4137
Conversation
ef58fdd to
b8cf6de
Compare
|
@ai-ag2026 Can you shoot me an email? I’d like to invite you to our dev chat discord as a contributor! |
|
| Filename | Overview |
|---|---|
| api/routes.py | Adds fastpath eligibility guard and rewires the session-load flow; one dead _slice variable introduced in the new offset-adjustment block. |
| api/models.py | Adds tool_call_count metadata field; parsing is defensive and mirrors the existing message_count pattern; correctly defaults to None so old sidecars without the field fall back to the full-load path. |
| tests/test_session_tail_state_db_fastpath.py | New test file; covers fastpath eligibility guard and asserts full-sidecar hydration is never called; offset/count assertions are consistent with the implementation. |
| tests/test_stale_stream_cleanup.py | Updates source-position test to track the new metadata_only=True first load and adds a new assertion that the sidecar-hydration fallback also clears stale stream state. |
| CHANGELOG.md | Adds Unreleased entry describing the fastpath; entry text is accurate to the implementation. |
Flowchart
%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[GET /api/session?msg_limit=N] --> B[get_session metadata_only=True]
B --> C{is_messaging_session?}
C -- Yes --> D[get_session metadata_only=False\n+ get_cli_session_messages]
C -- No --> E{load_messages?}
E -- No --> F[metadata-only path\n_metadata_only_message_summary]
E -- Yes --> G[get_state_db_session_messages]
G --> H{_state_db_tail_fastpath_eligible?\nmsg_limit set, msg_before None,\nno parent, tool_call_count=0,\nstate.db >= sidecar_count}
H -- Yes --> I[FASTPATH\n_all_msgs = state_db_messages\nadjust _messages_offset\nwith historical delta]
H -- No --> J[get_session metadata_only=False\nFull sidecar hydration]
J --> K[_limited_webui_messages_for_display\nor full lineage merge]
I --> L[_message_window_for_display\nslice tail window]
K --> L
D --> L
L --> M[Build payload\nmessage_count = state_db_tail_total_count\n_messages_offset adjusted]
Reviews (8): Last reviewed commit: "perf: skip lineage sidecar hydration for..." | Re-trigger Greptile
b8cf6de to
8fdee6c
Compare
f38e68a to
191b5f3
Compare
Re-review — fast path is unsound for lineage children (changes requested)Thanks @ai-ag2026 — the no-parent fast path is sound and a real win: CORE 1 — lineage child drops parent rows + reports wrong global countWith an isolated parent + child continuation, the fast path returned child-only rows with CORE 2 — "Load earlier" never reaches parent history for fast-pathed lineage childrenThe UI's normal Load-earlier sends a larger SILENT — legacy session-level tool cards vanishBecause the fast-path Fix (cheapest closes the holes)
The perf goal is great and the no-parent path is ready — it's specifically the lineage-child |
191b5f3 to
f18dc71
Compare
|
Thanks for reworking the eligibility gate — the narrow The fast path serves raw In To make this safe you'd need to prove the state.db rows are display-equivalent to the sidecar, not merely as numerous — e.g. a persisted equivalence fingerprint/marker written when the sidecar and state.db are known in sync, checked here before taking the fast path; or run the same merge/projection over the state.db rows before slicing (which reclaims much of the cost the fast path is trying to avoid, so the fingerprint route is likely the real design). A behavioral test that asserts the fast-path window == the full-load window for the same session would lock it in. This is genuinely good direction (sidecar hydration on tail loads IS a real cost) — but the equivalence proof is the crux and needs a design pass, so I'm returning it rather than inline-patching. Note this overlaps #4070 and #4138, which touch the same |
Re-review — all three blocking concerns are now closed; re-gating greenThanks @ai-ag2026. The rewrite addresses every issue from the last round head-on, and the fast path is now narrow in exactly the right places. Reading the new CORE 1 + CORE 2 (lineage children) — fixedThe blanket if str(getattr(session, "parent_session_id", "") or "").strip():
return FalseSo both the wrong-global-count case and the "Load earlier never reaches parent history" case can't trigger — lineage children stay on the full-hydration / SILENT (legacy session-level tool cards) — fixedThis is the part I want to call out as well-designed. meta['tool_call_count'] = len(self.tool_calls or [])and the gate refuses the fast path unless metadata proves there are zero legacy tool calls ( tool_call_count = int(getattr(session, "_metadata_tool_call_count", -1))
...
if tool_call_count != 0:
return FalseThe default-to- Offset / count mathThe metadata-vs-state.db count mixing I flagged is now bounded to the no-parent, tool-call-free case where One small thing worth a glance (non-blocking)
Re-gating now (Codex + Opus + full suite). No blockers from me on this revision — the perf goal is intact and the unsound branches are all excluded. Note this still pairs with #4138 on the merge-loop win, so worth a combined large-history benchmark before/after. |
86ba25f to
3dbbe8e
Compare
3dbbe8e to
f4e07f1
Compare
Re-gate after rebase — two blocking items before this can shipThanks @ai-ag2026 — picking this back up. I rebased the PR onto current master (it was 65 behind; resolved a BLOCKER 1 (security) — re-validate the profile boundary after the full re-fetchThe fast path correctly inherits the active-profile visibility check from the initial Fix (mechanical — mirror the existing 404 shape) after each s = get_session(sid, metadata_only=False)
original_stream_id = getattr(s, "active_stream_id", None)
_clear_stale_stream_state(s)
_session_profile = getattr(s, 'profile', None) or None
if not _session_visible_to_active_profile(_session_profile, handler):
return bad(handler, "Session not found", 404)(Applies to both the messaging branch BLOCKER 2 (display correctness) — prove zero attachment / display-only sidecar fields before taking the fast path
Fix: extend the same metadata-persist-and-prove-zero pattern you used for
Non-blocking (worth doing while you're in here)
Why this is closeThe no-parent fast path itself is sound (Opus re-confirmed: with |
Thinking Path
Long compression-lineage sessions can make an ordinary
/api/session?msg_limit=...load hydrate every parent sidecar before slicing the visible tail. On a real large local lineage session, that path spent ~20s in_webui_sidecar_lineage_messages_for_display(), while the current segment'sstate.dbmessages were available in milliseconds. This PR keeps the full-history path intact but avoids parent sidecar hydration for the initial tail-window load.What Changed
_state_db_tail_fastpath_eligible(...)guard for/api/sessiontail-window requests.state.dbtranscript and preserves the global message coordinate space via the sidecar metadata count.msg_beforeolder-history loads, active/pending sessions, truncation-watermark sessions, and ordinary non-lineage sessions with incomplete state.db coverage on the existing full reconciliation path.message_count/_messages_offset.Why It Matters
Switching into a long compressed/continued conversation should not block the UI while the server parses and merges every historical parent sidecar just to return the newest visible tail. The explicit older-history/full-transcript paths remain available when the user asks for them.
Local real-corpus timing on a long lineage session:
/api/session?messages=1&msg_limit=80&expand_renderable=1spent ~24.7s cumulatively in_webui_sidecar_lineage_messages_for_display().83current-segment messages,message_count=4388,_messages_offset=4305).msg_limitstill uses the existing path and remains intentionally expensive (~20s on that corpus), because it returns the full merged history.Contract Routing
Task type: runtime/session metadata performance fix.
Touched areas:
api/routes.py/api/sessionload pathRelevant public docs:
AGENTS.mdCONTRIBUTING.mddocs/CONTRACTS.mddocs/rfcs/webui-run-state-consistency-contract.mdScope boundaries:
msg_before, messaging/CLI, truncation, active/pending recovery, or sidecar persistence semantics.Evidence needed before claiming done:
Verification
python3 -m pytest tests/test_session_tail_state_db_fastpath.py tests/test_session_tail_payload.py tests/test_session_message_window_renderable_tail.py tests/test_webui_state_db_reconciliation.py tests/test_session_lineage_full_transcript.py -q→39 passed in 2.55spython3 -m py_compile api/routes.py tests/test_session_tail_state_db_fastpath.pygit diff --checkstatic scan findings: 013.1ms,msgs=83,count=4388,offset=430520327.5ms,msgs=5914,offset=0Risks / Follow-ups
Model Used
OpenAI GPT-5.5 via Hermes Agent WebUI, with terminal/file tooling and local pytest/timing probes.