perf: reuse prefix duplicate match during message merge#4138
Conversation
|
| Filename | Overview |
|---|---|
| api/models.py | Core optimization: splits exact/fuzzy prefix detection into two branches, adds a per-call prefix_visible_lookup_cache, and correctly attributes skipped_state_visible_counts to expected_visible_key. Non-prefix dedup path is unchanged. |
| tests/test_merge_prefix_duplicate_fastpath.py | Two new tests covering the no-broad-scan guarantee and duplicate-budget attribution for fuzzy prefix replay rows. |
| CHANGELOG.md | Adds a clear Unreleased entry under Fixed describing the O(n²) to linear improvement. |
Flowchart
%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[state row: visible_key] --> B{state_replay_idx < len sidecar_visible_sequence?}
B -- No --> G[Non-prefix dedup path: full sidecar scan]
B -- Yes --> C{visible_key == expected_visible_key?}
C -- Exact match --> D[replays_sidecar_prefix = True, matched_prefix_visible_key = expected_visible_key, state_replay_idx++]
C -- Fuzzy check --> E{prefix_visible_lookup_cache hit?}
E -- Miss --> F[_build_visible_duplicate_lookup, cache result]
E -- Hit --> H[_matching_visible_duplicate against single-key set]
F --> H
H -- matched --> I[replays_sidecar_prefix = True, state_replay_idx++]
H -- no match --> G
D --> J[skipped_state_visible_counts expected_visible_key += 1, continue]
I --> J
G --> K{matched in full sidecar set?}
K -- Yes, budget left --> L[skipped_state_visible_counts += 1, continue]
K -- Exhausted or no match --> M[Append to merged_messages]
Reviews (8): Last reviewed commit: "perf: reuse prefix duplicate match durin..." | Re-trigger Greptile
ed79091 to
829c68d
Compare
Review — the prefix-match reuse is sound, and arguably a touch more correct than beforeReading What the diff actually doesThe ordered-prefix branch now records the key it already matched and reuses it for the counter, instead of rescanning: if visible_key == expected_visible_key:
replays_sidecar_prefix = True
matched_prefix_visible_key = expected_visible_key # ← exact case, no lookup
state_replay_idx += 1
else:
expected_visible_keys = {expected_visible_key}
... # fuzzy case: lookup over a 1-element set
matched_prefix_visible_key = _matching_visible_duplicate(visible_key, expected_visible_keys, expected_visible_lookup)and the counter debits that key directly ( if replays_sidecar_prefix:
if matched_prefix_visible_key is not None:
skipped_state_visible_counts[matched_prefix_visible_key] = (
skipped_state_visible_counts.get(matched_prefix_visible_key, 0) + 1Two things I checked specifically
Tests
VerdictLGTM on correctness and scope. One small note for the maintainer: per the PR's own follow-up, the headline win here is partly gated behind #4137 (which removes the worst initial-tail path before this merge loop is reached), so the two are worth benchmarking together on a real large-history load rather than only the synthetic 3k/3k probe. |
ec602a4 to
c33d347
Compare
|
Thanks — the intent (avoiding the O(n²) duplicate rescan on long replay-prefix merges) is a real cost worth fixing. But a deep re-review (rebased onto current master, full regression gate) shows this is not a pure-perf change — it alters the merge output in a fuzzy-duplicate edge case and can append a duplicate transcript row. Root cause. On matched_visible_key = _matching_visible_duplicate(visible_key, sidecar_visible_keys, sidecar_visible_lookup)This PR debits against Reproduction (Codex-verified,
Fix. For prefix-replay skips, keep debiting the full-sidecar matched key, i.e. restore: matched_visible_key = _matching_visible_duplicate(visible_key, sidecar_visible_keys, sidecar_visible_lookup)
# ...debit skipped_state_visible_counts[matched_visible_key]rather than Returning for that — happy to re-gate once the dedup bookkeeping is equivalence-proven. |
1928a22 to
50bf776
Compare
50bf776 to
3bdbc1e
Compare
Thinking Path
merge_session_messages_append_only()already checks whether a state.db row replays the next expected sidecar-visible message. When that ordered prefix check matched fuzzily, the code immediately performed a second duplicate lookup against the entire sidecar visible-key set just to update skipped counts. On long replay prefixes this becomes an O(n²) hot path.What Changed
Why It Matters
Long sidecar/state.db reconciliation can happen during full session loads, recovery, and fallback paths. Avoiding the redundant global lookup keeps the merge linear for ordered replay prefixes while leaving the existing dedupe/truncation/tool-call semantics intact.
Local synthetic timing probe:
Contract Routing
Task type: runtime/session reconciliation performance fix.
Touched areas:
api/models.pymerge_session_messages_append_only()Relevant public docs:
AGENTS.mdCONTRIBUTING.mddocs/CONTRACTS.mdScope boundaries:
Verification
python3 -m pytest tests/test_merge_prefix_duplicate_fastpath.py tests/test_merge_key_tool_calls.py tests/test_webui_state_db_reconciliation.py tests/test_session_lineage_full_transcript.py -q→38 passed in 2.53spython3 -m py_compile api/models.py tests/test_merge_prefix_duplicate_fastpath.pygit diff --checkstatic scan findings: 04594.6msbefore vs105.7msafter for 3k/3k fuzzy ordered replay rows.Risks / Follow-ups
Model Used
OpenAI GPT-5.5 via Hermes Agent WebUI, with terminal/file tooling and local pytest/timing probes.