perf: reuse prefix duplicate match during message merge by ai-ag2026 · Pull Request #4138 · nesquena/hermes-webui

ai-ag2026 · 2026-06-13T19:47:26Z

Thinking Path

merge_session_messages_append_only() already checks whether a state.db row replays the next expected sidecar-visible message. When that ordered prefix check matched fuzzily, the code immediately performed a second duplicate lookup against the entire sidecar visible-key set just to update skipped counts. On long replay prefixes this becomes an O(n²) hot path.

What Changed

Reuse the visible key found by the ordered prefix check instead of rescanning all sidecar visible keys for the same state row.
Preserve the existing global duplicate lookup for non-prefix rows.
Add a regression test that makes fuzzy ordered-prefix replay rows fail if they invoke broad sidecar duplicate lookup.
Add an Unreleased changelog note.

Why It Matters

Long sidecar/state.db reconciliation can happen during full session loads, recovery, and fallback paths. Avoiding the redundant global lookup keeps the merge linear for ordered replay prefixes while leaving the existing dedupe/truncation/tool-call semantics intact.

Local synthetic timing probe:

Before: 3,000 sidecar rows + 3,000 fuzzy ordered state replay rows → ~4,595ms, ~22.9M function calls.
After: same probe → ~106ms, ~321k function calls.

Contract Routing

Task type: runtime/session reconciliation performance fix.
Touched areas:

api/models.py merge_session_messages_append_only()
session sidecar/state.db reconciliation
Relevant public docs:
AGENTS.md
CONTRIBUTING.md
docs/CONTRACTS.md
Scope boundaries:
Does not change full transcript inputs, truncation watermark rules, message-id dedupe, tool-call keying, or non-prefix duplicate detection.
Only avoids repeating an already-known prefix duplicate lookup.

Verification

python3 -m pytest tests/test_merge_prefix_duplicate_fastpath.py tests/test_merge_key_tool_calls.py tests/test_webui_state_db_reconciliation.py tests/test_session_lineage_full_transcript.py -q → 38 passed in 2.53s
python3 -m py_compile api/models.py tests/test_merge_prefix_duplicate_fastpath.py
git diff --check
Added-line static risk scan → static scan findings: 0
Synthetic timing probe: 4594.6ms before vs 105.7ms after for 3k/3k fuzzy ordered replay rows.

Risks / Follow-ups

The patch intentionally keeps broad duplicate lookup for non-prefix rows; only the already matched ordered-prefix row reuses its known expected key.
Follow-up work could benchmark real large fallback/full-history loads after perf: skip lineage sidecar hydration for session tails #4137 lands, because perf: skip lineage sidecar hydration for session tails #4137 removes the worst initial-tail path before this merge path is reached.

Model Used

OpenAI GPT-5.5 via Hermes Agent WebUI, with terminal/file tooling and local pytest/timing probes.

greptile-apps · 2026-06-13T19:58:50Z

Greptile Summary

This PR eliminates an O(n²) hot path in merge_session_messages_append_only() by reusing the visible key already found during the ordered prefix check, rather than rescanning the entire sidecar visible-key set for each prefix-matched state row. A prefix_visible_lookup_cache is also introduced to avoid rebuilding single-element duplicate lookups when the same expected key recurs.

api/models.py: Splits the old combined exact+fuzzy prefix guard into two explicit branches; the fuzzy branch builds and caches a single-key lookup, captures matched_prefix_visible_key from the result, and skips the full-sidecar rescan entirely.
tests/test_merge_prefix_duplicate_fastpath.py: Two new tests — one monkeypatch-based check confirming no broad sidecar lookup fires on prefix rows, and one exercising the duplicate-budget counter.
CHANGELOG.md: Adds an Unreleased entry describing the O(n²) → linear improvement.

Confidence Score: 5/5

Safe to merge — the change only affects how the prefix-replay skip-count is attributed and avoids a redundant full-sidecar rescan; all non-prefix dedup logic is untouched.

The optimization is logically correct: the ordered prefix check already pins the only sidecar position the state row can match, so reusing expected_visible_key is both faster and more accurate than letting the full-sidecar fuzzy scan pick an arbitrary matching key. The old code had a latent mis-attribution bug that the new code inadvertently fixes. The cache is function-local and bounded.

No files require special attention. The test for duplicate-budget attribution is slightly under-specified, but this does not affect correctness of the production path.

Important Files Changed

Filename	Overview
api/models.py	Core optimization: splits exact/fuzzy prefix detection into two branches, adds a per-call prefix_visible_lookup_cache, and correctly attributes skipped_state_visible_counts to expected_visible_key. Non-prefix dedup path is unchanged.
tests/test_merge_prefix_duplicate_fastpath.py	Two new tests covering the no-broad-scan guarantee and duplicate-budget attribution for fuzzy prefix replay rows.
CHANGELOG.md	Adds a clear Unreleased entry under Fixed describing the O(n²) to linear improvement.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[state row: visible_key] --> B{state_replay_idx < len sidecar_visible_sequence?}
    B -- No --> G[Non-prefix dedup path: full sidecar scan]
    B -- Yes --> C{visible_key == expected_visible_key?}
    C -- Exact match --> D[replays_sidecar_prefix = True, matched_prefix_visible_key = expected_visible_key, state_replay_idx++]
    C -- Fuzzy check --> E{prefix_visible_lookup_cache hit?}
    E -- Miss --> F[_build_visible_duplicate_lookup, cache result]
    E -- Hit --> H[_matching_visible_duplicate against single-key set]
    F --> H
    H -- matched --> I[replays_sidecar_prefix = True, state_replay_idx++]
    H -- no match --> G
    D --> J[skipped_state_visible_counts expected_visible_key += 1, continue]
    I --> J
    G --> K{matched in full sidecar set?}
    K -- Yes, budget left --> L[skipped_state_visible_counts += 1, continue]
    K -- Exhausted or no match --> M[Append to merged_messages]

_{Reviews (8): Last reviewed commit: "perf: reuse prefix duplicate match durin..." | Re-trigger Greptile}

nesquena-hermes · 2026-06-13T21:12:24Z

Review — the prefix-match reuse is sound, and arguably a touch more correct than before

Reading api/models.py:4469-4502 on the PR head against origin/master, the optimization holds up. The hot path it removes is real: on the old code, every ordered-prefix match still did a second _matching_visible_duplicate(visible_key, sidecar_visible_keys, …) against the entire sidecar key set purely to debit the skip counter, which is the O(n²) behavior on long replay prefixes.

What the diff actually does

The ordered-prefix branch now records the key it already matched and reuses it for the counter, instead of rescanning:

if visible_key == expected_visible_key:
    replays_sidecar_prefix = True
    matched_prefix_visible_key = expected_visible_key   # ← exact case, no lookup
    state_replay_idx += 1
else:
    expected_visible_keys = {expected_visible_key}
    ...                                                  # fuzzy case: lookup over a 1-element set
    matched_prefix_visible_key = _matching_visible_duplicate(visible_key, expected_visible_keys, expected_visible_lookup)

and the counter debits that key directly (models.py:4500-4502):

if replays_sidecar_prefix:
    if matched_prefix_visible_key is not None:
        skipped_state_visible_counts[matched_prefix_visible_key] = (
            skipped_state_visible_counts.get(matched_prefix_visible_key, 0) + 1

Two things I checked specifically

Not a dead-variable refactor. sidecar_visible_lookup / sidecar_visible_keys are still consumed by the non-prefix duplicate path (models.py:4570-4573), and the skipped_count < sidecar_count budget there is untouched — so the broad lookup is preserved exactly where the PR claims (non-prefix rows only).
The counter target is at least as correct as before. The old global _matching_visible_duplicate could debit some matching sidecar key, not necessarily the one at the current replay index; the new code debits expected_visible_key — the specific message the prefix cursor is consuming. For sessions with repeated visible payloads (your test_prefix_replay_counts_matched_sidecar_key_for_later_duplicate_skips covers exactly this), debiting the indexed key is the more defensible choice, and the duplicate-budget skip downstream still works because both paths feed the same skipped_state_visible_counts.

Tests

test_prefix_replay_reuses_expected_visible_key_without_global_lookup monkeypatches _matching_visible_duplicate and asserts no call ever arrives with len(visible_keys) > 1 during a 64-row fuzzy ordered prefix — a clean behavioral guard that the broad scan is gone rather than just an output check. The second test pins the duplicate-budget semantics. Both are the right shape. CI is green across 3.11/3.12/3.13.

Verdict

LGTM on correctness and scope. One small note for the maintainer: per the PR's own follow-up, the headline win here is partly gated behind #4137 (which removes the worst initial-tail path before this merge loop is reached), so the two are worth benchmarking together on a real large-history load rather than only the synthetic 3k/3k probe.

nesquena-hermes · 2026-06-14T18:27:11Z

Thanks — the intent (avoiding the O(n²) duplicate rescan on long replay-prefix merges) is a real cost worth fixing. But a deep re-review (rebased onto current master, full regression gate) shows this is not a pure-perf change — it alters the merge output in a fuzzy-duplicate edge case and can append a duplicate transcript row.

Root cause. On origin/master, when a state row replays the sidecar prefix, the skipped-count is debited against the key found by matching the row against the entire sidecar visible-key set:

matched_visible_key = _matching_visible_duplicate(visible_key, sidecar_visible_keys, sidecar_visible_lookup)

This PR debits against matched_prefix_visible_key, which is matched only against the singleton {expected_visible_key} at the current state_replay_idx. When a state row fuzzy-matches the expected prefix position but the actual sidecar duplicate lives at a different position, the wrong key gets debited from skipped_state_visible_counts, so the later dedup guard (#3346) no longer suppresses the real duplicate.

Reproduction (Codex-verified, PYTHONHASHSEED=4):

sidecar visible = ["alpha beta", "alpha gamma"]
state = ["alpha", "alpha beta"]
origin/master: output stays the two sidecar rows.
this PR: output appends a second "alpha beta" row (duplicate).

Fix. For prefix-replay skips, keep debiting the full-sidecar matched key, i.e. restore:

matched_visible_key = _matching_visible_duplicate(visible_key, sidecar_visible_keys, sidecar_visible_lookup)
# ...debit skipped_state_visible_counts[matched_visible_key]

rather than matched_prefix_visible_key. Note this is in tension with the perf goal for the fuzzy branch (it reintroduces the full-set lookup there), so the honest options are: (a) keep the fast path only for the exact-== prefix case and fall back to the original full-set debit for the fuzzy branch, or (b) drop the "pure perf / no behavior change" framing and add explicit tests for ambiguous fuzzy duplicates that lock in the intended output. A behavioral test asserting the merge output for the repro above (and a few fuzzy-duplicate permutations) would make whichever choice safe to land.

Returning for that — happy to re-gate once the dedup bookkeeping is equivalence-proven.

greptile-apps Bot reviewed Jun 13, 2026

View reviewed changes

Comment thread tests/test_merge_prefix_duplicate_fastpath.py

Comment thread api/models.py

ai-ag2026 force-pushed the perf/merge-visible-duplicate-fastpath branch from ed79091 to 829c68d Compare June 13, 2026 20:12

ai-ag2026 force-pushed the perf/merge-visible-duplicate-fastpath branch 3 times, most recently from ec602a4 to c33d347 Compare June 14, 2026 15:51

nesquena-hermes mentioned this pull request Jun 14, 2026

perf: skip lineage sidecar hydration for session tails #4137

Open

nesquena-hermes added the changes-requested Maintainer left detailed feedback requesting changes; PR is waiting on author to address label Jun 14, 2026

ai-ag2026 force-pushed the perf/merge-visible-duplicate-fastpath branch 2 times, most recently from 1928a22 to 50bf776 Compare June 14, 2026 19:10

nesquena-hermes mentioned this pull request Jun 15, 2026

fix(sessions): paginate tool-heavy session tails by visible messages (#4069) #4215

Merged

perf: reuse prefix duplicate match during message merge

3bdbc1e

ai-ag2026 force-pushed the perf/merge-visible-duplicate-fastpath branch from 50bf776 to 3bdbc1e Compare June 15, 2026 13:05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

perf: reuse prefix duplicate match during message merge#4138

perf: reuse prefix duplicate match during message merge#4138
ai-ag2026 wants to merge 1 commit into
nesquena:masterfrom
ai-ag2026:perf/merge-visible-duplicate-fastpath

ai-ag2026 commented Jun 13, 2026

Uh oh!

greptile-apps Bot commented Jun 13, 2026 •

edited

Loading

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Flowchart

Uh oh!

Uh oh!

Uh oh!

nesquena-hermes commented Jun 13, 2026

Uh oh!

nesquena-hermes commented Jun 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

ai-ag2026 commented Jun 13, 2026

Thinking Path

What Changed

Why It Matters

Contract Routing

Verification

Risks / Follow-ups

Model Used

Uh oh!

greptile-apps Bot commented Jun 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Flowchart

Uh oh!

Uh oh!

Uh oh!

nesquena-hermes commented Jun 13, 2026

Review — the prefix-match reuse is sound, and arguably a touch more correct than before

What the diff actually does

Two things I checked specifically

Tests

Verdict

Uh oh!

nesquena-hermes commented Jun 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

greptile-apps Bot commented Jun 13, 2026 •

edited

Loading