Commit dc27b2f
perf(indexer): incremental byte-offset JSONL parse — 604x on appends (v1.1.90)
The session indexer re-read and re-parsed the WHOLE transcript on every
append (a ~4 KB append to a 211 MB live session = a ~525 ms re-parse plus
rewriting every archive + FTS row). Now it reads only what changed.
How (history.rs::index_adapter_session):
- cc_sessions gains last_indexed_byte_offset + last_indexed_line_count
(additive migration, NOT NULL DEFAULT 0 — legacy sessions full-parse once
to set the cursor, then go incremental).
- when an indexed file only grew, seek to the saved offset, parse the tail
up to the last newline (half-flushed events are never indexed), and merge
deltas: append archive rows with continued message_index/source_line, bump
day buckets, sum token totals, recompute cost from the new totals.
- shrunk/rotated file (file_size < offset) falls back to a clean full reparse.
- parse_claude_session/parse_codex_session are now thin wrappers over one
generic adapter path.
Proven:
- incremental_index_matches_full_reindex_byte_for_byte — an incremental
index is byte-identical to a one-shot full re-index (totals, cursor, cost,
every archive row, day buckets).
- file_shrink_falls_back_to_full_reparse — rotation rebuilds cleanly.
- bench_incremental_reindex_vs_full (23.5 MB): full reparse 1275.9 ms →
incremental append 2.1 ms = 604x; gap widens with size. (docs/PERFORMANCE.md)
215 tests pass. No language rewrite — this was algorithmic, exactly as the
harness predicted.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>1 parent dd54495 commit dc27b2f
6 files changed
Lines changed: 613 additions & 45 deletions
File tree
- apps/desktop
- src-tauri
- src
- commands
- db
- docs
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
2 | 2 | | |
3 | | - | |
| 3 | + | |
4 | 4 | | |
5 | 5 | | |
6 | 6 | | |
| |||
0 commit comments