Summary
Every --rebuild and --rebuild-from-meili run permanently discards all accumulated first_seen/last_seen history. This is a regression introduced in the v0.2606.0 refactor of index_kvrocks.py.
Root cause
tools/index_kvrocks.py:843–847:
seen_snapshot = (
rebuild_kvrocks(indexer, include_tags=args.retag) # always returns None
if args.rebuild or args.rebuild_from_meili
else None
)
rebuild_kvrocks() ends with an implicit return None. The return value is assigned to seen_snapshot, which is forwarded to apply_seen_snapshot(). That function bails immediately when seen_snapshot is falsy (line 307), so no timestamps are ever restored after a rebuild.
The function snapshot_seen_values() still exists at line 414 and correctly captures timestamps from the live Kvrocks index before keys are cleared. It was called in the pre-v0.2606.0 flow but was silently removed during the refactor.
first_seen is documented in CLAUDE.md as accumulated Kvrocks state that cannot be reconstructed from Meilisearch alone. This regression means every operator-triggered rebuild permanently destroys historical first-seen timestamps with no warning.
Note: reimport_port_dump.py (line 666) explicitly deletes doc:* keys before calling the same pipeline, so the in-place-merge fallback documented in the rebuild_kvrocks() comment is also unavailable for the port-dump path.
Fix
seen_snapshot = None
if args.rebuild or args.rebuild_from_meili:
seen_snapshot = snapshot_seen_values(indexer) # capture before clearing
rebuild_kvrocks(indexer, include_tags=args.retag)
File
tools/index_kvrocks.py
Verification
Run --rebuild-from-meili on a populated index and confirm first_seen values survive on a sample of UIDs before and after.
Summary
Every
--rebuildand--rebuild-from-meilirun permanently discards all accumulatedfirst_seen/last_seenhistory. This is a regression introduced in the v0.2606.0 refactor ofindex_kvrocks.py.Root cause
tools/index_kvrocks.py:843–847:rebuild_kvrocks()ends with an implicitreturn None. The return value is assigned toseen_snapshot, which is forwarded toapply_seen_snapshot(). That function bails immediately whenseen_snapshotis falsy (line 307), so no timestamps are ever restored after a rebuild.The function
snapshot_seen_values()still exists at line 414 and correctly captures timestamps from the live Kvrocks index before keys are cleared. It was called in the pre-v0.2606.0 flow but was silently removed during the refactor.first_seenis documented in CLAUDE.md as accumulated Kvrocks state that cannot be reconstructed from Meilisearch alone. This regression means every operator-triggered rebuild permanently destroys historical first-seen timestamps with no warning.Note:
reimport_port_dump.py(line 666) explicitly deletesdoc:*keys before calling the same pipeline, so the in-place-merge fallback documented in therebuild_kvrocks()comment is also unavailable for the port-dump path.Fix
File
tools/index_kvrocks.pyVerification
Run
--rebuild-from-meilion a populated index and confirmfirst_seenvalues survive on a sample of UIDs before and after.