Summary
create_long_term_memory(..., deduplicate=True) (the default) performs semantic deduplication that can MERGE distinct-but-similar records — and supplying distinct ids does not prevent the merge. The id acts as an upsert key (identical re-write → same id → no duplicate), not a merge guard; the merge decision is content-driven.
How it bit us
An accumulate-style store (session findings, several per day, often similar phrasing) lost records to merges despite unique ids. The only reliable guard was deduplicate=False + caller-supplied stable ids (Smart-AI-Memory/attune-ai#666).
A diagnosis caveat worth passing on: eventual indexing + the relevance cutoff in semantic search make this easy to misread in probes — counting via empty-text search until the count stabilizes was the only reliable verification method we found.
Ask
Document the dedup semantics explicitly: (a) semantic, content-driven; (b) ids don't opt a record out; (c) accumulate-style consumers should pass deduplicate=False. A per-record opt-out (or id-respecting mode) would be a nice enhancement but the doc note alone prevents the data-loss surprise.
Server + client 0.14.0, redis-stack, Ollama embeddings. Happy to PR the docs.
Summary
create_long_term_memory(..., deduplicate=True)(the default) performs semantic deduplication that can MERGE distinct-but-similar records — and supplying distinctids does not prevent the merge. Theidacts as an upsert key (identical re-write → same id → no duplicate), not a merge guard; the merge decision is content-driven.How it bit us
An accumulate-style store (session findings, several per day, often similar phrasing) lost records to merges despite unique ids. The only reliable guard was
deduplicate=False+ caller-supplied stable ids (Smart-AI-Memory/attune-ai#666).A diagnosis caveat worth passing on: eventual indexing + the relevance cutoff in semantic search make this easy to misread in probes — counting via empty-text search until the count stabilizes was the only reliable verification method we found.
Ask
Document the dedup semantics explicitly: (a) semantic, content-driven; (b) ids don't opt a record out; (c) accumulate-style consumers should pass
deduplicate=False. A per-record opt-out (or id-respecting mode) would be a nice enhancement but the doc note alone prevents the data-loss surprise.Server + client 0.14.0, redis-stack, Ollama embeddings. Happy to PR the docs.