Skip to content

docs: deduplicate=True merges distinct-but-similar records — ids are upsert keys, not merge guards #309

@silversurfer562

Description

@silversurfer562

Summary

create_long_term_memory(..., deduplicate=True) (the default) performs semantic deduplication that can MERGE distinct-but-similar records — and supplying distinct ids does not prevent the merge. The id acts as an upsert key (identical re-write → same id → no duplicate), not a merge guard; the merge decision is content-driven.

How it bit us

An accumulate-style store (session findings, several per day, often similar phrasing) lost records to merges despite unique ids. The only reliable guard was deduplicate=False + caller-supplied stable ids (Smart-AI-Memory/attune-ai#666).

A diagnosis caveat worth passing on: eventual indexing + the relevance cutoff in semantic search make this easy to misread in probes — counting via empty-text search until the count stabilizes was the only reliable verification method we found.

Ask

Document the dedup semantics explicitly: (a) semantic, content-driven; (b) ids don't opt a record out; (c) accumulate-style consumers should pass deduplicate=False. A per-record opt-out (or id-respecting mode) would be a nice enhancement but the doc note alone prevents the data-loss surprise.

Server + client 0.14.0, redis-stack, Ollama embeddings. Happy to PR the docs.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions