You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
- Drop the forced two-category split. memsearch is both an engine and
a set of native plugins, and several other projects (notably qmd)
straddle the same line, so the grouping was misleading.
- Fix qmd's characterization. It is tobi/qmd — a local CLI engine
plus MCP server (stdio/HTTP) plus a Claude Code skill wrapper, with
BM25 + dense + a local GGUF LLM reranker. The earlier page called
it an MCP-only or coding-CLI plugin project, which was wrong.
- Include qmd in all three detailed tables (integration surface,
write semantics, search & retrieval) so the matrix is complete.
- Adjust the differentiator section to stop overclaiming: qmd also
does hybrid search and treats markdown as the source of truth, so
those rows now name qmd as a peer instead of implying uniqueness.
- Drop the 'benchmark numbers are not comparable' disclaimer — it
read defensively and the comparison is already about architectural
shape, not accuracy.
Signed-off-by: Cheney Zhang <chen.zhang@zilliz.com>
This page compares **memsearch** with other open-source memory solutions for LLMs and AI agents. Each project solves a real problem, and the right choice depends on *what kind of agent you are building* and *how you want memory to be stored*.
4
-
5
-
We group the projects below into two categories:
6
-
7
-
-**Coding-CLI memory plugins** — attach to an existing agent CLI (Claude Code, Codex, OpenCode, OpenClaw, Cursor, …) and give it persistent memory: memsearch, [claude-mem](https://github.com/thedotmack/claude-mem), [qmd](https://github.com/nomenclator-ninja/qmd), [MemPalace](https://github.com/milla-jovovich/mempalace).
8
-
-**General-purpose agent memory systems** — memory libraries or agent runtimes you build applications on top of: [mem0](https://github.com/mem0ai/mem0), [Letta / MemGPT](https://github.com/letta-ai/letta).
3
+
This page compares **memsearch** with other open-source memory solutions for LLMs and AI agents. memsearch sits on both sides of the stack at once — it is a Python library / CLI engine *and* a set of native plugins for four coding CLIs — so below we compare it against projects that sit anywhere along that spectrum: [claude-mem](https://github.com/thedotmack/claude-mem), [qmd](https://github.com/tobi/qmd), [MemPalace](https://github.com/milla-jovovich/mempalace), [mem0](https://github.com/mem0ai/mem0), and [Letta / MemGPT](https://github.com/letta-ai/letta).
|**Local-first default**| ONNX bge-m3 (no API key) | WASM MiniLM | Local GGUF | Local ChromaDB + Llama | Requires LLM API for every write | Depends on configured backend |
|**Local-first default**| ONNX bge-m3 (no API key) | WASM MiniLM | Local GGUF (embedding + reranker) | Local ChromaDB + Llama | Requires LLM API for every write | Depends on configured backend |
22
17
|**Scale path**| Milvus Lite → Server → Zilliz Cloud (same API) | Single machine | Single machine | Single machine | Depends on chosen vector DB | Postgres / pgvector |
23
18
24
-
> None of the benchmark numbers published by individual projects (LOCOMO, LongMemEval, etc.) are directly comparable because the evaluation setups differ. We do not claim a benchmark win here — the comparison is about *architectural shape*, not accuracy numbers.
25
-
26
19
---
27
20
28
21
## Quick orientation
29
22
30
23
### memsearch
31
24
32
-
A cross-platform semantic memory plugin for coding-CLI agents. Ships **native plugins** for Claude Code, OpenClaw, OpenCode, and Codex CLI (not MCP adapters — actual per-platform hooks and skills). Stores memory as plain markdown daily logs; Milvus is a derived hybrid-search index rebuildable from the markdown at any time.
25
+
Both a search engine and a set of native plugins. The core is a Python library + `memsearch`CLI that indexes markdown into Milvus with hybrid search. On top of that, memsearch ships first-class plugins for Claude Code, OpenClaw, OpenCode, and Codex CLI — each plugin hooks into the host CLI's lifecycle (session start / prompt submit / stop / session end) to auto-capture memory and inject cold-start context. Markdown daily logs are the canonical store; Milvus is a derived index that can be rebuilt any time.
33
26
34
27
### claude-mem
35
28
36
29
Memory for Claude Code only. Hooks compress session transcripts using an LLM and store the result in ChromaDB + SQLite. Storage is opaque (binary DB), Claude Code–specific.
37
30
38
31
### qmd
39
32
40
-
Local-first MCP search engine for markdown notes. Read-only — it searches existing markdown; capture is left to the user or external tools. Share the same markdown-as-source-of-truth philosophy as memsearch.
33
+
A local-first search engine for markdown notes. Ships as a CLI + MCP server (stdio or HTTP) and a Claude Code skill wrapper. Uses BM25 + dense + an LLM reranker, all running locally via `node-llama-cpp` with GGUF models. It is a **search engine**, not a memory writer — you (or another tool) are responsible for producing the markdown it indexes.
41
34
42
35
### MemPalace
43
36
44
37
A memory server organized around the *method of loci* ("wings → halls → rooms"). Stores conversations raw in ChromaDB without LLM extraction, then exposes them to chat clients (Claude Code, ChatGPT, Cursor) via MCP. Runs fully offline with local Llama + ChromaDB.
45
38
46
39
### mem0
47
40
48
-
A general-purpose memory layer for LLM applications (not tied to any specific coding CLI). Every write goes through an LLM that extracts entities and relationships, decides whether to add / update / delete existing memories, and stores the results in a configurable vector DB — optionally mirrored to a graph DB (Neo4j, Memgraph, Neptune, Kuzu, AGE). Published as a Python/JS SDK, REST API, hosted platform, and (via OpenMemory) an MCP server.
41
+
A general-purpose memory layer for LLM applications, not tied to any specific coding CLI. Every write goes through an LLM that extracts entities and relationships, decides whether to add / update / delete existing memories, and stores the results in a configurable vector DB — optionally mirrored to a graph DB (Neo4j, Memgraph, Neptune, Kuzu, AGE). Published as a Python / JS SDK, REST API, hosted platform, and (via OpenMemory) an MCP server.
49
42
50
43
### Letta (formerly MemGPT)
51
44
@@ -57,37 +50,37 @@ A full **agent framework and server** built around the "LLM as an operating syst
"Native plugin" means memsearch participates in the CLI's own lifecycle events (SessionStart, UserPromptSubmit, Stop, SessionEnd, …) with collection naming, per-project isolation, and skill registration. Generic MCP integrations only expose tools to the LLM — they cannot write daily memory notes at the end of a session, or inject cold-start context at session start.
62
+
"Native plugin" here means participating in a CLI's own lifecycle events (SessionStart, UserPromptSubmit, Stop, SessionEnd, …) with collection naming, per-project isolation, and skill registration. Generic MCP integrations only expose tools to the LLM — they cannot write daily memory notes at the end of a session, or inject cold-start context at session start.
70
63
71
64
### Memory write semantics
72
65
73
-
|| memsearch |mem0| MemPalace | Letta |
74
-
|---|---|---|---|---|
75
-
| Who decides what to store? | Session-end hook summarizes the last turn as third-person notes |An LLM extracts "salient facts" on every write | Nobody — raw transcript is stored as-is | The agent itself, via tool calls during the reasoning loop |
76
-
| Updates to prior memories? | Append-only (never mutates history) | LLM may update or delete prior memories during the update phase| Append-only| Agent can rewrite core memory blocks at any time |
77
-
| LLM cost per write | One small Haiku call per turn (async, non-blocking) | LLM extraction call(s) per write| None (no LLM on the write path)| Depends on the agent loop — each self-edit is an LLM tool call |
78
-
| Auditability |`git log` on `memory/YYYY-MM-DD.md`| Inspect rows in the vector/graph DB| Inspect ChromaDB| Inspect Postgres tables |
| Who decides what to store? | Session-end hook summarizes the last turn as third-person notes | LLM-compressed session transcript written by hooks | Nobody — qmd is a read-only search engine over markdown you already have | Nobody — raw transcript stored as-is| LLM extracts "salient facts" on every write| The agent itself, via tool calls during the reasoning loop |
69
+
| Updates to prior memories? | Append-only (never mutates history) |Each session appends a new compressed record | N/A (no write path) | Append-only |LLM may update or delete prior memories during the update phase | Agent can rewrite core memory blocks at any time |
70
+
| LLM cost per write | One small Haiku call per turn (async, non-blocking) | LLM compression at session end| None | None | LLM extraction call(s) per write| Depends on the agent loop — each self-edit is an LLM tool call |
71
+
| Auditability |`git log` on `memory/YYYY-MM-DD.md`| Inspect SQLite + ChromaDB |`git log` on your markdown (whatever produced it) | Inspect ChromaDB | Inspect rows in the vector/graph DB | Inspect Postgres tables |
79
72
80
73
**Append-only vs. self-editing** is the key philosophical split. memsearch treats memory like a commit log: once written, always auditable. mem0 and Letta treat memory like a mutable KV store that the LLM maintains — which can converge on cleaner facts, but also means prior writes can be silently rewritten or deleted by a later LLM call.
| Progressive disclosure | L1 search → L2 expand section → L3 drill into original transcript JSONL | Single top-K retrieval |Single top-K retrieval |Four-layer context loading (L0–L3)| Single top-K retrieval| Core memory always in context; archival pulled on-demand |
91
84
92
85
---
93
86
@@ -97,21 +90,21 @@ We try to keep this list honest — only things that are real consequences of th
97
90
98
91
### 1. Native plugins for four coding CLIs, not just an MCP adapter
99
92
100
-
memsearch ships first-class plugins for Claude Code, OpenClaw, OpenCode, and Codex CLI. Each plugin hooks into that CLI's lifecycle (session start / prompt submit / stop / session end) to capture memory automatically and inject cold-start context. None of mem0, MemPalace, or Letta ship native integrations for these coding CLIs — they expose memory tools over MCP or a REST API, which is a thinner integration.
93
+
memsearch ships first-class plugins for Claude Code, OpenClaw, OpenCode, and Codex CLI. Each plugin hooks into that CLI's lifecycle (session start / prompt submit / stop / session end) to capture memory automatically and inject cold-start context. claude-mem integrates at this level but only for Claude Code; qmd has a Claude Code skill wrapper but no plugin for the other three CLIs. mem0, MemPalace, and Letta expose memory only over MCP or a REST / framework API — no CLI-lifecycle integration at all.
101
94
102
95
### 2. Plain markdown is the canonical store; the vector DB is derived
103
96
104
-
Your memory lives in `memory/YYYY-MM-DD.md` and `MEMORY.md`. You can `cat`, `grep`, `git diff`, and `git blame` it. If you lose the Milvus index, you rebuild it from the markdown. mem0 and Letta both store memory inside a database (vector DB / Postgres) — their storage is opaque by design. MemPalace stores in ChromaDB only.
97
+
Your memory lives in `memory/YYYY-MM-DD.md` and `MEMORY.md`. You can `cat`, `grep`, `git diff`, and `git blame` it. If you lose the Milvus index, you rebuild it from the markdown. qmd shares this markdown-as-source-of-truth property (it indexes whatever markdown you give it). claude-mem, MemPalace, mem0, and Letta all keep their canonical state in a database (SQLite + ChromaDB / ChromaDB / vector DB / Postgres).
105
98
106
99
### 3. Writes are cheap and append-only
107
100
108
101
A memsearch write is: extract the last turn → one Haiku summarization call → append a bullet to today's `.md`. No LLM "decides what to forget." No self-editing. No entity extraction pipeline. This makes writes cheap, predictable, and fully auditable — at the cost of not auto-compressing redundant memories (you can run `memsearch compact` on demand if you want that).
109
102
110
-
mem0 and Letta are on the other end of the spectrum: they rely on LLMs to curate memory on the write path, which is more powerful but introduces cost, latency, and the possibility of silent data loss.
103
+
mem0 and Letta are on the other end of the spectrum: they rely on LLMs to curate memory on the write path, which is more powerful but introduces cost, latency, and the possibility of silent data loss. qmd sidesteps the question entirely by not writing at all.
111
104
112
105
### 4. Hybrid search with BM25 fused via RRF, out of the box
113
106
114
-
memsearch indexes every chunk with both a dense vector and a BM25 sparse vector, and fuses them at query time with Reciprocal Rank Fusion. Exact keyword hits (function names, file paths, error strings) and semantic matches both surface. mem0, MemPalace, and Letta archival are dense-only by default.
107
+
memsearch indexes every chunk with both a dense vector and a BM25 sparse vector, and fuses them at query time with Reciprocal Rank Fusion. Exact keyword hits (function names, file paths, error strings) and semantic matches both surface. qmd also does hybrid search (BM25 + dense) and adds a local LLM reranker; claude-mem uses FTS5 only; mem0, MemPalace, and Letta archival are dense-only by default.
115
108
116
109
### 5. A clear scale path on one API
117
110
@@ -140,4 +133,4 @@ A few cases where memsearch is *not* what you want:
0 commit comments