docs: refine Comparison with Alternatives (#339)

zc277584121 · web-flow · commit 8988b6eea4c6 · 2026-04-14T11:48:04.000+08:00
- Drop the forced two-category split. memsearch is both an engine and
  a set of native plugins, and several other projects (notably qmd)
  straddle the same line, so the grouping was misleading.
- Fix qmd's characterization. It is tobi/qmd — a local CLI engine
  plus MCP server (stdio/HTTP) plus a Claude Code skill wrapper, with
  BM25 + dense + a local GGUF LLM reranker. The earlier page called
  it an MCP-only or coding-CLI plugin project, which was wrong.
- Include qmd in all three detailed tables (integration surface,
  write semantics, search &amp; retrieval) so the matrix is complete.
- Adjust the differentiator section to stop overclaiming: qmd also
  does hybrid search and treats markdown as the source of truth, so
  those rows now name qmd as a peer instead of implying uniqueness.
- Drop the 'benchmark numbers are not comparable' disclaimer — it
  read defensively and the comparison is already about architectural
  shape, not accuracy.

Signed-off-by: Cheney Zhang &lt;chen.zhang@zilliz.com&gt;
diff --git a/docs/home/comparison.md b/docs/home/comparison.md
@@ -1,51 +1,44 @@
 # Comparison with Alternatives
 
-This page compares **memsearch** with other open-source memory solutions for LLMs and AI agents. Each project solves a real problem, and the right choice depends on *what kind of agent you are building* and *how you want memory to be stored*.
-
-We group the projects below into two categories:
-
-- **Coding-CLI memory plugins** — attach to an existing agent CLI (Claude Code, Codex, OpenCode, OpenClaw, Cursor, …) and give it persistent memory: memsearch, [claude-mem](https://github.com/thedotmack/claude-mem), [qmd](https://github.com/nomenclator-ninja/qmd), [MemPalace](https://github.com/milla-jovovich/mempalace).
-- **General-purpose agent memory systems** — memory libraries or agent runtimes you build applications on top of: [mem0](https://github.com/mem0ai/mem0), [Letta / MemGPT](https://github.com/letta-ai/letta).
+This page compares **memsearch** with other open-source memory solutions for LLMs and AI agents. memsearch sits on both sides of the stack at once — it is a Python library / CLI engine *and* a set of native plugins for four coding CLIs — so below we compare it against projects that sit anywhere along that spectrum: [claude-mem](https://github.com/thedotmack/claude-mem), [qmd](https://github.com/tobi/qmd), [MemPalace](https://github.com/milla-jovovich/mempalace), [mem0](https://github.com/mem0ai/mem0), and [Letta / MemGPT](https://github.com/letta-ai/letta).
 
 ---
 
 ## TL;DR
 
 | | memsearch | claude-mem | qmd | MemPalace | mem0 | Letta (MemGPT) |
 |---|:---:|:---:|:---:|:---:|:---:|:---:|
-| **Category** | Coding-CLI plugin | Coding-CLI plugin | Coding-CLI plugin | Coding-CLI plugin | General memory library | Agent framework / runtime |
-| **Integration** | Native hooks + skills (4 CLIs) | Native hooks (Claude Code) | MCP | MCP | Python / JS SDK, REST, MCP | Rewrite agent on Letta runtime |
-| **Source of truth** | Plain `.md` files | SQLite + ChromaDB | `.md` files | ChromaDB | Vector DB (+ optional graph DB) | Postgres (memory blocks + archival) |
+| **Shape** | CLI engine + native plugins for 4 coding CLIs | Claude Code–only plugin | CLI engine + MCP server + Claude Code skill | MCP server | Memory library / SDK / hosted service | Stateful-agent framework and runtime |
+| **Integration** | Native hooks + skills (Claude Code, OpenClaw, OpenCode, Codex CLI) | Native hooks (Claude Code) | MCP (stdio/HTTP) + Claude Code skill | MCP | Python / JS SDK, REST, OpenMemory MCP | Build agent on Letta runtime |
+| **Source of truth** | Plain `.md` files | SQLite + ChromaDB | Plain `.md` files | ChromaDB | Vector DB (+ optional graph DB) | Postgres (memory blocks + archival) |
 | **Write strategy** | Append-only daily logs | LLM-summarized transcripts | External (read-only search engine) | Store-everything raw | LLM extracts facts, LLM decides add/update/delete | LLM self-edits memory (`memory_replace`, `memory_insert`, …) |
-| **Search** | Hybrid: dense + BM25 + RRF | Dense + FTS5 | Hybrid: dense + BM25 + RRF + query expansion | Dense (ChromaDB) | Dense (+ optional graph traversal) | Dense archival + conversation search |
-| **Local-first default** | ONNX bge-m3 (no API key) | WASM MiniLM | Local GGUF | Local ChromaDB + Llama | Requires LLM API for every write | Depends on configured backend |
+| **Search** | Hybrid: dense + BM25 + RRF | Dense + FTS5 | Hybrid: BM25 + dense + local LLM rerank (GGUF) | Dense (ChromaDB) | Dense (+ optional graph traversal) | Dense archival + conversation search |
+| **Local-first default** | ONNX bge-m3 (no API key) | WASM MiniLM | Local GGUF (embedding + reranker) | Local ChromaDB + Llama | Requires LLM API for every write | Depends on configured backend |
 | **Scale path** | Milvus Lite → Server → Zilliz Cloud (same API) | Single machine | Single machine | Single machine | Depends on chosen vector DB | Postgres / pgvector |
 
-> None of the benchmark numbers published by individual projects (LOCOMO, LongMemEval, etc.) are directly comparable because the evaluation setups differ. We do not claim a benchmark win here — the comparison is about *architectural shape*, not accuracy numbers.
-
 ---
 
 ## Quick orientation
 
 ### memsearch
 
-A cross-platform semantic memory plugin for coding-CLI agents. Ships **native plugins** for Claude Code, OpenClaw, OpenCode, and Codex CLI (not MCP adapters — actual per-platform hooks and skills). Stores memory as plain markdown daily logs; Milvus is a derived hybrid-search index rebuildable from the markdown at any time.
+Both a search engine and a set of native plugins. The core is a Python library + `memsearch` CLI that indexes markdown into Milvus with hybrid search. On top of that, memsearch ships first-class plugins for Claude Code, OpenClaw, OpenCode, and Codex CLI — each plugin hooks into the host CLI's lifecycle (session start / prompt submit / stop / session end) to auto-capture memory and inject cold-start context. Markdown daily logs are the canonical store; Milvus is a derived index that can be rebuilt any time.
 
 ### claude-mem
 
 Memory for Claude Code only. Hooks compress session transcripts using an LLM and store the result in ChromaDB + SQLite. Storage is opaque (binary DB), Claude Code–specific.
 
 ### qmd
 
-Local-first MCP search engine for markdown notes. Read-only — it searches existing markdown; capture is left to the user or external tools. Share the same markdown-as-source-of-truth philosophy as memsearch.
+A local-first search engine for markdown notes. Ships as a CLI + MCP server (stdio or HTTP) and a Claude Code skill wrapper. Uses BM25 + dense + an LLM reranker, all running locally via `node-llama-cpp` with GGUF models. It is a **search engine**, not a memory writer — you (or another tool) are responsible for producing the markdown it indexes.
 
 ### MemPalace
 
 A memory server organized around the *method of loci* ("wings → halls → rooms"). Stores conversations raw in ChromaDB without LLM extraction, then exposes them to chat clients (Claude Code, ChatGPT, Cursor) via MCP. Runs fully offline with local Llama + ChromaDB.
 
 ### mem0
 
-A general-purpose memory layer for LLM applications (not tied to any specific coding CLI). Every write goes through an LLM that extracts entities and relationships, decides whether to add / update / delete existing memories, and stores the results in a configurable vector DB — optionally mirrored to a graph DB (Neo4j, Memgraph, Neptune, Kuzu, AGE). Published as a Python/JS SDK, REST API, hosted platform, and (via OpenMemory) an MCP server.
+A general-purpose memory layer for LLM applications, not tied to any specific coding CLI. Every write goes through an LLM that extracts entities and relationships, decides whether to add / update / delete existing memories, and stores the results in a configurable vector DB — optionally mirrored to a graph DB (Neo4j, Memgraph, Neptune, Kuzu, AGE). Published as a Python / JS SDK, REST API, hosted platform, and (via OpenMemory) an MCP server.
 
 ### Letta (formerly MemGPT)
 
@@ -57,37 +50,37 @@ A full **agent framework and server** built around the "LLM as an operating syst
 
 ### Integration surface
 
-| | memsearch | mem0 | MemPalace | Letta |
-|---|:---:|:---:|:---:|:---:|
-| Claude Code native plugin | ✅ (hooks + skills) | ❌ (MCP only) | ❌ (MCP only) | ❌ (runtime, not plugin) |
-| OpenClaw native plugin | ✅ | ❌ | ❌ | ❌ |
-| OpenCode native plugin | ✅ | ❌ | ❌ | ❌ |
-| Codex CLI native plugin | ✅ | ❌ | ❌ | ❌ |
-| Generic MCP | Not shipped | ✅ (OpenMemory) | ✅ | ❌ |
-| Library / SDK | Python | Python, JS | Python | Python |
+| | memsearch | claude-mem | qmd | MemPalace | mem0 | Letta |
+|---|:---:|:---:|:---:|:---:|:---:|:---:|
+| Claude Code native plugin | ✅ (hooks + skills) | ✅ (hooks) | ✅ (skill wrapper over MCP) | ❌ (MCP only) | ❌ (MCP only) | ❌ (runtime, not plugin) |
+| OpenClaw native plugin | ✅ | ❌ | ❌ | ❌ | ❌ | ❌ |
+| OpenCode native plugin | ✅ | ❌ | ❌ | ❌ | ❌ | ❌ |
+| Codex CLI native plugin | ✅ | ❌ | ❌ | ❌ | ❌ | ❌ |
+| Generic MCP | Not shipped | ❌ | ✅ (stdio / HTTP) | ✅ | ✅ (OpenMemory) | ❌ |
+| Library / SDK | Python | — | Node.js CLI | Python | Python, JS | Python |
 
-"Native plugin" means memsearch participates in the CLI's own lifecycle events (SessionStart, UserPromptSubmit, Stop, SessionEnd, …) with collection naming, per-project isolation, and skill registration. Generic MCP integrations only expose tools to the LLM — they cannot write daily memory notes at the end of a session, or inject cold-start context at session start.
+"Native plugin" here means participating in a CLI's own lifecycle events (SessionStart, UserPromptSubmit, Stop, SessionEnd, …) with collection naming, per-project isolation, and skill registration. Generic MCP integrations only expose tools to the LLM — they cannot write daily memory notes at the end of a session, or inject cold-start context at session start.
 
 ### Memory write semantics
 
-| | memsearch | mem0 | MemPalace | Letta |
-|---|---|---|---|---|
-| Who decides what to store? | Session-end hook summarizes the last turn as third-person notes | An LLM extracts "salient facts" on every write | Nobody — raw transcript is stored as-is | The agent itself, via tool calls during the reasoning loop |
-| Updates to prior memories? | Append-only (never mutates history) | LLM may update or delete prior memories during the update phase | Append-only | Agent can rewrite core memory blocks at any time |
-| LLM cost per write | One small Haiku call per turn (async, non-blocking) | LLM extraction call(s) per write | None (no LLM on the write path) | Depends on the agent loop — each self-edit is an LLM tool call |
-| Auditability | `git log` on `memory/YYYY-MM-DD.md` | Inspect rows in the vector/graph DB | Inspect ChromaDB | Inspect Postgres tables |
+| | memsearch | claude-mem | qmd | MemPalace | mem0 | Letta |
+|---|---|---|---|---|---|---|
+| Who decides what to store? | Session-end hook summarizes the last turn as third-person notes | LLM-compressed session transcript written by hooks | Nobody — qmd is a read-only search engine over markdown you already have | Nobody — raw transcript stored as-is | LLM extracts "salient facts" on every write | The agent itself, via tool calls during the reasoning loop |
+| Updates to prior memories? | Append-only (never mutates history) | Each session appends a new compressed record | N/A (no write path) | Append-only | LLM may update or delete prior memories during the update phase | Agent can rewrite core memory blocks at any time |
+| LLM cost per write | One small Haiku call per turn (async, non-blocking) | LLM compression at session end | None | None | LLM extraction call(s) per write | Depends on the agent loop — each self-edit is an LLM tool call |
+| Auditability | `git log` on `memory/YYYY-MM-DD.md` | Inspect SQLite + ChromaDB | `git log` on your markdown (whatever produced it) | Inspect ChromaDB | Inspect rows in the vector/graph DB | Inspect Postgres tables |
 
 **Append-only vs. self-editing** is the key philosophical split. memsearch treats memory like a commit log: once written, always auditable. mem0 and Letta treat memory like a mutable KV store that the LLM maintains — which can converge on cleaner facts, but also means prior writes can be silently rewritten or deleted by a later LLM call.
 
 ### Search & retrieval
 
-| | memsearch | mem0 | MemPalace | Letta |
-|---|---|---|---|---|
-| Dense vectors | ✅ | ✅ | ✅ | ✅ (archival memory) |
-| BM25 / sparse | ✅ (RRF fused with dense) | ❌ by default | ❌ | ❌ |
-| Reranking | Optional cross-encoder (ONNX) | ❌ | ❌ | ❌ |
-| Graph traversal | ❌ | ✅ (optional graph backend) | ❌ | ❌ |
-| Progressive disclosure | L1 search → L2 expand section → L3 drill into original transcript JSONL | Single top-K retrieval | Four-layer context loading (L0–L3) | Core memory always in context; archival pulled on-demand |
+| | memsearch | claude-mem | qmd | MemPalace | mem0 | Letta |
+|---|---|---|---|---|---|---|
+| Dense vectors | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ (archival) |
+| BM25 / sparse | ✅ (RRF fused with dense) | ❌ (FTS5 only) | ✅ (BM25 + dense) | ❌ | ❌ by default | ❌ |
+| Reranking | Optional cross-encoder (ONNX) | ❌ | ✅ (local LLM reranker, GGUF) | ❌ | ❌ | ❌ |
+| Graph traversal | ❌ | ❌ | ❌ | ❌ | ✅ (optional graph backend) | ❌ |
+| Progressive disclosure | L1 search → L2 expand section → L3 drill into original transcript JSONL | Single top-K retrieval | Single top-K retrieval | Four-layer context loading (L0–L3) | Single top-K retrieval | Core memory always in context; archival pulled on-demand |
 
 ---
 
@@ -97,21 +90,21 @@ We try to keep this list honest — only things that are real consequences of th
 
 ### 1. Native plugins for four coding CLIs, not just an MCP adapter
 
-memsearch ships first-class plugins for Claude Code, OpenClaw, OpenCode, and Codex CLI. Each plugin hooks into that CLI's lifecycle (session start / prompt submit / stop / session end) to capture memory automatically and inject cold-start context. None of mem0, MemPalace, or Letta ship native integrations for these coding CLIs — they expose memory tools over MCP or a REST API, which is a thinner integration.
+memsearch ships first-class plugins for Claude Code, OpenClaw, OpenCode, and Codex CLI. Each plugin hooks into that CLI's lifecycle (session start / prompt submit / stop / session end) to capture memory automatically and inject cold-start context. claude-mem integrates at this level but only for Claude Code; qmd has a Claude Code skill wrapper but no plugin for the other three CLIs. mem0, MemPalace, and Letta expose memory only over MCP or a REST / framework API — no CLI-lifecycle integration at all.
 
 ### 2. Plain markdown is the canonical store; the vector DB is derived
 
-Your memory lives in `memory/YYYY-MM-DD.md` and `MEMORY.md`. You can `cat`, `grep`, `git diff`, and `git blame` it. If you lose the Milvus index, you rebuild it from the markdown. mem0 and Letta both store memory inside a database (vector DB / Postgres) — their storage is opaque by design. MemPalace stores in ChromaDB only.
+Your memory lives in `memory/YYYY-MM-DD.md` and `MEMORY.md`. You can `cat`, `grep`, `git diff`, and `git blame` it. If you lose the Milvus index, you rebuild it from the markdown. qmd shares this markdown-as-source-of-truth property (it indexes whatever markdown you give it). claude-mem, MemPalace, mem0, and Letta all keep their canonical state in a database (SQLite + ChromaDB / ChromaDB / vector DB / Postgres).
 
 ### 3. Writes are cheap and append-only
 
 A memsearch write is: extract the last turn → one Haiku summarization call → append a bullet to today's `.md`. No LLM "decides what to forget." No self-editing. No entity extraction pipeline. This makes writes cheap, predictable, and fully auditable — at the cost of not auto-compressing redundant memories (you can run `memsearch compact` on demand if you want that).
 
-mem0 and Letta are on the other end of the spectrum: they rely on LLMs to curate memory on the write path, which is more powerful but introduces cost, latency, and the possibility of silent data loss.
+mem0 and Letta are on the other end of the spectrum: they rely on LLMs to curate memory on the write path, which is more powerful but introduces cost, latency, and the possibility of silent data loss. qmd sidesteps the question entirely by not writing at all.
 
 ### 4. Hybrid search with BM25 fused via RRF, out of the box
 
-memsearch indexes every chunk with both a dense vector and a BM25 sparse vector, and fuses them at query time with Reciprocal Rank Fusion. Exact keyword hits (function names, file paths, error strings) and semantic matches both surface. mem0, MemPalace, and Letta archival are dense-only by default.
+memsearch indexes every chunk with both a dense vector and a BM25 sparse vector, and fuses them at query time with Reciprocal Rank Fusion. Exact keyword hits (function names, file paths, error strings) and semantic matches both surface. qmd also does hybrid search (BM25 + dense) and adds a local LLM reranker; claude-mem uses FTS5 only; mem0, MemPalace, and Letta archival are dense-only by default.
 
 ### 5. A clear scale path on one API
 
@@ -140,4 +133,4 @@ A few cases where memsearch is *not* what you want:
 - Letta (MemGPT) — [github.com/letta-ai/letta](https://github.com/letta-ai/letta), [docs.letta.com/concepts/memgpt](https://docs.letta.com/concepts/memgpt/)
 - MemPalace — [github.com/milla-jovovich/mempalace](https://github.com/milla-jovovich/mempalace)
 - claude-mem — [github.com/thedotmack/claude-mem](https://github.com/thedotmack/claude-mem)
-- qmd — [github.com/nomenclator-ninja/qmd](https://github.com/nomenclator-ninja/qmd)
+- qmd — [github.com/tobi/qmd](https://github.com/tobi/qmd)