This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
# Install in development mode
uv sync --all-extras
# Run all tests (use python -m pytest to avoid system pytest conflicts)
uv run python -m pytest
# Run a single test file
uv run python -m pytest tests/test_chunker.py
# Run a specific test
uv run python -m pytest tests/test_store.py::test_upsert_and_search -v
# Serve docs locally
uv run mkdocs serve
# Run the CLI
uv run memsearch --helpmemsearch is a semantic memory search engine for markdown knowledge bases, built on Milvus.
Markdown files → Scanner → Chunker → Embedder → MilvusStore
↓
User query → Embedder → Hybrid Search (dense + BM25 + RRF) → Results
core.py—MemSearchclass: the public Python API that orchestrates everything. Entry point forindex(),search(),compact(),watch().store.py—MilvusStore: Milvus wrapper handling collection creation, upsert, hybrid search (dense cosine + BM25 sparse + RRF reranking), and cleanup. Thechunk_hash(composite ID of source+lines+content+model) is the VARCHAR primary key.chunker.py— Splits markdown by headings intoChunkdataclasses. SHA-256 content hash enables dedup.compute_chunk_id()generates composite IDs matching OpenClaw's format.embeddings/__init__.py—EmbeddingProviderprotocol + lazy-loading factory (get_provider()). Providers: openai (default), google, voyage, jina, mistral, ollama, local, onnx.scanner.py— Walks directories to find.md/.markdownfiles, returnsScannedFilelist.config.py— Layered TOML config: dataclass defaults →~/.memsearch/config.toml→.memsearch.toml→ CLI flags.cli.py— Click CLI wrapping the Python API. All commands resolve config viaresolve_config()then instantiateMemSearch.watcher.py—watchdog-based file watcher with debounce, used bymemsearch watchand the Claude Code plugin.compact.py— LLM-powered chunk summarization (OpenAI/Anthropic/Gemini).reranker.py— Optional cross-encoder reranking (ONNX or PyTorch backend). Disabled by default; enable viareranker.modelconfig.
The plugin is a first-class component of memsearch — it's the primary real-world application that demonstrates the library in action. It gives Claude Code automatic persistent memory across sessions with zero user intervention.
Architecture: 4 shell hooks + 1 skill + 1 background watcher
plugins/claude-code/
├── hooks/
│ ├── common.sh # Shared setup: PATH, memsearch detection, collection name, watch PID
│ ├── session-start.sh # SessionStart: start watch, write session heading, inject recent memories
│ ├── user-prompt-submit.sh # UserPromptSubmit: lightweight hint reminding Claude about memory skill
│ ├── stop.sh # Stop: extract last turn → haiku summarize (third-person) → append to daily .md (async)
│ ├── session-end.sh # SessionEnd: stop watch process
│ └── parse-transcript.sh # Last-turn extractor: finds last user question → EOF, formats with role labels for LLM (Python 3, no jq)
├── scripts/
│ └── derive-collection.sh # Derive per-project collection name from project path
├── transcript.py # JSONL transcript parser for Claude Code conversation files (L3 deep drill)
└── skills/
└── memory-recall/
└── SKILL.md # Skill (context: fork): search → expand → transcript in subagent
Key design: skill-based memory recall. Memory retrieval is handled by a memory-recall skill that runs in a forked subagent context (context: fork). Claude automatically invokes the skill when it judges the user's question could benefit from historical context. The subagent autonomously performs search, evaluates relevance, expands promising results, and returns a curated summary — all without polluting the main conversation context.
Three-layer progressive disclosure (all in subagent):
- L1 (search): Subagent runs
memsearch searchto find relevant chunks - L2 (expand): Subagent runs
memsearch expand <chunk_hash>to get full markdown sections - L3 (transcript): Subagent runs
python3 ${CLAUDE_PLUGIN_ROOT}/transcript.py <jsonl>to drill into original conversations
Supporting hooks:
SessionStartinjects cold-start context (recent daily logs) so Claude knows history existsUserPromptSubmitreturns a lightweightsystemMessagehint ("[memsearch] Memory available") to increase skill trigger awarenessStophook is async and non-blocking — extracts last turn only, callsclaude -p --model haiku(withCLAUDECODE=to bypass nested session detection) to summarize as third-person notes, appends to daily.md
When modifying hooks/skills, keep in mind:
- All hooks output JSON to stdout (
additionalContextfor context injection,systemMessagefor visible hints, or empty{}) common.shis sourced by every hook — changes there affect all hooks. It derives a per-projectCOLLECTION_NAMEviaderive-collection.shand passes--collectionautomatically throughrun_memsearch()andstart_watch()- The watch process uses a PID file (
.memsearch/.watch.pid) for singleton behavior. Milvus Lite falls back to one-timeindex()at session start stop.shhas a recursion guard (stop_hook_active) since it callsclaude -pinternally, and setsMEMSEARCH_NO_WATCH=1to prevent the child process from interfering with the main session's watch- The
memory-recallskill usescontext: fork— the subagent has its own context window and does not see main conversation history transcript.pylives in the plugin directory (not in core library) since it is entirely Claude Code JSONL-specific
- Markdown is the source of truth. Milvus is a derived index, rebuildable anytime from
.mdfiles. - Composite chunk ID as PK.
hash(source:startLine:endLine:contentHash:model)— enables natural dedup without a separate cache. - ONNX bge-m3 as plugin default. The Claude Code plugin hooks default to
onnxprovider (bge-m3, CPU, no API key). The Python API still defaults toopenai. - Hybrid search by default. Every collection has both dense vector and BM25 sparse fields. Search uses RRF to combine them. RRF scores are normalized to
[0, 1](theoretical max =num_retrievers / (k + 1)). [llm]+[prompts]config. New config sections for LLM provider selection and custom prompt templates.[compact]is deprecated but still works (fallback:[llm]>[compact]> defaults). Plugins readprompts.summarizefor custom session summarization prompts. Migration plan:[compact]will be removed in the next major version (1.0). During the transition,resolve_config()emits aDeprecationWarningwhen user config files contain[compact]. The compact CLI command resolves LLM settings ascfg.llm.* or cfg.compact.*.- Shared prompt template. All four plugins share a single
summarize.txttemplate (maintained inplugins/_shared/prompts/, synced viascripts/sync-prompts.sh). Template uses{{AGENT_NAME}}placeholder. - Remote Milvus
query()requires a filter. Usechunk_hash != ""as a "match all" filter when no filter is provided (Milvus Lite doesn't enforce this, but Milvus Server does).
Five independent version numbers — bump only the ones that changed:
| Component | Version file | Publish channel |
|---|---|---|
| memsearch (PyPI) | pyproject.toml |
PyPI (automated via GitHub Actions on tag push) |
| Claude Code plugin | plugins/claude-code/.claude-plugin/plugin.json |
Marketplace (.claude-plugin/marketplace.json) |
| OpenClaw plugin | plugins/openclaw/package.json |
ClawHub (clawhub package publish) |
| OpenCode plugin | plugins/opencode/package.json |
npm (@zilliz/memsearch-opencode) |
| Codex CLI plugin | (none) | install.sh (no version management) |
See CLAUDE.local.md for detailed release procedures, current versions, and operational details.
- Uses
uv+pyproject.tomlfor dependency management (not pip). - Optional deps via extras:
[google],[voyage],[ollama],[local],[onnx],[all]. The Claude Code plugin usesmemsearch[onnx]for zero-config ONNX embedding. - Docs at
docs/use mkdocs-material. Thesite/directory is build output — do not commit. - Always use
uv run python -m pytestinstead ofuv run pytestto avoid system Python pytest conflicts.