Skip to content

Latest commit

 

History

History
129 lines (98 loc) · 9.18 KB

File metadata and controls

129 lines (98 loc) · 9.18 KB

CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

Build & Test Commands

# Install in development mode
uv sync --all-extras

# Run all tests (use python -m pytest to avoid system pytest conflicts)
uv run python -m pytest

# Run a single test file
uv run python -m pytest tests/test_chunker.py

# Run a specific test
uv run python -m pytest tests/test_store.py::test_upsert_and_search -v

# Serve docs locally
uv run mkdocs serve

# Run the CLI
uv run memsearch --help

Architecture

memsearch is a semantic memory search engine for markdown knowledge bases, built on Milvus.

Data Flow

Markdown files → Scanner → Chunker → Embedder → MilvusStore
                                                      ↓
                               User query → Embedder → Hybrid Search (dense + BM25 + RRF) → Results

Core Library (src/memsearch/)

  • core.pyMemSearch class: the public Python API that orchestrates everything. Entry point for index(), search(), compact(), watch().
  • store.pyMilvusStore: Milvus wrapper handling collection creation, upsert, hybrid search (dense cosine + BM25 sparse + RRF reranking), and cleanup. The chunk_hash (composite ID of source+lines+content+model) is the VARCHAR primary key.
  • chunker.py — Splits markdown by headings into Chunk dataclasses. SHA-256 content hash enables dedup. compute_chunk_id() generates composite IDs matching OpenClaw's format.
  • embeddings/__init__.pyEmbeddingProvider protocol + lazy-loading factory (get_provider()). Providers: openai (default), google, voyage, jina, mistral, ollama, local, onnx.
  • scanner.py — Walks directories to find .md/.markdown files, returns ScannedFile list.
  • config.py — Layered TOML config: dataclass defaults → ~/.memsearch/config.toml.memsearch.toml → CLI flags.
  • cli.py — Click CLI wrapping the Python API. All commands resolve config via resolve_config() then instantiate MemSearch.
  • watcher.pywatchdog-based file watcher with debounce, used by memsearch watch and the Claude Code plugin.
  • compact.py — LLM-powered chunk summarization (OpenAI/Anthropic/Gemini).
  • reranker.py — Optional cross-encoder reranking (ONNX or PyTorch backend). Disabled by default; enable via reranker.model config.

Claude Code Plugin (plugins/claude-code/)

The plugin is a first-class component of memsearch — it's the primary real-world application that demonstrates the library in action. It gives Claude Code automatic persistent memory across sessions with zero user intervention.

Architecture: 4 shell hooks + 1 skill + 1 background watcher

plugins/claude-code/
├── hooks/
│   ├── common.sh                # Shared setup: PATH, memsearch detection, collection name, watch PID
│   ├── session-start.sh         # SessionStart: start watch, write session heading, inject recent memories
│   ├── user-prompt-submit.sh    # UserPromptSubmit: lightweight hint reminding Claude about memory skill
│   ├── stop.sh                  # Stop: extract last turn → haiku summarize (third-person) → append to daily .md (async)
│   ├── session-end.sh           # SessionEnd: stop watch process
│   └── parse-transcript.sh      # Last-turn extractor: finds last user question → EOF, formats with role labels for LLM (Python 3, no jq)
├── scripts/
│   └── derive-collection.sh     # Derive per-project collection name from project path
├── transcript.py                # JSONL transcript parser for Claude Code conversation files (L3 deep drill)
└── skills/
    └── memory-recall/
        └── SKILL.md             # Skill (context: fork): search → expand → transcript in subagent

Key design: skill-based memory recall. Memory retrieval is handled by a memory-recall skill that runs in a forked subagent context (context: fork). Claude automatically invokes the skill when it judges the user's question could benefit from historical context. The subagent autonomously performs search, evaluates relevance, expands promising results, and returns a curated summary — all without polluting the main conversation context.

Three-layer progressive disclosure (all in subagent):

  1. L1 (search): Subagent runs memsearch search to find relevant chunks
  2. L2 (expand): Subagent runs memsearch expand <chunk_hash> to get full markdown sections
  3. L3 (transcript): Subagent runs python3 ${CLAUDE_PLUGIN_ROOT}/transcript.py <jsonl> to drill into original conversations

Supporting hooks:

  • SessionStart injects cold-start context (recent daily logs) so Claude knows history exists
  • UserPromptSubmit returns a lightweight systemMessage hint ("[memsearch] Memory available") to increase skill trigger awareness
  • Stop hook is async and non-blocking — extracts last turn only, calls claude -p --model haiku (with CLAUDECODE= to bypass nested session detection) to summarize as third-person notes, appends to daily .md

When modifying hooks/skills, keep in mind:

  • All hooks output JSON to stdout (additionalContext for context injection, systemMessage for visible hints, or empty {})
  • common.sh is sourced by every hook — changes there affect all hooks. It derives a per-project COLLECTION_NAME via derive-collection.sh and passes --collection automatically through run_memsearch() and start_watch()
  • The watch process uses a PID file (.memsearch/.watch.pid) for singleton behavior. Milvus Lite falls back to one-time index() at session start
  • stop.sh has a recursion guard (stop_hook_active) since it calls claude -p internally, and sets MEMSEARCH_NO_WATCH=1 to prevent the child process from interfering with the main session's watch
  • The memory-recall skill uses context: fork — the subagent has its own context window and does not see main conversation history
  • transcript.py lives in the plugin directory (not in core library) since it is entirely Claude Code JSONL-specific

Key Design Decisions

  • Markdown is the source of truth. Milvus is a derived index, rebuildable anytime from .md files.
  • Composite chunk ID as PK. hash(source:startLine:endLine:contentHash:model) — enables natural dedup without a separate cache.
  • ONNX bge-m3 as plugin default. The Claude Code plugin hooks default to onnx provider (bge-m3, CPU, no API key). The Python API still defaults to openai.
  • Hybrid search by default. Every collection has both dense vector and BM25 sparse fields. Search uses RRF to combine them. RRF scores are normalized to [0, 1] (theoretical max = num_retrievers / (k + 1)).
  • [llm] + [prompts] config. New config sections for LLM provider selection and custom prompt templates. [compact] is deprecated but still works (fallback: [llm] > [compact] > defaults). Plugins read prompts.summarize for custom session summarization prompts. Migration plan: [compact] will be removed in the next major version (1.0). During the transition, resolve_config() emits a DeprecationWarning when user config files contain [compact]. The compact CLI command resolves LLM settings as cfg.llm.* or cfg.compact.*.
  • Shared prompt template. All four plugins share a single summarize.txt template (maintained in plugins/_shared/prompts/, synced via scripts/sync-prompts.sh). Template uses {{AGENT_NAME}} placeholder.
  • Remote Milvus query() requires a filter. Use chunk_hash != "" as a "match all" filter when no filter is provided (Milvus Lite doesn't enforce this, but Milvus Server does).

Versioning & Release

Five independent version numbers — bump only the ones that changed:

Component Version file Publish channel
memsearch (PyPI) pyproject.toml PyPI (automated via GitHub Actions on tag push)
Claude Code plugin plugins/claude-code/.claude-plugin/plugin.json Marketplace (.claude-plugin/marketplace.json)
OpenClaw plugin plugins/openclaw/package.json ClawHub (clawhub package publish)
OpenCode plugin plugins/opencode/package.json npm (@zilliz/memsearch-opencode)
Codex CLI plugin (none) install.sh (no version management)

See CLAUDE.local.md for detailed release procedures, current versions, and operational details.

Project Conventions

  • Uses uv + pyproject.toml for dependency management (not pip).
  • Optional deps via extras: [google], [voyage], [ollama], [local], [onnx], [all]. The Claude Code plugin uses memsearch[onnx] for zero-config ONNX embedding.
  • Docs at docs/ use mkdocs-material. The site/ directory is build output — do not commit.
  • Always use uv run python -m pytest instead of uv run pytest to avoid system Python pytest conflicts.