Technical deep-dive into how ourmem stores, retrieves, and evolves memories. For API reference see API.md. For deployment see DEPLOY.md.
ourmem provides three ingestion paths, each optimized for different use cases:
┌─────────────────────────────────────────────────────────┐
│ Storage Paths │
├──────────────────┬──────────────────┬───────────────────┤
│ Conversation │ File Import │ Direct Memory │
│ Ingest │ │ Creation │
│ POST /memories │ POST /imports │ POST /memories │
│ {messages} │ multipart/form │ {content} │
└────────┬─────────┴────────┬─────────┴─────────┬─────────┘
│ │ │
┌────────▼─────────┐ ┌──────▼──────────┐ ┌─────▼──────────┐
│ Dual-Stream │ │ Intelligence │ │ Direct Store │
│ (sync + async) │ │ Task (async) │ │ (sync) │
│ │ │ │ │ │
│ Fast: session │ │ Strategy detect │ │ Memory::new() │
│ Slow: LLM path │ │ → extract │ │ → embed │
│ │ │ → reconcile │ │ → LanceDB │
└────────┬─────────┘ └──────┬──────────┘ └─────┬──────────┘
│ │ │
└──────────────────┼───────────────────┘
│
┌────────▼────────┐
│ LanceDB │
│ (per-space) │
└─────────────────┘
The primary path for plugin-driven memory capture. Uses a dual-stream architecture: a synchronous fast path stores raw messages immediately, while an asynchronous slow path extracts and reconciles facts via LLM.
Messages ──▶ Session Store (sync, <50ms)
│
└──▶ Background Task (async)
│
├── 1. select_messages() ── Budget: 20 messages / 200KB
├── 2. PrivacyFilter ── Strip <private> tags → [REDACTED]
├── 3. FactExtractor ── LLM extracts atomic facts (max 50)
├── 4. NoiseFilter ── Regex + vector prototype matching
├── 5. AdmissionControl ── 5-dimension scoring gate
└── 6. Reconciler ── 7-decision reconciliation
│
▼
LanceDB Store
Stage Details:
| Stage | Component | What It Does |
|---|---|---|
| Message Selection | select_messages() |
Takes the last N messages within budget (20 messages, 200KB). Selects from the end of the conversation to capture the most recent context. |
| Privacy Filter | strip_private_content() |
Replaces <private>...</private> blocks with [REDACTED]. Messages that are fully private (nothing left after stripping) are dropped entirely. |
| Fact Extraction | FactExtractor |
Sends sanitized conversation to LLM with structured prompt. Extracts atomic facts with 3-layer detail (l0/l1/l2), category, and tags. Max 50 facts per extraction. Strips platform envelope metadata (channel info, sender metadata) before sending. |
| Noise Filter | NoiseFilter |
Three-layer filtering: (1) Regex patterns catch greetings, thanks, meta-questions, agent refusals in EN/CN; (2) Vector prototype matching (cosine similarity ≥ 0.82) catches semantically similar noise; (3) Feedback learning — confirmed noise vectors are remembered (up to 200) for future filtering. |
| Admission Control | AdmissionControl |
5-dimension weighted scoring: composite = 0.1·utility + 0.1·confidence + 0.1·novelty + 0.1·recency + 0.6·type_prior. Balanced preset: reject < 0.45, admit ≥ 0.60. Category priors: Profile=0.95, Preferences=0.90, Patterns=0.85, Cases=0.80, Entities=0.75, Events=0.45. |
| Reconciliation | Reconciler |
Compares extracted facts against existing memories using dual search (vector + FTS). Makes one of 7 decisions per fact. See Section 1.4. |
Ingest Modes:
smart(default) — Full pipeline: session store + async LLM extractionraw— Session store only, no LLM processing
Graceful Degradation: If the LLM fails, raw messages are still preserved in the session store. The slow path logs the error and exits without crashing.
Handles bulk document import with intelligent content-type detection and strategy routing.
┌──────────┐ ┌─────────────────┐ ┌───────────────┐ ┌───────────┐
│ File │───▶│ Intelligence │───▶│ Reconciler │───▶│ LanceDB │
│ Upload │ │ Task │ │ (7 decisions) │ │ Store │
└──────────┘ └──────┬──────────┘ └───────────────┘ └───────────┘
│
┌────────▼────────┐
│ Strategy Router │
│ auto / atomic / │
│ section / doc │
└────────┬────────┘
│
┌────────────┼────────────┐
▼ ▼ ▼
┌──────────┐ ┌──────────┐ ┌──────────┐
│ Atomic │ │ Section │ │ Document │
│ Extract │ │ Extract │ │ Extract │
└──────────┘ └──────────┘ └──────────┘
Import Flow:
- Upload — Multipart form with file, strategy, file_type, space_id
- Dedup Check — SHA-256 content hash prevents duplicate imports
- Session Storage — Raw content stored in session table with
import-{task_id}session ID - Background Processing — Acquires
import_semaphore(capacity=3), then:
Strategy Detection (auto mode):
| Content Hint | Detection Rule | Extraction Method |
|---|---|---|
Conversation |
≥ 3 role-pattern lines (\nuser:, \nassistant:, etc.) |
Atomic — chunk & extract per chunk |
LargeDoc |
Content > 80,000 characters | Atomic — smart_split with 2,000 char overlap |
StructuredDoc |
Has markdown headings (# or ## ) AND ≥ 500 words |
Section — split by headings, one memory per section |
ShortNote |
< 500 words, no headings | Document — single comprehensive memory |
Extraction Paths:
- Atomic (
extract_atomic): Splits text into chunks (max 80K chars, 2K overlap) usingsmart_split(). Boundary detection prefers: heading (##) > paragraph (\n\n) > newline (\n) > hard cut. Each chunk is sent to LLM for fact extraction. - Section (
extract_sections): Splits at#and##headings. Each section gets a dedicated LLM prompt producing exactly one memory per section. Retries up to 2 times with exponential backoff (1s, 2s). - Document (
extract_document): Entire text sent as one prompt, producing a single comprehensive memory. Also retries up to 2 times.
Concurrency Control:
import_semaphore(capacity=3) — Limits concurrent extraction tasksreconcile_semaphore(capacity=1) — Serializes reconciliation to prevent race conditions
Source Text Preservation: Each extracted fact retains source_text — the original chunk/section/document text. This becomes the content field in the stored memory, ensuring the original text is searchable via both vector and BM25.
Batch Self-Dedup: When the database is empty (no existing memories to reconcile against), facts within the same import batch are deduplicated via LLM. The LLM identifies duplicate/overlapping facts and returns indices to keep.
The simplest path — creates a single pinned memory with no LLM processing.
API Body ──▶ Memory::new() ──▶ Embed content ──▶ LanceDB create
- Memory type:
Pinned(protected from MERGE/SUPERSEDE by reconciler) - Category:
Preferences(default) - Embedding: Generated immediately from
contentfield - No noise filter, no admission control, no reconciliation
The reconciler is the intelligence layer that prevents duplicate memories and maintains knowledge consistency. For each extracted fact, it:
- Gathers existing memories via dual search (vector + FTS) — up to 60 existing memories, 5 per fact
- Preference slot guard — Detects same-brand-different-item preferences (e.g., "likes Starbucks latte" vs "likes Starbucks americano") and auto-creates without LLM
- LLM decision — Sends facts + existing memories to LLM, receives one decision per fact
Extracted Facts ──▶ gather_existing() ──▶ LLM Reconciliation ──▶ Execute Decisions
│ │
├── vector_search (per fact, top 5, min 0.3) ├── CREATE
└── fts_search (per fact, top 5) ├── MERGE
├── SKIP
├── SUPERSEDE
├── SUPPORT
├── CONTEXTUALIZE
└── CONTRADICT
Decision Types:
| Decision | Effect | When Used |
|---|---|---|
| CREATE | New memory created with embedding | Genuinely new information |
| MERGE | Existing memory updated with combined content + re-embedded | Fact adds detail to existing memory. Profile category always merges. |
| SKIP | No action | Duplicate or less informative than existing |
| SUPERSEDE | New memory created, old memory archived (invalidated_at set, superseded_by linked) |
Fact updates/replaces outdated information |
| SUPPORT | Existing memory's confidence boosted by +0.1 (max 1.0), Supports relation added |
Fact reinforces existing memory |
| CONTEXTUALIZE | New memory created with Contextualizes relation to existing |
Fact adds situational nuance (e.g., "prefers tea in the evening") |
| CONTRADICT | For temporal categories → routes to SUPERSEDE. Otherwise: new memory created, bidirectional Contradicts relations added |
Fact directly contradicts existing memory |
Category-Aware Rules:
profile— Always MERGE when match exists (never SUPERSEDE/CONTRADICT)events,cases— Only CREATE or SKIP (append-only, never modify)preferences,entities— All 7 operations supportedpatterns— Supports MERGE
Pinned Memory Protection: Memories with type Pinned cannot be MERGED or SUPERSEDED. These decisions are automatically downgraded to CREATE.
ID Mapping: The reconciler maps internal UUIDs to sequential integer IDs ([0], [1], ...) in the LLM prompt to prevent UUID leakage and reduce token usage.
Each memory is stored with a multi-layer content structure:
| Field | Source | Purpose |
|---|---|---|
content |
Original source text (chunk/section/document) | BM25 FTS index + vector embedding. The ground truth. |
l0_abstract |
LLM-generated | One-line index entry. Used for scan/browse, FTS indexed. |
l1_overview |
LLM-generated | Structured markdown summary (2-5 lines). Key attributes at a glance. |
l2_content |
LLM-generated | Full narrative with all details, context, and nuance. |
Why dual content? The content field preserves the original text for faithful keyword search and embedding. The l0/l1/l2 layers provide progressively detailed LLM-generated summaries optimized for agent consumption.
Full Schema (29 columns in LanceDB):
| Field | Type | Description |
|---|---|---|
id |
UUID | Unique identifier |
content |
String | Original source text |
l0_abstract |
String | One-line summary |
l1_overview |
String | Structured overview |
l2_content |
String | Detailed narrative |
vector |
Float32[1024] | Embedding vector (nullable) |
category |
Enum | profile, preferences, entities, events, cases, patterns |
memory_type |
Enum | Insight, Session, Pinned |
state |
Enum | Active, Archived, Deleted |
tier |
Enum | Core, Working, Peripheral |
importance |
Float32 | 0.0–1.0, affects retrieval scoring |
confidence |
Float32 | 0.0–1.0, boosted by SUPPORT decisions |
access_count |
Int32 | Retrieval frequency counter |
tags |
JSON | User-defined labels |
relations |
JSON | Array of {relation_type, target_id, context_label} |
superseded_by |
UUID | Link to replacement memory |
invalidated_at |
Timestamp | When memory was superseded |
tenant_id |
String | Tenant isolation |
space_id |
String | Space-based isolation |
visibility |
String | global, private, or shared:<space-id> |
provenance |
JSON | Sharing lineage tracking |
The retrieval pipeline processes search queries through 12 stages, combining vector similarity and keyword matching with progressive refinement:
SearchRequest
│
▼
┌─────────────────────────────────────────────────────────────────────┐
│ Stage 1: parallel_search Vector + BM25 in parallel │
│ Stage 2: rrf_fusion Reciprocal Rank Fusion (K=60) │
│ Stage 3: rrf_normalize Min-max normalize to [0, 1] │
│ Stage 4: min_score_filter Drop if score < 0.3 │
│ Stage 5: topk_cap Truncate to limit × 2 │
│ Stage 6: cross_encoder Rerank blend: 0.6·rerank + 0.4·orig │
│ Stage 7: bm25_floor Protect exact keyword matches │
│ Stage 8: decay_boost Weibull time decay │
│ Stage 9: importance_weight Score × (0.7 + 0.3·importance) │
│ Stage 10: length_norm Penalize overly long content │
│ Stage 11: hard_cutoff Drop if score < 0.005 │
│ Stage 12: mmr_diversity Jaccard dedup + truncate to limit │
└─────────────────────────────────────────────────────────────────────┘
│
▼
Vec<SearchResult> { memory, score }
Executes vector search and BM25 full-text search concurrently via tokio::join!.
- Vector search: Embeds query → ANN search on
vectorcolumn (cosine similarity). Fetcheslimit × 3candidates with min_score=0.0 (no pre-filtering). - BM25 search: Full-text search on
contentandl0_abstractcolumns. Same fetch limit. - Fault tolerance: If either search fails, the pipeline continues with results from the other. Both failing → empty result.
Combines results from both search legs using Reciprocal Rank Fusion:
vector_rrf = vector_weight / (rrf_k + rank) → 0.7 / (60 + rank)
bm25_rrf = bm25_weight / (rrf_k + rank) → 0.3 / (60 + rank)
- Memories appearing in both legs have their RRF scores summed
- Pinned boost: Pinned memories get
score × 1.5
| Parameter | Default | Description |
|---|---|---|
vector_weight |
0.7 | Weight for vector search leg |
bm25_weight |
0.3 | Weight for BM25 search leg |
rrf_k |
60.0 | RRF smoothing constant |
pinned_boost |
1.5 | Multiplier for pinned memories |
Normalizes raw RRF scores (typically ~0.01–0.03) to the [0, 1] range:
- Multiple results: Min-max normalization → highest=1.0, lowest=0.0
- Single result:
score = min(score × 40.0, 1.0) - All equal scores: All set to 1.0
Drops candidates below the minimum score threshold.
| Parameter | Default | Description |
|---|---|---|
min_score |
0.3 | Minimum normalized score to keep |
Sorts by score descending and truncates to limit × 2 candidates. This provides enough candidates for reranking without excessive computation.
If a reranker is configured (Jina, Voyage, or Pinecone), blends reranker scores with original scores:
final_score = rerank_score × 0.6 + original_score × 0.4
| Provider | Default Endpoint |
|---|---|
jina |
https://api.jina.ai/v1/rerank |
voyage |
https://api.voyageai.com/v1/rerank |
pinecone |
https://api.pinecone.io/rerank |
Configure via OMEM_RERANK_PROVIDER and OMEM_RERANK_API_KEY environment variables. Timeout: 5 seconds. If reranker fails, original scores are preserved.
Protects high-quality keyword matches from being over-penalized by the reranker:
if bm25_score ≥ 0.75:
floor = pre_rerank_score × 0.95
score = max(score, floor)
This ensures that exact keyword matches retain at least 95% of their pre-rerank score.
Applies Weibull time-decay to favor recent, frequently-accessed, and important memories:
composite = 0.4·recency + 0.3·frequency + 0.3·intrinsic
recency = exp(-λ · t^β)
λ = ln(2) / (half_life × exp(1.5 × importance))
β = 0.8 (Core) | 1.0 (Working) | 1.3 (Peripheral)
frequency = (1 - exp(-count/5)) × gap_factor
intrinsic = importance × confidence
boosted_score = score × (0.3 + 0.7 × composite)
| Parameter | Default | Description |
|---|---|---|
half_life_days |
30.0 | Base half-life for recency decay |
beta_core |
0.8 | Sub-exponential — Core memories decay slowly |
beta_working |
1.0 | Exponential — standard decay |
beta_peripheral |
1.3 | Super-exponential — Peripheral memories decay fast |
search_boost_min |
0.3 | Minimum boost factor (floor) |
floor_core |
0.9 | Minimum composite for Core tier |
floor_working |
0.7 | Minimum composite for Working tier |
floor_peripheral |
0.5 | Minimum composite for Peripheral tier |
Applies a mild importance-based multiplier:
score × = 0.7 + 0.3 × importance
importance=0→ score × 0.7importance=1→ score × 1.0 (unchanged)
Penalizes excessively long content to prevent verbose memories from dominating:
len_ratio = content.len() / 500.0
denominator = max(1.0, 1.0 + log₂(len_ratio))
score /= denominator
| Content Length | Penalty |
|---|---|
| ≤ 500 chars | None (÷1.0) |
| 1,000 chars | ÷2.0 |
| 2,000 chars | ÷3.0 |
| 4,000 chars | ÷4.0 |
Final safety net — drops any candidate with score below the hard cutoff threshold.
| Parameter | Default | Description |
|---|---|---|
hard_cutoff |
0.005 | Absolute minimum score after all adjustments |
Maximal Marginal Relevance removes near-duplicate results:
- Sort by score descending
- For each candidate, compute word-level Jaccard similarity against all higher-ranked results
- If
jaccard > 0.85with any prior result →score × 0.5(50% penalty) - Re-sort and truncate to final
limit
The retrieval pipeline combines two complementary search strategies:
┌──────────────────────┐
│ Search Query │
└──────────┬───────────┘
│
┌──────────▼───────────┐
│ Embed Query │
│ (1024-dim vector) │
└──────────┬───────────┘
│
┌────────────────┼────────────────┐
▼ ▼
┌─────────────────┐ ┌─────────────────┐
│ Vector Search │ │ BM25 Search │
│ │ │ │
│ ANN on vector │ │ FTS on content │
│ column │ │ + l0_abstract │
│ │ │ │
│ Cosine distance │ │ Keyword match │
│ → similarity │ │ with ranking │
│ │ │ │
│ Weight: 0.7 │ │ Weight: 0.3 │
└────────┬────────┘ └────────┬────────┘
│ │
└──────────────┬────────────────────┘
▼
┌─────────────────┐
│ RRF Fusion │
│ K=60 │
└─────────────────┘
Vector Search:
- Index type: IVF-HNSW-SQ (IVF + HNSW + Scalar Quantization)
- Distance metric: Cosine
- Score conversion:
similarity = 1.0 - cosine_distance - Dimension: 1024 (configurable via embedding provider)
- Filter:
state != 'deleted'+ optional scope/visibility filters
BM25 Full-Text Search:
- Two FTS indexes:
contentcolumn andl0_abstractcolumn - Post-filtering (search first, then apply scope/visibility filters)
- Auto-created on first write (LanceDB requires data before index creation)
Cross-Space Search: When a user has access to multiple spaces, the pipeline runs independently on each space's store, normalizes scores per-space, applies space weights, then merges and re-ranks globally.
ourmem integrates with AI coding platforms through four plugins, each adapted to the platform's extension model:
┌─────────────────────────────────────────────────────────────────────┐
│ AI Agent Platforms │
├──────────────┬──────────────┬──────────────┬────────────────────────┤
│ Claude Code │ OpenCode │ OpenClaw │ MCP (Cursor/VS Code) │
│ Hooks+MCP │ Plugin │ Plugin │ Server │
│ │ │ │ │
│ 3 bash hooks│ 3 hooks │ 3 hooks │ 9 tools │
│ 9 MCP tools │ 5 tools │ 5 tools │ 1 resource │
│ 2 skills │ │ ContextEngine│ │
│ │ │ MemoryBackend│ │
└──────┬───────┴──────┬───────┴──────┬───────┴────────────┬──────────┘
│ │ │ │
└──────────────┴──────────────┴────────────────────┘
│
┌─────────▼──────────┐
│ ourmem Server │
│ REST API │
│ X-API-Key auth │
└────────────────────┘
Installation: /plugin marketplace add ourmem/omem or /plugin install ourmem@ourmem
Architecture: Bash scripts registered via hooks.json, plus a bundled @ourmem/mcp server (via .mcp.json) for on-demand tools, and two skills for slash-command access.
Configuration: Credentials via ~/.claude/settings.json env field (recommended) or environment variables (fallback).
| Env Variable | Default | Purpose |
|---|---|---|
OMEM_API_URL |
https://api.ourmem.ai |
Server URL |
OMEM_API_KEY |
"" |
API authentication (graceful skip if empty) |
Hooks (3):
| Hook | Timeout | Behavior |
|---|---|---|
SessionStart |
15s | GET /v1/memories?limit=20. Formats as markdown list with relative timestamps ("3d ago"), showing l0_abstract (fallback: content, truncated to 200 chars). Injects via hookSpecificOutput.hookEventName.additionalContext. If no API key, shows setup instructions instead. |
Stop |
30s | Reads transcript_path JSONL file (fallback: inline transcript/messages array). Extracts last 10 user+assistant messages (each truncated to 2000 chars). Skips if fewer than 2 messages. POST /v1/memories with mode: "smart", tags: ["auto-captured", "claude-code"]. |
PreCompact |
30s | Same as Stop but extracts last 15 messages. POST /v1/memories with mode: "smart", tags: ["pre-compact", "claude-code"]. |
All hooks use curl with 8-second HTTP timeout. Errors are silently swallowed to never block the session.
MCP Tools (9, bundled via .mcp.json): The plugin bundles @ourmem/mcp as a child MCP server, giving Claude access to: memory_store, memory_search, memory_list, memory_ingest, memory_get, memory_update, memory_forget, memory_stats, memory_profile.
Skills (2): /ourmem:memory-recall (search), /ourmem:memory-store (save).
Installation: Add "@ourmem/opencode" to the plugin array in opencode.json.
Architecture: TypeScript plugin implementing @opencode-ai/plugin interface. Registers 3 hooks and 5 tools. Default export with {id: "ourmem", server} format.
Configuration: Via plugin_config in opencode.json (highest priority), ~/.config/ourmem/config.json (global), or environment variables (fallback).
| Env Variable | Default | Purpose |
|---|---|---|
OMEM_API_URL |
http://localhost:8080 |
Server URL |
OMEM_API_KEY |
"" |
API authentication |
Container Tags: Each session generates two SHA-256 hash-based tags for isolation:
omem_user_{hash(email)[0:16]}— derived fromGIT_AUTHOR_EMAILorUSERomem_project_{hash(cwd)[0:16]}— derived from working directory
These tags are attached to all store and search operations.
Hooks (3):
| Hook | Trigger | Behavior |
|---|---|---|
experimental.chat.system.transform |
Before each LLM call | First message only per session (tracked via injectedSessions Set). Uses the first user message text for semantic search (fallback "*" if no message stored yet). Also fetches user profile via GET /v1/profile. Injects <omem-context> block (memories grouped by category with relative age) + <omem-profile> block into system prompt. If keyword was detected, appends a nudge prompt. |
chat.message |
User sends a message | Stores the first message text in a firstMessages Map (keyed by session ID) for use as the semantic search query in the next system.transform call. Also scans for memory keywords ("remember", "save this", "don't forget", "keep in mind", "note that", "store this", "memorize", "记住", "记一下", "保存", "记下来", "别忘了"). If detected, flags the session so the next system prompt includes a nudge to use memory_store. |
experimental.session.compacting |
Before context compaction | Searches "*" for 20 recent memories with container tags. Injects <omem-context> block into the compaction context so memories survive compaction. |
Tools (5): All return structured JSON {ok, data}.
| Tool | API Call |
|---|---|
memory_store |
POST /v1/memories with content + container tags |
memory_search |
GET /v1/memories/search with container tags |
memory_get |
GET /v1/memories/{id} |
memory_update |
PUT /v1/memories/{id} |
memory_delete |
DELETE /v1/memories/{id} |
Note: OpenCode has no session-end hook. Memory storage relies on the agent proactively using the memory_store tool, or keyword detection nudging the agent to do so.
Installation: openclaw plugins install @ourmem/ourmem
Architecture: Object export {id, name, register()}. The register() method registers hooks via api.on() and tools via api.registerTool(). Also provides a ContextEngine class and MemoryBackend class for framework-level integration.
Configuration:
| Source | Priority | Keys |
|---|---|---|
pluginConfig (openclaw.json) |
1st | apiUrl, apiKey |
| Environment variables | 2nd | OMEM_API_URL, OMEM_API_KEY |
| Defaults | 3rd | https://api.ourmem.ai, "" |
Hooks (3):
| Hook | Trigger | Behavior |
|---|---|---|
before_prompt_build |
Before each LLM call (priority: 50) | Semantic search using event.prompt text (truncated to 500 chars, fallback "*"). Formats memories by category with relative age. Returns { prependContext } with <omem-context> block. |
agent_end |
Agent completes (success only) | Extracts last 20 messages (200KB byte budget). Handles Claude content block arrays (extracts type: "text" blocks). Strips previously injected <omem-context> tags to prevent re-ingestion. Sends to POST /v1/memories with mode: "smart", session_id, agent_id. |
before_reset |
Before /reset or daily reset |
Saves last 3 user messages (each truncated to 300 chars, minimum 10 chars) as a session summary via smart ingest. Prevents memory loss during OpenClaw's daily 4AM reset. |
ContextEngine (7 lifecycle methods): Available as OmemContextEngine class for framework-level integration:
bootstrap() ──▶ Health check (GET /health)
ingest(message) ──▶ Smart ingest single message
assemble(budget) ──▶ Parallel: GET /v1/profile + search memories
Format within token budget (text.length / 4)
Inject <user-profile> + <memories> blocks
afterTurn(turn) ──▶ Smart ingest user + assistant messages
prepareSubagentSpawn() ──▶ Search memories relevant to sub-task (limit=5)
onSubagentEnded(result) ──▶ Smart ingest sub-agent summary
compact() ──▶ No-op (server-side not implemented)
MemoryBackend: OmemMemoryBackend class proxies store(), search(), get(), update(), delete(), list() directly to the ourmem API.
Tools (5): Same 5 tools as OpenCode (without container tags). All return structured JSON {ok, data}.
Installation: npx -y @ourmem/mcp in MCP config (Cursor, VS Code, Claude Desktop, Windsurf).
Architecture: Standalone MCP server process communicating via stdio transport. Pure on-demand mode with no automatic hooks.
Configuration:
| Env Variable | Default | Purpose |
|---|---|---|
OMEM_API_URL |
http://localhost:8080 |
Server URL |
OMEM_API_KEY |
(required) | API key |
Tools (9):
| Tool | Parameters | Description |
|---|---|---|
memory_store |
content, tags?, source? |
Store a memory (source defaults to "mcp") |
memory_search |
query, limit? (1-50), scope?, tags? |
Semantic search with optional tag filtering |
memory_list |
limit? (1-100) |
Browse recent memories without a search query |
memory_ingest |
messages[], mode?, tags? |
Ingest conversation for smart extraction |
memory_get |
id |
Retrieve a memory by ID |
memory_update |
id, content, tags? |
Update memory content or tags |
memory_forget |
id |
Delete a memory (named "forget" not "delete") |
memory_stats |
(none) | Memory statistics by category, type, tier |
memory_profile |
(none) | Synthesized user profile from stored memories |
Resource (1):
| Resource | URI | Description |
|---|---|---|
| User Profile | omem://profile |
Returns synthesized user profile as JSON |
Key Differences from other plugins:
- No automatic hooks. Fully agent-driven.
- 9 tools (vs 5 in OpenCode/OpenClaw). Extra:
memory_list,memory_ingest,memory_stats,memory_profile. - Uses
memory_forgetinstead ofmemory_delete. - Errors surface to the agent (not silently swallowed).
| Feature | Claude Code | OpenCode | OpenClaw | MCP |
|---|---|---|---|---|
| Auto-recall on session start | ✅ SessionStart hook (lists recent 20) | ✅ system.transform (first msg only, semantic search) | ✅ before_prompt_build (semantic search on prompt text) | ❌ |
| Auto-save on session end | ✅ Stop hook (last 10 msgs, smart ingest) | ❌ (no session-end hook) | ✅ agent_end (last 20 msgs, smart ingest) | ❌ |
| Save before reset | ❌ | ❌ | ✅ before_reset (last 3 user msgs) | ❌ |
| Save before compaction | ✅ PreCompact hook (last 15 msgs, smart ingest) | ✅ session.compacting (injects memories into compaction context) | ❌ | ❌ |
| Semantic search recall | ❌ (lists recent) | ✅ (first user message as query) | ✅ (prompt text as query) | ❌ |
| Profile injection | ❌ | ✅ (<omem-profile> block) |
❌ (hooks don't inject profile; ContextEngine.assemble() does) | ✅ (memory_profile tool) |
| Manual tools | ✅ (9 via bundled MCP) | ✅ (5 native) | ✅ (5 native) | ✅ (9) |
| Keyword detection | ❌ | ✅ ("remember", "记住", etc.) | ❌ | ❌ |
| Container tag isolation | ❌ | ✅ (user + project hash tags) | ❌ | ❌ |
| Constant | Value | Location |
|---|---|---|
SMART_SPLIT_MAX_CHARS |
80,000 | intelligence.rs |
SMART_SPLIT_OVERLAP |
2,000 | intelligence.rs |
DEFAULT_MAX_FACTS |
50 | extractor.rs |
DEFAULT_MAX_INPUT_CHARS |
8,000 | extractor.rs |
BYTE_BUDGET |
200,000 | pipeline.rs (ingest) |
MESSAGE_BUDGET |
20 | pipeline.rs (ingest) |
NOISE_THRESHOLD |
0.82 | noise.rs |
MAX_LEARNED_NOISE |
200 | noise.rs |
W_TYPE_PRIOR |
0.6 | admission.rs |
MAX_EXISTING |
60 | reconciler.rs |
MAX_PER_FACT |
5 | reconciler.rs |
MIN_SIMILARITY |
0.3 | reconciler.rs |
VECTOR_DIM |
1024 | lancedb.rs |
RRF_K |
60.0 | pipeline.rs (retrieve) |
VECTOR_WEIGHT |
0.7 | pipeline.rs (retrieve) |
BM25_WEIGHT |
0.3 | pipeline.rs (retrieve) |
MIN_SCORE |
0.3 | pipeline.rs (retrieve) |
HARD_CUTOFF |
0.005 | pipeline.rs (retrieve) |
PINNED_BOOST |
1.5 | pipeline.rs (retrieve) |
RRF_SCALE |
40.0 | pipeline.rs (retrieve) |
HALF_LIFE_DAYS |
30.0 | decay.rs |
BETA_CORE |
0.8 | decay.rs |
BETA_WORKING |
1.0 | decay.rs |
BETA_PERIPHERAL |
1.3 | decay.rs |
SEARCH_BOOST_MIN |
0.3 | decay.rs |