kimi-mneme consists of 4 layers that work together to capture, compress, store, and retrieve coding session context.
┌─────────────────────────────────────────────────────────────────────┐
│ Kimi Code CLI │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────────────────┐ │
│ │ Lifecycle │ │ Plugin │ │ User Prompts │ │
│ │ Hooks │ │ Tools │ │ │ │
│ │ (7 hooks) │ │ (3 commands) │ │ │ │
│ └──────┬───────┘ └──────┬───────┘ └──────────────────────────┘ │
└─────────┼─────────────────┼─────────────────────────────────────────┘
│ │
▼ ▼
┌─────────────────────────────────────────────────────────────────────┐
│ Core Engine │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────────────────┐ │
│ │ Extractor │ │ Compressor │ │ Injector │ │
│ │ │ │ │ │ │ │
│ │ - Parse tool │ │ - Semantic │ │ - Query relevant │ │
│ │ calls │ │ summary │ │ context │ │
│ │ - Extract │ │ - Keyword │ │ - Format for LLM │ │
│ │ file paths │ │ extraction │ │ - Inject at session │ │
│ │ - Capture │ │ - Token │ │ start │ │
│ │ errors │ │ counting │ │ - Cross-session patterns │ │
│ │ - Detect │ │ │ │ │ │
│ │ truncation │ │ │ │ │ │
│ │ - Checkpoint │ │ │ │ │ │
│ │ on compact│ │ │ │ │ │
│ └──────┬───────┘ └──────┬───────┘ └──────────────────────────┘ │
└─────────┼─────────────────┼─────────────────────────────────────────┘
│ │
▼ ▼
┌─────────────────────────────────────────────────────────────────────┐
│ Storage Layer │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────────────────┐ │
│ │ SQLite │ │ sqlite-vec │ │ Web Server │ │
│ │ │ │ (vectors) │ │ │ │
│ │ - Sessions │ │ │ │ - FastAPI app │ │
│ │ - Observations│ │ - Embeddings │ │ - REST API │ │
│ │ - Summaries │ │ - Similarity │ │ - WebSocket for │ │
│ │ - Checkpoints│ │ search │ │ real-time updates │ │
│ │ - Patterns │ │ │ │ │ │
│ │ - Compaction │ │ │ │ │ │
│ │ - Truncated │ │ │ │ │ │
│ └──────────────┘ └──────────────┘ └──────────────────────────┘ │
└─────────────────────────────────────────────────────────────────────┘
User runs kimi → SessionStart hook fires → Create session record
↓
User sends prompt → UserPromptSubmit hook → Store prompt
↓
AI uses tool → PostToolUse hook → Extract and store observation
↓
Context compaction → PostCompact hook → Create checkpoint
↓
Session ends → SessionEnd hook → Mark session complete + detect patterns
Raw observations → Sanitization pipeline (3-layer privacy filter)
↓
Extractor (clean & structure)
↓
Compressor (configurable LLM: Kimi/Ollama/OpenAI-compatible) → Semantic summary
↓
Store summary + keywords in SQLite
↓
Generate embedding → Store in sqlite-vec
Sanitization pipeline (applied before any AI processing):
- Strip system content — remove
<system>,<system_instruction>,<system-reminder>tags - Redact sensitive patterns — API keys, tokens, passwords, private keys via regex
- Sanitize privacy tags — replace
<private>...</private>with[PRIVATE] - Truncate — limit content length before API call
New session starts → SessionStart hook
↓
Check for previous checkpoints → Inject resume context
↓
Query cross-session patterns → Inject recurring patterns
↓
Injector queries sqlite-vec (vector search)
↓
Rank by relevance + recency
↓
Format as context block
↓
Inject into system prompt
User asks "find that auth bug" → AI calls mneme_search
↓
Plugin tool → Hybrid search (SQLite FTS + sqlite-vec vectors)
↓
Progressive disclosure:
Layer 1: Compact index (~50 tokens/result)
Layer 2: Timeline around results
Layer 3: Full details for selected IDs
↓
Return to AI
Note:
sqlite3CLI is required for database inspection and internal operations. Install via your system package manager (apt install sqlite3,brew install sqlite3,winget install SQLite.SQLite, etc.).
-- Sessions table
CREATE TABLE sessions (
id TEXT PRIMARY KEY,
cwd TEXT NOT NULL,
started_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
ended_at TIMESTAMP,
summary TEXT,
token_count INTEGER,
project TEXT
);
-- Observations table (raw events)
CREATE TABLE observations (
id INTEGER PRIMARY KEY AUTOINCREMENT,
session_id TEXT NOT NULL,
event_type TEXT NOT NULL, -- PostToolUse, UserPromptSubmit, etc.
tool_name TEXT,
tool_input TEXT,
tool_output TEXT,
error TEXT,
file_path TEXT,
prompt TEXT,
agent_name TEXT,
content_hash TEXT, -- SHA-256 for deduplication
discovery_tokens INTEGER DEFAULT 0,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
FOREIGN KEY (session_id) REFERENCES sessions(id)
);
-- Summaries table (AI-compressed)
CREATE TABLE summaries (
id INTEGER PRIMARY KEY AUTOINCREMENT,
session_id TEXT NOT NULL,
observation_ids TEXT, -- JSON array of source observation IDs
content TEXT NOT NULL,
keywords TEXT, -- JSON array
embedding_id TEXT -- vec table reference
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
FOREIGN KEY (session_id) REFERENCES sessions(id)
);
-- Session checkpoints (for resume after compaction/crash)
CREATE TABLE session_checkpoints (
id INTEGER PRIMARY KEY AUTOINCREMENT,
session_id TEXT NOT NULL,
checkpoint_number INTEGER NOT NULL DEFAULT 1,
checkpoint_type TEXT NOT NULL DEFAULT 'auto'
CHECK(checkpoint_type IN ('auto', 'manual', 'compaction', 'crash')),
summary TEXT NOT NULL,
key_decisions TEXT, -- JSON array
open_tasks TEXT, -- JSON array
token_count INTEGER,
observation_count INTEGER,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
FOREIGN KEY (session_id) REFERENCES sessions(id) ON DELETE CASCADE
);
-- Compaction events (context compaction tracking)
CREATE TABLE compaction_events (
id INTEGER PRIMARY KEY AUTOINCREMENT,
session_id TEXT NOT NULL,
compacted_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
tokens_before INTEGER,
tokens_after INTEGER,
observations_dropped INTEGER,
summary_generated TEXT,
FOREIGN KEY (session_id) REFERENCES sessions(id) ON DELETE CASCADE
);
-- Cross-session patterns (errors, fixes, decisions)
CREATE TABLE patterns (
id INTEGER PRIMARY KEY AUTOINCREMENT,
pattern_type TEXT NOT NULL
CHECK(pattern_type IN ('error', 'fix', 'decision', 'preference', 'architecture')),
pattern_hash TEXT UNIQUE NOT NULL,
title TEXT NOT NULL,
description TEXT NOT NULL,
first_seen_session_id TEXT,
last_seen_session_id TEXT,
occurrence_count INTEGER NOT NULL DEFAULT 1,
related_files TEXT, -- JSON array
related_observation_ids TEXT, -- JSON array
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
-- Truncated tool outputs (>100K chars)
CREATE TABLE truncated_outputs (
id INTEGER PRIMARY KEY AUTOINCREMENT,
observation_id INTEGER NOT NULL,
original_size INTEGER NOT NULL,
truncated_size INTEGER NOT NULL,
summary TEXT,
head_preview TEXT,
tail_preview TEXT,
line_count INTEGER,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
FOREIGN KEY (observation_id) REFERENCES observations(id) ON DELETE CASCADE
);
-- Structured observations (AI/heuristic structured metadata)
CREATE TABLE structured_observations (
id INTEGER PRIMARY KEY AUTOINCREMENT,
session_id TEXT NOT NULL,
project TEXT NOT NULL,
type TEXT NOT NULL
CHECK(type IN ('bugfix', 'feature', 'refactor', 'change', 'discovery', 'decision')),
title TEXT NOT NULL,
subtitle TEXT,
facts TEXT, -- JSON array of strings
narrative TEXT, -- Brief description (1-2 sentences)
concepts TEXT, -- JSON array: how-it-works, why-it-exists, etc.
files_read TEXT, -- JSON array of file paths
files_modified TEXT, -- JSON array of file paths
content_hash TEXT NOT NULL,
discovery_tokens INTEGER DEFAULT 0,
raw_observation_id INTEGER,
source TEXT DEFAULT 'ai' -- 'ai', 'heuristic', 'manual'
CHECK(source IN ('ai', 'heuristic', 'manual')),
model TEXT, -- 'kimi-k2.5', 'heuristic', etc.
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
FOREIGN KEY (session_id) REFERENCES sessions(id) ON DELETE CASCADE,
FOREIGN KEY (raw_observation_id) REFERENCES observations(id) ON DELETE SET NULL,
UNIQUE(session_id, content_hash)
);
-- Soft dedup links between structured observations and raw observations
CREATE TABLE structured_observation_links (
id INTEGER PRIMARY KEY AUTOINCREMENT,
existing_structured_id INTEGER NOT NULL,
linked_raw_observation_id INTEGER,
content_hash TEXT NOT NULL,
link_type TEXT DEFAULT 'soft' CHECK(link_type IN ('soft', 'hard')),
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
FOREIGN KEY (existing_structured_id) REFERENCES structured_observations(id) ON DELETE CASCADE,
FOREIGN KEY (linked_raw_observation_id) REFERENCES observations(id) ON DELETE CASCADE
);
-- Vector sync state (watermark-based)
CREATE TABLE vec_sync_state (
id INTEGER PRIMARY KEY CHECK(id = 1),
last_synced_id INTEGER DEFAULT 0,
synced_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
-- Knowledge collections
CREATE TABLE observation_collections (
id INTEGER PRIMARY KEY AUTOINCREMENT,
name TEXT UNIQUE NOT NULL,
description TEXT,
project TEXT,
query TEXT,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
-- Collection items
CREATE TABLE collection_items (
id INTEGER PRIMARY KEY AUTOINCREMENT,
collection_id INTEGER NOT NULL,
observation_id INTEGER NOT NULL,
item_type TEXT DEFAULT 'structured' CHECK(item_type IN ('structured', 'raw')),
added_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
FOREIGN KEY (collection_id) REFERENCES observation_collections(id) ON DELETE CASCADE,
FOREIGN KEY (observation_id) REFERENCES structured_observations(id) ON DELETE CASCADE,
UNIQUE(collection_id, observation_id)
);
-- Full-text search index (raw observations)
CREATE VIRTUAL TABLE observations_fts USING fts5(
content='observations',
content_rowid='id',
tool_name, tool_input, tool_output
);
-- Full-text search index (structured observations)
CREATE VIRTUAL TABLE structured_observations_fts USING fts5(
title,
subtitle,
narrative,
facts,
concepts,
content='structured_observations',
content_rowid='id'
);# Collection: mneme_observations
{
"ids": ["obs_123", "obs_124", ...],
"embeddings": [[0.1, 0.2, ...], ...],
"metadatas": [
{
"session_id": "sess_abc",
"event_type": "PostToolUse",
"tool_name": "WriteFile",
"file_path": "/project/src/auth.ts",
"timestamp": "2026-05-02T10:00:00Z"
},
...
],
"documents": ["Created auth middleware...", ...]
}| Event | Trigger | Data Captured |
|---|---|---|
SessionStart |
New/resumed session | session_id, cwd, source |
UserPromptSubmit |
User sends message | prompt |
PreToolUse |
Before tool execution | tool_name, tool_input |
PostToolUse |
After successful tool | tool_name, tool_input, tool_output |
PostToolUseFailure |
After failed tool | tool_name, tool_input, error |
SubagentStart |
Subagent launched | agent_name, prompt |
SubagentStop |
Subagent completed | agent_name, response |
PreCompact |
Before context compaction | trigger, token_count |
PostCompact |
After compaction | trigger, estimated_token_count |
Stop |
Agent turn ends | stop_hook_active |
StopFailure |
Turn ended with error | error_type, error_message |
SessionEnd |
Session closed | reason |
Notification |
Notification delivered | sink, type, title, body |
| Tool | Purpose | Parameters |
|---|---|---|
mneme_search |
Search memory index | query, type, limit, date_from, date_to |
mneme_timeline |
Get chronological context | observation_id, radius |
mneme_get |
Fetch full observation details | ids (array) |
| Tool | Purpose |
|---|---|
memory_search |
FTS5 search over structured observations |
memory_semantic_search |
sqlite-vec semantic search (with days recency filter) |
memory_recall |
Get full observation by ID |
memory_timeline |
Chronological observations for a session |
memory_stats |
Memory statistics |
memory_by_concept |
Search by concept tag |
memory_by_file |
Find observations related to a file |
memory_workflow |
How to use memory effectively (3-step guide) |
smart_search |
Tree-sitter AST symbol search |
smart_outline |
File structural outline |
smart_unfold |
Symbol body extraction |
memory_build_collection |
Create knowledge collection |
memory_list_collections |
List collections |
memory_export_collection |
Export collection (md/json/plain) |
memory_query_collection |
Semantic Q&A over collection |
To minimize token usage, search follows a 3-layer pattern:
Layer 1: mneme_search
→ Returns compact index: id, timestamp, type, snippet
→ ~50-100 tokens per result
Layer 2: mneme_timeline
→ Returns context around specific observation
→ ~200-500 tokens per result
Layer 3: mneme_get
→ Returns full observation details
→ ~500-1000 tokens per result
Total savings: ~10x vs fetching everything upfront
When Kimi CLI compacts context mid-session, mneme creates a checkpoint:
PostCompact hook fires
↓
Record compaction event (tokens_before, tokens_after)
↓
Extract key decisions from recent observations
↓
Extract open tasks from user prompts
↓
Create checkpoint with summary + decisions + tasks
↓
On next SessionStart: inject checkpoint context
Resume context format:
## 📌 Session Resume Context
**Checkpoint #3** (compaction)
### Summary
Session checkpoint after context compaction. Tokens reduced from 5000 to 2000.
### Key Decisions
- Use FastAPI for API layer
- SQLite for local storage
### Open Tasks
- [ ] Add comprehensive tests
- [ ] Deploy to stagingmneme automatically detects recurring patterns across sessions:
| Pattern Type | Detection Method | Example |
|---|---|---|
error |
Same tool fails ≥2 times | "Recurring error in Shell: npm test" |
fix |
Error followed by success on same file | "Fixed error in src/auth.ts" |
decision |
User prompt contains decision keywords | "Decided to use PostgreSQL" |
preference |
Repeated coding style choices | "Prefer TypeScript over JavaScript" |
architecture |
Structural decisions | "Use repository pattern for DB access" |
Patterns are injected into new sessions as "Recurring Patterns" section.
Content wrapped in <private>...</private> tags is excluded from storage:
def sanitize_content(text: str) -> str:
"""Remove private blocks before storage."""
import re
return re.sub(r'<private>.*?</private>', '[PRIVATE]', text, flags=re.DOTALL)Additional exclusion via patterns in config:
{
"privacy": {
"exclude_patterns": ["*.env*", "*secret*", "*password*", "*token*"]
}
}Raw observations are compressed when:
- Session ends
- Token count exceeds threshold
- Explicitly requested
Compression prompt:
Summarize the following coding session observations into a concise,
semantic summary. Include:
- What was accomplished
- Key files modified
- Important decisions made
- Any errors or blockers
Observations:
{observations}
Summary:
| Component | Windows | macOS | Linux | iOS/BSD |
|---|---|---|---|---|
| Filesystem Watcher | WindowsApiObserver |
FSEventsObserver |
InotifyObserver |
PollingObserver |
| Event Loop | winloop (optional) |
asyncio |
asyncio |
asyncio |
| Process Launch | CREATE_NEW_PROCESS_GROUP |
start_new_session |
start_new_session |
start_new_session |
| Server Logs | ~/.kimi/mneme/server.log |
~/.kimi/mneme/server.log |
~/.kimi/mneme/server.log |
~/.kimi/mneme/server.log |
- Process survival: Server uses
CREATE_NEW_PROCESS_GROUPflag to survive parent console closure - No console window:
CREATE_NO_WINDOWprevents popup console window - Event loop: Optional
winlooppackage provides better async performance than defaultasyncio - Background scan: Disabled by default on large databases to prevent startup overload
- Session management:
start_new_session=Truedetaches from parent session - Logs: Server stdout/stderr redirected to
~/.kimi/mneme/server.log
| Metric | Target |
|---|---|
| Hook execution | < 100ms (fire-and-forget) |
| Search query | < 500ms |
| Vector search | < 200ms |
| Collection query | < 500ms |
| Web UI load | < 1s |
| DB size growth | ~1MB per 100 sessions |
| Checkpoint creation | < 50ms |
| Pattern detection | < 100ms (async) |
- All hook commands run in isolated subprocess
- Database is local-only (no network access)
- API key stored in config or env var, not in code
- Privacy tags prevent accidental data leakage
- Truncated outputs preserve head/tail previews without full content