Long-term Memory gives CoPAW persistent memory across conversations: writes key information to Markdown files for long-term storage, with semantic search for recall at any time.
The long-term memory mechanism is inspired by OpenClaw and implemented by ReMe.
graph TB
User[User / Agent] --> MM[MemoryManager]
MM --> MemoryMgmt[Long-term Memory Management]
MemoryMgmt --> FileTools[Memory Update]
MemoryMgmt --> Watcher[Memory Index Update]
MemoryMgmt --> SearchLayer[Hybrid Memory Search]
FileTools --> LTM[MEMORY.md]
FileTools --> DailyLog[memory/YYYY-MM-DD.md]
Watcher --> Index[Async DB Update]
SearchLayer --> VectorSearch[Vector Semantic Search]
SearchLayer --> BM25[BM25 Full-text Search]
Long-term memory management includes the following capabilities:
| Capability | Description |
|---|---|
| Memory Persistence | Writes key information to Markdown files via file tools (read / write / edit); files are the source of truth |
| File Watching | Monitors file changes via watchfile, asynchronously updating the local database (semantic index & vector index) |
| Semantic Search | Recalls relevant memories by semantics using vector embeddings + BM25 hybrid search |
| File Reading | Reads the corresponding Memory Markdown files directly via file tools, loading on demand to keep the context lean |
Memories are stored as plain Markdown files, operated directly by the Agent via file tools. The default workspace uses a two-level structure:
graph LR
Workspace[Workspace working_dir] --> MEMORY[MEMORY.md Long-term Memory]
Workspace --> MemDir[memory/*]
MemDir --> Day1[2025-02-12.md]
MemDir --> Day2[2025-02-13.md]
MemDir --> DayN[...]
Stores long-lasting, rarely changing key information.
- Location:
{working_dir}/MEMORY.md - Purpose: Stores decisions, preferences, and persistent facts
- Updates: Written by the Agent via
write/editfile tools
One page per day, appended with the day's work and interactions.
- Location:
{working_dir}/memory/YYYY-MM-DD.md - Purpose: Records daily notes and runtime context
- Updates: Appended by the Agent via
write/editfile tools; automatically triggered when conversations become too long and need summarization
| Information Type | Write Target | Method | Example |
|---|---|---|---|
| Decisions, preferences, persistent facts | MEMORY.md |
write / edit tools |
"Project uses Python 3.12", "Prefers pytest framework" |
| Daily notes, runtime context | memory/YYYY-MM-DD.md |
write / edit tools |
"Fixed login bug today", "Deployed v2.1" |
| User says "remember this" | Write to file immediately | write tool |
Do not only save in memory! |
Embedding configuration is used for vector semantic search. Configuration priority: config file > env var > default.
Configure in agent.json under running.embedding_config:
| Config Field | Description | Default |
|---|---|---|
backend |
Embedding backend type | openai |
api_key |
API Key for the Embedding service | `` |
base_url |
URL of the Embedding service | `` |
model_name |
Embedding model name | `` |
dimensions |
Vector dimensions for initializing the vector database | 1024 |
enable_cache |
Whether to enable Embedding cache | true |
use_dimensions |
Whether to pass dimensions parameter in API request | false |
max_cache_size |
Maximum number of Embedding cache entries | 2000 |
max_input_length |
Maximum input length per Embedding request | 8192 |
max_batch_size |
Maximum batch size for Embedding requests | 10 |
use_dimensionsis for cases where some vLLM models don't support the dimensions parameter. Set tofalseto skip it.
When not set in config file, these environment variables serve as fallback:
| Environment Variable | Description | Default |
|---|---|---|
EMBEDDING_API_KEY |
API Key for the Embedding service | `` |
EMBEDDING_BASE_URL |
URL of the Embedding service | `` |
EMBEDDING_MODEL_NAME |
Embedding model name | `` |
api_key,model_name, andbase_urlmust all be non-empty to enable vector search in hybrid retrieval.
Control BM25 full-text search via the FTS_ENABLED environment variable:
| Environment Variable | Description | Default |
|---|---|---|
FTS_ENABLED |
Whether to enable full-text search | true |
Even without Embedding configured, enabling full-text search allows keyword search via BM25.
Configure the memory storage backend via the MEMORY_STORE_BACKEND environment variable:
| Environment Variable | Description | Default |
|---|---|---|
MEMORY_STORE_BACKEND |
Memory storage backend: auto, local, chroma, or sqlite |
auto |
Storage backend options:
| Backend | Description |
|---|---|
auto |
Auto-select: uses local on Windows, chroma on other systems |
local |
Local file storage, no extra dependencies, best compatibility |
chroma |
Chroma vector database, supports efficient vector retrieval; may core dump on some Windows envs |
sqlite |
SQLite database + vector extension; may freeze or crash on macOS 14 and below |
Recommended: Use the default
automode, which automatically selects the most stable backend for your platform.
The Agent has two ways to retrieve past memories:
| Method | Tool | Use Case | Example |
|---|---|---|---|
| Semantic search | memory_search |
Unsure which file contains the info; fuzzy recall by intent | "Previous discussion about deployment process" |
| Direct read | read_file |
Known specific date or file path; precise lookup | Read memory/2025-02-13.md |
Memory search uses Vector + BM25 hybrid search by default. The two search methods complement each other's strengths.
Maps text into a high-dimensional vector space and measures semantic distance via cosine similarity, capturing content with similar meaning but different wording:
| Query | Recalled Memory | Why It Matches |
|---|---|---|
| "Database choice for the project" | "Finally decided to replace MySQL with PostgreSQL" | Semantically related: both discuss database technology choices |
| "How to reduce unnecessary rebuilds" | "Configured incremental compilation to avoid full builds" | Semantic equivalence: reduce rebuilds ≈ incremental compilation |
| "Performance issue discussed last time" | "Optimized P99 latency from 800ms to 200ms" | Semantic association: performance issue ≈ latency optimization |
However, vector search is weaker on precise, high-signal tokens, as embedding models tend to capture overall semantics rather than exact matches of individual tokens.
Based on term frequency statistics for substring matching, excellent for precise token hits, but weaker on semantic understanding (synonyms, paraphrasing).
| Query | BM25 Hits | BM25 Misses |
|---|---|---|
handleWebSocketReconnect |
Memory fragments containing that function name | "WebSocket disconnection reconnection handling logic" |
ECONNREFUSED |
Log entries containing that error code | "Database connection refused" |
Scoring logic: Splits the query into terms, counts the hit ratio of each term in the target text, and awards a bonus for complete phrase matches:
base_score = hit_terms / total_query_terms # range [0, 1]
phrase_bonus = 0.2 (only when multi-word query matches the complete phrase)
score = min(1.0, base_score + phrase_bonus) # capped at 1.0
Example: Query "database connection timeout" hits a passage containing only "database" and "timeout" →
base_score = 2/3 ≈ 0.67, no complete phrase match → score = 0.67
To handle ChromaDB's case-sensitive
$containsbehavior, the search automatically generates multiple case variants for each term (original, lowercase, capitalized, uppercase) to improve recall.
Uses both vector and BM25 recall signals simultaneously, performing weighted fusion on results (default vector
weight 0.7, BM25 weight 0.3):
- Expand candidate pool: Multiply the desired result count by
candidate_multiplier(default 3×, capped at 200); each path retrieves more candidates independently - Independent scoring: Vector and BM25 each return scored result lists
- Weighted merging: Deduplicate and fuse by chunk's unique identifier (
path + start_line + end_line)- Recalled by vector only →
final_score = vector_score × 0.7 - Recalled by BM25 only →
final_score = bm25_score × 0.3 - Recalled by both →
final_score = vector_score × 0.7 + bm25_score × 0.3
- Recalled by vector only →
- Sort and truncate: Sort by
final_scoredescending, return top-N results
Example: Query "handleWebSocketReconnect disconnection reconnect"
| Memory Fragment | Vector Score | BM25 Score | Fused Score | Rank |
|---|---|---|---|---|
| "handleWebSocketReconnect function handles WebSocket disconnection reconnect" | 0.85 | 1.0 | 0.85×0.7 + 1.0×0.3 = 0.895 | 1 |
| "Logic for automatic retry after network disconnection" | 0.78 | 0.0 | 0.78×0.7 = 0.546 | 2 |
| "Fixed null pointer exception in handleWebSocketReconnect" | 0.40 | 0.5 | 0.40×0.7 + 0.5×0.3 = 0.430 | 3 |
graph LR
Query[Search Query] --> Vector[Vector Semantic Search × 0.7]
Query --> BM25[BM25 Full-text Search × 0.3]
Vector --> Merge[Deduplicate by chunk + Weighted sum]
BM25 --> Merge
Merge --> Sort[Sort by fused score descending]
Sort --> Results[Return top-N results]
Summary: Using any single search method alone has blind spots. Hybrid search lets the two signals complement each other, delivering reliable recall whether you're asking in natural language or searching for exact terms.
- Introduction — What this project can do
- Console — Manage memory and configuration in the console
- Skills — Built-in and custom capabilities
- Configuration & Working Directory — Working directory and config