Author: Alex 3 | Date: 2026-03-18 Status: Implemented and in active use —
packages/memory/Inspired by: Mem0 (fact extraction + dedup), OpenViking (L0/L1/L2 layered loading) Package:@bit-office/memory— implemented inpackages/memory/src/with dedicated unit test coverage insrc/__tests__/
| Layer | Scope | What it stores |
|---|---|---|
| L0 -- Ephemeral | Current conversation | Sliding window (in-memory) |
| L1 -- Session | Per-task | Structured summary: what/why/files/decisions/commits |
| L2 -- Agent | Per-agent, long-term | Learned facts & preferences (up to 50, auto-deduped) |
| L3 -- Shared | Cross-agent | Project-wide knowledge promoted from L2 |
After a session crash, agents recover with structured context (task summary, changed files, commits, decisions) instead of raw chat fragments. Facts are extracted rule-based (zero extra LLM cost) and deduplicated via Jaccard similarity. High-confidence agent facts auto-promote to shared project knowledge.
Review patterns, tech preferences, and project ratings also persist (via the orchestrator's memory.ts). Agents learn what you value and adapt -- low visual scores lead to richer designs, recurring review failures get avoided automatically. The orchestrator injects these into worker prompts as ===== LEARNED FROM PREVIOUS PROJECTS =====.
Open Office currently has two disconnected memory mechanisms:
// What we save on task success:
interface RecoveryContext {
originalTask?: string; // truncated to 300 chars
phase?: string;
lastResult?: string; // truncated to 200 chars
recentMessages?: Array<{ // last 6 messages, each 400 chars max
role: "user" | "assistant";
text: string;
}>;
}Problems:
recentMessagesis a raw sliding window (last 6 turns), not semantically meaningful- 400-char truncation cuts mid-sentence, loses key information
- No structured facts — just raw conversation fragments
- After session loss, agent gets fragments like "Let me verify the build compiles" — useless without context
- Real example: Alex 2 recovered with 2 messages, couldn't remember what was built or why
interface MemoryStore {
reviewPatterns: ReviewPattern[]; // from reviewer FAIL verdicts
techPreferences: string[]; // from approved plan TECH lines
projectHistory: ProjectRecord[]; // summary + ratings
}Problems:
- Only captures structured data from team workflow events (review, plan approval, completion)
- No agent-level learning (e.g. "user prefers dashed borders" or "this codebase uses TERM_HOVER theme tokens")
- No cross-agent knowledge sharing
- No session-level work summaries (what was done, what decisions were made)
# Memory Index
- [feedback_edit_visibility.md](feedback_edit_visibility.md) — After code edits, explicitly state what changedProblems:
- Only accessible to Claude Code itself, not to our orchestrator
- Can't be structured, queried, or shared across agents
- No deduplication — entries accumulate without cleanup
What we HAVE: What we NEED:
┌──────────────┐ ┌──────────────────────────┐
│ Raw messages │ │ Structured work summary │
│ (6 x 400ch) │ │ (what/why/files/decisions)│
├──────────────┤ ├──────────────────────────┤
│ Project-level│ │ Session-level learning │
│ history only │ │ (per-task facts) │
├──────────────┤ ├──────────────────────────┤
│ Single agent │ │ Cross-agent context │
│ context only │ │ (L0 summaries) │
└──────────────┘ └──────────────────────────┘
| # | Goal | Metric |
|---|---|---|
| G1 | Agent recovers with actionable context after session loss | Recovery prompt contains structured facts, not raw fragments |
| G2 | Cross-agent awareness without token explosion | Other agents get 1-2 line L0 summary (~50 tokens), not full chat history |
| G3 | Accumulate agent-level learning (not just project-level) | Facts like "user prefers solid borders" persist across sessions |
| G4 | Zero external dependencies | No vector DB, no external LLM calls, no new services |
| G5 | Backward compatible | Existing memory.ts and RecoveryContext continue to work |
| G6 | Minimal token cost | Fact extraction is rule-based at session end, with no extra model call |
| G7 | Crash-safe recovery | Persist in-progress work state so interrupted sessions recover with actionable context |
Inspired by Mem0's tiered memory + OpenViking's L0/L1/L2 loading:
┌─────────────────────────────────────────────────────────────┐
│ Open Office Memory │
├─────────────┬──────────────┬──────────────┬─────────────────┤
│ Layer 0 │ Layer 1 │ Layer 2 │ Layer 3 │
│ EPHEMERAL │ SESSION │ AGENT │ SHARED │
├─────────────┼──────────────┼──────────────┼─────────────────┤
│ Current │ Per-task │ Per-agent │ Cross-agent │
│ conversation│ structured │ long-term │ project-wide │
│ sliding │ summary │ facts & │ knowledge │
│ window │ + facts │ preferences │ │
├─────────────┼──────────────┼──────────────┼─────────────────┤
│ In-memory │ JSON file │ JSON file │ JSON file │
│ (volatile) │ (persisted) │ (persisted) │ (persisted) │
├─────────────┼──────────────┼──────────────┼─────────────────┤
│ ~6 turns │ Last 30 │ Up to 50 │ Up to 20 │
│ (~2400 tok) │ sessions │ facts/agent │ facts total │
│ │ (~200 tok ea)│ (~1000 tok) │ (~500 tok) │
├─────────────┼──────────────┼──────────────┼─────────────────┤
│ NOW │ IMPROVE │ NEW │ NEW │
│ (keep as-is)│ (upgrade) │ │ │
└─────────────┴──────────────┴──────────────┴─────────────────┘
Agent running task...
│
▼
┌─────────────────────┐
│ L0: conversationLog │ ← sliding window (existing)
│ (raw messages, 6) │
└──────────┬──────────┘
│ periodic progress snapshots
▼
┌───────────────────────┐
│ Work State Snapshot │ ← persisted while task runs
│ (summary/next/files) │
└──────────┬───────────┘
│ task completes / interrupts
▼
┌───────────────────────┐
│ Fact Extraction │ ← extract structured data
│ (parse agent output) │ from stdoutBuffer, no LLM needed
└──────────┬───────────┘
│
┌───────────┼───────────────┐
▼ ▼ ▼
┌──────────────┐ ┌──────────┐ ┌──────────────┐
│ L1: Session │ │ L2: Agent│ │ L3: Shared │
│ Summary │ │ Facts │ │ Knowledge │
│ (what/why/ │ │ (prefs, │ │ (project │
│ files/next) │ │ patterns│ │ conventions)│
└──────────────┘ │ learned)│ └──────────────┘
└──────────┘
~/.bit-office/memory/
├── memory.json # Legacy project-level memory
├── sessions/
│ └── {agentId}.json # L1: latest + 30-item history ring buffer
├── agents/
│ └── {agentId}.json # L2: per-agent learned facts
├── work-state/
│ └── {agentId}.json # Crash-safe in-progress snapshot
└── shared.json # L3: cross-agent project knowledge
interface SessionSummary {
/** ISO timestamp */
timestamp: string;
/** One-line description: "Optimized MultiPaneView pagination UI" */
what: string;
/** Key decisions made during this session */
decisions: string[];
/** Files created or modified */
filesChanged: string[];
/** Unfinished work or known issues */
unfinished: string[];
/** Git commits created */
commits: string[];
/** Token usage for this session */
tokens: { input: number; output: number };
}No LLM call needed. We already have all the data in AgentSession:
| Field | Source |
|---|---|
what |
extractResult().summary or SUMMARY: line from stdout |
filesChanged |
taskChangedFiles Set — already tracked from tool_use events |
commits |
Parse from stdoutBuffer: lines matching Committed \[a-f0-9]+`` |
unfinished |
Parse from output: lines after "TODO" / "unfinished" / "remaining" |
decisions |
Parse from output: lines with "changed from X to Y" / "chose X over Y" / "because" |
tokens |
taskInputTokens / taskOutputTokens — already tracked |
In the shipped implementation, file paths are shortened to the last 3 path segments to keep recovery prompts compact.
The final implementation adds a persisted WorkState snapshot alongside session summaries so recovery still works when a task is interrupted before commitSession() runs.
interface WorkState {
startedAt: string;
updatedAt: string;
status: "running" | "interrupted" | "failed" | "cancelled";
taskId?: string;
taskPrompt?: string;
cwd?: string;
summary: string;
nextSteps: string[];
unfinished: string[];
filesTouched: string[];
lastActivity?: string;
}Current behavior:
updateWorkState()persists snapshots during executionbuildRecoveryContext()prefersworkStateover the last completedsessionSummaryclearAgentWorkState()clears the snapshot after clean completion
Before (current — raw fragments):
[Session recovered] Your previous session was lost. Here's what you were doing:
- Last result: There are changes in two files, but the `agent-session.ts` changes are from a different task
- Recent conversation:
[You]: Let me verify the build compiles:
[You]: Those errors are pre-existing `@types/node` version conflicts, not from my changes.
After (new — structured summary):
[Session recovered] Your previous session was lost. Here's what you were doing:
- Task: Optimize MultiPaneView pagination UI
- What you did: Redesigned pagination arrows as styled buttons, changed borders from dashed to solid, added hover scale animation
- Files changed: apps/web/src/components/MultiPaneView.tsx
- Commits: ad8ed51
- Decisions: Used TERM_HOVER/TERM_BORDER theme tokens instead of hardcoded rgba values
- Unfinished: agent-session.ts changes remain unstaged (separate task)
Note: You don't have full conversation history. Ask the user if unsure about details.
Token comparison: ~400 tokens (before) → ~150 tokens (after), 2.5x more informative
Long-lived facts about a specific agent's working context. Think of these as things an agent would "remember" about its user and codebase.
interface AgentFact {
/** Unique ID for dedup */
id: string;
/** Category for grouping */
category: "user_preference" | "codebase_pattern" | "workflow_habit" | "lesson_learned";
/** The fact itself: "User prefers solid borders over dashed" */
fact: string;
/** How many sessions this fact has been relevant */
reinforceCount: number;
/** When first observed */
createdAt: string;
/** When last reinforced */
lastSeen: string;
}Implemented now (v1 — rule-based, no LLM):
Parse agent output for patterns:
const FACT_PATTERNS: Array<{ regex: RegExp; category: AgentFact["category"] }> = [
// User preferences
{ regex: /(?:user|you)\s+(?:prefer|like|want|asked for)\s+(.{10,80})/i, category: "user_preference" },
// Codebase patterns
{ regex: /(?:this|the)\s+(?:codebase|project|repo)\s+(?:uses?|has)\s+(.{10,80})/i, category: "codebase_pattern" },
// Lessons learned
{ regex: /(?:note|important|remember|caution):\s*(.{10,80})/i, category: "lesson_learned" },
];The shipped regex set also covers:
- theme/token conventions such as
TERM_* - workflow rules like "always/never/make sure to ... before committing"
- pre-existing errors and known issues
Before adding a fact, check existing facts:
function isDuplicate(newFact: string, existing: AgentFact[]): AgentFact | null {
const normalized = normalize(newFact);
for (const fact of existing) {
// Simple similarity: Jaccard on word sets
const similarity = jaccardSimilarity(normalized, normalize(fact.fact));
if (similarity > 0.6) return fact; // duplicate — reinforce instead of add
}
return null;
}
function normalize(s: string): Set<string> {
return new Set(s.toLowerCase().replace(/[^a-z0-9\s]/g, "").split(/\s+/).filter(w => w.length > 2));
}Decision on match:
- similarity > 0.6: Reinforce existing fact (
reinforceCount++, updatelastSeen) - similarity ≤ 0.6: Add as new fact
- Max 50 facts per agent: Evict least-reinforced when full
===== AGENT KNOWLEDGE =====
Codebase: Uses TERM_HOVER/TERM_BORDER/TERM_SURFACE theme tokens for all interactive elements
Codebase: MultiPaneView uses PixiJS v8 with custom pagination component
Preference: User prefers solid borders with low opacity over dashed borders
Lesson: @types/node version conflicts are pre-existing; don't try to fix them
Cross-agent knowledge that any agent can benefit from. Extracted from L2 facts when they have high reinforceCount.
interface SharedKnowledge {
/** The knowledge item */
fact: string;
/** Which agent first discovered this */
source: string;
/** How many agents have confirmed this */
confirmedBy: string[];
/** Timestamp */
createdAt: string;
}When an L2 fact reaches reinforceCount >= 3, it's promoted to L3. Independent confirmation by another agent is tracked through crossConfirmShared(), which appends that agent to confirmedBy.
===== PROJECT KNOWLEDGE =====
- This monorepo uses pnpm workspaces with apps/web (Next.js 15) and apps/gateway (Node.js daemon)
- Theme tokens are defined in packages/shared; always use TERM_* constants, never hardcode colors
- User rates projects on 5 axes: creativity, visual, interaction, completeness, engagement
When Agent A needs to know what Agent B is doing, instead of sharing Agent B's full history, inject a one-line L0 summary:
function getAgentL0(agentId: string): string {
const session = loadLatestSession(agentId);
if (!session) return "idle (no recent activity)";
return session.what; // e.g. "Optimized MultiPaneView pagination UI (commit ad8ed51)"
}
// Inject into team roster:
// [Alex 2] Last: "Optimized MultiPaneView pagination UI" (commit ad8ed51) — 10 min ago
// [Alex 3] Last: "Analyzed OpenViking and Mem0 for memory redesign" — just nowToken cost: ~30 tokens per agent (vs. ~2400 tokens if we shared raw messages)
Implemented in packages/memory/src/:
index.ts: public exports for commit/recovery/context/storage helpersmemory.ts: L1/L2/L3 orchestration, work-state APIs, legacy wrappers, manual fact injectionextract.ts: rule-based extraction for summaries, decisions, unfinished work, next steps, fact candidatesstorage.ts: JSON persistence with atomic temp-file writes and configurable rootformat.ts: prompt formatting for recovery, session history, agent knowledge, shared knowledge, legacy contextdedup.ts: Jaccard-based dedup and shared-promotion logictypes.ts: complete type surface includingWorkStateandTaskCompletionData
Exported APIs now include:
commitSession()
buildRecoveryContext()
getMemoryContext()
getRecoveryString()
getAgentL0()
getWorkState()
updateWorkState()
clearAgentWorkState()
crossConfirmShared()
addManualFact()Still intentionally out of scope:
- LLM-based fact extraction
- fact decay / TTL
- UI dashboard for inspecting memory
| Layer | When injected | Tokens | Frequency |
|---|---|---|---|
| L0 (cross-agent) | Team roster | ~30/agent | Every task in team mode |
| L1 (session summary) | Recovery only | ~150 | Only after session loss |
| Work state | Recovery only | ~100-180 | During interrupted/crashed sessions |
| L2 (agent facts) | Every task | ~200 (top 10 facts) | Every task |
| L3 (shared knowledge) | Every task | ~100 (top 5 items) | Every task |
| Total new overhead | ~330 tokens steady-state | Per task |
Compare to current:
- Current
recentMessages: ~400 tokens (only on recovery, low value) - Current
getMemoryContext(): ~200 tokens (project history only) - New total: ~530 tokens per task, but dramatically more useful
v1 (legacy) v2 (implemented) v3 (possible future)
┌──────────┐ ┌──────────────┐ ┌──────────────────┐
│recentMsg │ ──────► │SessionSummary│ ──────► │SessionSummary │
│(raw 6x │ │(structured) │ │+ richer extraction│
│ 400ch) │ │ │ │+ memory tooling │
├──────────┤ ├──────────────┤ │+ L0 Cross-Agent │
│memory.ts │ │memory.ts │ ├──────────────────┤
│(project │ │+ work-state │ │memory.ts │
│ level) │ │+ L2/L3 │ │(further extended)│
└──────────┘ └──────────────┘ └──────────────────┘
100% 100% 100%
compatible compatible compatible
The middle column reflects the code currently in the repository.
{
"recovery": {
"originalTask": "优化 MultiPaneView 翻页按钮",
"lastResult": "There are changes in two files, but the agent-session.ts changes are from a different task",
"recentMessages": [
{ "role": "assistant", "text": "Let me verify the build compiles:" },
{ "role": "assistant", "text": "Those errors are pre-existing @types/node version conflicts, not from my changes. Let me verify my file specifically has no issu" }
]
}
}Alex 2 after recovery: "I vaguely remember compiling something? Let me check git log..."
{
"sessionSummary": {
"timestamp": "2026-03-17T14:30:00Z",
"what": "Redesigned MultiPaneView pagination bar with styled arrow buttons and hover animations",
"decisions": [
"Changed borders from dashed to solid with lower opacity",
"Used TERM_HOVER and TERM_BORDER theme tokens instead of hardcoded rgba",
"Added scale(1.02) hover animation for interactive feel"
],
"filesChanged": ["apps/web/src/components/MultiPaneView.tsx"],
"commits": ["ad8ed51"],
"unfinished": ["agent-session.ts recovery context improvements remain unstaged"],
"tokens": { "input": 45000, "output": 12000 }
},
"agentFacts": [
{ "fact": "MultiPaneView uses PixiJS-rendered pagination with TERM_* theme tokens", "category": "codebase_pattern" },
{ "fact": "User prefers solid borders over dashed borders", "category": "user_preference" }
]
}Alex 2 after recovery: "I was redesigning the pagination bar. I committed ad8ed51 with styled buttons and hover animations. The agent-session.ts changes are still unstaged — should I continue with those?"
| Decision | Chosen | Alternatives Considered | Rationale |
|---|---|---|---|
| No vector DB | File-based JSON | Qdrant, Chroma, FAISS | G4: Zero external deps. Our fact count (<50/agent) doesn't need ANN search |
| No LLM for extraction | Rule-based parsing | GPT-4o-mini, Ollama | G6: Zero token cost. Agent output is already structured enough to parse |
| Jaccard dedup over embeddings | Word-set overlap | Cosine similarity, LLM comparison | Sufficient for <50 facts. No embedding model dependency |
| Ring buffer (30 sessions) | Fixed size | Unlimited, LRU, TTL | Predictable storage cost with more recovery context |
| Promote at reinforceCount=3 | Threshold-based | Manual, voting | Simple, self-correcting. Bad facts decay naturally |
| Persist live work state | JSON snapshot | In-memory only | Required for crash-safe recovery before a successful session commit |
| Risk | Impact | Mitigation |
|---|---|---|
| Rule-based extraction misses important facts | Medium | addManualFact() exists now; optional LLM extraction can be added later |
| Jaccard dedup produces false positives | Low | Threshold 0.6 is conservative; worst case = mild duplication |
| Fact accumulation slows prompt | Low | Hard cap at 50 facts/agent + 20 shared; top-N by reinforceCount |
| Session summary too brief | Medium | Include recentMessages as L1.5 fallback alongside structured summary |
| Breaking existing recovery flow | High | Keep RecoveryContext.recentMessages as fallback; new system is additive |
- Optional LLM extraction: Use a small local model (Ollama) to extract richer facts at session end
- Fact decay: Auto-reduce
reinforceCountover time for stale facts - Semantic search: If fact count grows large (>200), add simple TF-IDF for retrieval
- Memory dashboard: UI in Open Office web app to inspect/edit/delete agent memories
- Export/import: Portable memory between machines (JSON export)
| Feature | Mem0 | OpenViking | Our Design |
|---|---|---|---|
| Storage | Vector DB (19 backends) | VikingFS (custom) | JSON files (zero deps) |
| Extraction | LLM on every add() |
LLM on session commit | Rule-based (v1) / optional LLM (v2) |
| Dedup | LLM conflict resolution | Vector pre-filter + LLM | Jaccard word-set similarity |
| Layering | 4 tiers (conv/session/user/org) | 3 tiers (L0/L1/L2 by detail) | 4 layers (L0-L3 by scope + lifetime) |
| Cross-agent | Per user_id scoping | Directory hierarchy | L0 summaries + L3 shared facts |
| Token cost | High (LLM per operation) | Medium (LLM on commit) | Minimal (rule-based v1, LLM optional v2) |
| Dependencies | pip install + vector DB + LLM API | Go compiler + C++ + AGFS | None (pure TypeScript + JSON) |
| Best for | General AI apps, SaaS | Enterprise context management | Lightweight multi-agent orchestrator |