Skip to content

Commit cfd4eaf

Browse files
committed
feat: move session memory to Redis hot tier
1 parent ada7c0b commit cfd4eaf

44 files changed

Lines changed: 10553 additions & 4542 deletions

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,5 @@
11
.opencode/
2+
.swarm/
23

34
dist/
45

README.md

Lines changed: 182 additions & 90 deletions
Original file line numberDiff line numberDiff line change
@@ -1,55 +1,86 @@
11
# opencode-graphiti
22

3-
OpenCode plugin that provides persistent memory via a
4-
[Graphiti](https://github.com/getzep/graphiti) knowledge graph.
3+
OpenCode plugin that provides persistent memory via
4+
[FalkorDB](https://www.falkordb.com/)/Redis and asynchronous
5+
[Graphiti](https://github.com/getzep/graphiti) knowledge-graph consolidation.
56

67
## Motivation
78

89
Long-running AI coding sessions depend on persistent memory to stay on track.
9-
Graphiti's MCP server is the intended backbone for this, but in practice it is
10-
unreliable — connections drop, queries time out, and ingestion silently fails.
11-
When the context window fills up and OpenCode triggers compaction, the
12-
summarizer discards details that were never persisted. The result is **context
13-
rot**: the agent loses track of recent decisions, re-explores solved problems,
14-
and drifts away from the original goal.
15-
16-
This plugin exists to close that gap. It captures chat histories and project
17-
facts into Graphiti when the server is healthy, then **re-injects them at the
18-
start of every session and before every compaction** so the agent is always
19-
reminded of recent project context — regardless of what survived the summary.
10+
Graphiti's MCP server is a powerful knowledge-graph backend, but synchronous
11+
calls to it on every message add latency and introduce a single point of failure
12+
— connections drop, queries time out, and ingestion silently fails. When the
13+
context window fills up and OpenCode triggers compaction, the summarizer
14+
discards details that were never persisted. The result is **context rot**: the
15+
agent loses track of recent decisions, re-explores solved problems, and drifts
16+
away from the original goal.
17+
18+
This plugin exists to close that gap. It uses **FalkorDB/Redis as the hot-path
19+
store** for structured session events, priority-tiered snapshots, and cached
20+
memory — all readable in sub-millisecond time. Graphiti remains the long-term
21+
knowledge graph but is accessed **only asynchronously**, off the critical path.
22+
The plugin re-injects session context before every LLM call and before every
23+
compaction so the agent is always reminded of recent project context —
24+
regardless of what survived the summary and regardless of Graphiti availability.
2025

2126
## Overview
2227

23-
This plugin connects to a Graphiti MCP server and:
28+
This plugin uses a two-tier architecture:
2429

25-
- Searches Graphiti for relevant facts and entities on each user message
26-
- Injects memories into the last user message as a `<memory>` block via
30+
**Hot path (FalkorDB/Redis — synchronous, sub-ms):**
31+
32+
- Stores structured session events, priority-tiered snapshots, and cached
33+
Graphiti results in Redis
34+
- Reads cached memory on each user message and injects it into the last user
35+
message as a `<session_memory>` block via
2736
`experimental.chat.messages.transform`, keeping the system prompt static for
2837
prefix caching
29-
- Detects context drift using Jaccard similarity and re-injects when the
30-
conversation topic shifts
31-
- Buffers user and assistant messages, flushing them to Graphiti on idle or
32-
before compaction
33-
- Preserves key facts during context compaction
38+
- Composes the same `<session_memory>` envelope for compaction context via
39+
`experimental.session.compacting`
40+
- Detects context drift using Jaccard similarity on cached fact UUIDs and
41+
schedules an async cache refresh when the topic shifts
42+
43+
**Async tier (Graphiti MCP — fire-and-forget, non-blocking):**
44+
45+
- Drains buffered session events to Graphiti as episodes on idle or before
46+
compaction
47+
- Refreshes the Redis memory cache from Graphiti search results in the
48+
background
49+
- Provides cross-session recall via vector/graph search, cached in Redis for
50+
chat-time injection
3451
- Saves compaction summaries as episodes so knowledge survives across boundaries
35-
- Annotates stale facts and filters expired ones automatically
36-
- Scopes memories per project (and per user) using directory-based group IDs
52+
53+
No Graphiti call ever blocks a hook return.
3754

3855
## Prerequisites
3956

57+
### FalkorDB / Redis
58+
59+
A running [FalkorDB](https://www.falkordb.com/) instance accessible via the
60+
Redis protocol. The easiest way to start one:
61+
62+
```bash
63+
docker run -p 6379:6379 falkordb/falkordb:latest
64+
```
65+
66+
### Graphiti MCP Server
67+
4068
A running
4169
[Graphiti MCP server](https://github.com/getzep/graphiti/tree/main/mcp_server)
42-
accessible over HTTP. The easiest way to set one up:
70+
accessible over HTTP:
4371

4472
```bash
45-
# Clone and start with Docker Compose
4673
git clone https://github.com/getzep/graphiti.git
4774
cd graphiti/mcp_server
4875
docker compose up -d
4976
```
5077

51-
This starts the MCP server at `http://localhost:8000/mcp` with a FalkorDB
52-
backend.
78+
This starts the MCP server at `http://localhost:8000/mcp`.
79+
80+
> **Note:** Graphiti is optional for basic operation. If Graphiti is
81+
> unavailable, the plugin continues to function with FalkorDB/Redis-sourced
82+
> session memory; only the `<persistent_memory>` section (long-term
83+
> cross-session facts) will be empty until Graphiti comes online.
5384
5485
## Installation
5586

@@ -101,95 +132,151 @@ automatically.
101132

102133
Supported config locations, in lookup order:
103134

104-
1. The provided project directory: `package.json#graphiti`, `.graphitirc`, and other standard `cosmiconfig` `graphiti` filenames
105-
2. Standard global/home `graphiti` config locations discovered by `cosmiconfig` (for example `~/.graphitirc`)
135+
1. The provided project directory: `package.json#graphiti`, `.graphitirc`, and
136+
other standard `cosmiconfig` `graphiti` filenames
137+
2. Standard global/home `graphiti` config locations discovered by `cosmiconfig`
138+
(for example `~/.graphitirc`)
106139
3. Legacy fallback: `~/.config/opencode/.graphitirc`
107140

108-
Example `.graphitirc`:
141+
### Nested Config Shape (recommended)
109142

110143
```jsonc
111144
{
112-
// Graphiti MCP server endpoint
113-
"endpoint": "http://localhost:8000/mcp",
114-
115-
// Prefix for project group IDs (e.g. "opencode-my-project")
116-
"groupIdPrefix": "opencode",
117-
118-
// Jaccard similarity threshold (0–1) below which memory is re-injected
119-
// Lower values mean the topic must drift further before re-injection
120-
"driftThreshold": 0.5,
121-
122-
// Number of days after which facts are annotated as stale
123-
"factStaleDays": 30
145+
"falkordb": {
146+
// FalkorDB Redis URL
147+
"redisEndpoint": "redis://localhost:6379",
148+
// Max events per drain batch
149+
"batchSize": 20,
150+
// Max combined body bytes per drain batch
151+
"batchMaxBytes": 51200,
152+
// Session event TTL in seconds (default: 24 h)
153+
"sessionTtlSeconds": 86400,
154+
// Memory cache TTL in seconds (default: 10 min)
155+
"cacheTtlSeconds": 600,
156+
// Max drain retry attempts before dead-lettering
157+
"drainRetryMax": 3
158+
},
159+
"graphiti": {
160+
// Graphiti MCP server endpoint
161+
"endpoint": "http://localhost:8000/mcp",
162+
// Prefix for project group IDs (e.g. "opencode-my-project")
163+
"groupIdPrefix": "opencode",
164+
// Jaccard similarity threshold (0–1) below which cache is refreshed
165+
"driftThreshold": 0.5,
166+
// Number of days after which facts are annotated as stale
167+
"factStaleDays": 30
168+
}
124169
}
125170
```
126171

127172
All fields are optional — defaults (shown above) are used for any missing
128-
values.
173+
values. Nested values take precedence when both forms are supplied.
174+
175+
### Legacy Top-Level Keys
176+
177+
For backward compatibility, the following top-level keys are still accepted and
178+
map to their nested equivalents:
179+
180+
| Legacy key | Nested equivalent |
181+
| ------------------- | ---------------------------- |
182+
| `endpoint` | `graphiti.endpoint` |
183+
| `groupIdPrefix` | `graphiti.groupIdPrefix` |
184+
| `driftThreshold` | `graphiti.driftThreshold` |
185+
| `factStaleDays` | `graphiti.factStaleDays` |
186+
| `redisEndpoint` | `falkordb.redisEndpoint` |
187+
| `batchSize` | `falkordb.batchSize` |
188+
| `batchMaxBytes` | `falkordb.batchMaxBytes` |
189+
| `sessionTtlSeconds` | `falkordb.sessionTtlSeconds` |
190+
| `cacheTtlSeconds` | `falkordb.cacheTtlSeconds` |
191+
| `drainRetryMax` | `falkordb.drainRetryMax` |
129192

130193
## How It Works
131194

132-
### Memory Search and Caching (`chat.message`)
195+
### Injection Format
196+
197+
The plugin injects a single canonical `<session_memory>` XML envelope into the
198+
last user message. This envelope is assembled from Redis hot-tier state and
199+
contains structured sections such as `<last_request>`, `<active_tasks>`,
200+
`<key_decisions>`, `<files_in_play>`, `<project_rules>`, and an optional
201+
`<session_snapshot>`.
202+
203+
When cached Graphiti results are available, a nested `<persistent_memory>`
204+
section is included with `fact_uuids` and `node_refs` attributes. On a cold
205+
first turn or when Graphiti is unreachable, `<persistent_memory>` is simply
206+
absent — the rest of the session memory is always available from FalkorDB/Redis.
207+
208+
```xml
209+
<session_memory source="falkordb+graphiti-cache" version="1">
210+
<last_request>Continue the current task.</last_request>
211+
<active_tasks><task>Implement the new feature.</task></active_tasks>
212+
<key_decisions><decision>Use Redis for the hot path.</decision></key_decisions>
213+
<files_in_play><file>src/index.ts</file></files_in_play>
214+
<project_rules><rule>No synchronous Graphiti calls.</rule></project_rules>
215+
<session_snapshot><!-- priority-tiered snapshot --></session_snapshot>
216+
<persistent_memory fact_uuids="uuid1,uuid2" node_refs="nodeA">
217+
<!-- cached Graphiti facts/nodes, optional -->
218+
</persistent_memory>
219+
</session_memory>
220+
```
221+
222+
### Hot-Path Memory Preparation (`chat.message`)
133223

134-
On each user message the plugin searches Graphiti for facts and entities
135-
relevant to the message content. Results are split into project and user scopes
136-
(70% / 30% budget), deduplicated, filtered for validity, annotated with
137-
staleness if older than `factStaleDays`, and formatted as Markdown. The
138-
formatted context is cached on the session state for the messages transform hook
139-
to pick up.
224+
On each user message the plugin reads session state from Redis:
140225

141-
On the very first message of a session, the plugin also loads the most recent
142-
session snapshot episode to prime the conversation with prior context.
226+
- Recent structured session events (`session:{id}:events`)
227+
- The priority-tiered snapshot (`session:{id}:snapshot`)
228+
- The cached Graphiti memory (`memory-cache:{groupId}`)
143229

144-
The injection budget is calculated dynamically: 5% of the model's context limit
145-
(resolved from the provider list) multiplied by 4 characters per token.
230+
These are composed into a `<session_memory>` envelope and staged for the
231+
transform hook. All reads are from Redis (sub-ms); no Graphiti call is made on
232+
this path.
146233

147234
### User Message Injection (`experimental.chat.messages.transform`)
148235

149-
A separate hook reads the cached memory context and prepends it to the last user
150-
message as a `<memory data-uuids="...">` block. The `data-uuids` attribute lists
151-
the fact UUIDs included in the injection, which are tracked in
152-
`visibleFactUuids` so subsequent searches can filter out already-visible facts.
153-
This approach keeps the system prompt static, enabling provider-side prefix
154-
caching, and avoids influencing session titles. The cache is cleared after
155-
injection so stale context is not re-injected on subsequent LLM calls within the
156-
same turn.
236+
The transform hook reads the prepared `<session_memory>` envelope and prepends
237+
it to the last user message. Fact UUIDs from the `<persistent_memory>` section
238+
are tracked in `visibleFactUuids` so subsequent cache refreshes can filter out
239+
already-visible facts. This approach keeps the system prompt static, enabling
240+
provider-side prefix caching, and avoids influencing session titles. The
241+
prepared injection is cleared after use so stale context is not re-injected on
242+
subsequent LLM calls within the same turn.
157243

158-
### Drift-Based Re-injection (`chat.message`)
244+
### Drift Detection and Async Cache Refresh
159245

160-
After the first injection, the plugin monitors for context drift on every user
161-
message. It searches Graphiti for the current message and compares the returned
162-
fact UUIDs against the previously injected set using Jaccard similarity. When
163-
similarity drops below `driftThreshold` (default 0.5), the memory cache is
164-
refreshed with project-scoped results only (no user scope).
246+
On each user message, the plugin compares the current query against the query
247+
that produced the cached memory. When Jaccard similarity on cached fact UUIDs
248+
drops below `driftThreshold` (default 0.5), an **async** cache refresh is
249+
scheduled via Graphiti MCP. The current cached context is still injected
250+
immediately; the refreshed cache becomes available on the next message. This
251+
trades one message of staleness for eliminating synchronous Graphiti latency
252+
entirely.
165253

166-
### Message Buffering (`event`)
254+
### Event Extraction and Buffering (`event`)
167255

168-
User and assistant messages are buffered in memory as they arrive. The plugin
169-
listens on `message.part.updated` to capture assistant text as it streams, and
170-
on `message.updated` to finalize completed assistant replies. Buffered messages
171-
are flushed to Graphiti as episodes:
256+
User and assistant messages are captured as structured `SessionEvent` objects
257+
and stored in Redis (`session:{id}:events`). The plugin listens on
258+
`message.part.updated` to buffer assistant text as it streams, and on
259+
`message.updated` to finalize completed assistant replies.
172260

173-
- **On idle** (`session.idle`): when the session becomes idle with at least 50
174-
bytes of buffered content.
175-
- **Before compaction** (`session.compacted`): all buffered messages are flushed
176-
immediately (no minimum size) so nothing is lost.
261+
Events are also enqueued for async drain to Graphiti:
177262

178-
If the last buffered message is from the user (i.e. no assistant reply was
179-
captured), the plugin fetches the latest assistant message from the session API
180-
as a fallback before flushing.
263+
- **On idle** (`session.idle`): buffered events are drained and the
264+
priority-tiered snapshot is rebuilt.
265+
- **Before compaction** (`session.compacted`): all pending events are drained
266+
immediately so nothing is lost.
181267

182-
### Compaction Preservation (`session.compacted` + `experimental.session.compacting`)
268+
### Compaction Preservation
183269

184270
Compaction is handled entirely by OpenCode's native compaction mechanism. The
185271
plugin participates in two ways:
186272

187-
1. **Before compaction** (`experimental.session.compacting`): The plugin injects
188-
known facts and entities into the compaction context using the same 70% / 30%
189-
project/user budget split, so the summarizer preserves important knowledge.
190-
2. **After compaction** (`session.compacted`): The compaction summary is saved
191-
as an episode to Graphiti, ensuring knowledge survives across compaction
192-
boundaries.
273+
1. **Before compaction** (`experimental.session.compacting`): The plugin reads
274+
the snapshot and cached memory from Redis and composes the same canonical
275+
`<session_memory>` envelope used for chat injection, so the summarizer
276+
preserves important knowledge. No Graphiti call is made.
277+
2. **After compaction** (`session.compacted`): The snapshot is rebuilt from
278+
Redis events and the compaction summary is enqueued for async drain to
279+
Graphiti, ensuring knowledge survives across compaction boundaries.
193280

194281
### Project Scoping
195282

@@ -207,7 +294,12 @@ process.
207294

208295
MIT
209296

210-
## Acknowledgement
297+
## Acknowledgements
298+
299+
The structured event extraction, priority-tiered snapshots, and session
300+
continuity design in this plugin are inspired by
301+
[context-mode](https://github.com/mksglu/context-mode) by
302+
[Mert Köseoğlu](https://github.com/mksglu).
211303

212-
This project is inspired by
213-
[opencode-openmemory](https://github.com/happycastle114/opencode-openmemory)
304+
The original plugin concept is inspired by
305+
[opencode-openmemory](https://github.com/happycastle114/opencode-openmemory).

deno.json

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -33,6 +33,7 @@
3333
"@opencode-ai/plugin": "npm:@opencode-ai/plugin@^1.1.53",
3434
"@opencode-ai/sdk": "npm:@opencode-ai/sdk@^1.1.53",
3535
"cosmiconfig": "npm:cosmiconfig@9.0.0",
36+
"ioredis": "npm:ioredis@^5.7.0",
3637
"zod": "npm:zod@4.3.6"
3738
},
3839
"exports": {

0 commit comments

Comments
 (0)