You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
docs: reframe README around memory reliability, add differentiator (#86)
- Lead with 'Your agent remembers what matters' instead of token reduction
- Add one-sentence differentiator near the top
- Reframe the problem statement around agent forgetfulness and trust
- Add 'Remember' stage to the How It Works table
- Document new memory endpoints (expire, supersede) and MCP tools
- Move token efficiency to a supporting detail, not the headline
Closes#80Closes#81
Co-authored-by: Ona <no-reply@ona.com>
Copy file name to clipboardExpand all lines: README.md
+12-9Lines changed: 12 additions & 9 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -8,11 +8,11 @@
8
8
9
9
[](https://app.ona.com/#https://github.com/siddhant-k-code/distill)
10
10
11
-
**Open-source context preprocessing for LLM applications.**
11
+
**Your agent remembers what matters.**
12
12
13
-
Distill sits between your application and any LLM. It cleans up context before it's sent: deduplicating semantically redundant chunks, compressing conversation history as it ages, and placing cache markers on stable content so Anthropic's prompt cache actually fires.
13
+
Distill gives LLM agents persistent, deduplicated memory that survives across sessions. It prevents repeated re-learning, surfaces conflicting information before it causes mistakes, and compresses aging context so the signal stays high.
14
14
15
-
The result: fewer tokens sent, lower cost per request, and context windows that don't fill up with noise.
15
+
> Other tools compress what goes into your agent. Distill controls what your agent *remembers* — across sessions, without conflicts, ranked by what matters now.
16
16
17
17
**[Learn more →](https://distill.siddhantkhare.com)**
18
18
@@ -22,29 +22,30 @@ The result: fewer tokens sent, lower cost per request, and context windows that
22
22
RAG / tools / memory / docs
23
23
↓
24
24
Distill
25
-
(dedupe · compress · cache)
25
+
(remember · dedupe · compress · cache)
26
26
↓
27
27
LLM
28
28
```
29
29
30
30
## The Problem
31
31
32
-
30-40% of context assembled from multiple sources is semantically redundant. The same information arrives from docs, code, memory, and tool outputs, all competing for attention in the same prompt.
32
+
Agents forget. Every new session starts from zero — the same constraints, preferences, and facts have to be re-established. When context does persist, 30-40% of it is semantically redundant, and contradictory information sits side by side with no signal about which version is current.
33
33
34
-
This causes non-deterministic outputs, confused reasoning, and failures that only show up at scale. Better prompts don't fix it. The context going in needs to be clean.
34
+
This causes non-deterministic outputs, confused reasoning, and failures that only show up at scale. Better prompts don't fix it. The agent needs memory it can trust.
35
35
36
36
## How It Works
37
37
38
38
No LLM calls. Fully deterministic. ~12ms overhead.
39
39
40
40
| Stage | What it does |
41
41
|-------|-------------|
42
+
|**Remember**| Persistent memory across sessions with write-time dedup, expiry, and sensitivity tagging |
42
43
|**Deduplicate**| Cluster semantically similar chunks, keep one representative per cluster |
43
44
|**Compress**| Extractive compression to remove noise and preserve signal |
44
45
|**Summarize**| Progressively condense conversation history as turns age |
45
46
|**Cache**| Annotate stable prefixes with `cache_control`, track TTL per prefix |
46
47
47
-
All four stages chain together via `POST /v1/pipeline` or `distill pipeline` CLI.
48
+
All stages chain together via `POST /v1/pipeline` or `distill pipeline` CLI. Memory is available via `--memory` flag.
48
49
49
50
### Dedup pipeline
50
51
@@ -273,7 +274,7 @@ Memory tools are available in Claude Desktop, Cursor, and other MCP clients when
0 commit comments