Skip to content
AEndrix edited this page May 15, 2026 · 4 revisions

graft is a single-binary semantic cache that survives across sessions. Agents save what they learn; queries return verified hits in milliseconds; misses degrade to graph exploration. No SaaS, no API keys, one SQLite file.

LLM/agent ──> graft query "…"
                 │
                 ▼
            STRONG  ──> reuse answer (no LLM call, no token cost)
            WEAK    ──> reuse with low-confidence banner
            MISS    ──> graft retrieve / explore fallback
                 │
                 ▼
            graft insert  ──> next time it's a STRONG

What graft is

A C11 daemon + CLI that speaks MessagePack over an AF_UNIX socket, backed by SQLite + sqlite-vec + FTS5, embeds text with BGE-M3 (1024-dim) via llama.cpp, and exposes both a subprocess CLI and an optional REST + 3D viewer. Designed to be the smallest useful thing that makes hard-won agent knowledge survive a session.

Why not a vector DB / RAG / chatbot platform

Vector DB RAG framework graft
Local-first, no SaaS partial
One binary, no dependencies
Graph topology (not just lists)
Verified hits (STRONG/WEAK/MISS)
3D viewer included
Multi-tenant by profile partial
MCP + Skills/Hooks integrations partial

Where to next

Clone this wiki locally