-
Notifications
You must be signed in to change notification settings - Fork 7
Design Rationale
Decisions that have already been argued. This page captures the why so that future contributors (human or LLM) do not relitigate.
Decision: vault.* operations never touch Obsidian runtime. They read and write .md files on the filesystem. Obsidian can be closed, not installed, on a different machine — the MCP server still works.
Why:
- Agents run on servers, in CI, in cron jobs, inside Docker. Most of those environments have no display. An MCP server that needs Obsidian open is an MCP server that needs a human present.
- Obsidian's WebSocket API (via plugins like Obsidian Local REST API) is a valid bridge, but it couples agent workflow to human workflow. If the user closes Obsidian, the agent loses its tools. That is not robust.
- The filesystem is already the source of truth. Obsidian reads
.mdfiles. Dataview reads.mdfiles. Templater reads.mdfiles. We read.mdfiles. Consistency is free.
Consequence: features that genuinely need Obsidian runtime (Dataview query execution, Templater rendering, plugin-specific commands) do not live here. They live behind the obsidian adapter (opt-in) or in the separate obsidian-vault-bridge project (not shipped — see FAQ).
Decision: where the upstream ecosystem already solves a problem, we adapt; we do not rebuild.
Cases:
- Bulk import of foreign formats (Evernote, Notion export, Roam JSON). We do not write converters. Obsidian ships a native CLI and the Obsidian Importer plugin handles all the big formats. Our job is to be reachable from the CLI, not to duplicate it. See Obsidian-Native-CLI-Comparison.
- Daily note / template rendering. Obsidian does this natively. We don't.
- Vector search. When the user already runs Postgres + pgvector (memU pattern) or wants a local-only option (pglite + bge-m3), we adapt. We do not ship our own vector index.
-
Graph viz. We emit
.compile/graph.json; the static HTML viewer (viewer/) renders it. No custom D3 framework.
Exception: where the semantic is already part of the user's muscle memory (Obsidian's key:: value Dataview inline field syntax, wikilinks, tags), we are native, not adaptive. Those are the user's language; we speak it.
Decision: notes authored by LLM agents land in 00-Inbox/AI-Output/{persona}/YYYY-MM-DD-{slug}.md with an 8-field provenance frontmatter and a body-tag gated review system. This was the flagship change in v2.
The 8 fields:
generated-by: vault-<persona>
generated-at: 2026-04-21T10:23:00Z
agent: claude-opus-4-7
parent-query: "user's original ask, truncated to 200 chars"
source-nodes: ["[[Kolmogorov complexity]]", "[[MDL]]"]
status: draft | stale | superseded
scope: project | global | cross-project | host-local
quarantine-state: new | reviewed | promoted | discarded
history: [{ at, op, axis, delta }, ...]Why each field:
-
generated-by/agent: blame. If an analysis is wrong, we can audit which persona + which model produced it. -
parent-query: provenance. A note without a query attached is a note that cannot be re-generated. -
source-nodes: citation graph. AI-Output notes that cite no source nodes are either bullshit or insights — the sweep distinguishes by comparing backlinks. -
status(draft / stale / superseded): lifecycle. Drafts expire on a persona-specific clock if no non-AI-Output notes backlink them. Supersede is a same-persona Jaccard≥0.6 match on source-nodes. -
scope: governance.project-scoped insights stay in-vault;cross-projectis promoted to the memU graph layer;host-localnever leaves the current machine. -
quarantine-state: trust gate. New AI output isnew; human review flips toreviewed; promotion to canonical knowledge flips topromoted; rejection flips todiscarded. -
history[]: audit trail. Every sweep that touches the note appends an entry. Non-destructive.
The human-signature cache — review-status is not a frontmatter field. It rides on an Obsidian body tag #user-confirmed. Reasons: frontmatter edits trigger Obsidian's re-render + cache invalidation on every keystroke; body tags are cheap, searchable via Obsidian's tag panel, and survive merge conflicts. The write op's reviewStatus: "user-confirmed" parameter simply appends the tag.
Why not just store everything in Canonical/ directly? Because agents are wrong sometimes. Sediment gives a trust gradient (new → reviewed → promoted) instead of a binary trust/reject. A note can sit in new for a week before promotion.
Decision: the project's thesis is Karpathy's blog post (2026). A wiki that LLMs curate compounds across sessions; session-scoped memory does not.
What we take from Karpathy:
- Markdown is the substrate (not a DB, not a graph DB, not JSON).
- Wikilinks (
[[...]]) are the edge primitive. -
log.mdchronicles,_index.mdcatalogues — these are the "topic-level" structure. - Humans and LLMs read the same files. No two-world problem.
What we add beyond Karpathy's post:
- The AI-Output sediment convention (Karpathy's post leaves "how do you trust LLM-authored entries" open).
- The compile pipeline (concept graph, link discovery, evaluate) — Karpathy manually curates; we provide automation for the parts that are automatable.
- The adapter registry — Karpathy uses one vault; we handle the multi-source case (vault + memU graph + code graph + Obsidian metadata).
Decision: every mutating op defaults to dryRun: true. The agent must explicitly pass dryRun: false to actually write.
Why:
- LLMs hallucinate paths. An op that writes on the first call is an op that writes to a hallucinated path on the first call.
-
vault.*operates inside the user's irreplaceable data. Versioning via git is the last line, not the first. - Three layers, each independent:
- Dry-run default (op-level)
- Protected dirs (
.obsidian,.trash,.git,node_modules— hard-blocked regardless of dry-run) - Realpath traversal guard (symlink targets must resolve inside vault root)
No single layer is enough. Layered defense.
Decision: VAULT_MIND_VAULT_PATH env var takes precedence over any on-disk yaml. Fixed in v2.0.0.
Why: pre-v2, yaml probes ran first. An abandoned dev-workspace yaml in a parent directory could silently redirect the server away from the user's intended vault. An explicit env var is a declaration of intent; a stale yaml is accidental state. Intent wins.
Decision: the 7 vault-* personas (librarian, architect, curator, gardener, historian, janitor, teacher) are Claude Code skills installed under ~/.claude/skills/. They are not MCP operations.
Why:
- A persona is a system prompt + workflow + tool-selection policy. MCP operations are atomic transport-level verbs. Different layer.
- Making them skills means they work with any MCP client, not just Claude Code — the skill format is portable.
- Adding a new persona requires zero server changes. The MCP tool surface (40 ops) stays stable; persona evolution happens in the skill files alone.
Decision: the compile pipeline (compiler/*.py) is a Python CLI, not an in-process module of the MCP server.
Why:
- Compile is batch work. Taking 30 seconds to re-index 500 notes should not block an interactive
vault.search. - Python has the mature scientific stack (Jinja2, spacy candidates, jupyter for debugging). TypeScript does not, for this workload.
- The filesystem is the API between pipeline and server. Both read the same
.mdfiles; both write to the same.compile/folder. No RPC, no shared memory, no lock files. Filesystem atomicity primitives are enough.
Consequence: install the MCP server without Python if you don't want compile. The filesystem adapter alone is a complete install.
Decision: mcp-server/ uses @modelcontextprotocol/sdk + Node stdlib. No Express, no Fastify, no Koa. No ORM. No DI framework. No state manager.
Why:
- The MCP server is ~2000 lines of TypeScript. A framework tax would be 50%+ overhead for the abstractions it provides.
- MCP transport is already defined by the SDK; adding Express means we have two stacks.
- For stdio JSON-RPC, plain async functions are enough. State is the vault filesystem; there's nothing to manage.
Honest list of assumptions that could break the thesis:
-
Filesystem-is-source-of-truth assumption — if a user's vault lives in a non-filesystem backend (iCloud Drive sync conflicts, OneDrive reparse points, network shares with funky locking), the safety guarantees wobble. See
recipes/for the multi-source landing strategy. - Karpathy model assumes a single curator — we inherit that assumption. Multi-human-author vaults are out of scope.
-
Sediment only works if the human reviews — if
quarantine-state: newnotes pile up forever, the vault is an agent monologue, not a knowledge base. Sweep gives a staleness timer but cannot force review. - Adapter weights are hand-tuned — no learning, no auto-calibration. If the user's workload shifts, weights drift from optimal. Documented but not yet addressed.
See FAQ for questions people have actually asked.