This file defines the rules and workflows for your LLM-powered knowledge wiki. Your LLM coding tool reads this file and follows these instructions to manage your wiki automatically. It is tool-agnostic — it works with Claude Code, Codex CLI, Gemini CLI, or any LLM coding tool that reads markdown instructions.
schema_version: "1.0"
# Wiki Configuration
wiki_language: en # Default language for meta files (index.md, log.md)
# Options: en, ko, ja, zh, etc.
# Page content follows source language rules belowSeojae is a pattern-based personal knowledge wiki with a 3-tier architecture:
- Raw sources (
raw/) — Immutable originals. Only the user adds files; the LLM reads only. - Wiki (
wiki/) — LLM-generated/maintained markdown pages: summaries, entities, concepts, synthesis. - Schema (
WIKI_SCHEMA.md) — This file. Defines wiki rules and workflows.
- Python 3.9+ with pip
- Internet access on first run (downloads ~470MB embedding model)
- An LLM coding tool: Claude Code, Codex CLI, Gemini CLI, or similar
If you cannot download the model immediately, you can skip the search index setup and build it later with the Reindex workflow.
When a user asks to initialize this wiki, perform these steps:
The repo ships starter stubs. After initialization, update the file for your tool with environment-specific settings.
Claude Code (CLAUDE.md):
@WIKI_SCHEMA.md
@extensions/search-chromadb.md
@extensions/obsidian.md
# Environment
- Python: use `venv/bin/python` prefix or `source venv/bin/activate`
- All tools/ scripts require the venv to be active
Codex CLI (AGENTS.md):
(Full WIKI_SCHEMA.md content is already inlined)
(Append active extension contents below)
# Environment
- Python: always use `venv/bin/python` prefix (venv activation
does not persist between commands in Codex)
Gemini CLI (GEMINI.md):
@./WIKI_SCHEMA.md
@./extensions/search-chromadb.md
@./extensions/obsidian.md
# Environment
- Python: use `venv/bin/python` prefix or `source venv/bin/activate`
- Create venv:
python3 -m venv venv - Install:
venv/bin/pip install -r requirements.txt
- Run the search command defined by the active search extension.
- Default (search-chromadb):
venv/bin/python tools/search.py --reindex - Note: first run downloads the embedding model (~470MB).
- Run a test query:
venv/bin/python tools/search.py --query "test" - If results return, setup is complete.
| Path | Write | Read | Conflict Risk |
|---|---|---|---|
raw/ |
User only | LLM + User | None |
wiki/ |
LLM only | LLM + User | None |
index.md, log.md |
LLM only | LLM + User | None |
WIKI_SCHEMA.md |
User + LLM | LLM | Possible |
README.md |
User + LLM | User | Possible |
Absolute rules:
- Never modify files in
raw/. - Wiki page creation/modification must follow the rules in this schema.
Conflict prevention for shared-write files: Both the user and the LLM may edit WIKI_SCHEMA.md and README.md. Do not edit these files while the LLM is working. If both sides have uncommitted changes, the LLM will run git pull --rebase and ask the user to resolve any conflicts manually.
wiki/entities/— Things with a proper name (people, tools, companies, models). Examples: "GPT-4", "OpenAI", "Yoshua Bengio"wiki/concepts/— Abstract concepts without a proper name (attention mechanism, fine-tuning). Examples: "Transformer Architecture", "Reinforcement Learning"wiki/sources/— One source = one summary pagewiki/synthesis/— Analysis combining 2+ sources or concepts
raw/myself/— Your own content (blog posts, resume, etc.)raw/articles/— Web articlesraw/papers/— Academic papers, PDFsraw/videos/— YouTube/podcast transcriptsraw/books/— Book chaptersraw/misc/— Miscellaneousraw/assets/— Images, attachments (not subject to ingest)
Boundary rule: If it has a proper name, it is an entity; if it is a general concept or method, it is a concept. In ambiguous cases (e.g., "Transformer" — both a paper title and an architecture), prefer concept unless the page is specifically about a particular paper or product.
Parser limitation: The parse_frontmatter function in the search tool is regex-based. A line starting with --- inside a YAML block scalar may be misidentified as the closing delimiter. This rarely occurs in practice with wiki pages.
---
title: "Page Title"
type: concept # entity | concept | source | synthesis
tags: [tag1, tag2]
sources: ["raw/papers/example.md"]
aliases: [] # Optional — alternative names, e.g., ["attention mechanism"]
created: YYYY-MM-DD
updated: YYYY-MM-DD
---- Use Obsidian wikilinks:
[[Page Name]] - Filenames: Always use English kebab-case (e.g.,
attention-mechanism.md), regardless of body language. Non-English concept names go in thealiasesfrontmatter field. - Filenames must be unique across all
wiki/subdirectories (compatible with Obsidian "shortest path" wikilinks).
- Source summary pages (
wiki/sources/): Written in the source's original language. - Entity/concept pages (
wiki/entities/,wiki/concepts/): Written in the language of the source that first created the page. Later sources in different languages add information in the existing page's language. - Synthesis pages (
wiki/synthesis/): Written in the language the user requests, or the dominant language of the combined sources. - Meta files (
index.md,log.md,WIKI_SCHEMA.md,README.md): Written in the language specified bywiki_languagein the configuration block above. Source titles in log entries are kept in their original language. - Wikilink names: Use one canonical name per concept (always
[[Attention Mechanism]], never localized variants like[[어텐션 메커니즘]]). Add localized names to the frontmatteraliasesfield if needed.
Raw source files are freeform markdown. There is no required frontmatter — this schema does not enforce a format on raw sources. However, sources may include optional metadata at the top for context:
---
title: "Source Title"
author: "Author Name"
source: "https://original-url"
date: YYYY-MM-DD
---The body is the original content or a transcript/summary of it. The LLM reads raw sources as input and generates structured wiki pages from them.
When a workflow references {search.query}, {search.add}, or {search.reindex}, resolve it by reading the commands field from whichever extension is active with provides: search-backend. If no search extension is active, use the default shown in parentheses after each token. If no default is applicable, skip the step and warn the user.
Trigger: User specifies a file, e.g., "ingest raw/articles/some-article.md"
- Read the entire source (for sources with images: read text first, then examine referenced images separately).
- Discuss key takeaways with the user (what to emphasize, perspective).
- Create a source summary page at
wiki/sources/<source-name>.md. - Update related entity/concept/synthesis pages (or create new ones).
- Add cross-reference wikilinks between new and existing pages.
- Add new page entries to
index.md. - Update the search index for each new/modified wiki page:
Run
{search.add}<wiki page path>(Default:venv/bin/python tools/search.py --add) Pages without frontmatter are skipped with a warning; missing files cause exit code 1. - Append to
log.md:## [YYYY-MM-DD] ingest | <source title> - Git commit:
ingest: <source title>
Trigger: User asks a question about wiki content.
- Run
{search.query}"<question>" --top 5(Default:venv/bin/python tools/search.py --query)- Output format:
<path> [score: X.XX](score: cosine similarity, -1.0 to 1.0) - If output is empty or the highest score is below 0.5, also scan
index.mdas a fallback and merge results. - If the index path (default:
search-index/) does not exist (exit code 2), fall back to scanningindex.mdand advise the user to run the Reindex workflow. - An empty query string causes exit code 1 — ensure the query is non-empty.
- Output format:
- Read the wiki pages at the returned paths.
- Synthesize an answer with source citations. The answer format may vary depending on the question — markdown pages, comparison tables, slide decks (Marp), charts (matplotlib), canvas files.
- Save valuable answers back to the wiki. Comparisons, analyses, discovered connections — these should not vanish in chat history. Save as
wiki/synthesis/<topic>.md. If unsure whether to save, ask the user. - (If saved) Update
index.md, append tolog.md:## [YYYY-MM-DD] query | <question summary>, Git commit:query: <question summary>
Trigger: User asks to check the wiki, or periodically.
Health checks:
- Orphan pages — Pages with no inbound links
- Broken links — References to
[[Non-existent Page]] - Stale information — Content contradicting recent sources
- Missing pages — Frequently mentioned entities/concepts without their own page
- Insufficient cross-references — Highly related pages with no links between them
Growth suggestions (proactively, beyond just fixing problems): 6. Data gaps — Information that could be filled by web searches or new sources 7. New questions to investigate — Questions that would deepen wiki coverage 8. New sources to find — Source recommendations to fill identified gaps
- Report findings, fix health issues with user approval, present growth suggestions.
- Update the search index for each modified page:
Run
{search.add}<modified wiki page path>(Default:venv/bin/python tools/search.py --add) - Append to
log.md:## [YYYY-MM-DD] lint | <summary> - Git commit:
lint: <fix summary>
Trigger: User asks to check for new sources and ingest them.
- Read
log.mdto build a list of already-processed sources (find## [YYYY-MM-DD] ingest | <title>headers, extract source file paths from^- Source: <path>patterns in entry bodies). - Scan all files in
raw/subdirectories (excludingraw/assets/). - Difference = unprocessed sources.
- Report the list of new sources, then proceed to ingest all of them without waiting for approval.
- Run the full Ingest workflow for each source, with individual
ingest:commits per source. - After all processing, append a summary to
log.md:## [YYYY-MM-DD] check-new | N new sources processed - Git commit:
check-new: N sources processed
Trigger: User asks to rebuild the index, or during environment setup.
- Run
{search.reindex}(Default:venv/bin/python tools/search.py --reindex) For non-standard paths, add--index-path <path>and/or--wiki-path <path>. - Confirm the completion message and report (output:
Reindex complete: N pages indexed, M skipped). search-index/is a generated artifact included in.gitignore— no commit needed.
A categorized wiki catalog. Updated after every Ingest, Query save, and Lint.
- One entry per line:
- [[Page Name]] — one-line summary - Alphabetical order within each category
- Category headers (in the language specified by
wiki_language): Entities, Concepts, Sources, Synthesis
Chronological, append-only record. Parseable with grep "^## \[" log.md | tail -5.
- Header format:
## [YYYY-MM-DD] <action> | <title> - Actions:
init,ingest,query,lint,check-new - Source file paths always use the
- Source: <path>prefix (the Check-New workflow parses already-processed sources using the^- Source:pattern). - Entry bodies also record pages created/modified.
init: project bootstrapped— Initial bootstrapingest: <source title>— Source processingquery: <question summary>— Query result saved to wikilint: <fix summary>— Wiki maintenancecheck-new: <N sources processed>— Batch new source processing summary (after individual ingest commits)schema: <change description>— WIKI_SCHEMA.md or README.md changes
- At the start of any workflow:
git pull - After editing files:
git add->git commit->git pull --rebase->git push - On rebase conflict (extremely rare):
git rebase --abortand ask the user to resolve manually.
Before starting any workflow, scan the extensions/ directory.
Each .md file is an extension module. Read all active extensions
and follow their instructions alongside this core schema.
- Read all
.mdfiles inextensions/(excludingREADME.md).- If an extension declares
min_schema_versionhigher than this schema'sschema_version, skip it and warn the user to update.
- If an extension declares
- Check
provides:fields for conflicts:provides:signals exclusive ownership of a capability (e.g., only onesearch-backendcan be active).- If two extensions declare the same
provides:value, only use the one with anoverrides:field targeting the other. - If neither overrides the other, warn the user and use the first one alphabetically.
- Extensions that augment (not replace) a workflow do NOT
need a
provides:value — they are always active.
- Check
requires.provides:fields — if an extension declares a capability dependency (e.g.,requires.provides: [search-backend]), verify that a provider is active. Warn the user if not. - Verify scripts: check each entry in
requires.scripts:exists in the repo. Warn the user if any are missing. - Install dependencies: run
venv/bin/pip install <package>for each entry inrequires.packages:(let pip handle version resolution). - Follow each active extension's instructions.
- Add new workflows
- Append steps to existing core workflows (reference the step they follow, e.g., "After Ingest step 3, also do X")
- Replace a capability by declaring
provides:+overrides: - Add integrations (Obsidian, Notion, etc.)
- Define new page types or categories