This file tracks the active execution queue for this repository. Keep it current when starting, finishing, or reprioritizing work.
- Move root research source notes into
docs/research/source-notes/. - Add a documentation map in
docs/README.md. - Split cross-project memory rules into
docs/memory-policy.md. - Split repository-specific strategy into
docs/project-development-policy.md. - Move detailed MCP API reference out of
README.md. - Add
AGENTS.mdas a short agent entrypoint.
- Add deterministic query normalization for domain terms.
- Make
待办,待办项,todo,task, and任务retrievework_itemrecords. - Make decision/preference/procedure/evidence terms retrieve the matching object types or knowledge kinds.
- Return
normalized_terms,applied_filters, and useful no-match retry hints where practical. - Add tests for search and context behavior.
- Update
docs/mcp-api-reference.mdanddocs/agent-memory-mcp-usage.mdafter behavior lands.
- Add soft duplicate detection for title/summary-only knowledge.
- Return
possible_duplicateswith reasons and scores. - Do not hard-reject unstructured semantic duplicates.
- Preserve current hard duplicate/conflict behavior for structured facts.
- Add tests for similar unstructured knowledge that should be flagged but still writable.
- Extend
memory_maintain reportto surface soft duplicate candidates. - Keep
merge_duplicateslimited to deterministic structured duplicates until review semantics are explicit. - Design a safe resolve/review path for soft duplicates.
- Update agent usage docs so callers retry with expanded terms before concluding there is no memory.
- Add response guidance fields to MCP docs after implementation.
- Add MCP resources/prompts for policy and examples.
- Re-evaluate embedding/vector/hybrid retrieval after deterministic query normalization is in place.
- Continue Cognee and LlamaIndex spikes only if they fit behind Memory Substrate governance.
- Keep hosted LLMs, local LLMs, Graphiti, and reasoner adapters optional.
- Treat Neo4j as an optional production backend after local contracts and migrations are stable.
- Spike LanceDB + BGE-M3 semantic retrieval against Chinese/English memory queries.
- Confirm LanceDB remains a derived index, not canonical storage.
- Add optional semantic dependencies for LanceDB and FlagEmbedding.
- Project canonical memory objects into semantic chunks.
- Rebuild the semantic index from
memory_maintain reindex. - Merge lexical and semantic results in
memory_query search. - Keep semantic search active when a graph backend is also configured.
- Add regression coverage for the
Codex dogfood MCPquery miss.
- Adopt a single-primary parser stance: require
tree_sitter_language_pack, then use local fallback parsing only when parser loading fails. - Index Markdown repository docs as source evidence with headings, excerpts, and line locators.
- Make repo query summaries include documentation sections so theory-to-code questions can find design docs.
- Add a locked parser dependency for
tree-sitter-language-pack==1.6.0and run a live parser smoke test.
- Make
memory_query pagecompact by default with explicitoptions.detail: "full"for complete stored objects. - Bound and truncate source segment excerpts returned by
memory_query expand. - Shorten repo source summaries and MCP server instructions.
Status: completed
Goal: make memory_query search robust when users and agents do not phrase queries with exact stored keywords.
Boundary: memory-core retrieval only. Do not adopt llm_wiki desktop/wiki UI, web clipper, or knowledge-collection workflows.
Deliverables:
- Replace lexical/semantic score max-merge with rank-based fusion such as Reciprocal Rank Fusion.
- Improve lexical query planning with phrase, title, filename/id, and CJK bigram signals.
- Return matched semantic chunks with source locators, excerpts, and chunk scores in query results.
- Document retrieval scoring behavior in MCP docs.
Verification:
- Add focused unit tests for lexical phrase/title/id/CJK matching.
- Add semantic merge tests using a fake semantic index.
- Run
uv run --group dev python -m pytest tests/test_query_normalization.py tests/test_semantic_index_service.py tests/test_mcp_server.py. - Run non-semantic main path:
uv run --group dev python -m pytest -k 'not lance and not semantic'.
Status: completed
Goal: make source segments and semantic chunks preserve document structure and citeable locations.
Boundary: deterministic source/evidence preparation. Do not introduce mandatory LLM extraction or hosted services.
Deliverables:
- Add a Markdown-aware document chunker for ingest and semantic indexing.
- Preserve heading breadcrumbs, code fences, tables, frontmatter boundaries, overlap, and source offsets in chunks.
- Reuse one chunking contract across
memory_ingest, source segments, and semantic rebuild. - Include source locators and heading breadcrumbs in semantic chunks.
Verification:
- Add tests for CJK text, code blocks, markdown tables, YAML frontmatter, and oversized sections.
- Add ingest tests proving source segments carry stable locators and hashes.
- Run
uv run --group dev python -m pytest tests/test_phase1_acceptance.py tests/test_semantic_index_service.py.
Status: completed
Goal: absorb upstream llm_wiki source-hardening lessons without shifting Memory Substrate into a desktop knowledge collector.
Boundary: adapters and projections only. Canonical memory objects remain independent of document extraction libraries.
Deliverables:
- Add robust frontmatter parsing/sanitizing for LLM-generated or imported markdown projections.
- Evaluate PDF/DOCX/XLSX extraction dependencies for source capture; keep them behind ingest adapters and not core storage.
- Treat multimodal image extraction/captioning as optional evidence capture for document-heavy knowledge work, not a memory-core prerequisite.
- Add source deletion/cascade cleanup semantics only after source manifests and provenance policies are explicit.
Verification:
- Add projection tests for fenced YAML, misplaced
frontmatter:, wikilink lists, and malformed frontmatter fallback. - Add dependency decision notes before adding document extraction packages.
- Run
uv run --group dev python -m pytest tests/test_obsidian_projection.py tests/test_structure_validation.py.
Status: completed
Goal: make memory_maintain report surface graph-health issues that agents can act on.
Boundary: maintain/report output first. Defer visualization and UI concerns to product layers.
Deliverables:
- Add deterministic graph health insights to
memory_maintain report: isolated nodes, sparse clusters, bridge nodes, and weakly connected scopes. - Evaluate a local Python graph analysis library, such as
networkx, before considering UI-oriented graphology/sigma dependencies. - Keep graph insights as maintain/report output for agents first; defer visualization to a separate product layer.
Verification:
- Add graph-health report tests with a small synthetic memory graph.
- Run
uv run --group dev python -m pytest tests/test_maintain_service.py tests/test_graph_health_report.py.
Status: completed
Goal: prevent long agent prompts, system instructions, and scratchpads from polluting memory_query retrieval.
Boundary: query-service hardening only. Do not add LLM query rewriting or a new retrieval library for this slice.
Deliverables:
- Review MemPalace design lessons and capture them in
docs/research/2026-04-30-mempalace-design-review.md. - Sanitize long
memory_query searchtext before query planning. - Sanitize long
memory_query contexttask text before context building. - Return
query_sanitizerdiagnostics and warnings when sanitization occurs. - Update MCP usage and API docs.
Verification:
- Add focused tests for labeled long-prompt sanitization in search and context.
- Run
uv run --group dev python -m pytest tests/test_query_normalization.py::QueryNormalizationTest::test_search_sanitizes_long_agent_prompt_before_planning_terms tests/test_query_normalization.py::QueryNormalizationTest::test_context_sanitizes_long_agent_prompt_and_reports_diagnostics.
Status: completed
Goal: make memory_ingest outputs self-describing across repos, markdown, conversations, and future source adapters.
Boundary: adapter metadata and source payloads only. Do not add heavy document extraction dependencies in this slice.
Deliverables:
- Define adapter metadata fields: adapter name, adapter version, supported mode, declared transformations, privacy class, and origin classification.
- Attach adapter metadata to repo and markdown ingested sources.
- Add deterministic freshness/currentness hints where available.
- Update
docs/mcp-api-reference.mdanddocs/agent-memory-mcp-usage.md.
Verification:
- Add source ingest tests for repo and markdown adapter metadata.
- Run focused source metadata tests in
tests/test_phase1_acceptance.py.
Status: completed
Goal: evolve memory_query context into budgeted work-ready context instead of a flat item list.
Boundary: context pack contract and query output shape. Do not add UI or visualization.
Deliverables:
- Define context tiers for policy, active task, decisions, procedures, evidence, open work, and deep-search hints.
- Keep compact defaults and bounded excerpts.
- Preserve existing fields during the transition where practical.
- Update MCP resources so an agent with no repo context can still use the tiers correctly.
Verification:
- Add context budget and tier-order tests.
- Run focused context pack contract tests.
Status: completed
Goal: make semantic and graph indexes auditable, rebuildable, and measurable.
Boundary: local diagnostics and small deterministic benchmark data. Do not introduce hosted services.
Deliverables:
- Add derived-index repair checks that compare index counts against canonical objects before destructive rebuilds.
- Add planted-needle retrieval benchmark cases for lexical, semantic, and hybrid retrieval.
- Report recall and latency separately per retrieval stream.
- Document when to run benchmarks and how to interpret regressions.
Verification:
- Add repair-safety tests for missing or stale semantic index entries.
- Add a small benchmark smoke test that runs without network access.
Status: completed
Goal: surface entity confusion, stale facts, and relationship mismatches without automatic mutation.
Boundary: advisory memory_maintain report output only. Do not auto-contest, supersede, or merge facts.
Deliverables:
- Report similar entity names that may cause incorrect recall.
- Report stale active facts using
valid_until,last_verified_at, status, and evidence age where available. - Report relationship mismatches for structured claims with clear subject/predicate/object conflicts.
- Add next-action guidance for promote, contest, supersede, or keep-both review.
Verification:
- Add maintain report tests with synthetic entity-confusion and stale-fact fixtures.
- Run focused maintain report fact-check test.
Status: completed
Goal: reduce memory_query context response size so MCP callers spend less context on duplicated section data.
Boundary: response shape and documentation only. Do not remove compact item details or require an LLM summarizer.
Deliverables:
- Measure context response field sizes and identify duplicated section payloads.
- Convert
context_tiersfrom copied section lists into compact directory metadata. - Convert top-level
decisions,procedures, andopen_workinto id directories back intoitems. - Clip context item summaries to keep default context compact.
- Update MCP docs and agent resources.
Verification:
- Add regression coverage that context tiers do not duplicate section summaries.
- Add payload budget coverage for large context responses.
- Measure sample context payload reduction from about 16.2 KB to about 7.2 KB.
Status: completed
Goal: reconnect the LLM Wiki crystallization loop by surfacing repeated source concepts without automatically mutating durable memory.
Boundary: deterministic advisory discovery only. Do not add a required LLM API key and do not auto-promote candidates into canonical memory.
Deliverables:
- Add reusable concept candidate discovery over source segments, headings, and existing memory text.
- Surface global
concept_candidatesfrommemory_maintain report. - Surface current-source
memory_suggestions.concept_candidatesfrommemory_ingest. - Suppress candidates already represented by concept knowledge or concept nodes.
- Document that candidates require agent/human review before
memory_remember.
Verification:
- Add maintain report tests for repeated uncrystallized concepts and existing concept suppression.
- Add repo ingest test proving source-local advisory concept candidates are returned.
- Run focused red-green tests for the new behavior.
Status: completed
Goal: make advisory candidates actionable for agents without letting candidates become automatic canonical memory.
Boundary: response guidance and agent workflow only. Do not add an automatic write path, mandatory LLM key, or background agent.
Deliverables:
- Add
review_guidanceoutcomes for concept, procedure, decision, merge, and skip. - Add
suggested_memory.input_datawith reason, memory source, scope refs, evidence refs, status, confidence, and editable fields. - Infer candidate scope refs from repo/document nodes when available and fall back to source ids.
- Document the candidate review flow in MCP docs, agent resources, and memory policy.
- Dogfood candidate discovery on
wiki-memory,llm_wiki, andmempalaceusing a temporary memory root.
Verification:
- Add regression coverage for executable candidate review payloads.
- Run focused red-green tests for candidate review payloads.
Status: completed
Goal: make candidate discovery more stable and useful by classifying, ranking, and diagnosing candidate quality.
Boundary: deterministic candidate quality only. Do not add a required LLM classifier or automatic durable writes.
Deliverables:
- Add
candidate_typehints for concept, procedure, decision, tool/library, and implementation detail candidates. - Add
ranking_signalswith score bonuses and penalties. - Rank stable concepts/procedures/decisions ahead of tool/library and version/package details.
- Add
candidate_diagnostics.skippedso filtered phrases are explainable. - Update MCP docs and agent resources.
Verification:
- Add tests for classification, ranking, diagnostics, and ingest response shape.
- Dogfood against
wiki-memory,llm_wiki, andmempalace.
Status: completed
Goal: turn advisory soft duplicate candidates into an explicit reviewed maintenance workflow.
Boundary: explicit review outcomes only. Do not let merge_duplicates auto-merge unstructured soft duplicates.
Deliverables:
- Add
memory_maintain resolve_duplicatesfor reviewed soft duplicate candidates. - Support
supersede,keep_both, andcontestoutcomes. - Require non-empty review reasons and current soft duplicate candidate ids.
- Keep curated replacement as an explicit
memory_remember knowledgewrite followed bymemory_remember supersede. - Update MCP docs, agent usage docs, and built-in resources.
Verification:
- Add lifecycle tests for supersede, keep_both, and rejecting non-candidate pairs.
- Add MCP dispatch/schema/apply guard tests.
Status: completed
Goal: make soft duplicate report entries self-guiding so agents can safely review and resolve them.
Boundary: response guidance only. Do not auto-resolve duplicates and do not add an LLM reviewer.
Deliverables:
- Add
review_guidanceto each soft duplicate candidate. - Add editable
suggested_resolutionpayloads formemory_maintain resolve_duplicates. - Add
next_actionsfor review, outcome selection, and explicit resolution. - Update MCP docs, agent usage docs, and built-in resources.
Verification:
- Add maintain report coverage for guidance and suggested resolution payloads.
Status: completed
Goal: retire bad or untrusted sources without deleting canonical history, while making affected knowledge explicit.
Boundary: safe archive semantics only. Do not physically delete sources or automatically downgrade mixed-evidence knowledge.
Deliverables:
- Add
memory_maintain archive_sourcewith requiredsource_id,reason, andoptions.apply=true. - Mark the source
archivedwith an audit reason. - Mark knowledge
staleonly when all evidence refs depend on the archived source. - Return
partially_affected_knowledge_idsfor mixed-evidence knowledge requiring review. - Update MCP docs, agent usage docs, policy, and built-in resources.
Verification:
- Add lifecycle coverage for archive and cascade behavior.
- Add MCP dispatch/schema/apply guard coverage.
Status: completed
Goal: make graph relation edges explainable and backend-independent without promoting derived graph data to canonical storage.
Boundary: relation provenance contract only. Do not add graph-table migration complexity or make Kuzu the canonical store.
Deliverables:
- Add
payload.relation_schema.versionto synced graph relations. - Record derivation kind: canonical relation, field reference, evidence ref, or structured payload.
- Record origin object type/id, origin field, and endpoint canonical object types.
- Preserve existing relation payload fields such as
knowledge_id. - Document relation provenance for agents and MCP callers.
Verification:
- Add graph sync coverage for relation provenance schema.
- Run graph sync regression tests.
Status: completed
Goal: make ingest return an explicit handoff from captured evidence to agent-reviewed durable memory writes.
Boundary: protocol metadata only. Do not add a mandatory LLM dependency or let ingest decide what should be remembered.
Deliverables:
- Add
memory_suggestions.agent_extractionwith protocol version and source id. - Document the boundary between ingest, agent analysis, and governed remember writes.
- Include required steps for source inspection, existing-memory query, candidate preparation, and reviewed remember.
- Include a
remember_write_contractwith required and recommended fields. - Update API docs, agent usage docs, policy, and MCP resources.
Verification:
- Add acceptance coverage for the extraction protocol shape.
- Run candidate suggestion regression coverage.
Status: completed
Goal: provide a deterministic local benchmark for long-term maintenance signals.
Boundary: local read-only report benchmark only. Do not require network, optional embedding models, or real user memory roots.
Deliverables:
- Add
run_maintenance_dogfood_benchmarkunder packaged experiment helpers. - Seed synthetic cases for promotable candidates, low-evidence candidates, stale candidates, structured duplicate groups, and soft duplicate candidates.
- Return expected counts, observed counts, per-case checks, reference time, and mutation flag.
- Document the benchmark entrypoint in
experiments/README.md.
Verification:
- Add benchmark test coverage.
- Run retrieval and maintenance benchmark tests.
Status: completed
Goal: reduce accidental context consumption from common MCP calls while keeping explicit expansion paths available.
Boundary: response-size controls only. Do not remove compact repo indexes, evidence locators, or explicit caller overrides for bounded non-repo objects.
Deliverables:
- Return explicit
page_unavailable/unsupportedfor repo sourcememory_query pagedetail=full. - Keep full detail available for bounded non-repo objects.
- Compress
memory_suggestions.agent_extractioninto a compact protocol with a resource pointer. - Compress ingest
memory_suggestions.concept_candidatesinto compact triage records and keep full write skeletons inmemory_maintain report. - Lower MCP default
search,recent, andgraphmax_itemsfrom 20 to 10 while preserving explicit overrides. - Update API docs, agent usage docs, policy, and MCP resources.
Verification:
- Add regression coverage for repo full-page unsupported semantics.
- Add regression coverage for compact ingest concept candidates.
- Add regression coverage for compact extraction protocol.
- Add MCP dispatch coverage for compact defaults and explicit
max_items.
Status: completed
Goal: provide a deterministic local acceptance signal for the core memory loop across ingest, query, remember, maintain, reindex, and context retrieval.
Boundary: MCP dispatch workflow only. Do not require network access, optional embedding models, hosted services, or a real user memory root.
Deliverables:
- Add
run_end_to_end_dogfood_acceptanceunder packaged experiment helpers. - Seed a small repo that produces a compact concept candidate.
- Exercise
memory_ingest,memory_query search,memory_query page,memory_remember knowledge,memory_maintain report,memory_maintain reindex, andmemory_query context. - Return per-step checks, object ids, observed ids, and compact payload sizes.
- Document the benchmark entrypoint in
experiments/README.md.
Verification:
- Add benchmark coverage for the end-to-end MCP memory loop.
- Run retrieval benchmark tests and full test suite.
Status: completed
Goal: make the end-to-end dogfood helper actionable when it fails and enforce compact response budgets as part of acceptance.
Boundary: dogfood helper diagnostics only. Do not add new production MCP modes or optional backends.
Deliverables:
- Add
failed_checksand diagnosticnext_actionsto the dogfood acceptance result. - Add explicit
payload_budgetsfor compact candidates and context payloads. - Promote compact candidate and context payload budgets into acceptance checks.
- Document the diagnostic fields in
experiments/README.md.
Verification:
- Add benchmark assertions for failed check summaries, next actions, and payload budgets.
- Run retrieval benchmark tests and full test suite.
Status: completed
Goal: make the end-to-end dogfood helper safe to run repeatedly under the same local parent directory.
Boundary: experiment helper isolation only. Do not change production MCP storage roots or canonical object identity semantics.
Deliverables:
- Create an isolated
dogfood-runs/run-NNNNdirectory for each helper invocation. - Return
run_rootin the dogfood acceptance result for diagnostics. - Document repeatable run behavior in
experiments/README.md.
Verification:
- Add coverage that the helper can run twice in the same parent root.
Status: completed
Goal: verify the package works from a real MCP stdio host shape and keep the release path testable without adopting Neo4j, Graphiti, or UI work.
Boundary: host smoke, release docs, and local verification only. Do not add new production backends, UI surfaces, hosted LLM providers, or mandatory optional dependencies.
Deliverables:
- Add a packaged MCP host smoke helper that starts the server over stdio, initializes an MCP client session, lists tools/resources, and calls representative tools without passing a root in tool args.
- Verify the smoke helper binds
MEMORY_SUBSTRATE_ROOTat process startup and mutates only the supplied temporary/local root. - Document when to run the host smoke and release checks.
- Keep README navigational and avoid duplicating the full MCP API reference.
Verification:
- Add automated coverage for the host smoke helper.
- Run focused MCP host smoke tests.
- Run the full test suite.
- Run
uv build.
Status: pending
Goal: add a low-frequency maintenance flow for using one configured external wiki, such as an Obsidian vault folder, without making wiki files canonical memory.
Boundary: projection maintenance only. Do not add top-level MCP tools, do not restore wiki-first storage, do not make an internal wiki projection a required path, and do not automatically write wiki edits back into canonical memory.
Design decisions:
- Use existing
memory_maintainmodes rather than new top-level tools. - Store configuration in the memory root at
memory/config.json. - Support one configured wiki projection target with
pathandformat. - Treat the external wiki as a projection target, not canonical storage.
- Protect user notes with a projection manifest so render only manages files previously generated by Memory Substrate.
- Default reconciliation to report-only output with candidates and conflicts; canonical writes still go through reviewed
memory_rememberor explicit apply modes.
Planned modes:
- Extend
memory_maintain configurewithwiki_projection.pathandwiki_projection.format. - Add
memory_maintain render_projectionfor canonical memory -> configured external wiki. - Add
memory_maintain reconcile_projectionfor configured external wiki -> diff report, conflicts, and remember candidates.
Verification:
- Add config repository tests for wiki projection settings.
- Add render tests proving generated files are manifest-bound and canonical objects are unchanged.
- Add reconcile tests proving report-only behavior does not mutate canonical memory.
- Update MCP API docs and agent usage docs.
Status: completed
Goal: make deferred MCP hosts more likely to discover Memory Substrate as the persistent agent memory server before agents fall back to shell diagnostics.
Boundary: server/tool descriptions, MCP resources, and docs only. Do not change tool count, mode schemas, storage behavior, or Codex host configuration.
Deliverables:
- Add discovery keywords to server instructions and tool descriptions.
- Add a tool-discovery rule to the built-in agent playbook.
- Document that hosts may defer MCP tools and agents should search for
memory-substratebefore using shell fallbacks.
Verification:
- Add MCP server/resource tests for discovery text.
- Run focused MCP server tests.
- Run full tests.