CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

Commands

# Run all tests
python -m pytest tests/ -q

# Run a single test file
python -m pytest tests/test_cso_store.py -v

# Run a single test by name
python -m pytest tests/test_cso_store.py::test_write_and_query -v

# Start MCP server (stdio transport, used by Cursor/Claude/Windsurf)
python -m hcr.product.integrations.mcp_server_stdio

# CLI
hcr init              # initialize .hcr/ in current project
hcr status            # show cognitive state summary
hcr resume            # show resume context

Architecture

HCR is a persistent developer memory layer exposed as an MCP server. The key insight: AI tools are stateless, so HCR provides the stateful substrate they lack.

Storage: CSOs (Cognitive State Objects)

hcr/engine/cso/ is the core data layer. A CSO (cso_model.py) is a typed, causally-linked record — types include DECISION, OBSERVATION, CONSTRAINT, RISK, TASK, OUTCOME, CLAIM, INTENT, ROLLBACK, TRIGGER. Each CSO has explicit causal_in/causal_out edge lists forming a directed causal graph. CSOStore (cso_store.py) persists them in SQLite + WAL at .hcr/cso.db. Indexes on type, created_at, and scope.

Memory Fabric: `hcr/engine/memory/`

The Cognitive State Fabric (CSF) — the intelligence layer over raw CSO storage:

centrality.py — CausalCentralityScorer: BFS transitive reachability on causal edges. CSOs that caused the most downstream effects score highest (0–1).
projection.py — CognitiveProjection: replaces naive facts[-10:] with centrality-ranked, decay-filtered live state. Called on every hcr_get_state and hcr_preflight.
prefetch.py — ProjectionPrefetcher: background thread triggered on file edit events. Caches CognitiveProjection so the next tool call hits cache (zero compute latency).
embedding_store.py — EmbeddingStore: sqlite-vec ANN store at .hcr/embeddings.db. Embeds qualifying CSO tiers (commit/task/decision/constraint/risk/edited) via sentence-transformers (all-MiniLM-L6-v2, 384-dim), falling back to Ollama nomic-embed-text. Only CSOs in those tiers are embedded. Wired into CognitiveProjection as 3rd RRF retrieval channel.
implicit_graph.py — generate_soft_links: called in _handle_file_edit after embedding; auto-detects semantic soft causal edges (similarity > 0.55 threshold via 1/(1+distance)) and back-writes them to the CSO.
fusion.py — reciprocal_rank_fusion (RRF, accepts optional learned weights tuple) and mmr_select (Maximum Marginal Relevance): fuse semantic + causal ranked lists; select diverse results. RRF called by CognitiveProjection (3-way: decay + BM25 + embedding) and cso_impact. MMR called after cross-encoder reranking in projection.
prospective.py — get_triggered_csos: returns TRIGGER CSOs matching the active file pattern; injected at rank-0 by CognitiveProjection.
feedback.py — FeedbackStore: SQLite store recording preflight retrievals and session outcomes. After TRAINING_THRESHOLD=50 labelled samples, trains learned RRF weights via least-squares. HCREngine._feedback_store owns the instance.
sync.py — EmbeddingSync: fire-and-forget Supabase pgvector push via daemon threads. Only (org_id, project_id, cso_id, embedding) pushed — no raw content. Wired into EmbeddingStore._sync when SUPABASE_URL/SUPABASE_KEY configured.
cpap.py — CausalPrefixAlignmentProtocol: three-layer freeze-gated context serialization achieving ~98% prompt cache hit rate. Layers 1+2 are byte-identical between git commits; Layer 3 is the per-call delta.
reranker.py — CrossEncoderReranker: second-pass joint (query, candidate) relevance scoring via cross-encoder/ms-marco-MiniLM-L-6-v2 (22 MB). Called after RRF fusion in CognitiveProjection. Graceful identity-order fallback on import error.
episode_store.py — Removed. Episode detection (BOCPD) deprecated; functionality moved to event_store.py.
commit_extractor.py — ExtractedSignal: regex-based extraction of decisions, constraints, and risks from git commit messages. Scans trailers (Decided-to:, Constraint:) and body patterns; returns typed CSOs with confidence scores.
merger.py — StateMerger: cross-project CSO state merger with dedup (exact-ID, path+text for observations) and conflict resolution (confidence then timestamp for decisions/tasks). Emits conflict OBSERVATION CSOs.

Causal Analysis: `hcr/engine/causal/`

Static and runtime causal analysis over the codebase dependency graph:

ast_extractor.py — DependencyExtractor: AST-based extraction of imports and function calls from Python source files to build the causal graph.
dependency_graph.py — DependencyGraph: in-memory directed graph of file/module dependencies (forward + reverse edges) with support for latent links discovered by neural inference.
impact_analyzer.py — ImpactAnalyzer: BFS traversal of reverse dependencies to predict the ripple effect of a file change up to configurable depth.
event_store.py — EventStore: append-only JSONL event log (causal_events.jsonl) for temporal reasoning and time-travel over cognitive/file state changes.
cso_impact.py — query_cso_impact: CSO-graph-based impact analysis (v2.0). Semantic attention BFS over causal_in edges with optional embedding similarity gating and RRF fusion with ANN search.
metrics.py — MetricsAnalyzer: calculates fragility (AST complexity, incoming dependencies) and centrality scores for files in the causal graph.

Core Engine: `hcr/engine/core/`

hco_engine.py was removed. Legacy operator stubs (BaseOperator, CompositeOperator, PolicySelector) are deprecated. Core engine logic now lives entirely in engine_api.py.

LLM Abstraction: `hcr/engine/llm/`

llm_provider.py — unified synchronous interface (LLMResponse) for all LLM providers (Groq, Google, Ollama). Handles markdown code-fence stripping and JSON parsing of responses.

Engine: `hcr/engine/engine_api.py`

HCREngine is the central object. Key responsibilities:

Owns _cso_store, _prefetcher, _embedding_store, _feedback_store
_handle_file_edit() is the hot path: writes OBSERVATION CSO → symbolic verifier → prefetcher → embeds CSO → generates soft-links (back-written to CSO causal_in)
_extract_task() and _calculate_progress() filter edited: facts to paths that exist under project_path — prevents cross-project state bleed from stale causal_events.jsonl
Parallel legacy state in CognitiveState (symbolic facts, causal graph) is kept for backwards compatibility but CSOs are the v2.0 source of truth

Symbolic Reasoning: `hcr/engine/symbolic/`

SymbolicVerifier (verifier.py) evaluates DEFAULT_RULES against a newly written CSO and emits RISK CSOs when rules fire. Rules live in rules.py. Rule evaluation failures are logged at DEBUG level (not silently swallowed).

Semantic Decay: `hcr/product/storage/semantic_decay.py`

Tier-based fact retention. Each fact prefix (commit:, task:, edited:, error:, cmd:, mcp_tool:, pattern:, observation:) has a half-life (7 days → 15 min). CognitiveProjection uses this to compute effective half-life adjusted by centrality: effective_hl = base / max(1 - centrality*0.8, 0.2) — high-centrality CSOs decay slower.

MCP Server: `hcr/product/integrations/`

HCRMCPResponder in mcp_server.py dispatches all MCP tool calls to modular BaseMCPTool subclasses in tools/. Each tool class is ~one file. 32 tools total. Key tools:

hcr_get_state / hcr_preflight — main context-handoff tools for agents
hcr_preflight / hcr_postflight — agent lifecycle: preflight records retrieval for learned fusion; postflight records outcome signal; preflight hints hcr_set_trigger when no triggers active
hcr_record_file_edit, hcr_remember, hcr_fail, hcr_resolve — write-path tools
hcr_analyze_impact — causal BFS + RRF semantic fusion; file_path=null returns causal graph summary
hcr_set_trigger — creates TRIGGER CSOs for prospective memory (file-pattern → reminder injection)
hcr_read_decisions — JSONL decision log; source=cso or source=all also queries CSO store
hcr_get_recommendations — current task + next action + AI recommendations (consolidated from 3 tools)

All imports into mcp_server.py must be at module level (not inside functions/conditionals).

Project State on Disk

.hcr/
  cso.db          # SQLite CSO store (WAL mode, indexes: type/created_at/scope)
  embeddings.db   # sqlite-vec embedding store (WAL, synchronous=NORMAL)
  feedback.db     # learned fusion weights (FeedbackStore)
  state.json      # legacy CognitiveState (required for init check)
  decisions/      # legacy JSONL decision log (migrated to CSOs on init)

Project Knowledge Base: `docs/`

The docs/ directory is the primary project memory. Always consult it when you need context about architecture, goals, history, or current work. Key files:

docs/architecture.md — system design, component relationships
docs/NORTHSTAR.md — project vision, long-term direction
docs/product_roadmap.md — roadmap, priorities, milestones
docs/dev_log.md — development history, past decisions
docs/tasks.md — current work items
docs/strategic_vision.md — business strategy

Rule: When uncertain about project context, read the relevant docs/ file before proceeding. After making significant changes, update the corresponding doc to keep it current.

Tests

tests/conftest.py has collect_ignore for non-pytest scripts. Tests are fully synchronous where possible; async tests use pytest-asyncio with asyncio_mode = "auto". Mock _get_embedding on EmbeddingStore to avoid needing Ollama running. The full suite runs in ~310 seconds (452 tests: 446 passed, 6 skipped — skipped tests require a live LLM endpoint).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CLAUDE.md

Commands

Architecture

Storage: CSOs (Cognitive State Objects)

Memory Fabric: `hcr/engine/memory/`

Causal Analysis: `hcr/engine/causal/`

Core Engine: `hcr/engine/core/`

LLM Abstraction: `hcr/engine/llm/`

Engine: `hcr/engine/engine_api.py`

Symbolic Reasoning: `hcr/engine/symbolic/`

Semantic Decay: `hcr/product/storage/semantic_decay.py`

MCP Server: `hcr/product/integrations/`

Project State on Disk

Project Knowledge Base: `docs/`

Tests

FilesExpand file tree

CLAUDE.md

Latest commit

History

CLAUDE.md

File metadata and controls

CLAUDE.md

Commands

Architecture

Storage: CSOs (Cognitive State Objects)

Memory Fabric: hcr/engine/memory/

Causal Analysis: hcr/engine/causal/

Core Engine: hcr/engine/core/

LLM Abstraction: hcr/engine/llm/

Engine: hcr/engine/engine_api.py

Symbolic Reasoning: hcr/engine/symbolic/

Semantic Decay: hcr/product/storage/semantic_decay.py

MCP Server: hcr/product/integrations/

Project State on Disk

Project Knowledge Base: docs/

Tests

Memory Fabric: `hcr/engine/memory/`

Causal Analysis: `hcr/engine/causal/`

Core Engine: `hcr/engine/core/`

LLM Abstraction: `hcr/engine/llm/`

Engine: `hcr/engine/engine_api.py`

Symbolic Reasoning: `hcr/engine/symbolic/`

Semantic Decay: `hcr/product/storage/semantic_decay.py`

MCP Server: `hcr/product/integrations/`

Project Knowledge Base: `docs/`