Skip to content

Commit a71b30b

Browse files
committed
Add HCR v2 docs, integrations; repo cleanup
Add comprehensive HCR v2 documentation and developer guidance (CLAUDE.md, HCR-Northstar-v2.md) and update local Claude settings. Introduce hcr.product.integrations and hcr.product.security packages and modify integration tools (file_tools, output_synthesizer, session_tools). Apply changes to engine layer (cso_store.py, verifier.py), update README and mcp_server_wrapper, and adjust tests (tests/test_mcp_init.py). Remove legacy product/ package files (large deletion of old product modules) as part of the refactor and clean up project configuration and scripts.
1 parent 2e758e0 commit a71b30b

74 files changed

Lines changed: 4603 additions & 16232 deletions

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

.claude/settings.local.json

Lines changed: 33 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -59,7 +59,39 @@
5959
"Bash(python3 -c \"from hcr.product.cli import main; print\\('✓ hcr.product subpackage imports'\\)\")",
6060
"PowerShell(Head *)",
6161
"PowerShell(python -m hcr.product.cli.main --help 2>&1)",
62-
"PowerShell(python -m hcr.product.integrations.mcp_server_stdio --help 2>&1)"
62+
"PowerShell(python -m hcr.product.integrations.mcp_server_stdio --help 2>&1)",
63+
"Bash(sed -i 's|from src\\\\.|from hcr.engine.|g' README.md)",
64+
"Bash(sed -i 's|from product\\\\.|from hcr.product.|g' README.md)",
65+
"Bash(sed -i 's|`src/|`hcr/engine/|g' README.md)",
66+
"Bash(sed -i 's|`product/|`hcr/product/|g' README.md)",
67+
"Bash(sed -i 's|from src\\\\.|from hcr.engine.|g' HybridCognitiveRuntime-Northstar.md)",
68+
"Bash(sed -i 's|from product\\\\.|from hcr.product.|g' HybridCognitiveRuntime-Northstar.md)",
69+
"Bash(sed -i 's|`src/|`hcr/engine/|g' HybridCognitiveRuntime-Northstar.md)",
70+
"Bash(sed -i 's|`product/|`hcr/product/|g' HybridCognitiveRuntime-Northstar.md)",
71+
"Bash(sed -i 's|from src\\\\.|from hcr.engine.|g' hcr/product/PRODUCT_SPEC.md)",
72+
"Bash(pip show *)",
73+
"PowerShell(Get-Content *)",
74+
"PowerShell(cd \"C:\\\\Users\\\\rishi\\\\Documents\\\\GitHub\\\\HybridCognitiveRuntime\"; $env:PYTHONIOENCODING=\"utf-8\"; python -m hcr.product.integrations.mcp_server_stdio --help 2>&1 | Select-Object -First 5)",
75+
"PowerShell(cd \"C:\\\\Users\\\\rishi\\\\Documents\\\\GitHub\\\\HybridCognitiveRuntime\"; $env:PYTHONIOENCODING=\"utf-8\"; python test_tools_smoke.py 2>&1 | Where-Object { $_ -notmatch 'RequestsDependencyWarning|requests\\\\.__init__|chardet|charset|RuntimeWarning|coroutine|warnings.warn' })",
76+
"PowerShell(cd \"C:\\\\Users\\\\rishi\\\\Documents\\\\GitHub\\\\HybridCognitiveRuntime\"; $env:PYTHONIOENCODING=\"utf-8\"; python test_tools_smoke.py 2>&1 | Where-Object { $_ -notmatch 'RequestsDependency|requests\\\\.__init__|chardet|charset|RuntimeWarning|coroutine|warnings.warn' })",
77+
"Bash(cd \"C:\\\\Users\\\\rishi\\\\Documents\\\\GitHub\\\\HybridCognitiveRuntime\")",
78+
"Bash(git rev-parse *)",
79+
"Bash(echo GIT_DIR=__TRACKED_VAR__)",
80+
"Bash(echo GIT_COMMON=__TRACKED_VAR__)",
81+
"Bash(git branch *)",
82+
"Bash(Get-ChildItem -Path \"C:\\\\Users\\\\rishi\\\\Documents\\\\GitHub\\\\HybridCognitiveRuntime\" -Force)",
83+
"Bash(Select-Object -First 20)",
84+
"Bash(python3)",
85+
"Bash(python3 -m pytest tests/test_cso_tools.py -v --tb=short)",
86+
"Bash(Get-ChildItem *)",
87+
"Bash(Select-Object FullName)",
88+
"Bash(git *)",
89+
"Bash(dir C:\\\\Users\\\\rishi\\\\Documents\\\\GitHub\\\\HybridCognitiveRuntime\\\\hcr\\\\product\\\\storage\\\\ *)",
90+
"WebSearch",
91+
"Bash(grep -n \"\\\\\\\\bos\\\\\\\\.\" hcr/product/integrations/tools/decision_tools.py)",
92+
"Bash(python3 -c \"import sqlite3; help\\(sqlite3.Connection.__enter__\\)\")",
93+
"Bash(ls *.md *.toml *.cfg *.ini)",
94+
"Bash(ls .cursor/ .github/)"
6395
]
6496
},
6597
"enableAllProjectMcpServers": true,

CLAUDE.md

Lines changed: 88 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,88 @@
1+
# CLAUDE.md
2+
3+
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
4+
5+
## Commands
6+
7+
```bash
8+
# Run all tests
9+
python -m pytest tests/ -q
10+
11+
# Run a single test file
12+
python -m pytest tests/test_cso_store.py -v
13+
14+
# Run a single test by name
15+
python -m pytest tests/test_cso_store.py::test_write_and_query -v
16+
17+
# Start MCP server (stdio transport, used by Cursor/Claude/Windsurf)
18+
python -m hcr.product.integrations.mcp_server_stdio
19+
20+
# CLI
21+
hcr init # initialize .hcr/ in current project
22+
hcr status # show cognitive state summary
23+
hcr resume # show resume context
24+
```
25+
26+
## Architecture
27+
28+
HCR is a persistent developer memory layer exposed as an MCP server. The key insight: AI tools are stateless, so HCR provides the stateful substrate they lack.
29+
30+
### Storage: CSOs (Cognitive State Objects)
31+
32+
`hcr/engine/cso/` is the core data layer. A `CSO` (`cso_model.py`) is a typed, causally-linked record — types include `DECISION`, `OBSERVATION`, `CONSTRAINT`, `RISK`, `TASK`, `OUTCOME`, `CLAIM`, `INTENT`, `ROLLBACK`, `TRIGGER`. Each CSO has explicit `causal_in`/`causal_out` edge lists forming a directed causal graph. `CSOStore` (`cso_store.py`) persists them in SQLite + WAL at `.hcr/cso.db`. Indexes on `type`, `created_at`, and `scope`.
33+
34+
### Memory Fabric: `hcr/engine/memory/`
35+
36+
The Cognitive State Fabric (CSF) — the intelligence layer over raw CSO storage:
37+
38+
- **`centrality.py`**`CausalCentralityScorer`: BFS transitive reachability on causal edges. CSOs that caused the most downstream effects score highest (0–1).
39+
- **`projection.py`**`CognitiveProjection`: replaces naive `facts[-10:]` with centrality-ranked, decay-filtered live state. Called on every `hcr_get_state` and `hcr_preflight`.
40+
- **`prefetch.py`**`ProjectionPrefetcher`: background thread triggered on file edit events. Caches `CognitiveProjection` so the next tool call hits cache (zero compute latency).
41+
- **`embedding_store.py`**`EmbeddingStore`: sqlite-vec ANN store at `.hcr/embeddings.db`. Embeds qualifying CSO tiers (`commit/task/decision/constraint/risk/edited`) via Ollama `nomic-embed-text`, falling back to `sentence-transformers`. Only CSOs in those tiers are embedded.
42+
- **`implicit_graph.py`**`generate_soft_links`: semantic k-NN to auto-detect soft causal edges (similarity > 0.82 threshold).
43+
- **`episode_store.py`**`BOCPDSegmenter`: Bayesian Online Changepoint Detection for segmenting event streams into work episodes.
44+
- **`fusion.py`**`reciprocal_rank_fusion` (RRF, accepts optional learned `weights` tuple) and `mmr_select` (Maximum Marginal Relevance): fuse semantic + causal ranked lists; select diverse results for fixed token budget. RRF is called by `cso_impact.query_cso_impact` when an `embedding_store` is passed.
45+
- **`implicit_graph.py`**`generate_soft_links`: called in `_handle_file_edit` after embedding; auto-detects semantic soft causal edges (similarity > 0.82) and back-writes them to the CSO.
46+
- **`prospective.py`**`get_triggered_csos`: returns `TRIGGER` CSOs matching the active file pattern; injected at rank-0 by `CognitiveProjection`.
47+
- **`feedback.py`**`FeedbackStore`: SQLite store recording preflight retrievals and session outcomes. After `TRAINING_THRESHOLD=50` labelled samples, trains learned RRF weights via least-squares. `HCREngine._feedback_store` owns the instance.
48+
49+
### Engine: `hcr/engine/engine_api.py`
50+
51+
`HCREngine` is the central object. Key responsibilities:
52+
- Owns `_cso_store`, `_prefetcher`, `_embedding_store`, `_feedback_store`, `_episode_segmenter`
53+
- `_handle_file_edit()` is the hot path: writes OBSERVATION CSO → symbolic verifier → prefetcher → embeds CSO → generates soft-links (back-written to CSO causal_in)
54+
- Parallel legacy state in `CognitiveState` (symbolic facts, causal graph) is kept for backwards compatibility but CSOs are the v2.0 source of truth
55+
56+
### Symbolic Reasoning: `hcr/engine/symbolic/`
57+
58+
`SymbolicVerifier` (`verifier.py`) evaluates `DEFAULT_RULES` against a newly written CSO and emits RISK CSOs when rules fire. Rules live in `rules.py`. Rule evaluation failures are logged at DEBUG level (not silently swallowed).
59+
60+
### Semantic Decay: `hcr/product/storage/semantic_decay.py`
61+
62+
Tier-based fact retention. Each fact prefix (`commit:`, `task:`, `edited:`, `error:`, `cmd:`, `mcp_tool:`, `pattern:`, `observation:`) has a half-life (7 days → 15 min). `CognitiveProjection` uses this to compute effective half-life adjusted by centrality: `effective_hl = base / max(1 - centrality*0.8, 0.2)` — high-centrality CSOs decay slower.
63+
64+
### MCP Server: `hcr/product/integrations/`
65+
66+
`HCRMCPResponder` in `mcp_server.py` dispatches all MCP tool calls to modular `BaseMCPTool` subclasses in `tools/`. Each tool class is ~one file. Key tools:
67+
- `hcr_get_state` / `hcr_preflight` — main context-handoff tools for agents
68+
- `hcr_preflight` / `hcr_postflight` — agent lifecycle: preflight records retrieval for learned fusion; postflight records outcome signal
69+
- `hcr_record_file_edit`, `hcr_remember`, `hcr_fail`, `hcr_resolve` — write-path tools
70+
- `hcr_analyze_impact` — causal BFS + RRF semantic fusion (`cso_impact.py` + `embedding_store`)
71+
72+
All imports into `mcp_server.py` must be at module level (not inside functions/conditionals).
73+
74+
### Project State on Disk
75+
76+
```
77+
.hcr/
78+
cso.db # SQLite CSO store (WAL mode, indexes: type/created_at/scope)
79+
embeddings.db # sqlite-vec embedding store (WAL, synchronous=NORMAL)
80+
feedback.db # learned fusion weights (FeedbackStore)
81+
agents.db # AgentRegistry
82+
state.json # legacy CognitiveState (required for init check)
83+
decisions/ # legacy JSONL decision log (migrated to CSOs on init)
84+
```
85+
86+
## Tests
87+
88+
`tests/conftest.py` has `collect_ignore` for non-pytest scripts. Tests are fully synchronous where possible; async tests use `pytest-asyncio` with `asyncio_mode = "auto"`. Mock `_get_embedding` on `EmbeddingStore` to avoid needing Ollama running. The full suite runs in ~57 seconds (179 tests, 4 skipped — skipped tests require a live LLM endpoint).

0 commit comments

Comments
 (0)