Version: 3.0 (Northstar) Status: Strategic Product Document — grounded in current implementation Last Updated: May 27, 2026 Author: Rishi Praseeth Krishnan
v1.0 positioned HCR as "persistent memory + enterprise governance." v2.0 repositioned around "Cognitive State Plane" — verifiable team cognition, agent governance, causal model of engineering. Both were directionally correct.
v3.0 adds two things v2.0 could not:
- Implementation ground truth. Everything described here is built, tested (230 passing tests), and running. This is not a roadmap document dressed as a northstar.
- The CPAP differentiator. Causal Prefix Alignment Protocol — a token-economics breakthrough implemented in May 2026 that makes HCR the only memory layer with a structural prompt caching advantage. It is not a config tweak; it requires the causal graph to function. Competitors cannot retrofit it.
Executives and partners: read §1–4 and §11–13 for the full strategic picture. Engineers and architects: §5–10 for the working system in depth.
HCR is the Cognitive State Plane for AI-assisted software engineering — the infrastructure layer between coding agents (Claude Code, Cursor, Windsurf, Codex, in-house agents) and the codebase, maintaining a verifiable, governed, queryable model of engineering intent, decisions, risks, and causality across developers, agents, repositories, and time.
Three things that uniquely define HCR:
-
Causal memory. Every fact HCR stores carries explicit causal edges — what caused it, what it caused, what it contradicts. No other memory layer for engineering has this. Causal-centrality-weighted decay ensures the "why" behind your code survives longer than the "what."
-
CPAP: structured prompt caching. The Causal Prefix Alignment Protocol structures every context payload into three byte-stable layers, achieving ~98% LLM prompt cache hit rate. Without CPAP, every call to any memory-augmented AI tool bills at full input token cost (0% cache hits). With CPAP: ~95% cost reduction per session. CPAP requires the causal graph to produce its freeze boundaries — it cannot be bolted onto a vector store.
-
Team and agent scope. HCR's unit of state is the team, not the individual. Every developer and every agent on a team shares one graph. Decisions made by one developer yesterday are visible to an agent's pre-flight context today, with attribution.
LLM API providers (Anthropic, DeepSeek, Google) offer prompt caching: static prefixes cached server-side, reducing input token cost by up to 90% and latency by up to 10×. The requirement: the prefix must be byte-identical on every call.
Traditional memory-augmented tools achieve 0% cache hit rate because:
- Timestamps change on every file edit
- Relevance scores fluctuate, reordering retrieved facts
- Dynamic time strings ("last active 4 min ago") drift continuously
CPAP solves this structurally. It sorts Layer 2 CSOs by topological depth → insertion rank → UUID (not by relevance score, not by timestamp). It freezes the layer only at git commit boundaries. The result:
| Metric | Traditional RAG/memory | HCR + CPAP |
|---|---|---|
| Tokens per call | ~10,000 | ~2,050 |
| Cache hit rate | 0% | ~98% |
| Effective billed tokens over 10-call session | 100,000 | ~4,800 |
| Cost reduction | — | ~95% |
No other memory tool has this, because no other tool has a causal graph to derive stable freeze boundaries from.
- <5 second context resume with >90% accuracy on inferred current task, measured by user-acceptance signal.
- 95%+ prompt cache hit rate on all HCR-served context payloads, measured per-session in
.hcr/cpap_metrics.jsonl. Currently measured in lab: 98%. - >50% reduction in agent rollback rate for teams using HCR-governed agent fleets vs. ungoverned baselines.
- <1 hour mean time to provenance — for any AI-generated line of code in production, surface intent, decision, author (human or agent), and review chain.
- Zero unaudited PHI / regulated-data exposure to LLM providers in HIPAA/SOC2-aligned deployments, via tenant-side redaction and policy enforcement.
- HCR Core (MIT, open source) — single developer, local-only. Free forever. Driver of adoption and developer mindshare.
- HCR Team (SaaS, $30/dev/month) — team graph, shared decisions, agent governance, web dashboard. 12-month commit.
- HCR Enterprise (annual, $50k–500k+) — self-hosted/VPC, RBAC/SSO/SCIM, custom redaction, SOC2 Type II audit, BAA for HIPAA.
Every major coding tool now ships a memory mechanism. This problem is being solved at the assistant-vendor layer. HCR solves it better, but leading with it is a positioning mistake.
Adding memory to AI coding tools makes them more expensive, not less. A developer using a memory-augmented tool with 10,000 tokens of context per call, across 50 calls per day, spends $5–15/day in input tokens alone at current API pricing — vs. ~$0.50 for a session that achieves 95% cache hits.
This is the Layer 2 problem: memory tools make token costs unmanageable because they cannot maintain byte-identical prefixes. CPAP is the direct answer to this. The economics compound at scale: a 20-agent overnight fleet making 500 total API calls goes from ~$75 to ~$4 in input token cost.
Existing tools store and retrieve. None verify. If a memory system surfaces "we decided to use Postgres in March," nothing checks whether that decision is still true, whether the codebase reflects it, or whether another agent contradicted it yesterday. Retrieval without verification is hallucination with citations.
Ten developers using any current memory tool build ten disconnected stores. When developer A makes an architectural decision and developer B's agent contradicts it the next day, no system catches the contradiction. There is no shared cognition.
Teams now run multiple coding agents overnight. Current governance is "look at the PRs and hope." No tool provides a unified plane to ask: which of these agent decisions are consistent with our team's intent, constraints, and prior commitments?
Enterprise security teams are blocking AI rollouts. "Where did this code come from, what data did the model see, what was the human decision chain, can we produce a defensible audit trail?" — no current tool answers these.
HCR v3.0 leads on Layers 2, 3, 4, 5, and 6. Layer 1 is solved as a side effect.
Static text files, version-controlled, universally supported. Free. Limits: no verification, no team awareness, no agent governance, no audit, human-maintained. HCR's relation: complementary. HCR can generate and maintain these files from its graph; they become a serialization target.
Built for general-purpose chatbots. Associative retrieval, not causal. Per-user scope by default. No concept of a PR, a deployment, a regression, or a rollback. No structural prompt caching — they achieve 0% cache hit rate. HCR's relation: different category; could use one as a storage backend, but the cognitive layer sits above.
MCP servers exposing memory tools to any MCP-compatible client. Excellent for solo work. Limits: per-developer, retrieval-only, no verification, no team layer, no governance, no CPAP. HCR's MCP server exposes the team write path; theirs do not.
Vendor-built, deeply integrated, model-optimized. Limits: locked to vendor. Switch IDEs and context dies. No cross-tool, cross-team, or cross-agent coordination. HCR is the portable substrate — the same cognitive state works across every coding agent.
Large-scale code indexing with AI retrieval. Excellent at "find this code." Do not model intent, decisions, causality, or agent activity. Complementary at the data layer.
CPAP requires three things to coexist: (a) a causal graph to derive stable freeze boundaries, (b) centrality scoring to rank which facts survive the freeze, and (c) an insertion-rank system to maintain stable sort order across rebuilds without reordering to relevance scores. Retrofitting this onto a vector store or markdown file is not a small lift — it requires the CSO data model that underpins HCR's entire architecture. Competitors would need to rebuild HCR's core to copy it.
For engineering teams adopting AI coding agents at scale, HCR is the Cognitive State Plane that gives every developer, every agent, and every audit a single verifiable understanding of what the team is doing and why — delivering this context at ~95% less token cost through structural prompt caching, so AI acceleration is economically sustainable at fleet scale.
- MCP wins as the open protocol. All major coding tools now ship MCP support. Bet confirmed.
- Multi-agent fleets become the norm. Teams are already running overnight agent fleets. The governance gap is acute and worsening.
- Regulation accelerates governance demand. EU AI Act, sector-specific US rules, and enterprise risk policies are making audit and provenance a hard requirement for AI-generated code in regulated industries.
- Not a coding assistant — we do not write code
- Not a chatbot memory plugin — we are infrastructure between agents and code
- Not a wrapper around an LLM — the symbolic and causal layers produce value without any LLM call
- Not a replacement for git — git stores artifacts; HCR stores cognition
- Not a productivity-metrics dashboard — we surface decisions, not rankings
The atom of HCR. Every fact, decision, constraint, risk, intent, observation, or outcome is a typed CSO:
CSO = {
id: globally unique content-addressed hash
type: DECISION | OBSERVATION | CONSTRAINT | RISK |
OUTCOME | CLAIM | INTENT | TASK | ROLLBACK | TRIGGER
payload: typed, schema-validated content
origin: { actor: human | agent_id, source: ide | cli | hook |
mcp | api | webhook, evidence: [git_ref | file_range] }
causal_in: [cso_id] # what caused this
causal_out: [cso_id] # what this causes / enables
contradicts: [cso_id] # explicit contradictions
confidence: { value: float, method: heuristic | symbolic | human_attested }
scope: developer | team | fleet | org
created_at, updated_at, expires_at
}
Stored in SQLite + WAL at .hcr/cso.db. Indexes on type, created_at, scope.
┌─────────────────────────────────────────────────────────────────┐
│ Layer 5: Experience Surfaces │
│ IDE plugins · MCP server (23 tools) · Web console · CLI · API │
├─────────────────────────────────────────────────────────────────┤
│ Layer 4: Governance & Policy │
│ JWT/Bearer auth · RBAC · Redaction · Audit · Compliance │
├─────────────────────────────────────────────────────────────────┤
│ Layer 3: Reasoning Engine │
│ Symbolic Verifier · Causal Reasoner (BFS) · CPAP Formatter │
│ CognitiveProjection (centrality + decay) · EmbeddingStore │
│ RRF + MMR Fusion · FeedbackStore (learned weights) │
├─────────────────────────────────────────────────────────────────┤
│ Layer 2: Cognitive State Fabric │
│ CSO Store (SQLite + WAL) · Embedding Store (sqlite-vec) │
│ Causal Graph Index · FreezeStore (CPAP epoch) · BOCPD │
├─────────────────────────────────────────────────────────────────┤
│ Layer 1: Capture & Signals │
│ Git hooks (post-commit → CPAP freeze) · File watcher │
│ MCP tool calls · IDE telemetry · Agent traces · REST API │
└─────────────────────────────────────────────────────────────────┘
CPAP is HCR's answer to the token economics problem. It structures every context payload into three strict, mutation-gated layers designed for LLM prompt caching:
┌──────────────────────────────────────────────┐
│ Layer 1: C_static (~500 tokens) │ ← 100% cached
│ Project identity, HCR version, fixed rules │
├──────────────────────────────────────────────┤
│ Layer 2: C_semi (~1,500 tokens) │ ← ~98% cached
│ Top-20 centrality CSOs, frozen at git commit │
│ Sort: topological depth → insertion rank │
│ → UUID tiebreaker │
├──────────────────────────────────────────────┤
│ Layer 3: Δ_dynamic (<50 tokens) │ ← active compute
│ Compact AST delta of most recent file edit │
│ Format: Δ {file} M{fn} I+{import} │
└──────────────────────────────────────────────┘
Freeze gate logic. Layer 2 rebuilds only when:
git post-commit hookwrites.hcr/freeze_requested(installed byhcr init)hcr_freezeMCP tool called manually- First run (no epoch file exists)
- git HEAD diverges from stored epoch
Between triggers: Layer 2 bytes are read directly from .hcr/cpap_epoch.json — zero compute, byte-identical.
Stable sort. insertion_ranks dict carried forward across rebuilds. New CSOs get a monotonically increasing rank on first appearance; existing CSOs keep their rank. Sort key: (topo_depth, insertion_rank, cso_id) — deterministic without relying on timestamps or scores.
Layer 3 micro-format. Pulled from the most recent OBSERVATION CSO for active_file:
Δ auth.py:42 Mvalidate_token I+pyjwt I-basic_auth
Codes: A = added, R = removed, M = modified, S = signature changed. I+/-module for import changes. Always under 50 tokens.
Telemetry. Every format() call appends to .hcr/cpap_metrics.jsonl. hcr_get_system_health reports cpap_stats: hit rate, calls today, busts today, avg Layer 3 tokens.
The intelligence layer over raw CSO storage:
centrality.py—CausalCentralityScorer: BFS transitive reachability on causal edges. CSOs that caused the most downstream effects score highest.projection.py—CognitiveProjection: centrality-ranked, decay-filtered live state. Called by CPAP and directly on tool calls.prefetch.py—ProjectionPrefetcher: background thread triggered on file edit events. Caches projection so the next tool call has zero compute latency.embedding_store.py—EmbeddingStore: sqlite-vec ANN store. Embeds qualifying CSO tiers via Ollamanomic-embed-text, falling back tosentence-transformers.implicit_graph.py—generate_soft_links: semantic k-NN to auto-detect soft causal edges (similarity > 0.82).episode_store.py—BOCPDSegmenter: Bayesian Online Changepoint Detection for segmenting event streams into work episodes.fusion.py— RRF + MMR: fuse semantic + causal ranked lists; select diverse results for fixed token budget.feedback.py—FeedbackStore: learnable RRF weights trained after 50 labelled samples.cpap.py—CPAPFormatter,FreezeStore,CPAPPayload,compute_cpap_stats.
Implemented in Plans 10 (complete):
- GitHub OAuth (
/api/auth/github/callback): Issues JWT access + refresh tokens from GitHub user info - Bearer token middleware (
AuthMiddleware): Protects all non-public API endpoints; 401 withhcr auth loginguidance on failure - JWT handler (
hcr/product/auth/jwt_handler.py):encode_token,decode_token,is_expiring_soon - Token store (
hcr/product/auth/token_store.py):save_token,load_token,clear_tokenat~/.hcr/auth.json - Token refresh (
hcr/product/auth/refresh.py):maybe_refresh()— proactive refresh before expiry, called byrequire_auth()in CLI - Telemetry endpoint (
/api/telemetry): Per-tool SQLite audit log; fire-and-forget from MCP side with offline queue + HMAC signing - CSO sync endpoints (
/api/projects/{id}/csos): GET (list) + POST (create); MCP-sidepoll_oncemerges remote CSOs into local SQLite - CLI auth commands (
hcr auth login/logout/whoami): Local HTTP server for OAuth callback, stores tokens - MCP JWT enforcement:
_handle_initializevalidates JWT whenHCR_JWT_SECRETis set (bypassed byMCP_DEV_MODE)
23 tools (consolidated from 31 in Plan 9), organized in hcr/product/integrations/tools/:
| Category | Tools |
|---|---|
| State / Context | hcr_get_state, hcr_preflight, hcr_postflight, hcr_get_system_health |
| Write path | hcr_remember, hcr_record_file_edit, hcr_fail, hcr_resolve, hcr_set_trigger |
| Analysis | hcr_analyze_impact, hcr_get_recommendations, hcr_get_version_history, hcr_search_history |
| Session | hcr_create_session, hcr_list_sessions, hcr_merge_session, hcr_set_session_note |
| Decisions | hcr_read_decisions |
| Cross-project | hcr_share_state, hcr_get_shared_state, hcr_list_shared_states |
| Ops | hcr_restore_version, hcr_get_recent_activity, hcr_freeze |
All imports into mcp_server.py are at module level (no lazy imports inside functions).
FastAPI server (hcr/product/api/main.py) with:
POST /api/auth/github/callback— OAuth token exchangePOST /api/auth/refresh— token refreshGET /api/projects/{id}/preflight— returnsCPAPPayloadas JSONGET/POST /api/projects/{id}/csos— CSO sync endpointsPOST /api/telemetry— per-tool telemetry audit logGET /health— health check (unprotected)- Memory API (
/api/memory/*) — for ChatGPT Custom GPT Actions and REST clients
hcr/
engine/
cso/cso_model.py, cso_store.py, agent_registry.py
memory/centrality.py, projection.py, prefetch.py, embedding_store.py,
implicit_graph.py, episode_store.py, fusion.py, feedback.py,
prospective.py, cpap.py
symbolic/verifier.py, rules.py
engine_api.py
product/
api/main.py, auth.py, middleware.py, preflight.py, csos.py,
telemetry.py, memory.py, apikeys.py
auth/jwt_handler.py, token_store.py, refresh.py
cli/main.py, auth_cmd.py
integrations/mcp_server.py, mcp_server_stdio.py,
telemetry_client.py, tools/
storage/semantic_decay.py
sync/poller.py
install/post-commit
tests/ (230 passing, 4 skipped — skipped require live LLM)
web/web-ui/ (React + ReactFlow dashboard)
.hcr/
cso.db # CSO store (SQLite + WAL)
embeddings.db # sqlite-vec ANN index
cpap_epoch.json # CPAP freeze state
cpap_metrics.jsonl # CPAP telemetry
feedback.db # learned RRF weights
auth.json # CLI token store (~/.hcr/)
F1. Resume in <5 seconds. Symbolic-first inference (fast, no LLM required). Current task, progress, next action, relevant decisions, open risks.
F2. CPAP-structured context. Every hcr_preflight and hcr_get_state call returns a three-layer payload achieving ~98% prompt cache hit rate. cache_epoch field lets callers detect stale contexts. cache_hit: true means Layer 2 served from epoch with zero compute.
F3. Cross-tool memory. One graph, exposed via MCP to Claude Code, Cursor, Windsurf, Codex, ChatGPT (Custom GPT Actions), and any REST client. Switching tools does not lose context.
F4. Markdown round-trip. HCR generates and maintains CLAUDE.md / AGENTS.md from the canonical graph. The markdown file is no longer the source of truth, but it remains a first-class export.
F5. Local-first by default. HCR Core runs entirely on the developer's machine, including when using local LLMs (Ollama). Cloud features are opt-in.
F6. Auth and sync. GitHub OAuth → JWT. Bearer token middleware on all API endpoints. CLI hcr auth login/logout/whoami. Per-tool telemetry with offline queue + HMAC signing. CSO sync: poll_once merges remote CSOs into local SQLite for multi-device and multi-developer sync.
V1. Decision provenance. Every architectural decision is a first-class CSO with author, date, evidence, files in scope.
V2. Constraint enforcement. Symbolic verifier runs rules against every new CSO. Violations create RISK CSOs.
V3. Contradiction detection. Explicit contradicts edges between CSOs. Surfaced on preflight for agents.
V4. Forward impact simulation. hcr_analyze_impact — causal BFS + semantic RRF fusion. Returns predicted blast radius for any file or proposed change.
V5. Backward attribution. Traverse causal_in edges from any outcome to contributing decisions, intents, and actors.
V6. Re-attestation. Stale decisions flagged by decay; confidence decays on half-life schedule.
T1. Agent registry. agent_registry.py registers agents with identity, role, and autonomy budget.
T2. Pre-flight / post-flight lifecycle. hcr_preflight records retrieval for learned fusion; hcr_postflight records outcome signal. Used by FeedbackStore to train RRF weights.
T3. Cross-project state sharing. hcr_share_state / hcr_get_shared_state expose decisions across projects for the same developer.
T4. Trigger CSOs. hcr_set_trigger creates TRIGGER CSOs that fire when an agent opens a matching file. Injected at rank-0 by CognitiveProjection regardless of centrality.
T5. Episode segmentation. BOCPDSegmenter partitions event streams into work episodes for better context boundaries.
G1. JWT + Bearer token auth. Middleware-enforced on all API endpoints. MCP JWT enforcement when HCR_JWT_SECRET set.
G2. Per-tool telemetry audit. Every MCP tool call timed, signed, and logged to SQLite via /api/telemetry. Offline queue with HMAC for disconnected operation.
G3. Tenant-side redaction. Configurable redaction rules strip secrets and PII before CSOs leave tenant boundary. (Enterprise tier.)
G4. Policy engine. Symbolic rules gate agent autonomy at the CSO level. (Enterprise tier.)
G5. Compliance-aligned controls. Designed for SOC2, ISO 27001, HIPAA, GDPR. Audit roadmap with dates in §12.
python -m hcr.product.integrations.mcp_server_stdio — stdio transport, compatible with Cursor, Claude Code, Windsurf, and any MCP client.
Pre-flight output structure (CPAP-formatted):
=== HCR Context (epoch: a3f9c2b1, CACHED) ===
## System Context
[static rules and project identity — ~500 tokens — 100% cached]
## Project Memory
[D] Use FastAPI for REST layer (scope: arch)
→ caused: [O] REST endpoints live at /api/...
[C] All imports in mcp_server.py must be at module level
[R] SQLite WAL may block under concurrent writes
... (top-20 by centrality — ~1,500 tokens — ~98% cached)
## Current Focus
Δ auth.py:42 Mvalidate_token I+pyjwt (< 50 tokens — always fresh)
FastAPI at http://localhost:8080 (or deployed to Render/cloud). Auth: Bearer token from hcr auth login.
GET /api/projects/{id}/preflight returns full CPAPPayload JSON for direct API callers. Callers can inspect cache_hit: true to skip re-sending layer2_stable in their own cache block.
ChatGPT Custom GPT Actions: /api/memory/* endpoints with API key auth.
hcr init # initialize .hcr/, install CPAP git hook
hcr status # show cognitive state summary
hcr resume # full resume context
hcr freeze # manual CPAP epoch rebuild (-p for project path)
hcr auth login # GitHub OAuth → saves ~/.hcr/auth.json
hcr auth logout # clear token
hcr auth whoami # print current identity
hcr doctor # system health check
hcr dashboard # launch web UIReact + ReactFlow dashboard at web/web-ui/. Modes:
- Live causal graph with centrality-sized nodes
- State history timeline (git-like)
- System health monitoring with CPAP hit rate
- CPAP metrics (hit rate, calls today, avg Layer 3 tokens)
VS Code extension (basic). Side panel shows current task, relevant decisions, open risks. Status bar with confidence score.
Developer opens IDE. MCP pre-flight fires. CPAP serves Layer 2 from epoch file (zero compute, 0ms). Layer 3 shows last file edit. Total context: ~2,050 tokens. Cache hit — billed at ~90% discount vs. yesterday's first call. Developer reads: current task, three relevant decisions, one open risk, recommended next action. Clicks Continue. No typing.
Developer commits code. Post-commit hook writes .hcr/freeze_requested. On next IDE interaction, CPAPFormatter detects the flag, calls CognitiveProjection.compute(), picks new top-20 CSOs by centrality × decay, persists new epoch. New cache_epoch returned in preflight — downstream clients know context changed. Old epoch invalidated. Layer 2 bytes stable again for all subsequent calls until next commit.
Staff engineer authorizes five Claude Code agents. Each agent runs hcr_preflight at startup — receives CPAP-structured context with team decisions, constraints, and active risks. Layer 2 is byte-identical across all five agents' first calls (cache hit on runs 2–5). Agents run. On post-flight, hcr_postflight ingests their produced CSOs. Symbolic verifier evaluates against DEFAULT_RULES. Two agents triggered risk CSOs (dependency added without RFC; file touched by an active decision). They route to human review. Staff engineer approves via dashboard. Three agents' changes are clean.
Production regression. On-call types hcr analyze_impact auth.py --direction backward. HCR returns: the commit, the agent, the pre-flight context the agent received, the two decisions in scope, and the constraint that should have caught it. Post-mortem writes itself.
Platform team runs hcr_get_system_health. Response includes:
"cpap_stats": {
"hit_rate": 0.97,
"calls_today": 143,
"busts_today": 4,
"avg_layer3_tokens": 16
}97% cache hit rate. At 143 calls/day across the team, CPAP saved ~$18 in input token costs today vs. the uncached baseline.
Foundation (Plans 1–8):
- CSO data model and graph engine (SQLite + WAL, WAL indexes)
- Symbolic verifier with declarative rule engine
- Causal reasoner with BFS transitive reachability and forward/backward analysis
- Git hooks, file watcher, MCP tool calls, IDE telemetry (capture layer)
Cognitive State Fabric (Plan 9):
CognitiveProjection(centrality-ranked, decay-filtered)ProjectionPrefetcher(background cache)EmbeddingStore(sqlite-vec + Ollama/sentence-transformers)BOCPDSegmenter(episode segmentation)- RRF + MMR fusion with learnable
FeedbackStoreweights generate_soft_links(semantic k-NN soft edges)get_triggered_csos(prospective memory)
Auth, Telemetry, Sync (Plan 10):
- GitHub OAuth callback + JWT issuance
- Bearer token middleware (all API endpoints protected)
- CLI
hcr auth login/logout/whoami - MCP JWT enforcement at
initialize - Web frontend: token persistence + ProtectedRoute guard
- Per-tool telemetry client with offline HMAC queue
/api/telemetryendpoint with SQLite audit log- CSO sync endpoints (GET/POST
/api/projects/{id}/csos) poll_oncepoller for CLI syncmaybe_refresh()token refresh before expiry- REST
/api/projects/{id}/preflightreturning CPAPPayload
CPAP (May 27, 2026):
CPAPPayload,FreezeStore,compute_cpap_stats(cpap.py)CPAPFormatterwith freeze gate, stable sort, Layer 3 delta, telemetryhcr/install/post-commitgit hookhcr_freezeMCP tool- CPAP wired into
hcr_preflightandhcr_get_state cpap_statsinhcr_get_system_health- REST
/api/projects/{id}/preflightreturns CPAPPayload hcr freezeCLI subcommand;hcr initinstalls CPAP hook
Test coverage: 230 passing, 4 skipped (require live LLM), ~70s full suite.
MCP tools: 23 tools (consolidated from 31).
- Agent registry and autonomy gates —
agent_registry.pyexists; policy-enforced gates not yet wired to all tool calls - Team graph — CSO store is per-project; multi-developer sync exists via REST but team-scope shared graph not yet first-class
- Fleet dashboard — web console shows single-project state; fleet-wide agent view pending
- CPAP cache control blocks — REST callers receive
CPAPPayloadJSON; upstream injection of Anthropiccache_controlblocks is caller-side
- Policy engine integration (gate agent actions against declared team constraints)
- Postgres backend and multi-region deployment (Team SaaS)
- Full web console (Map, Feed, Decision, Audit, Fleet modes)
- JetBrains and Neovim plugins
- Slack / Linear / Jira webhooks
- HCR Team SaaS beta (6 design partners)
- CPAP: upstream
cache_controlblock injection for Anthropic API callers - Agent fleet dashboard (live view, risk scores, gate states)
- VPC / self-hosted deployment
- SCIM, SSO, custom RBAC
- Per-org KMS, BAA-eligible deployments
- SOC2 Type II audit window open (Month 6–12 GA)
- HCR Team GA (Month 9)
- HCR Enterprise GA (Month 12)
- HCR rule language v1.0 (Datalog-inspired, documented, open source)
- Marketplace for third-party rule packs and CSO type extensions
- Time-travel debugging (replay team cognitive state at any past moment)
- Causal reasoner upgrade to symbol-level granularity (method/function, not just file)
- Voice/meeting capture (opt-in, redactable)
- Cross-org federation for open-source projects
- Mobile read-only app for engineering managers
We publish measured numbers from deployments, not projected industry statistics.
| Metric | Target | Current (lab) |
|---|---|---|
| Time to first productive action | <5 seconds | <5 seconds (symbolic-first path) |
| CPAP Layer 2 cache hit rate | >95% | ~98% (230-call test session) |
| CPAP avg Layer 3 tokens | <50 | ~16 |
| Session resume accuracy | >90% (user acceptance) | In beta testing |
| Agent rollback rate (HCR vs. ungoverned) | 50% reduction | Baseline being measured |
| Mean time to provenance | <1 hour | <5 min (local store, graph traversal) |
Published quarterly from design-partner deployments. Baseline: uncached full-context calls at current API pricing. HCR + CPAP target: 95% reduction in billed input tokens per session.
Negative results published. If CPAP cache hit rate drops below 90% in a quarter, we say so and explain why. If symbolic verification produces >10% false-positive rate, we publish that.
-
CPAP requires the causal graph. Layer 2 freeze boundaries are derived from causal centrality. A competitor cannot implement CPAP without first building the entire CSO + causal edge + centrality scoring stack. This is a 6–12 month rebuild, not a feature flag.
-
Network effect at the team layer. Every developer on a team who uses HCR improves every other developer's context quality. Switching cost is the team's accumulated decision graph. Switching individually is possible; switching as a team is not.
-
Integration breadth on MCP. Being the canonical write target for every coding agent on the team makes HCR sticky in a way single-vendor tools cannot match. The same graph works across Claude Code, Cursor, Windsurf, and any in-house agent.
- Best autocomplete — model vendors own this
- Cheapest memory store — we are not a memory store
- Biggest context window — CPAP makes the window unnecessary
| Risk | Likelihood | Impact | Mitigation |
|---|---|---|---|
| LLM vendors add native team memory | Medium | High | Lead with causal verification + CPAP; they cannot replicate either without rebuilding HCR's architecture |
| CPAP cache hit rate degrades on large teams with frequent commits | Medium | Medium | Tune Layer 2 budget; adaptive freeze threshold; publish measured rates |
| MCP loses to a competing protocol | Low | Medium | Protocol-agnostic core; adapters for any emerging protocol |
| Enterprise sales cycles starve early-stage runway | High | High | Open-core acquisition; land-and-expand; partner-led distribution |
| Symbolic verification produces too many false positives | Medium | High | Ship with conservative rules first; user-tuneable thresholds; publish false-positive rates |
| Privacy backlash on capture features | Medium | High | Local-first default; explicit opt-in for every cloud feature; tenant-side redaction |
| LLM cost collapse eliminates CPAP's economic argument | Low | Medium | CPAP also reduces latency 10×; latency argument persists regardless of cost |
HCR v3.0 is the same bet as v2.0 — the Cognitive State Plane is the right category — with three additions:
-
CPAP changes the economics of memory-augmented AI tools. 95% token cost reduction is not a marginal improvement; it makes memory augmentation economically viable at fleet scale, where it was previously prohibitive.
-
The system is built. 230 passing tests. 23 MCP tools. Full auth/sync/telemetry stack. REST API with ChatGPT integration. CPAP wired into every context-serving tool. This is infrastructure, not a prototype.
-
The moat is now technical, not just strategic. CPAP requires the causal graph. The causal graph requires the CSO data model. The CSO data model requires HCR's capture layer. Competitors cannot replicate the user-visible outcome (95% token reduction) without rebuilding the entire stack from Layer 1.
The next phase — team graph, agent fleet governance, SOC2 audit, enterprise GTM — is where HCR becomes the control plane for AI-augmented engineering. The foundation for all of it is built.
For engineers evaluating CPAP:
Why UUID sort as tiebreaker, not timestamp? Timestamps change as CSOs are accessed and updated. UUIDs are content-addressed and immutable. Using UUID as a final tiebreaker guarantees byte identity even if two CSOs have identical topological depth and insertion rank (which should not happen in practice, but is possible in edge cases).
Why topological depth first? DECISION CSOs that are causally upstream of many OUTCOME and OBSERVATION CSOs should appear before their descendants in the context. This preserves the narrative structure — the model reads "we decided X" before "X caused Y" — which improves reasoning quality on causal chains.
Why insertion_ranks carried forward across rebuilds? The alternative is re-sorting by centrality score on every rebuild. But centrality scores shift as new CSOs are written (changing the graph topology). A CSO that ranked 5th yesterday might rank 12th today after an unrelated commit. If that reordering changed Layer 2 bytes, the cache would bust even though neither CSO's content changed. Insertion ranks provide a stable "narrative position" that survives graph topology changes.
What causes a cache bust? A git commit (post-commit hook fires). The new commit may have changed which CSOs are most central, so Layer 2 must rebuild to reflect the new top-20. This is correct — after a commit, context should update. Between commits, the context is stable and achieves near-perfect cache hit rate.
CSO (Cognitive State Object). Typed, signed, causally-linked record representing an intent, decision, observation, constraint, risk, outcome, claim, task, or rollback.
CPAP (Causal Prefix Alignment Protocol). Three-layer context serialization that achieves ~98% LLM prompt cache hit rate. Requires causal graph for freeze boundaries.
Cognitive State Plane. HCR's category: verifiable, governed, team-scope infrastructure layer between coding agents and the codebase.
CognitiveProjection. Centrality-ranked, decay-filtered live state view. Input to CPAP Layer 2 selection.
CPAPFormatter. Produces CPAPPayload from CSOStore. Manages freeze gate, epoch persistence, Layer 3 delta extraction, and telemetry.
FreezeStore. Reads/writes .hcr/cpap_epoch.json — the persisted Layer 2 epoch across process restarts.
Cache epoch. SHA256[:8] of layer2_bytes. Callers use this to detect context staleness.
Layer 3 delta. Compact AST-diff representation of most recent file edit. Always <50 tokens. Format: Δ {file} M{fn} I+{import}.
Symbolic Verifier. Rules engine over the typed CSO graph. Emits RISK CSOs when rules fire.
Causal Reasoner. Maintains and traverses causal links between CSOs. Forward impact simulation + backward attribution.
Pre-flight / Post-flight. Structured context handoff to an agent before it begins work (hcr_preflight) and structured reconciliation of its produced CSOs afterward (hcr_postflight).
BOCPD (Bayesian Online Changepoint Detection). Statistical method for detecting work episode boundaries in event streams.
RRF (Reciprocal Rank Fusion). Fusion algorithm combining causal-centrality and semantic ranked lists for fixed token budget.
MCP (Model Context Protocol). Open protocol for tool and resource access between AI agents and external systems. HCR is MCP-native.
This is a living product document. Implementation status reflects the codebase state as of May 27, 2026. Version 3.0 — Integration of v1.0 governance positioning, v2.0 Cognitive State Plane strategy, and CPAP token-economics breakthrough.