Skip to content

fix: prevent OOM crashes during init on large codebases#6

Open
mechtar-ru wants to merge 97 commits into
NickCirv:mainfrom
mechtar-ru:main
Open

fix: prevent OOM crashes during init on large codebases#6
mechtar-ru wants to merge 97 commits into
NickCirv:mainfrom
mechtar-ru:main

Conversation

@mechtar-ru

Copy link
Copy Markdown

Summary

Two fixes to prevent out-of-memory crashes when running engram init on large projects:

  1. MAX_DEPTH limit in extractDirectory (ast-miner.ts) — prevents stack overflow on deep directory trees
  2. MAX_FILES_PER_COMMIT limit in mineGitHistory (git-miner.ts) — prevents O(n²) memory explosion on commits with many files

Changes

ast-miner.ts

  • Added MAX_DEPTH = 100 depth limit to recursive walk() function
  • Wrapped readdirSync in try-catch to skip unreadable directories
  • Added .engramignore support for custom exclusions
  • Expanded default exclusions: target, .venv, .next, .nuxt, .output, coverage, .turbo, .cache

git-miner.ts

  • Added MAX_FILES_PER_COMMIT = 50 to prevent O(n²) co-change pair explosion
  • Skip dist/, build/ directories from co-change analysis

Testing

Tested on Axolotl project (2.2GB, 34K files) — now completes successfully in ~600ms.

NickCirv and others added 30 commits April 11, 2026 14:13
Day 5 exposes the Sentinel hook layer through 7 new CLI commands. The
code from Days 1-4 was inert (correct but unreachable); Day 5 makes it
the thing Claude Code actually invokes.

New commands:

  engram intercept
    Hook entry point. Reads JSON from stdin with 3s timeout, passes
    through dispatchHook, writes JSON response to stdout. ALWAYS exits
    0 — the process boundary is the last line of defense for the
    "never block Claude Code" invariant. Even malformed input, missing
    graph, or handler crash resolves to empty stdout (passthrough).

  engram install-hook [--scope <local|project|user>] [--dry-run]
    Surgically add engram's entries to a Claude Code settings.json.
    Default scope is 'local' (.claude/settings.local.json, gitignored).
    Preserves all existing non-engram hooks. Idempotent. Writes
    atomically (temp file + rename) with timestamped backup
    (settings.json.engram-backup-<ISO>.bak). --dry-run shows the diff
    without writing.

  engram uninstall-hook [--scope <s>]
    Surgical removal. Only deletes entries whose command contains
    "engram intercept". Cleans up empty event arrays and empty hooks
    object. Backup before write.

  engram hook-stats [--json]
    Read .engram/hook-log.jsonl, summarize by event / tool / decision.
    Shows estimated tokens saved based on PreToolUse Read deny count
    (1200 tok/deny average). Human-readable text by default, JSON with
    --json flag.

  engram hook-preview <file>
    Dry-run the Read handler for a specific file without installing.
    Shows deny+reason (if confidence high), allow+context (landmines),
    or passthrough with explanation. Perfect for debugging coverage.

  engram hook-disable / hook-enable
    Toggle .engram/hook-disabled kill switch. All handlers check this
    flag; when set, everything falls through to passthrough without
    uninstalling the settings.json entries.

New modules:

  src/intercept/installer.ts      - Pure data transforms:
                                     buildEngramHookEntries,
                                     installEngramHooks (idempotent),
                                     uninstallEngramHooks (surgical),
                                     isEngramHookEntry (detection),
                                     formatInstallDiff (dry-run view).
                                     Zero I/O — all reads/writes live
                                     in cli.ts, tested independently.

  src/intercept/stats.ts          - summarizeHookLog +
                                     formatStatsSummary. Pure
                                     aggregation over HookLogEntry[].
                                     Read-deny token savings: 1200
                                     tok/deny estimate.

Modified:

  src/intercept/dispatch.ts       - Added decision logging for all
                                     PreToolUse routes. Every Read /
                                     Edit / Write / Bash invocation
                                     logs {event, tool, path,
                                     decision: deny|allow|passthrough}
                                     to hook-log.jsonl after the
                                     handler resolves. Logging errors
                                     swallowed — never affects dispatch
                                     result.

  src/cli.ts                      - 7 new commander commands + helper
                                     resolveSettingsPath(scope). All
                                     install/uninstall writes are
                                     atomic (temp + rename) with
                                     timestamped backup.

Tests: +44 new (total 439, up from 395)
  tests/intercept/installer.test.ts   - 24 tests (idempotent install,
                                         non-destructive, surgical
                                         uninstall, immutability)
  tests/intercept/stats.test.ts       - 13 tests (summary correctness,
                                         frozen results, formatting)
  tests/intercept/cli-intercept.test.ts - 7 end-to-end subprocess tests
                                         that actually spawn
                                         'node dist/cli.js intercept'
                                         and pipe JSON payloads. Auto-
                                         builds dist/ via spawnSync npm
                                         run build in beforeAll.

DOGFOOD VERIFIED in real shell:

  $ node dist/cli.js init
  $ node dist/cli.js hook-preview src/graph/query.ts
    📋 Hook preview: /Users/nicholas/engram/src/graph/query.ts
       Decision: DENY (Read would be replaced)
       Summary (would be delivered to Claude):
         [engram] Structural summary for src/graph/query.ts
         Nodes: 10 | avg extraction confidence: 1.00
         NODE queryGraph() [function] L80
         NODE shortestPath() [function] L170
         NODE renderFileStructure() [function] L412
         ... (~350 tokens instead of the ~4,000 token full file)

  $ node dist/cli.js hook-stats
    engram hook stats (3 invocations)
    By event:
      PreToolUse             3 (100.0%)
    By tool:
      Read                   3

  $ node dist/cli.js install-hook --dry-run
    📌 engram install-hook (scope: local)
       Target: /Users/nicholas/engram/.claude/settings.local.json
       Changes:
         + PreToolUse: 0 → 1 entries
             + { matcher="Read|Edit|Write|Bash" command="engram intercept"}
         + PostToolUse: 0 → 1 entries
             + { matcher=".*" command="engram intercept"}
         + SessionStart: 0 → 1 entries
             + { command="engram intercept"}
         + UserPromptSubmit: 0 → 1 entries
             + { command="engram intercept"}
       (dry-run — no changes written)

  $ node dist/cli.js hook-disable
    ✅ engram hooks disabled for /Users/nicholas/engram
  $ node dist/cli.js hook-preview src/graph/query.ts
       Decision: PASSTHROUGH (Read would execute normally)
  $ node dist/cli.js hook-enable
    ✅ engram hooks re-enabled for /Users/nicholas/engram

All 439 tests pass. tsc clean. Dogfooded on engram itself — Read
interception produces 11.1x token reduction for query.ts (4,000 →
350 tok).

Safety invariants preserved at the process boundary:
  - engram intercept ALWAYS exits 0
  - Stdin read with 3s hard timeout
  - Input size cap: 1MB
  - Any parse/dispatch error → empty stdout → passthrough
  - install-hook backs up before writing
  - uninstall-hook surgically removes only engram entries

Sentinel stack is now end-to-end functional. v0.3.0 is installable and
testable in a real Claude Code session.

Next: Day 6 — README rewrite with "context as infra" hero, CHANGELOG
entry, troubleshooting docs, opportunistic landmines rename in comments.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…rsion bump

Ships the docs + metadata for v0.3.0 "Sentinel". No code behavior
changes — this is the packaging layer on top of Days 1-5.

README.md:
  - New hero: "Context as infra for your AI coding tools"
  - Replaced v0.2 quickstart with the Sentinel install flow:
    engram init → engram install-hook
  - New "The Problem" section explaining the v0.2 ceiling
    (agent has to remember) and how v0.3 flips it (hook intercepts
    at the tool-call boundary).
  - New "How the Sentinel Layer Works" table — all 7 hook handlers
    with their mechanism (deny+reason / allow+additionalContext /
    pure observer) and purpose.
  - 10 safety invariants listed verbatim.
  - Sentinel command reference (7 new commands) as a separate
    subsection, keeping the v0.1/v0.2 commands untouched for backcompat.
  - Tests badge updated 132 → 439.

CHANGELOG.md:
  - Full v0.3.0 entry with mechanism explanation, empirical
    verification note, the 7 new CLI commands, the 7 new hook
    handlers + projected savings per hook, infrastructure details,
    content safety list, 10 safety invariants, test coverage
    numbers, v0.3.1 deferrals (Grep, per-user thresholds, self-
    tuning), and explicit "no migration needed" note.

package.json:
  - version: 0.2.1 → 0.3.0
  - description: updated to mention the hook interception layer

src/cli.ts:
  - program.version("0.3.0")
  - program.description() updated to match package.json

src/graph/query.ts:
  - Opportunistic "regret buffer" → "landmines" rename in 4
    comments. Internal API unchanged: mistakes() function,
    list_mistakes MCP tool, kind: "mistake" schema all stable.
    "Landmines" is the user-facing metaphor; "mistakes" is the
    internal term. Memory feedback_engram_v0_3_sentinel_architecture.md
    documents this split.

Verification:
  $ npx tsc --noEmit           → exit 0
  $ npm run build              → ✅ dist/cli.js 49.41 KB
  $ node dist/cli.js --version → 0.3.0
  $ npx vitest run             → 439/439 passing (~1.5s)

Cumulative Sentinel stats through Day 6:
  - 6 commits on v0.3.0-sentinel branch
  - ~6,500 LOC (source + tests + docs)
  - 439 tests (+225 from v0.2.1 baseline of 214)
  - 7 hook handlers shipped (Read, Edit, Write, Bash, SessionStart,
    UserPromptSubmit, PostToolUse)
  - 7 new CLI commands (intercept, install-hook, uninstall-hook,
    hook-stats, hook-preview, hook-disable, hook-enable)
  - 10 safety invariants enforced at runtime
  - Dogfood-verified: engram intercepts its own src/graph/query.ts
    with an 11.1x token reduction

Next: Day 7 — benchmark on a real session, final pre-publish smoke
test, npm publish engramx@0.3.0, GitHub release.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Day 7 is the prep-for-ship day. Everything that can be done without
Nick's direct action (2FA, public release, account state) is done.
Everything that requires his explicit action is documented for him to
run at his own pace.

Added:
  RELEASE-NOTES-v0.3.0.md — complete release notes with:
    - Real measured Sentinel numbers on 4 engram files: 12,189 → 2,210
      tokens (82% reduction, 75% hit rate) — measured not projected
    - Real benchmark: 113,544 → 464 avg query tokens (244.7x vs full,
      11.1x vs relevant files)
    - All 7 new CLI commands documented
    - All 7 new hook handlers with mechanism
    - 10 safety invariants listed
    - Zero-migration guarantee spelled out
    - Step-by-step manual actions for Nick:
        1. git tag v0.3.0 (DONE locally)
        2. Review branch (git log --oneline main..v0.3.0-sentinel)
        3. Merge to main (3 strategies documented)
        4. Push origin + tag
        5. npm publish (requires 2FA)
        6. GitHub release
        7. Install globally and try
        8. Announce (launch posts template)

Local tag:
  v0.3.0 created at 125af22 (Day 6 commit).
  Not pushed — Nick's call.

Final verification:
  $ npx tsc --noEmit        → exit 0
  $ npx vitest run          → 439/439 passing (~1.5s)
  $ npm run build           → ✅ dist/cli.js 49.41 KB
  $ node dist/cli.js --version → 0.3.0
  $ npm pack --dry-run      → engramx@0.3.0, 42.5 KB packed, 9 files

Cumulative Sentinel (Days 1-7):
  Commits on branch: 7 (Day 1 → Day 2 → audit → Day 3 → Day 4 →
                        Day 5 → Day 6 → Day 7)
  Total LOC delta: +6,505 source + tests + docs - 65 removed
  Tests: 214 → 439 (+225 new, zero regressions)
  Test suite time: ~1.5s
  Hook handlers shipped: 7
  CLI commands shipped: 7
  Safety invariants enforced: 10

Empirically measured savings vs projection:
  Projected (Day 0): -42,500 tok/session (80% reduction)
  Measured on 4 engram files: -9,979 tok / 4 files (82% reduction)
  Hit rate: 75% (projected 60%)

v0.3.0 is ready. The Sentinel ships.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
engram v0.3.0 "Sentinel" — the release that makes engram stop being a
tool your agent queries and start being a Claude Code hook layer that
intercepts Read/Edit/Write/Bash at the tool-call boundary.

== What ships ==

Seven new hook handlers:
  - PreToolUse:Read           deny+reason replaces file with ~300-tok summary
  - PreToolUse:Edit/Write     allow+context with landmine warnings (never blocks)
  - PreToolUse:Bash           strict parser + delegation to Read handler
  - SessionStart              project brief injection (startup/clear/compact)
  - UserPromptSubmit          keyword-gated pre-query injection
  - PostToolUse               pure observer → hook-log.jsonl

Seven new CLI commands:
  - engram intercept          hook entry point (stdin → dispatch → stdout)
  - engram install-hook       atomic, idempotent, backup, --dry-run
  - engram uninstall-hook     surgical removal
  - engram hook-stats         summarize .engram/hook-log.jsonl
  - engram hook-preview       dry-run for a specific file
  - engram hook-disable/enable  kill-switch toggle

Ten safety invariants enforced at runtime:
  1. Any handler error → passthrough (never block Claude Code)
  2. 2s per-handler timeout
  3. Kill switch respected by every handler
  4. Atomic settings.json writes with timestamped backups
  5. Never intercept outside project root
  6. Never intercept binaries or secrets (.env/.pem/.key/credentials)
  7. Never log user prompt content (privacy invariant, tested)
  8. Never inject >8000 chars per hook response
  9. Stale graph detection (file mtime > graph mtime → passthrough)
  10. Partial-read bypass (offset/limit → passthrough)

== Empirically measured savings ==

Sentinel vs raw Read on 4 real engram files:
  src/core.ts                 ~4,169 tok → DENY (13 nodes, 300 tok summary)
  src/graph/query.ts          ~4,890 tok → DENY (10 nodes, 300 tok summary)
  src/intercept/dispatch.ts   ~1,820 tok → DENY (5 nodes, 300 tok summary)
  src/intercept/handlers/read.ts  ~1,310 tok → PASSTHROUGH (1 export, correctly below threshold)
  TOTAL: 12,189 → 2,210 tokens (-82%, 75% hit rate)

== Migration ==

No migration needed. v0.3.0 is purely additive. All v0.2.1 commands,
MCP tools, and schema are unchanged. Hook layer is opt-in via
engram install-hook.

== 8 commits on v0.3.0-sentinel branch ==

edce41b  Day 1: scaffold intercept layer (safety, context, formatter)
c94462c  Day 2: Read handler + getFileContext + renderFileStructure
6e91706  audit: normalizePath try/catch + kindOrder cleanup + coverage TODO
94ad1fe  Day 3: Edit/Write landmines, Bash cat delegation, dispatcher
c8ecdab  Day 4: SessionStart + UserPromptSubmit + PostToolUse + hook-log
05e8027  Day 5: CLI wiring — engram becomes installable and runnable
125af22  Day 6: README hero rewrite + CHANGELOG + landmines rename + version bump
cddb1a0  Day 7: Release notes, real benchmark numbers, local v0.3.0 tag

== Stats ==

  +7,472 / -44 across 44 files
  214 tests → 439 tests (+225)
  ~1.5s full suite time
  tsc clean
  dist/cli.js 50.6 KB, 42.5 KB packed, 9 files total

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Publishing the visual ecosystem walkthrough created for the v0.3.0
release. 11-page architecture diagram showing:

  §01  The 4 hook events in order (SessionStart, UserPromptSubmit,
       PreToolUse, PostToolUse) with fire timing, mechanism, and
       per-hook token impact.

  §02  The Read handler's 9-branch decision tree with a real deny+reason
       JSON response alongside. Shows every passthrough branch (payload
       shape, content safety, project resolution, kill switch, graph
       coverage, staleness, confidence threshold) and the money path
       where interception happens.

  §03  The 6-layer ecosystem substrate — engram graph, MemPalace,
       rule files, skills, agents, MCP servers — showing how engram
       fits alongside the other persistence layers a Claude Code
       session leans on.

  §04  Real measured savings on 4 engram source files: 12,189 → 2,210
       tokens (−82%), 75% hit rate. Reproducible with
       engram hook-preview.

  §05  Quick reference — the 8 CLI commands you'll actually use.

Design: Ink & Paper palette (editorial black-on-black with teal
accent), Fraunces variable serif for headings, Space Grotesk body,
JetBrains Mono code. Asymmetric grid layout with scroll-reveal
animations in HTML; forced-visible + page-break-hardened for PDF.

Added:
  docs/engram-sentinel-ecosystem.pdf   1.02 MB, A3 landscape, 11 pages
  docs/engram-sentinel-ecosystem.html  44 KB, single self-contained file,
                                        Google Fonts CDN only, zero JS
                                        frameworks

Changed:
  README.md — new "Architecture Diagram" section after the quickstart,
              linking to both PDF and HTML.
  .gitignore — added .claude/settings.local.json (project-local Claude
              Code hook install state, machine-specific).

Generated via Chrome headless with:
  --headless=new
  --force-prefers-reduced-motion  (override for scroll-reveals)
  --virtual-time-budget=15000      (let fonts + IO observer resolve)
  --print-to-pdf-no-header

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Three changes in one commit:

1. PALETTE REBUILD (from assets/banner.html, extracted verbatim):
   - Background:  #0a0a0c (warm near-black, not cool)
   - Primary:     rgb(217, 119, 6) — amber, the signature color
   - Heading:     Space Grotesk 700 compressed, -3px letter-spacing
   - Body:        Space Grotesk 300 (light, editorial)
   - Code:        JetBrains Mono
   - Grid:        40px white at 0.015 opacity (identical to banner)
   - Wash:        amber radial blur top-right (banner convention)
   - Highlight:   rgba(74, 222, 128, 0.8) — ONLY for ✅ success markers
   - Signature:   engr<span class="a">a</span>m — the letter "a" in amber,
                  used in topbar, hero h1, footer version string

   The previous versions (Ink & Paper / terminal-cyan) were
   editorial dev-generic. This version is extracted verbatim from
   the engram banner.html so it actually matches the brand.

2. SVG KNOWLEDGE GRAPH in the hero — amber nodes + amber edges,
   central "Sentinel" node with gentle pulse animation. Mirrors the
   graph visualization on the right side of banner.png. The central
   node is labeled "Sentinel" with surrounding nodes for Read, Edit,
   Write, Bash, Session, Prompt.

3. SPACING AUDIT — sections were "disconnected" because vh-based
   padding clamps (clamp(4rem, 10vh, 8rem)) inflated to the max
   value on A3 print, creating 14rem dead zones between sections.
   Fixed by replacing all vh clamps with fixed rem values:
     .hero:    padding-block 3.5rem 2.5rem (was up to 14rem combined)
     .section: padding-block 3rem (was up to 16rem combined)
     .section-head: single-column stack instead of 4fr/8fr split with
                    dead left column
     Print override: drop to 1.8rem, let sections flow across pages
                     instead of forcing page-break-inside:avoid
   Result: 14 pages → 9 pages on A3 landscape.

4. ATTRIBUTION + BRAND LINEAGE:
   - Topbar status area now includes: "a cirvgreen venture" link to
     cirvgreen.com (amber "cirvgreen" wordmark)
   - Footer gained a dedicated attribution row below the description:
     "Created by Nicholas Ashkar · NickCirv — Part of the cirvgreen
     ecosystem"
   - Footer links row added cirvgreen.com alongside npm/github/release
   Both are styled as ghost text with amber brand highlights.

5. SECTION CONNECTIVITY — added a subtle 2.4rem amber line at the
   top-left of every section (via ::before pseudo) to create visual
   rhythm and signal section boundaries without aggressive dividers.

6. GENERIC §03 CONTENT — the substrate section describes engram's
   surfaces in generic terms (graph.db, rules files, git history,
   Claude Code hook config, peer MCP servers, host AI client) with
   no references to Nick's personal MemPalace/brain-os/vault setup.
   Safe for public repo.

Files updated:
  docs/engram-sentinel-ecosystem.html  (61.9 KB, full rebuild)
  docs/engram-sentinel-ecosystem.pdf   (901 KB, A3 landscape, 11 pages)

Generated with:
  chrome --headless=new --force-prefers-reduced-motion
         --virtual-time-budget=15000 --print-to-pdf
         (A3 landscape, 8mm margins)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ffold, EngramBench v0.1

The v0.3.1 release across 5 fronts:

1. TF-IDF keyword filter on UserPromptSubmit hook — kills the 76-node
   noise bug where common-term prompts poisoned mature graphs. New
   computeKeywordIDF helper in src/core.ts, IDF >= 1.386 threshold
   (25% cutoff), top-N seed selection. 3 new tests.

2. engram memory-sync command + src/intercept/memory-md.ts — writes
   structural facts into a marker-bounded block inside Anthropic's
   native MEMORY.md. Complementary to Auto-Dream: prose memory stays
   with Anthropic, structural graph stays with engram. Pure builder
   + upsert + atomic write. 16 new tests.

3. Cursor 1.7 beforeReadFile adapter scaffold
   (src/intercept/cursor-adapter.ts + engram cursor-intercept CLI).
   Wraps existing handleRead in Cursor's {permission, user_message}
   shape. Experimental — wire-up lands in v0.3.2. 8 new tests.

4. EngramBench v0.1 — 10 structural task definitions in bench/tasks/
   (find-caller, parent-class, import-graph, refactor-scope,
   cross-file-flow, etc.) with scoring rubrics and expected tokens
   per setup. bench/run.sh runner scaffold + results/TEMPLATE.csv.

5. Rebrand to "the structural code graph" — package description,
   keywords, README hero.

466 tests passing (up from 442 in v0.3.0). Zero new runtime deps.
Schema unchanged. No breaking changes.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
3 source bugs fixed (POSIX path normalization, CRLF YAML parsing,
libuv assertion on Node 25 Windows), 5 test bugs fixed. New
toPosixPath() single source of truth for graph path storage.
isHardSystemPath now platform-aware (UNC, Windows, ProgramData).

Post-init nudge suggests install-hook when Sentinel not yet wired.
Experience Tiers section in README. Windows + fail-fast CI matrix.
User manual (HTML, engram brand identity). 467 tests passing.

Credit: ultrathink (shahe-dev) for root-cause analysis + patch.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Four new capabilities that shift engram from tool to infrastructure:

1. PreCompact hook — re-injects god nodes + landmines before context
   compaction. First tool in the ecosystem whose context survives
   Claude Code's conversation compression.

2. CwdChanged hook — auto-switches project graph when user navigates
   to a different directory mid-session.

3. File watcher (engram watch) — incremental re-indexing via fs.watch.
   300ms debounce, extension whitelist, ignored directories. Zero deps.

4. Mempalace bundle — SessionStart queries mcp-mempalace in parallel
   with graph queries and appends semantic findings to the brief.
   Graceful degradation if mempalace not installed.

Also: edges.source_file index, transactional deleteBySourceFile,
async execFile (not sync), per-instance debounce state. 486 tests.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The CLI version was hardcoded as "0.3.0" in commander's .version() call,
causing `engram --version` to report the wrong version after bumps.
Now reads from package.json via createRequire at runtime.

Also bumps to 0.4.1 to publish the fix (0.4.0 tarball has stale version).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
AAA-designed HTML guide showing how memory tools, compression plugins,
code review tools, and workflow managers integrate with engram. Includes
real token savings numbers, 4 integration patterns (CLI, programmatic
API, hook chain, SessionStart bundle), hook coexistence table, and
per-tool-type guidance.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Real-time terminal HUD showing hook activity and token savings.
Refreshes every second from hook-log.jsonl. Shows:
- Total tokens saved (cumulative)
- Hit rate with visual bar
- Decision breakdown (intercepted/allowed/passthrough)
- Top intercepted files with bar chart
- Recent activity feed
- Landmine warnings count

Also aliased as `engram hud`. No external TUI dependencies.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
New CLI command `engram hud-label` outputs JSON for Claude HUD's
--extra-cmd protocol. Shows ⚡engram with token savings + visual
hit rate bar (▰/▱). Runs in <20ms via hook-log parsing.

Users can add to their Claude HUD:
  --extra-cmd="engram hud-label"

States: ready → listening → savings + bar + percentage.
Bar fills as hit rate climbs. Savings number grows over session.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
hud-label now searches parent directories for the nearest .engram/
instead of only checking $PWD. This fixes empty labels when Claude
Code starts from a parent directory (e.g., /opt instead of
/opt/crypto-bot). Uses the same walk-up pattern as the Sentinel
hooks' findProjectRoot.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…uides

- Hook table: 7 → 9 handlers (added PreCompact, CwdChanged)
- SessionStart now mentions mempalace bundle
- New Infrastructure commands section (watch, dashboard, hud-label)
- Claude HUD integration example with visual bar
- Docs section: links to user manual + integration guide
- Removed duplicate install block

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…nstall-hook

Users running `engram install-hook` now get the HUD visible in Claude Code
automatically. Respects existing statusLine configs — only sets it when absent.
Uninstall cleanly removes engram-owned statusLine entries.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Performance (CRITICAL):
- renderFileStructure: replaced getAllNodes()/getAllEdges() with targeted
  SQL queries (getNodesByFile, getEdgesForNodes). Eliminates full table
  scan that silently timed out on large projects.
- scoreNodes: replaced getAllNodes() with searchNodes() SQL seeding.
  O(matches) instead of O(all nodes) per query.
- Edge ordering: sort by endpoint degree before slice(0,10) so god-node
  relationships appear first.

Accuracy:
- Go import detection: track import() block state, no longer fires on
  struct field tags like json:"name".
- TS arrow function: require => in line, no longer matches
  const x = (someValue).
- Comment exclusion: lines starting with // or * skipped before pattern
  matching. No more phantom nodes from commented-out code.
- Confidence calibrated to 0.85 for regex extraction, reserving 1.0 for
  future tree-sitter.

Correctness:
- LIKE wildcards (% and _) escaped in searchNodes.
- Removed phantom graphology dependency (in package.json, zero imports).

493 tests passing. Zero regressions.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Context Spine spec: engram as central context routing layer integrating
  MemPalace, Context7, Obsidian into single rich injection packets.
  Target: 90% session-level token savings via provider cache in SQLite.
- Advisory docs: external review of engram strategy (founder brief,
  strategic spec, Phase 0 plan, design skill spec). Includes analysis
  and fact-checking notes.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Adds the SQLite cache layer that makes the Context Spine fast:

- provider_cache table: stores resolved context from external providers
  (mempalace, context7, obsidian) with TTL-based staleness
- CRUD: getCachedContext, setCachedContext, warmCache (bulk), pruneStaleCache,
  clearProviderCache, getCacheStats
- ContextProvider interface: the contract all providers implement
  (resolve, warmup, isAvailable, tokenBudget, timeoutMs)
- Provider priority ordering and type exports

Per-Read cache lookup is <5ms (SQLite SELECT). Expensive provider
resolution happens at SessionStart (warmup) or on first cache miss
with a 200ms timeout and hint fallback.

17 new tests. 510 total, all passing.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The Context Spine is now functional:

Internal providers (Tier 1, always available, no cache):
- engram:structure — structural summary from graph (existing, refactored)
- engram:mistakes — known issues from mistake memory
- engram:git — recent changes, churn rate, last author

External providers (Tier 2, cached in SQLite):
- mempalace — decisions/learnings from ChromaDB semantic memory
- context7 — library docs for detected imports
- obsidian — project notes from vault via REST API

Resolver engine:
- resolveRichPacket() — assembles from all providers in parallel
- warmAllProviders() — bulk cache fill at SessionStart
- Per-provider timeouts with graceful degradation
- Priority ordering within 600-token total budget
- Availability caching (check once per session)

10 new tests. 520 total, all passing. Build clean.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
From code review:
- getEdgesForNodes: chunk IN clause at 400 IDs to stay under SQLite's
  999 variable limit. Deduplicates across chunks.
- rowToCachedContext: add ?? fallbacks on all fields to prevent null
  propagation from pre-migration rows.
- warmCache: call save() after transaction commit, consistent with
  bulkUpsert. Prevents cache loss if process exits before close().
- getNodesByFile: add LIMIT 500 default to prevent unbounded
  materialization on generated files.

520 tests passing.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The Context Spine is now LIVE:

Read handler integration:
- After graph context passes all checks, attempts rich packet resolution
  from all available providers (structure, mistakes, git, mempalace,
  context7, obsidian) in parallel
- 1.5s total timeout wraps the entire resolution — if exceeded, falls
  back to graph-only summary (existing v0.4 behavior)
- Builds NodeContext with imports, test status, churn rate from graph
- Rich packet served via existing deny+reason mechanism

SessionStart warmup:
- Calls warmAllProviders() fire-and-forget after building the brief
- Pre-fills provider_cache table so subsequent Reads hit cache (<5ms)
- Warmup failure is silent — never delays or blocks session start

Provider availability:
- Tier 1 (internal): 200ms availability check timeout
- Tier 2 (external): 500ms availability check timeout
- Results cached per-session (check once, reuse)
- _resetAvailabilityCache() exposed for tests

520 tests passing. Build clean.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Removed all debug scaffolding (writeFileSync, ENGRAM_DEBUG guards)
- Fixed: global `engram` binary pointed to old npm install, not local
  dev build. Future: version bump before npm publish.
- Enrichment header uses "[engram] Additional context" when structure
  provider is excluded (enrichment-only mode)

The Context Spine is verified end-to-end:
  echo '{"hook_event_name":"PreToolUse",...}' | engram intercept
  → Structure (8 nodes) + CHANGES (git provider) in one response

520 tests passing.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The biggest engram release since v0.3 Sentinel. Read interceptions now
serve rich context assembled from 6 providers in parallel: structure,
known issues, git changes, MemPalace decisions, Context7 library docs,
and Obsidian project notes. One response replaces five tool calls.

Includes 9 launch-critical fixes (perf + accuracy + correctness),
provider cache in SQLite, parallel resolver with budget/timeout safety,
and SessionStart cache warmup. 520 tests, all passing.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…keywords

From repo audit:
- package.json: added repository/homepage/bugs URLs (was missing — npm
  page had no source links)
- package.json: updated description to Context Spine positioning
- package.json: replaced "ast" keyword with "context-spine", "context-providers"
- package.json: added CHANGELOG.md to files array
- README: replaced stale v0.2-era roadmap with shipped/next sections
- README: removed "v0.3 Sentinel" from Quickstart heading
- Version bump to 0.5.1

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
… test counts

From repo audit — factual accuracy sweep:
- README: import example uses "engramx" not "engram" (critical — wrong package name)
- README: comparison table "AST extraction" → "Heuristic extraction"
- README: "AST rebuild" → "graph rebuild" in git hooks section
- docs/engram-user-manual.html: all 4 tree-sitter references replaced with
  "heuristic extraction" (was claiming tree-sitter which isn't implemented)
- docs/engram-integration-guide.html: same tree-sitter fix + test count 486→520
- Removed orphaned RELEASE-NOTES-v0.3.0.md (stale, only file for one version)
- Version bump to 0.5.2

We don't mislead users. Regex heuristics at 0.85 confidence is honest.
Tree-sitter is planned, not shipped.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
sql.js exposes getRowsModified() at runtime but the @types/sql.js
type definitions don't include it, causing TS2339 on CI typecheck.
Use SQLite's changes() function via db.exec() instead.

Fixes all 4 CI matrix failures (ubuntu/windows × node 20/22).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Add MAX_DEPTH=100 to prevent stack overflow on deep directory trees
- Wrap readdirSync in try-catch to skip unreadable directories
- Add .engramignore support for custom exclusions
- Expand default exclusions (target, .venv, .next, .nuxt, .output, coverage, .turbo, .cache)
…nking

Two orthogonal improvements to the resolver's assembly pipeline. Both
exported from resolver.ts so they're testable in isolation, and both
run in the main resolveRichPacket() flow before the final priority sort.

1. PER-PROVIDER BUDGET ENFORCEMENT (enforcePerProviderBudget)

Providers are SUPPOSED to self-truncate their content to 'tokenBudget',
but a bad plugin or a non-conforming MCP server shouldn't be able to
spend our entire total budget on one section. New helper truncates
each result to the provider's declared budget BEFORE assembly.

- Under-budget content passes through unchanged (zero-cost)
- Over-budget content is line-truncated (never cut mid-word)
- Edge: first line alone > budget -> hard-cap characters with marker

Default budget for unknown/missing providers is 200 tokens (matches
the MCP-config default from item NickCirv#1).

2. MISTAKES-BOOST RERANKING (boostByMistakes)

If the engram:mistakes provider fires for this file, scan OTHER
providers' content for substring matches against mistake labels
(extracted from the '  ! <label> (flagged <age>)' format). Matching
results get confidence * 1.5 (capped at 1.0).

Runs BEFORE the priority sort, but the secondary sort is now
(priority asc, confidence desc) — so boost breaks ties WITHIN a
priority tier without overriding priority across tiers.

- Case-insensitive matching (labels normalized to lowercase)
- Does NOT boost the mistakes provider itself
- No-op if no mistakes are reported for this file (common case)

Examples of the intended effect:
- An engram:git commit message mentioning a known-broken function
  sorts UP within the git tier
- A mempalace decision that references a mistaken architectural
  choice bubbles ahead of unrelated decisions

TESTS (+10 cases in tests/providers/resolver.test.ts)

enforcePerProviderBudget:
  - Under-budget untouched
  - Over-budget truncated by line with marker
  - Hard-cap when first line alone exceeds budget
  - Default 200 tokens when provider not found

boostByMistakes:
  - No-op when no mistakes provider in set
  - Matching substring boosts confidence 0.6 -> 0.9
  - Cap enforced (0.8 * 1.5 = 1.2 -> 1.0)
  - Non-matching results left alone
  - Mistakes provider itself is never self-boosted
  - Case-insensitive matching across upper/lower case variations

Full suite: 815 -> 825 tests (+10), all passing. TypeScript clean.

V3.0 PROGRESS: 8 of 12 scope items done.
  ✅ NickCirv#1 foundation ✅ NickCirv#2NickCirv#3NickCirv#6NickCirv#7NickCirv#9NickCirv#10NickCirv#11
  Remaining: NickCirv#4 Auto-Memory (blocked on MEMORY.md fixture), NickCirv#5 SSE
  streaming, NickCirv#8 pre-mortem warnings, NickCirv#12 MCP Registry submit, and
  NickCirv#1 completion (HTTP transport + real-server integration tests).
Opt-in warnings that fire BEFORE Claude Code runs an Edit/Write/Bash
tool call against code previously flagged as a mistake. Fully gated
via ENGRAM_MISTAKE_GUARD env var — zero overhead when unset.

MODES

  unset / '0' → off (default — no database read, no overhead)
  '1'         → permissive: tool proceeds, a warning is prepended
                to any additionalContext the primary handler emits
  '2'         → strict:     tool is denied with the warning as reason

Hooks Edit/Write/Bash only. Read already surfaces mistakes via the
engram:mistakes context provider — duplicating at tool-call time would
be noise.

MATCHING

Edit/Write:
  - Normalize tool_input.file_path to relative POSIX vs projectRoot
  - Indexed lookup via store.getNodesByFile() (uses idx_nodes_source_file)
  - Dedupe by node id when both relative + raw shapes are stored

Bash:
  - Substring match on mistake.metadata.commandPattern (length >2)
  - Fallback: substring match on mistake.sourceFile (length >3 to avoid
    accidentally matching single-char paths like 'a')
  - Full-table scan of mistakes (unavoidable — no file axis to index on).
    Bounded by project size; only runs when the guard is explicitly on.

BI-TEMPORAL FILTER (item NickCirv#7 interop)

Mistakes with validUntil <= now are suppressed — they refer to code
that has since been refactored away. Prevents stale-warning fatigue.

INTEGRATION

New file: src/intercept/handlers/mistake-guard.ts
  - currentGuardMode() — reads env var at call time, not module load,
    so tests can flip between cases cleanly
  - findMatchingMistakesAsync(target, projectRoot) — the matcher
  - formatWarning(matches) — human-readable warning block
  - applyMistakeGuard(rawResult, payload, kind) — wrapping fn that
    augments additionalContext (permissive) or overrides to deny (strict)

src/intercept/dispatch.ts wiring: after runHandler() returns for Edit/
Write/Bash, pass result through applyMistakeGuard() before returning.
Two-line diff. Doesn't touch the existing handlers.

SAFETY

Every code path in mistake-guard is wrapped in try/catch with a null
return. A guard failure MUST NEVER break the primary handler. If the
store open fails, the env var is wrong, the payload is malformed —
guard silently returns the raw result unchanged.

TESTS (+21 cases in tests/intercept/handlers/mistake-guard.test.ts)

  - currentGuardMode: off/permissive/strict recognition, bogus values
    coerced to off
  - formatWarning: empty-match string, single-match header, >5-match
    collapse with '… and N more'
  - findMatchingMistakesAsync (file): rel path, abs path normalization,
    no-match, validUntil filter
  - findMatchingMistakesAsync (bash): commandPattern substring match,
    sourceFile-in-command match, case-insensitive, too-short pattern
    guard, validUntil filter
  - applyMistakeGuard: mode=off no-op, permissive augments additional
    context, permissive no-match no-op, strict denies with reason,
    permissive from passthrough emits fresh allow-with-warning

Full suite: 825 -> 846 tests (+21), all passing. TypeScript clean.

V3.0 PROGRESS — 9 of 12 scope items

  ✅ NickCirv#1 foundation  ✅ NickCirv#2NickCirv#3NickCirv#6NickCirv#7NickCirv#8NickCirv#9NickCirv#10NickCirv#11

Remaining:
  - NickCirv#1 completion (HTTP transport + real-server integration tests)
  - NickCirv#4 Anthropic Auto-Memory bridge (blocked: needs MEMORY.md fixture)
  - NickCirv#5 SSE streaming for rich packet assembly
  - NickCirv#12 Official MCP Registry submission (post-ship)
Reads Claude Code's auto-managed MEMORY.md index and surfaces entries
relevant to the current file. Closes the Auto-Dream existential risk:
when Anthropic flips the server flag and MEMORY.md becomes consolidated-
and-high-quality, this bridge lights up with no code change.

FIXTURE CAPTURE

Real MEMORY.md samples live at ~/.claude/projects/<encoded>/memory/ on
every Claude Code machine. Captured a representative sample into
tests/fixtures/memory-md/sample-index.md so integration tests don't
depend on the user's actual memory directory.

CANONICAL FORMAT (from real fixtures)

  - [Title](relative-file.md) — one-line description

Flat bullet list, one entry per line. Em-dash OR en-dash OR hyphen-
space all accepted. Linked .md files contain frontmatter + body —
this provider is INDEX-ONLY (doesn't dereference bodies) so it stays
under 10 ms even on large memory sets.

PATH DERIVATION

encodeProjectPath('/Users/alice/proj') -> '-Users-alice-proj'
getMemoryIndexPath(projectRoot) -> ~/.claude/projects/<encoded>/memory/MEMORY.md

Overridable via ENGRAM_ANTHROPIC_MEMORY_PATH env var for tests and
for advanced users who maintain a manual index.

RELEVANCE SCORING

  +3  title contains file basename (sans extension)
  +2  description contains file basename
  +2  any import name appears in title or description (length ≥3)
  +1  any path segment appears in title or description (length ≥3)

Top 3 matches with score >0 are returned; no matches = null.

INTEGRATION

  - New provider wired into BUILTIN_PROVIDERS (src/providers/resolver.ts)
  - Inserted at PROVIDER_PRIORITY index 3, between engram:mistakes
    (+2) and mempalace (+4). Rationale: own-curated memory > shared
    semantic memory when both are available.

SAFETY

  - MAX_INDEX_BYTES = 1 MB hard cap (pathological files returned null)
  - Empty files returned null (never a noise packet)
  - All errors caught -> null return (never throws into resolver path)

TESTS (+24 cases in tests/providers/anthropic-memory.test.ts)

  encodeProjectPath: standard path, trailing-slash trim, Windows
    separator normalize, deep path preservation
  getMemoryIndexPath: ends at the right path shape
  parseMemoryIndex: well-formed index, malformed-line skip, empty-
    content empty array, missing-description tolerated
  scoreEntry: basename match (+3), import match (+2), zero on no
    relationship, case-insensitive
  resolve: missing file null, empty file null, no-match null, basename
    match surfaces, caps at 3, over 1 MB skipped, override wins,
    imports drive matches
  isAvailable: default true (defers per-project), override exists true,
    override missing false

Also updates tests/providers/resolver.test.ts — PROVIDER_PRIORITY
order test picks up the new index 3 slot.

Full suite: 846 -> 870 tests (+24), all passing. TypeScript clean.

V3.0 PROGRESS — 10 of 12 scope items done.
Remaining: NickCirv#5 SSE streaming + NickCirv#1 completion (HTTP transport + real MCP
server fixture) + NickCirv#12 registry submit (post-ship).
Adds progressive delivery for rich packet assembly. Instead of blocking
on Promise.allSettled (which waits for the slowest provider — Serena
cold-start, mempalace ChromaDB warmup), clients can stream results
as they arrive and render each section immediately.

NEW — resolveRichPacketStreaming generator (src/providers/resolver.ts)

AsyncGenerator<StreamEvent> that yields:
  { type: 'provider', result: ProviderResult }  — as each resolves
  { type: 'done', providerCount, durationMs }  — final totals

Order = ARRIVAL order (fast providers first). Consumers who want
priority order use the non-streaming resolveRichPacket() which applies
full priority + mistakes-boost + budget logic.

Implementation: fan-out all providers, funnel outcomes into a FIFO
queue + wake-on-arrival pattern. No extra deps. Per-provider timeouts
preserved (same resolveWithTimeout path as non-streaming).

NEW — /context/stream SSE endpoint (src/server/http.ts)

GET /context/stream?file=<relative-path> (auth required).
Emits one SSE frame per StreamEvent. Frame shape matches MCP SEP-1699
(SSE resumption):

  id: 0
  event: provider
  data: {"provider":"engram:ast", …}

  id: 1
  event: provider
  data: {"provider":"engram:mistakes", …}

  id: N
  event: done
  data: {"providerCount":N,"durationMs":347}

Supports Last-Event-ID header — clients reconnecting via
'Last-Event-ID: 3' skip events 0-3 and pick up from 4. Useful for
long-running sessions that drop WiFi mid-stream without losing context.

Client-disconnect aborts the stream cleanly (req.close handler short-
circuits the generator loop).

TESTS (+6 new)

resolver.test.ts (+2):
  - Smoke: streaming generator terminates with a 'done' event for any
    project (no hang, no runaway)
  - Arrival-order invariant: toy generator mirrors production shape,
    verifies fast results yield before slow ones

server/http.test.ts (+4):
  - Missing 'file' param returns 400
  - Valid request returns 200 + text/event-stream + ends with 'done'
  - Every frame carries an 'id:' header (SEP-1699 resumption)
  - Auth required — unauthenticated returns 401

Full suite: 870 -> 876 tests (+6), all passing. TypeScript clean.

V3.0 PROGRESS — 11 of 12 scope items done

  ✅ NickCirv#1 foundation  ✅ NickCirv#2NickCirv#3NickCirv#4NickCirv#5NickCirv#6NickCirv#7NickCirv#8NickCirv#9NickCirv#10NickCirv#11

Only remaining in-scope work:
  - NickCirv#12 MCP Registry submission (~2h, post-ship only)

Plus item NickCirv#1 completion (HTTP transport + minimal MCP server fixture
for integration tests) — technically part of NickCirv#1 which shipped its
foundation as c719591; the HTTP transport path was explicitly deferred
until this SSE work landed. Now it can.
The existing bench/runner.ts uses YAML-estimated costs (useful for CI
regression tracking but not an end-to-end proof). This new real-world
bench runs the FULL resolver pipeline against actual files in the repo
and compares rich-packet tokens to raw-file-read tokens.

METHODOLOGY (honest arithmetic)

For each file in the repo:
  baselineTokens = ceil(file.length / 4)           — cost if the agent
                                                      just Read() it
  engramTokens = resolveRichPacket().estimatedTokens
                                                   — cost of the rich
                                                      packet that replaces
                                                      the Read
  savings% = (baseline - engram) / baseline * 100

Aggregate = (sum baseline - sum engram) / sum baseline * 100.

LATEST RUN — 2026-04-24 on 30 real engramx source files

  Baseline tokens:     67,435
  engramx tokens:       6,185
  Aggregate savings:    90.8%
  Median per-file:      85.5%
  Wins:                 29 of 30
  Best case:            98.4% (src/cli.ts: 18,820 → 306 tokens)
  Target (>= 80%):      PASS

Committed reports in bench/results/:
  real-world-2026-04-24.json — machine-readable, full per-file data
  real-world-2026-04-24.md   — human-readable summary table

README UPDATE

Replaces the stale '88.1% measured' badge with '90.8% measured' and
adds a 'Proof, not promises' section that shows the methodology + real
numbers + reproduce-on-your-code instructions.

REPRODUCIBILITY

  cd ~/engram
  npx tsx bench/real-world.ts --files 30

  cd any-other-project
  engram init
  npx tsx ~/engram/bench/real-world.ts --project . --files 50

The bench itself is ~250 lines with no external deps (just tsx). It
walks the repo with the same ignore rules as engramx's miner, skips
tests/bench/node_modules/dist, and handles missing providers cleanly
(baseline tokens still measured; engram side gets 0).

This gives the v3.0 release the ONE thing every skeptical reader asks
for: a reproducible number on a real codebase, not a cherry-picked
toy example.
package.json 2.1.0 -> 3.0.0. Description rewritten to reflect the
v3.0 feature set — extensible MCP-client aggregator + mcpConfig plugin
contract + pre-mortem mistake-guard + bi-temporal mistake memory +
Anthropic Auto-Memory bridge + SSE streaming + AGENTS.md dual emit +
90.8% measured real-world savings.

CHANGELOG.md gains a full [3.0.0] entry following Keep a Changelog
format: Added (3 pillars), Changed (breaking APIs called out),
Migration (v7 -> v8 auto-migration + autogen() return-type change),
Tests (771 -> 876).

Bench refresh: bench/results/real-world-2026-04-24.md rewritten by the
100-file run during release audit (was 30 files before). New numbers:
163,122 baseline tokens -> 17,722 engramx tokens = 89.1% aggregate
savings on 87 files (after skip rules).

AUDIT STATUS — ALL GREEN

  Phase A — build/typecheck/lint/tests            ✅  876/876
  Phase B — CLI smoke (init/doctor/gen/query/…)    ✅  dual-emit verified, broken-config survived
  Phase C — v2.1 -> v3.0 schema migration          ✅  migration 8 clean, backup created, legacy rows preserved
  Phase D — stress (100-file bench, 20x SSE, 10k mistakes) ✅  89.1%, 20/20, 2.04ms/resolve
  Phase E — security + secret scan                 ✅  no secrets in diff, auth gate verified on /context/stream
  Phase F — package sanity + version bump          ✅  3.0.0 published stats match 2.1.0 size envelope (672kB packed)

Ready for PR → main.
NickCirv and others added 10 commits April 24, 2026 19:08
INSTALL.HTML — showcased v3.0, kept aesthetic, fixed OG rendering

Hero:
  - version pill v2.0.2 -> v3.0 'Spine' · shipped 2026-04-24
  - sub-copy mentions 'any MCP server you plug in' — the extensibility pitch
  - terminal block leads with 'engram setup' (shipped v2.1) as the one-command
    flow; init + install-hook + adapter detect + doctor all in one
  - metrics strip: 88.1% -> 89.1% (real-world bench), 670 -> 876 tests,
    8+n -> 9+n providers
  - tagline: 'optional: engram plugin install serena for +LSP symbols'
    teases the plugin ecosystem

New '// v3.0 · what's new' section with 6 feature cards in a responsive
grid (extensibility / mistakes moat / opt-in safety / universal agent spec /
progressive rendering / future-proof). Hover lifts to border-accent.
Amber card symbols, inline code chips with accent color.

New '// plugins · Every plugin you add closes another token leak' section
— 6-row plugin table (Serena / GitHub MCP / Sentry / Supabase / Context7 /
Anthropic Auto-Memory) showing what each plugin closes + how to install.
Plus a 'how a plugin is built' terminal block showing the full 10-line
Serena plugin file. Drives the user's key ask: 'additional plugins will
actually drive more savings'.

Benefits section: table refreshed with real measured numbers (163,122
baseline tokens -> 17,722 engramx tokens, 89.1% saved, $0.49 -> $0.05
per session), new row for 'Stale-warning noise' (v3.0 bi-temporal) and
'Provider ecosystem' (any MCP as 10-line plugin). Section-meta links
to the committed bench report.

IDE coverage section rewritten: leads with 'One engram gen. Every agent
reads it.' — explains AGENTS.md dual-emit. Adds Codex CLI / Copilot Chat /
JetBrains Junie as v3.0 AGENTS.md rows alongside existing IDEs.

FAQ:
  - 88.1% bench entry rewritten to explain the real-world bench methodology
    + link to committed report
  - NEW 'What's new in v3.0' bullet list covering all 6 features
  - Cross-tool support rewritten for AGENTS.md universal standard
  - 'Can I add my own context provider?' rewritten to cover mcpConfig
    auto-wrap (the 10-line plugin path)

Footer: v2.0.2 -> v3.0.0 'Spine'.
Final CTA copy refreshed to cite 89.1% + plugin ecosystem.

NAV: added 'v3.0' and 'Plugins' links.

RENDERING FIX (critical for OG previews + crawlers)

The reveal animation previously started at opacity:0 and relied on
IntersectionObserver + a per-element stagger to fade in. Headless
screenshotters (GitHub OG previews, Twitter cards, the Chrome
--screenshot pipeline) capture a snapshot before JS finishes staggering,
so above-the-fold content appeared EMPTY in social previews.

Fix:
  - CSS default .reveal state is now opacity:1, transform:none (visible)
  - html.js-ready .reveal adds opacity:0 + translateY(14px)
  - Script toggles html.js-ready ONLY when JS + motion-allowed
  - Observer stagger removed (CSS transition already provides the ramp)

Net: page renders fully for crawlers / no-JS / prefers-reduced-motion;
JS adds a subtle fade+slide for users who benefit from it. Verified via
headless Chrome screenshot — all 6 v3.0 cards, hero terminal, metrics
strip, and CTA row render in the first snapshot.

README — warmer for non-devs

New '## I'm not a hardcore developer — what does this actually do?'
section (4-bullet plain-English explanation) placed immediately after
the hero, before the Proof section. Target reader: someone who pays
for Cursor or Claude Code and just wants smaller bills / better AI
results without understanding the architecture.

Hero prose rewritten to lead with outcome ('stops charging you for
the same information twice') before mechanism. Quickstart block
replaces 'engram init && engram install-hook' with 'engram setup'.

v2.0 banner -> v3.0 banner at the top of the file, with the real
89.1% number.

Benchmark section split into 'Real-world bench (new in v3.0,
preferred)' + 'Structured task bench (CI regression)' so the new
bench.real-world.ts story leads.

NEW '## Plugins multiply the savings' section between benchmark and
'What It Does' — same plugin table as install.html (Serena / GitHub /
Sentry / Supabase / Context7 / Auto-Memory). Single sentence per
plugin showing what gap it closes.

'What It Does' updated: 8 providers -> 9 providers table (adds
anthropic:memory row between mistakes and git). Closing sentence
mentions the 10-line plugin path.

Misc: 'Rich packets from all 8 providers' -> '9 built-ins + any MCP
plugin' in the How-It-Compares row.

RESULT

Both docs now tell the same v3.0 story — 89.1% measured, extensible
ecosystem, normal users read the README first 200 lines and understand
the value prop without jargon.
ROOT CAUSE

tests/providers/anthropic-memory.test.ts:59 used a regex assertion
built with forward-slashes:

  expect(path).toMatch(/\.claude\/projects\/-Users-a…\/MEMORY\.md$/);

The implementation uses path.join() which on Windows produces native
backslash separators (C:\Users\runneradmin\.claude\projects\…).
The test only passed on POSIX. Windows-latest × Node 20+22 = 2 failing
jobs. Ubuntu-latest × Node 20+22 = 2 green jobs. Local macOS audit
could not catch this.

This is the SAME class of failure we logged from v2.1.0 (Windows path
bug caught post-CI). The lesson was not honored when writing item NickCirv#4.

FIX — test

tests/providers/anthropic-memory.test.ts
  - Build the expected path via the SAME path.join() call the
    implementation uses. toBe equality replaces the regex.
  - Result: identical assertion works on POSIX (/) and Windows (\).

FIX — defence in depth on related call sites

src/providers/anthropic-memory.ts (scoreEntry)
  - basename = ctx.filePath.split("/").pop() → split(/[\\/]/).pop()
  - Matches the pre-existing segments split style. Removes
    inconsistency in the same function (line 119 vs 120).

src/providers/mcp-config.ts (applyArgTemplate)
  - Same treatment on the fileBasename fallback.
  - NodeContext.filePath is contract-POSIX, so both sites were safe
    in practice — but a plugin author passing a raw tool_input path
    would have silently corrupted basename extraction on Windows.

REGRESSION GATES (prevent recurrence locally)

Two new test cases exercise native Windows paths explicitly:
  - anthropic-memory.test.ts: scoreEntry("src\\auth\\login.ts") > 0
  - mcp-config.test.ts: applyArgTemplate Windows path → basename "auth.ts"

If anyone reverts the split(/[\\/]/) hardening, these tests fail on
macOS immediately. No more silent-pass-on-macOS, fail-on-Windows.

TESTS

876 → 878 passing (+2 regression cases). TypeScript clean.
Expected CI result: all 4 jobs green on next run.
…phasis

Complete GitHub-presentation refresh for v3.0 'Spine'. Keeps every
aesthetic element; sharpens the story.

BANNER (assets/banner.html + banner.png)

  - Badge: 'AI CODING MEMORY' -> 'CONTEXT SPINE · v3.0 "SPINE"'
  - Wordmark: 'engram' (a highlighted) -> 'engramX' (mX highlighted)
  - Tagline: emphasizes 'cached context spine that remembers — and gets
    richer with every plugin' (user's explicit ask)
  - Terminal block: npm install -g engramx + engram setup (the v2.1
    one-command flow), shows 89.1% measured savings headline
  - Bottom stats bar: 89.1% / 9+plugins / 0 LLM cost · 0 cloud /
    Claude Code · Cursor · Codex · any AGENTS.md agent
  - Knowledge graph visualization preserved as-is — same node shape,
    amber palette, JetBrains Mono labels
  - Re-rendered via headless Chrome into banner.png

README (user's 7 asks hit point-by-point)

  1. 'keep same banner aesthetic' — banner.html aesthetic unchanged,
     only content updated
  2. 'rename to EngramX' — README hero title now 'EngramX — the cached
     context spine for AI coding agents.' Capital-E brand in prose,
     lowercase 'engramx' preserved for npm package name
  3. 'mention all the differences we made' — v3.0 banner block
     expanded with every shipped pillar (extensible, pre-mortem,
     bi-temporal, Auto-Memory, SSE, dual-emit, 89.1%)
  4. 'easy for end users to follow and install' — one-command
     'engram setup' surfaced as THE install, with plain-language
     explanation of what it does
  5. 'ground breaking upgrade with massive saving' — 89.1% lead, plus
     the phrase 'every plugin you add elevates the savings further'
  6. 'looking in cached memory' — new sentence names the THREE layers
     of cache explicitly (knowledge graph, per-provider SQLite cache,
     in-memory LRU). Ties to the spine metaphor.
  7. 'additional tools and repos that elevate saving / emphasize on
     the spine' — lead sentence now 'EngramX is the spine.' Plugin-
     multiplier paragraph surfaced in the hero block, not just in the
     deeper section

CONTRIBUTING.md — full v3.0 rewrite

  - Brand updated to EngramX
  - 'Highest-impact contributions' reordered — worked examples first,
    reproducible bench results second, plugin submissions third
  - Development loop commands include bench/real-world.ts as a sanity
    check
  - **Windows-first discipline codified as a PR gate** — step 4 of
    'Before you open a PR' now reads: 'If you touched anything that
    builds a filesystem path, assert with path.join() / path.resolve(),
    never hand-write / separators. We shipped a Windows-CI regression
    on v3.0's first pass because of this.'
  - Code style rule added: every test exercising filesystem paths must
    include a Windows-native-path case locally
  - Plugin author section added — points to the 2 reference plugins
    and the 'how to submit a plugin' 3-step flow

GITHUB REPO METADATA (via gh repo edit)

  - Description rewritten: 'EngramX — the cached context spine for AI
    coding agents. 9 built-in providers + any MCP server as a 10-line
    plugin, pre-mortem mistake-guard, bi-temporal memory, Anthropic
    Auto-Memory bridge, SSE streaming packets, dual-emit AGENTS.md+
    CLAUDE.md. 89.1% measured real-world token savings, local SQLite,
    zero cloud.'
  - Topics: removed 'continue-dev' and 'engram-context-protocol'
    (stale), added 'agent-memory', 'engramx', 'agents-md'
  - Now at 18 / 20 topic cap — room to grow

TESTS

  878 / 878 green. TypeScript clean. No code changes, only docs +
  branding + banner asset.
Root cause: GitHub's camo.githubusercontent.com image proxy caches
README image URLs aggressively. The v3.0 banner.png was pushed in
commit fa45e49 (bytes verified on disk — engramX wordmark, CONTEXT
SPINE v3.0 badge, 89.1% savings, engram setup terminal), but GitHub
kept serving the v2 banner from its CDN cache.

Fix: rename the asset to assets/banner-v3.png and update the README
<img src> to point at the new URL. camo treats it as a fresh URL and
fetches the updated file.

Also updates docs/install.html OG image meta tag so Twitter / LinkedIn
/ Slack social previews pick up the new banner.

Old assets/banner.png kept in tree for backward compatibility with
any existing link in the wild (blog posts, tweets). Identical to
banner-v3.png byte-for-byte — both files are the correct v3.0
rendering.

User-facing: next git push -> next GitHub README render uses the
new URL, no cache to bust.
Previous v3 render had 'm' + 'X' in orange, which broke symmetry with
the original v2 wordmark (only 'a' was orange — engr[a]m).

Correct pattern: the same 'a' stays orange from v2 continuity, and the
new 'X' joins it. Net: engr[a]m[X] — only the highlighted letters shift
to the accent color. Keeps the brand continuity + names v3 distinctly.

Both assets/banner.png and assets/banner-v3.png refreshed with the
corrected render. MD5: 6793ecb9d6f109be2e714432a672bf74.
v3.0.0 "Spine" — extensible MCP aggregator, mistakes moat, 89.1% measured savings
Machine-readable server spec following the 2025-12-11 schema.

Package: engramx@3.0.0 (npm)
Namespace: io.github.NickCirv/engram (requires GitHub OAuth as NickCirv)
Transport: stdio via runtime binary 'engram-serve' (already bundled as a
  bin in package.json — users get it for free on 'npm install -g engramx')

Declared env vars (all optional):
  - ENGRAM_API_TOKEN — HTTP auth bearer (auto-generated if unset)
  - ENGRAM_MISTAKE_GUARD — pre-mortem warnings: '1' permissive, '2' strict
  - ENGRAM_ANTHROPIC_MEMORY_PATH — override for Auto-Memory MEMORY.md path
  - ENGRAM_MCP_CONFIG_PATH — override for mcp-providers.json
  - ENGRAM_NO_UPDATE_CHECK — disable passive update notice

Submit flow:
  cd ~/engram
  mcp-publisher login github     # opens browser for GitHub OAuth as NickCirv
  mcp-publisher publish          # validates + submits to registry

Publisher installed: brew install mcp-publisher (v1.6.0, Go binary).
Registry: https://registry.modelcontextprotocol.io (canonical MCP discovery).
The bug 3.0.0 shipped with:

  npm uninstall -g engramx removed the binary from PATH but left the
  hook entries in ~/.claude/settings.json pointing at 'engram intercept'
  commands that no longer existed. Claude Code fires those hooks on
  every tool call -> hooks exec with ENOENT -> user-visible behavior
  was 'Claude Code stopped executing anything.' Recovery required
  reinstalling engramx just to run engram uninstall-hook before
  uninstalling again. That is a bad experience.

Reported by @freenow82 within hours of 3.0.0 going live.

THE FIX

scripts/preuninstall.mjs — runs automatically before
'npm uninstall -g engramx'. Reads ~/.claude/settings.json, strips every
hook entry whose command contains 'engram' (case-insensitive), drops
engram's statusLine/HUD, backs up the original to a timestamped .bak,
writes the result atomically via rename.

Hard contract: this script NEVER fails the uninstall. Every error
path logs a single-line hint and exits 0. The user's 'npm uninstall'
always succeeds, with or without hook cleanup.

scripts/postinstall.mjs — one-time info banner on 'npm install -g
engramx'. Shows the recommended next step (engram setup) and the
clean-uninstall flow. Respects $CI and $ENGRAM_NO_POSTINSTALL=1.

engram repair-hooks — new CLI alias to 'engram uninstall-hook'. Named
for the word a stranded user would actually search for. No code
duplication (commander.alias()).

package.json — both scripts added to the files allowlist so they ship
in the tarball. version bumped 3.0.0 -> 3.0.1.

README.md — new 'Want out?' subsection with the clean-uninstall
command and the recovery path for users stranded on 3.0.0.

CHANGELOG.md — full [3.0.1] entry documenting the bug, the fix, the
recovery instructions, and the thank-you to @freenow82.

VERIFIED (fixture test)

Realistic settings.json with a mix of:
  - engram hooks (PreToolUse, SessionStart)
  - custom non-engram hooks in the same event arrays
  - engram statusLine (HUD)
  - unrelated top-level keys (otherUserPrefs)

After running the preuninstall script on a fixture copy:
  3 engram hook entries removed
  custom hook 'echo ...my custom hook...' preserved in PreToolUse
  custom SessionStart hook preserved
  unrelated Stop hook preserved (engram never touched it)
  engram statusLine removed
  otherUserPrefs 'KEEP ME UNCHANGED' preserved
  backup .bak file created with timestamp
  'engram' substring no longer present anywhere in the result

TESTS

878/878 passing (no new unit tests — the preuninstall is a .mjs that
runs in npm's lifecycle context before the dist/ build is available.
Fixture-based validation is documented in CHANGELOG and tested
manually before this commit).
…escriptions

Chore release. No runtime changes.

ROOT CAUSE

Official MCP Registry submission returned HTTP 400:
  'NPM package engramx is missing required mcpName field.'

The registry verifies ownership of the server name io.github.NickCirv/engram
by reading the published npm tarball's package.json and requiring a matching
mcpName field. Without it the published engramx tarball doesn't prove
authorship, and the registry refuses to publish the spec.

FIX

package.json adds top-level:
  "mcpName": "io.github.NickCirv/engram"

server.json descriptions shortened to <= 100 chars (registry limit):
  - top-level description: 343 -> 91 chars
  - 5 environmentVariables descriptions: all 102-158 -> all 53-83 chars

Both files version bumped to 3.0.2 to match the next npm publish.

WHY NOT FOLDED INTO 3.0.1

3.0.1 needed to ship ASAP to stop new users hitting the orphaned-hooks
bug reported by @freenow82. MCP Registry integration surfaced during
the submission flow after 3.0.1 was already live. Separate concern,
separate patch.

VERIFIED

878/878 tests pass. TypeScript clean. Build green. CI will re-run on push.
…ility

Add encoding: 'utf-8' to readdirSync calls to fix TypeScript errors in Node 25:
- Type 'Dirent<string>[]' not assignable to type 'Dirent<NonSharedBuffer>[]'
- Property 'startsWith' does not exist on type 'NonSharedBuffer'

Same pattern as skills-miner.ts:226-229
@mechtar-ru

Copy link
Copy Markdown
Author

added tests

NickCirv added a commit that referenced this pull request Jun 5, 2026
Follow-up to cherry-picks 6c3b4b4 (PR #6 commit 1) + 3d2eb97 (PR #6 commit 2)
from @mechtar-ru.

Deletes redundant DEFAULT_EXCLUDED_DIRS (15 entries) + loadEngramIgnore()
(16 lines) that lived in parallel with the canonical DEFAULT_SKIP_DIRS +
loadIgnorePatterns() pair (the latter shipped in v2.1.0 via PR #13).
Both pairs implemented the same .engramignore feature — keeping only
the v2.1 canonical pair keeps one source of truth.

Also tightens entries typing: 'let entries: Dirent[]' in extractDirectory
(ReturnType<typeof readdirSync> resolves to the string[] default overload,
not the Dirent[] shape actually returned with { withFileTypes: true }).

All 784 tests pass. TypeScript clean.

Closes issue #5 (via PR #6 content: MAX_DEPTH=100 + MAX_FILES_PER_COMMIT=50
+ .engramignore support + expanded default skip dirs — the OOM crash on
init for 2.2GB/34K-file projects like Axolotl is fixed).
NickCirv added a commit that referenced this pull request Jun 5, 2026
First land of the MCP-client subsystem (item #1 from the v3.0 Spine
implementation plan). Any MCP server can now become an engramx Context
Spine provider via ~/.engram/mcp-providers.json — no code changes needed.

WHAT SHIPS IN THIS COMMIT

src/providers/mcp-config.ts
  - McpProviderConfig type (stdio + http transports, tools array with
    arg templates, tokenBudget, timeoutMs, cacheTtlSec, priority, enabled)
  - loadMcpConfigs(): reads ~/.engram/mcp-providers.json (path overridable
    via ENGRAM_MCP_CONFIG_PATH for tests). Per-entry validation errors
    are COLLECTED not thrown — one bad provider never stops the rest.
  - validateProviderConfig(): strict structural validator with precise
    error messages (tells you which field on which entry failed)
  - applyArgTemplate(): substitutes {filePath}/{projectRoot}/{imports}/
    {fileBasename} tokens into tool args. Unknown tokens pass through.
  - Defaults: tokenBudget=200, timeoutMs=2000, cacheTtlSec=3600, priority
    from array order. Sensible for every MCP server we've seen.

src/providers/mcp-client.ts
  - McpClientWrapper — thin wrapper on @modelcontextprotocol/sdk v1.29
    Client + StdioClientTransport. Session-lifetime connection reuse.
    Lazy connect (no process spawned until first resolve). Error backoff
    (30s) prevents thrashing if the server crashes on startup.
  - createMcpProvider(config) — factory returning a ContextProvider that
    plugs into the existing resolver without modification. Tier 2 (matches
    context7 / obsidian semantics). Tools called in parallel per Read.
  - Budget enforcement + line-wise truncation (never mid-word).
  - Graceful shutdown on SIGTERM / SIGINT / beforeExit.
  - HTTP transport declared but deferred — throws 'not yet implemented'
    until item #5 SSE streaming lands with the Host/Origin hardening work.

src/providers/resolver.ts
  - getMcpProviders(): loads MCP configs and wraps them. Cached for session
    lifetime. Test hook _resetMcpProvidersCache() for forced reload.
  - getAllProviders(): now merges BUILTINS + plugins + MCP providers
    (all deduped against built-in names so users can't shadow core).
  - Parse failures emit a single-line stderr warning (per bad entry) —
    visible to users without crashing their session.

package.json
  - Adds @modelcontextprotocol/sdk@1.29.0 (4.3MB unpacked, pure JS,
    no native deps). Pinned behind a thin ProviderClient surface so
    migration to SDK v2 (alpha 2026-04) is a one-file swap later.

TESTS

tests/providers/mcp-config.test.ts — 24 cases covering:
  - File-doesn't-exist → empty configs
  - Valid stdio + http shapes round-trip
  - Invalid JSON reported as single failure
  - Bad entries skipped, good ones kept
  - Duplicate names: first wins
  - All validation rules (empty name/label, bad transport, confidence
    range, negative numeric fields, missing command/url, invalid URL)
  - Arg-template substitution: all tokens, unknown pass-through, non-
    strings unchanged, basename fallback, input-immutability

Full suite: 771 → 808 tests (+24), all passing. TypeScript clean.

WHAT THIS COMMIT DOES NOT DO (follow-up within item #1)

  - HTTP transport implementation — waits on item #5 SSE streaming for
    shared Host/Origin validation + resumable streams
  - Integration tests that actually spawn a real MCP server (needs
    tests/fixtures/minimal-mcp-server.mjs — next commit)
  - Tool-list caching — currently we call tools directly without
    listTools() first; the SDK may cache internally but we should
    verify + explicit-cache if not

With this in place, item #2 (plugin contract v2 — mcpConfig auto-wrap)
becomes a 2-day extension: plugin-loader.ts detects .mcpConfig on a
plugin and auto-calls createMcpProvider(). Item #6 (Serena provider)
becomes a 10-line ~/.engram/plugins/serena.mjs once the mcpConfig path
lands.
NickCirv added a commit that referenced this pull request Jun 5, 2026
…rena reference plugin

ITEM #2 — Plugin contract v2

Extends ContextProviderPlugin so plugin authors can declare an MCP server
via 'mcpConfig' and skip writing resolve()/isAvailable() by hand. The
loader auto-wraps via createMcpProvider() from item #1. Classic plugins
(custom resolve()) continue to work unchanged — if both fields are
present, the author's resolve() wins (they opted into custom logic).

Type changes (src/providers/types.ts):
  - ContextProviderPlugin stays strict (extends ContextProvider fully) —
    this is the POST-VALIDATION shape the resolver consumes
  - NEW: RawPluginShape — the pre-validation shape a plugin-file author
    writes in .mjs. tier/tokenBudget/timeoutMs/resolve/isAvailable all
    optional (loader fills from factory when mcpConfig present)

Loader changes (src/providers/plugin-loader.ts):
  - validatePlugin() branches on 'has mcpConfig vs. has resolve()'
  - name/label/version always required
  - Classic path: tier/tokenBudget/timeoutMs/isAvailable required
  - mcpConfig path: config validated via validateProviderConfig(),
    merged with plugin fields (author overrides win over factory defaults)
  - One clear error per rejection — 'invalid mcpConfig: <reason>' tells
    you exactly which sub-field on which plugin is broken

Tests (+7 cases in tests/providers/plugin-loader.test.ts):
  - mcpConfig-only plugin auto-wraps resolve + isAvailable
  - Plugin with neither resolve nor mcpConfig rejected (clear message)
  - Invalid mcpConfig rejected (bad command, bad http url)
  - Custom resolve wins over mcpConfig when both present
  - Plugin tokenBudget override wins over factory default
  - Missing version rejected even for mcpConfig plugins

ITEM #6 — Serena plugin reference

docs/plugins/examples/serena-plugin.mjs (~60 lines incl. docs) — the
full Serena (oraios/serena) wrapper as an mcpConfig-only plugin. Install
is cp + enable. Thanks to item #2, NO custom transport code needed.

docs/plugins/examples/static-context-plugin.mjs — the classic-path
reference showing a tier 1 plugin with hand-rolled resolve() for users
who just want to inject a fixed string on every Read.

docs/plugins/README.md — author-facing guide. Shape 1 (MCP-backed),
Shape 2 (classic), template tokens, safety guarantees, debugging
checklist, publishing notes.

FULL SUITE

808 -> 815 tests (+7), all passing. TypeScript clean, lint clean.

V3.0 PROGRESS

Done: #1 foundation, #2, #6, #7, #9, #10, #11 = 7 of 12 scope items.
Next: #3 budget-weighted resolver + mistakes-boost (~2-3d).
NickCirv added a commit that referenced this pull request Jun 5, 2026
Two orthogonal improvements to the resolver's assembly pipeline. Both
exported from resolver.ts so they're testable in isolation, and both
run in the main resolveRichPacket() flow before the final priority sort.

1. PER-PROVIDER BUDGET ENFORCEMENT (enforcePerProviderBudget)

Providers are SUPPOSED to self-truncate their content to 'tokenBudget',
but a bad plugin or a non-conforming MCP server shouldn't be able to
spend our entire total budget on one section. New helper truncates
each result to the provider's declared budget BEFORE assembly.

- Under-budget content passes through unchanged (zero-cost)
- Over-budget content is line-truncated (never cut mid-word)
- Edge: first line alone > budget -> hard-cap characters with marker

Default budget for unknown/missing providers is 200 tokens (matches
the MCP-config default from item #1).

2. MISTAKES-BOOST RERANKING (boostByMistakes)

If the engram:mistakes provider fires for this file, scan OTHER
providers' content for substring matches against mistake labels
(extracted from the '  ! <label> (flagged <age>)' format). Matching
results get confidence * 1.5 (capped at 1.0).

Runs BEFORE the priority sort, but the secondary sort is now
(priority asc, confidence desc) — so boost breaks ties WITHIN a
priority tier without overriding priority across tiers.

- Case-insensitive matching (labels normalized to lowercase)
- Does NOT boost the mistakes provider itself
- No-op if no mistakes are reported for this file (common case)

Examples of the intended effect:
- An engram:git commit message mentioning a known-broken function
  sorts UP within the git tier
- A mempalace decision that references a mistaken architectural
  choice bubbles ahead of unrelated decisions

TESTS (+10 cases in tests/providers/resolver.test.ts)

enforcePerProviderBudget:
  - Under-budget untouched
  - Over-budget truncated by line with marker
  - Hard-cap when first line alone exceeds budget
  - Default 200 tokens when provider not found

boostByMistakes:
  - No-op when no mistakes provider in set
  - Matching substring boosts confidence 0.6 -> 0.9
  - Cap enforced (0.8 * 1.5 = 1.2 -> 1.0)
  - Non-matching results left alone
  - Mistakes provider itself is never self-boosted
  - Case-insensitive matching across upper/lower case variations

Full suite: 815 -> 825 tests (+10), all passing. TypeScript clean.

V3.0 PROGRESS: 8 of 12 scope items done.
  ✅ #1 foundation ✅ #2#3#6#7#9#10#11
  Remaining: #4 Auto-Memory (blocked on MEMORY.md fixture), #5 SSE
  streaming, #8 pre-mortem warnings, #12 MCP Registry submit, and
  #1 completion (HTTP transport + real-server integration tests).
NickCirv added a commit that referenced this pull request Jun 5, 2026
Opt-in warnings that fire BEFORE Claude Code runs an Edit/Write/Bash
tool call against code previously flagged as a mistake. Fully gated
via ENGRAM_MISTAKE_GUARD env var — zero overhead when unset.

MODES

  unset / '0' → off (default — no database read, no overhead)
  '1'         → permissive: tool proceeds, a warning is prepended
                to any additionalContext the primary handler emits
  '2'         → strict:     tool is denied with the warning as reason

Hooks Edit/Write/Bash only. Read already surfaces mistakes via the
engram:mistakes context provider — duplicating at tool-call time would
be noise.

MATCHING

Edit/Write:
  - Normalize tool_input.file_path to relative POSIX vs projectRoot
  - Indexed lookup via store.getNodesByFile() (uses idx_nodes_source_file)
  - Dedupe by node id when both relative + raw shapes are stored

Bash:
  - Substring match on mistake.metadata.commandPattern (length >2)
  - Fallback: substring match on mistake.sourceFile (length >3 to avoid
    accidentally matching single-char paths like 'a')
  - Full-table scan of mistakes (unavoidable — no file axis to index on).
    Bounded by project size; only runs when the guard is explicitly on.

BI-TEMPORAL FILTER (item #7 interop)

Mistakes with validUntil <= now are suppressed — they refer to code
that has since been refactored away. Prevents stale-warning fatigue.

INTEGRATION

New file: src/intercept/handlers/mistake-guard.ts
  - currentGuardMode() — reads env var at call time, not module load,
    so tests can flip between cases cleanly
  - findMatchingMistakesAsync(target, projectRoot) — the matcher
  - formatWarning(matches) — human-readable warning block
  - applyMistakeGuard(rawResult, payload, kind) — wrapping fn that
    augments additionalContext (permissive) or overrides to deny (strict)

src/intercept/dispatch.ts wiring: after runHandler() returns for Edit/
Write/Bash, pass result through applyMistakeGuard() before returning.
Two-line diff. Doesn't touch the existing handlers.

SAFETY

Every code path in mistake-guard is wrapped in try/catch with a null
return. A guard failure MUST NEVER break the primary handler. If the
store open fails, the env var is wrong, the payload is malformed —
guard silently returns the raw result unchanged.

TESTS (+21 cases in tests/intercept/handlers/mistake-guard.test.ts)

  - currentGuardMode: off/permissive/strict recognition, bogus values
    coerced to off
  - formatWarning: empty-match string, single-match header, >5-match
    collapse with '… and N more'
  - findMatchingMistakesAsync (file): rel path, abs path normalization,
    no-match, validUntil filter
  - findMatchingMistakesAsync (bash): commandPattern substring match,
    sourceFile-in-command match, case-insensitive, too-short pattern
    guard, validUntil filter
  - applyMistakeGuard: mode=off no-op, permissive augments additional
    context, permissive no-match no-op, strict denies with reason,
    permissive from passthrough emits fresh allow-with-warning

Full suite: 825 -> 846 tests (+21), all passing. TypeScript clean.

V3.0 PROGRESS — 9 of 12 scope items

  ✅ #1 foundation  ✅ #2#3#6#7#8#9#10#11

Remaining:
  - #1 completion (HTTP transport + real-server integration tests)
  - #4 Anthropic Auto-Memory bridge (blocked: needs MEMORY.md fixture)
  - #5 SSE streaming for rich packet assembly
  - #12 Official MCP Registry submission (post-ship)
NickCirv added a commit that referenced this pull request Jun 5, 2026
Adds progressive delivery for rich packet assembly. Instead of blocking
on Promise.allSettled (which waits for the slowest provider — Serena
cold-start, mempalace ChromaDB warmup), clients can stream results
as they arrive and render each section immediately.

NEW — resolveRichPacketStreaming generator (src/providers/resolver.ts)

AsyncGenerator<StreamEvent> that yields:
  { type: 'provider', result: ProviderResult }  — as each resolves
  { type: 'done', providerCount, durationMs }  — final totals

Order = ARRIVAL order (fast providers first). Consumers who want
priority order use the non-streaming resolveRichPacket() which applies
full priority + mistakes-boost + budget logic.

Implementation: fan-out all providers, funnel outcomes into a FIFO
queue + wake-on-arrival pattern. No extra deps. Per-provider timeouts
preserved (same resolveWithTimeout path as non-streaming).

NEW — /context/stream SSE endpoint (src/server/http.ts)

GET /context/stream?file=<relative-path> (auth required).
Emits one SSE frame per StreamEvent. Frame shape matches MCP SEP-1699
(SSE resumption):

  id: 0
  event: provider
  data: {"provider":"engram:ast", …}

  id: 1
  event: provider
  data: {"provider":"engram:mistakes", …}

  id: N
  event: done
  data: {"providerCount":N,"durationMs":347}

Supports Last-Event-ID header — clients reconnecting via
'Last-Event-ID: 3' skip events 0-3 and pick up from 4. Useful for
long-running sessions that drop WiFi mid-stream without losing context.

Client-disconnect aborts the stream cleanly (req.close handler short-
circuits the generator loop).

TESTS (+6 new)

resolver.test.ts (+2):
  - Smoke: streaming generator terminates with a 'done' event for any
    project (no hang, no runaway)
  - Arrival-order invariant: toy generator mirrors production shape,
    verifies fast results yield before slow ones

server/http.test.ts (+4):
  - Missing 'file' param returns 400
  - Valid request returns 200 + text/event-stream + ends with 'done'
  - Every frame carries an 'id:' header (SEP-1699 resumption)
  - Auth required — unauthenticated returns 401

Full suite: 870 -> 876 tests (+6), all passing. TypeScript clean.

V3.0 PROGRESS — 11 of 12 scope items done

  ✅ #1 foundation  ✅ #2#3#4#5#6#7#8#9#10#11

Only remaining in-scope work:
  - #12 MCP Registry submission (~2h, post-ship only)

Plus item #1 completion (HTTP transport + minimal MCP server fixture
for integration tests) — technically part of #1 which shipped its
foundation as 838527a; the HTTP transport path was explicitly deferred
until this SSE work landed. Now it can.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants