Releases: kitfunso/hippo-memory
v0.36.0 - A1 server mode
A1 Server Mode - hippo serve, MCP-over-HTTP, thin-client CLI
Added
hippo servedaemon on http://127.0.0.1:6789 (configurable via --port or HIPPO_PORT). Exposes /v1/memories, /v1/auth/keys, /v1/audit, MCP-over-HTTP at /mcp, and /health.- CLI thin-client. When the daemon is running, CLI invocations auto-detect via .hippo/server.pid and route through HTTP. Stale pidfile self-heals on first ECONNREFUSED.
- MCP-over-HTTP/SSE transport alongside the existing stdio path. POST /mcp synchronous JSON-RPC; GET /mcp/stream SSE keepalive.
- Domain layer src/api.ts. Pure functions for remember/recall/forget/promote/supersede/archiveRaw/auth*/audit. Both server and CLI delegate through this surface.
- HTTP auth middleware. Bearer token via Authorization header; loopback (127.0.0.1, ::1, ::ffff:127.0.0.1) accepts unauthenticated as actor='localhost:cli'. Non-loopback no-token returns 401. Server refuses to bind 0.0.0.0 without auth.
- 24h soak harness at benchmarks/a1/soak.ts (manual run).
- p99 recall benchmark at benchmarks/a1/p99-recall.ts.
Fixed
- Audit-log tenant attribution: audit() helper now uses entry's tenant_id instead of HIPPO_TENANT env (latent bug).
- api.archiveRaw and api.forget enforce tenant scope: cross-tenant returns "memory not found".
- SIGTERM drain: server.closeAllConnections() before close so SSE streams don't block shutdown.
- MCP-over-HTTP threads hippoRoot + tenantId from auth context.
Internal
- 99 new tests (730 -> 829 + 2 skipped). Headline parity test spawns real subprocess server; concurrent recall+write under SQLite single-writer (10 readers + 1 writer) confirms zero locked errors.
- All 5 /review ship blockers closed before merge.
Known issues (tracked for v0.37.0 in TODOS.md)
- p99 latency: 58.4ms vs 50ms target on 10k store. Architecture ships; latency hardening lands in v0.37.0.
- HIPPO_API_KEY silently dropped on stale-pidfile fallback.
- Concurrent hippo serve has no winner detection.
- Recall ?mode=hybrid accepted but ignored (BM25-only over HTTP).
- MCP-over-HTTP SSE is keepalive-only; no server-pushed messages.
🤖 Generated with Claude Code
v0.35.0 - A5 stub auth
A5 Stub Auth - API keys, audit log, tenant isolation
Added
- Schema v16:
tenant_idon memories, working_memory, consolidation_runs, task_snapshots, memory_conflicts (default 'default') + composite indexes. New tables:api_keys(scrypt-hashed),audit_log(append-only). - API key primitives:
createApiKey/validateApiKey/revokeApiKey/listApiKeys. scrypt + timingSafeEqual. Plaintext returned exactly once. - Audit log: hooks on remember, recall, promote, supersede, forget, archive_raw, auth_revoke. Query via
hippo audit list --op X --since Y --json. - Tenant resolution:
resolveTenantId({db?, apiKey?}). Order: api key > HIPPO_TENANT env > 'default'. - Cross-tenant isolation: enforced on CLI recall/explain/context, MCP server (hippo_recall, hippo_context, hippo_status), dashboard.
- CLI:
hippo auth create [--label X] [--tenant Y],hippo auth list [--all],hippo auth revoke <key_id>,hippo audit list [--op X] [--since Y] [--limit N] [--json]. - SSO/SCIM stubs: throw NotImplementedError, tracked for v2.
Fixed
- Empty HIPPO_TENANT env trims to 'default'
- bigint-safe JSON for audit metadata
- archiveRawMemory audit event uses row's tenant_id, not env
Internal
- 36 new tests (730 -> 766). Cross-tenant negative test verified at vitest layer + live dashboard API smoke test.
- All /review findings closed: 4 HIGH (tenant filter holes on MCP/explain/dashboard/context), 7 MEDIUM, 8 LOW.
Deferred to v2 (TODOS.md)
Multi-tenant per-key isolation, OAuth/SCIM, audit log retention, RBAC, background pipeline tenant scoping (consolidate, embeddings, autolearn, autoShare).
Generated with Claude Code
v0.34.0 — A3 provenance envelope + pineal salience v2
0.34.0 (2026-04-29)
Added
- A3 provenance envelope. Every memory now carries
kind(raw | distilled | superseded | archived),scope,owner, andartifact_refcolumns.hippo recall --whysurfaces the envelope;hippo rememberaccepts--kind,--scope,--owner,--artifact-refflags. SeeMEMORY_ENVELOPE.md. - Append-only invariant on
kind='raw'. SQLite triggertrg_memories_raw_append_onlyaborts direct DELETE on raw rows. The only legitimate path isarchiveRawMemory(db, id, { reason, who })which snapshots into the newraw_archivetable, purges the FTS row, and removes the memory in one SAVEPOINT (sets up A4 right-to-be-forgotten). - Schema v14 + v15. v14 adds the envelope columns, the
raw_archivetable, the append-only trigger, and INSERT/UPDATE CHECK-substitute triggers (ALTER TABLE cannot add CHECK in SQLite). v15 closes a NULL-kind bypass in those triggers and addsUNIQUE(memory_id, archived_at)toraw_archive. Backwards compatible, auto-migrates. - Pineal salience v2.
--salience-thresholdflag for the recall pipeline (commit50528a5). - Enterprise execution roadmap (
ROADMAP-RESEARCH.md). 90-day plan re-sequenced after Codex + eng-review pass: A3 envelope first (this release), then A5 stub auth, A1 server, E1.3 Slack ingestion. Cuts 7 deferred items into days 91-180.
Fixed
- FTS leak in
archiveRawMemory. Archived raw content stayed inmemories_ftsuntil next DB-open backfill; defeated GDPR right-to-be-forgotten. Archive now purges the FTS row inside the same SAVEPOINT. - CLI
--kind rawgated. Existinghippo forget/ consolidation / conflict-resolution paths abort on raw rows via the trigger. Until those paths route througharchiveRawMemory, the CLI restricts--kindto{distilled, superseded}so users cannot create unforgettable memories. - NULL-kind trigger bypass. v14 triggers used
WHEN NEW.kind IS NOT NULL AND NEW.kind NOT IN (...), so a directkind=NULLwrite silently bypassed the CHECK substitute. v15 rejects NULL. archiveRawMemorytransaction safety. Now usesSAVEPOINT(nestable) instead ofBEGIN. BigInt-safe JSON serializer for the audit payload.--scopeenvelope trim. Matched the pre-existing scope-tag trim behavior.
Internal
- 730 tests (+15 from v0.33.0). New:
tests/a3-envelope-migration.test.ts,tests/raw-archive.test.ts,tests/recall-why-envelope.test.ts. - Reviewed via
/codex,/plan-eng-review,/review(Claude pass + adversarial subagent),/self-review,/ship-check. All ship-blockers resolved before release.
v0.33.0 — Fact extraction + DAG summarization + multi-hop recall
0.33.0 (2026-04-23)
Added
- Write-time fact extraction. During
hippo sleep, episodic memories are now processed by an LLM to extract standalone facts (up to 8 per memory). Facts are stored as semantic-layer entries withextracted_fromlinking back to the source. Extracted facts get a 1.3x search boost and automatically deduplicate against their source entries in results, so users see the precise fact instead of the raw conversation. - DAG summarization. Extracted facts are clustered by Jaccard similarity (>= 0.5) on speaker:/topic: entity tags, then summarized into dag_level=2 parent nodes. When a summary matches a query, its children are injected into results at 0.9x parent score, giving hierarchical drill-down.
- Multi-hop retrieval.
hippo recall --multihopandmultihopSearch()run a two-pass entity-chained search. Pass 1 retrieves top-K and extracts entity tags not in the original query. Pass 2 reformulates the query with discovered entities and retrieves again. Results merge by highest score per ID. hippo remember --extracttriggers immediate fact extraction on the remembered content.hippo dag --statsshows DAG layer distribution (how many entries at each level).- Schema v12-v13. v12 adds
extracted_fromcolumn, v13 addsdag_level+dag_parent_idwith backfill and index. Backwards compatible, auto-migrates on first open.
Fixed
temporalBoostO(N^2) refactored to O(N). Previously calledMath.min(...timestamps)per entry inside the search loop, risking stack overflow on large stores. Now precomputes range once viacomputeTemporalRange().- Config scoping bug in
consolidate.ts.configwas block-scoped inside the extractionifblock but referenced from the DAG section outside it. Would cause ReferenceError when no extraction candidates exist but extracted facts are ready for DAG processing. - Dead
seenIdsvariables removed from both search paths (populated but never read).
Internal
- 674 tests (+41 from v0.32.0). 16 new test files covering extraction, DAG, multi-hop, temporal scoring, CLI commands, and integration smoke tests.
- Reviewed via
/review+/self-review+/qa+/ship-check+ senior code review agent.
v0.32.0 — Bi-temporal supersession + --as-of historical recall
0.32.0 (2026-04-22)
Added
- Bi-temporal memory: correction without deletion. When a belief changes, the old memory stays as historical truth instead of being overwritten. Default recall filters superseded entries so agents see current reality; historical views are explicit. Schema v11 adds
valid_fromandsuperseded_bycolumns, backwards compatible with v10 stores (ADD COLUMN only, no data transform). hippo supersede <old-id> "<new content>". Creates a successor memory and links the old one viasuperseded_by. Cycle prevention: if the target is already superseded, the command errors with the successor's ID so you can supersede that one instead. Reuses--layer,--tag,--pinfromremember.--include-supersededonhippo recall/explain. Returns historical memories with a[superseded]marker in output. Default recall hides them.--as-of <ISO-date>onhippo recall/explain. Returns the set of memories that were current at that date. Validates input at CLI entry; invalid dates exit with a clear ISO-format hint.- Partial index for fast current-only queries.
CREATE INDEX idx_memories_current ON memories(layer, created) WHERE superseded_by IS NULLmakes the default recall path cheap even with large archives.
Changed
markRetrievedis a no-op for superseded entries. Retrieving a historical memory (via--include-superseded) no longer strengthens it or extends its half-life. Historical reads shouldn't revive dead beliefs.detectConflictsskips superseded pairs. No point flagging "these contradict" when one side is historically dead.
Research
- Physics search ablation: CUT verdict. Benchmarked physics-on vs physics-off over 60 stratified LongMemEval-oracle questions (paired bootstrap, 5000 iters). Physics OFF: MRR 0.8388, Recall@5 84.31%, NDCG@5 0.7888. Physics ON: MRR 0.6848, Recall@5 74.17%, NDCG@5 0.6570. All metrics statistically worse with physics; 95% CI excludes zero. Results in
benchmarks/physics-ablation/. Physics remains in the codebase and is not removed in this release; a decision on removal is tracked as follow-up. - LoCoMo harness built.
benchmarks/locomo/run.pyscores hippo against snap-research's long-conversation memory benchmark using Claude as judge. Sanity run (3 QAs): 2 adversarial abstentions correct, 1 open-domain miss. Full 10-conversation run requires overnight batch due to ~2 turns/sec ingestion.
Internal
- 633 tests pass (+8 from v0.31.0). 3 new test files:
bi-temporal-migration.test.ts,cli-supersede.test.ts,bi-temporal-recall.test.ts. - 4 commits on master:
091e6de(schema v11),026988b(supersede command),b538c0d(recall filters),7108187(review fixes). - Reviewed via
/review+/self-review+/ship-check. Two fixes landed:--as-ofdate validation (previously silent no-op on invalid input) andcmdSupersede --tagparity withcmdRemember(previously only accepted comma-separated, dropped repeated flags).
v0.31.0 — Scope-aware corrections
Route corrections to the right context. Memories can now carry a scope tag, so a correction said during one skill surfaces strongly when that skill runs again, and stays quiet elsewhere.
Added
hippo remember --scope <name>tags memories with a context scope.scopeBoostin search scoring — 1.5x when the active scope matches the memory's scope, 0.5x when it mismatches, 1.0x for unscoped. Same multiplier pattern as the existingdecisionBoost/pathBoost/outcomeBoost.- Auto-detect from env —
HIPPO_SCOPE>GSTACK_SKILL>OPENCLAW_SKILL. When any is set,remember/recall/context/explainauto-apply the scope. Pure env var reads, no I/O on hot paths. --scopeonhippo recall,hippo context,hippo explain— explicit scope overrides auto-detect.hippo explain --whysurfaces the scope multiplier when it fires, so scope routing is debuggable.
Why
Inspired by claude-reflect's "skill improvement routing" — corrections during a specific skill should update that skill's context, not global memory. Hippo's take: same outcome via tag + scoring multiplier, no separate file hierarchy.
Internal
- 625 tests pass (+21 new, 0 regressions).
- New module
src/scope.ts(32 lines). No schema migration. No breaking changes. - Git-branch fallback was prototyped and dropped after
/review— forked git on every hook call (~50-150ms per user message) and polluted the tag space with ephemeral branch names.
Upgrade
```bash
npm install -g hippo-memory@0.31.0
```
No migration. Additive only.
Full changelog
https://github.com/kitfunso/hippo-memory/blob/master/CHANGELOG.md#0310-2026-04-22
v0.30.1 — Sequence binding polish
Patch release: fixes three UX bugs found by code review + npm smoke test on v0.30.0. Sequence binding (the v0.30.0 headline feature) now works end-to-end as documented.
Fixed
- `hippo recall --layer ` is now a strict filter. Previously the flag was accepted but silently dropped, so results from other layers leaked in. This broke the RSI demo's `recall --layer trace` example.
- `hippo status` now prints a `Trace:` counter. The new trace layer was tracked internally but never surfaced.
- `hippo --version` / `-v` prints the package version. Previously errored with "Unknown command".
Internal
- 604 tests pass (+5 from v0.30.0). 3 new test files cover all three fixes via `execFileSync` against `bin/hippo.js`.
- Caught by `senior-code-reviewer` + npm smoke test against the published 0.30.0 artifact before the GitHub Release went public.
Upgrade
```bash
npm install -g hippo-memory@0.30.1
```
No migration. No breaking changes. If you're on v0.30.0, just upgrade.
v0.30.0 context
v0.30.0 introduced sequence binding — a new `Layer.Trace` memory type for ordered action→outcome sequences, foundation for recursive-self-improvement agents. See the v0.30.0 release notes for the full feature set.
Full changelog
https://github.com/kitfunso/hippo-memory/blob/master/CHANGELOG.md#0301-2026-04-22
v0.30.0 — Sequence binding (RSI foundation)
Sequence binding for recursive-self-improvement agents.
Agents can now remember step-by-step execution traces tagged with outcome (success / failure / partial), then recall them by outcome to learn from past attempts.
Highlights
Layer.Trace— a new memory layer for ordered action→outcome sequences. Traces inherit decay, retrieval-strengthening, conflict detection, embeddings, replay, and physics from existing infrastructure. Four inheritance smoke tests lock that claim.hippo trace record --task <t> --steps <json> --outcome <...>— explicit trace storage.hippo session complete --session <id> --outcome <...>— terminal event marking a session as finished.hippo recall --outcome success— retrieve only successful prior strategies.- Auto-promotion during
hippo sleep— completed sessions become bound traces automatically. Idempotent; three consecutive sleeps produce exactly one trace per session. examples/rsi-demo/— minimal RSI agent, deterministic and CI-runnable. 50-task suite. Current seed: 20% success on tasks 1-10 rising to 100% on tasks 41-50. Non-zero exit if the learning curve collapses.- Schema v3 migration preserves existing data. 599 tests pass.
Known issues in this release (fixed in v0.30.1)
- `hippo recall --layer trace` was accepted but silently dropped, so results leaked other layers.
- `hippo status` had no `Trace:` counter.
- `hippo --version` / `-v` was missing.
Upgrade to v0.30.1 for the clean experience.
Full changelog
v0.24.2 - machine-level daily runner + detached OpenClaw autosleep
What's new\n- Added a machine-level daily runner that sweeps registered Hippo workspaces and runs daily learn + sleep without requiring one OS task per project.\n- Updated the OpenClaw plugin to queue detached session-end autosleep so consolidation no longer blocks shutdown.\n- Updated the docs to separate query-time retrieval from session-end and daily refresh behavior across integrations.\n\n## Verification\n- npm run build\n- npm test\n- npm publish
v0.24.1 - conflict detection content-gate fix
Fixed
- Conflict detection now gates on content overlap, not shared tags.
hippo sleepno longer flags unrelatedfeedback/policymemories as contradictions just because they share coarse tags and opposite polarity words. - Reworded contradictions still surface. Opposites like
API auth must be enabled in prod/Disable API auth in prodstay detectable instead of being filtered out by a blunt overlap threshold. mustandalwaysnow count as positive polarity. Contradictions likeProduction deploys must require approval/Production deploys should not require approvalare caught consistently.
Internal
- Added regression tests for the exact false-positive pairs from the migrated-store report plus a broader contradiction matrix (
mustvsshould not,availablevsmissing,worksvsbroken). Full suite passes: 491 tests.