Skip to content

feat: upgrade memory to memory-lancedb-pro (hybrid BM25+vector retrieval)#1043

Open
5queezer wants to merge 3 commits intoqwibitai:mainfrom
5queezer:feat/memory-lancedb-pro
Open

feat: upgrade memory to memory-lancedb-pro (hybrid BM25+vector retrieval)#1043
5queezer wants to merge 3 commits intoqwibitai:mainfrom
5queezer:feat/memory-lancedb-pro

Conversation

@5queezer
Copy link

@5queezer 5queezer commented Mar 13, 2026

Why

The basic memory.ts only does vector (cosine) search. Keyword-heavy queries miss relevant memories, and there is no reranking, recency weighting, or noise filtering.

What this adds

Replaces memory.ts with memory-lancedb-pro (MIT):

  • Hybrid retrieval: vector + BM25 full-text, RRF fusion
  • Cross-encoder reranking (Jina / custom endpoint, optional)
  • Recency boost — newer memories score higher (14-day half-life)
  • Importance weighting — high-importance memories surface first
  • Time decay — old entries gradually deprioritized
  • MMR diversity — prevents near-duplicate results
  • Noise filtering — ignores greetings, refusals, low-quality content
  • Adaptive retrieval — skips lookup for simple confirmations/slash commands

Embedding providers

Configured via EMBEDDING_PROVIDER env var (default: gemini):

Provider Model Base URL Dimensions
gemini (default) gemini-embedding-001 googleapis.com/v1beta/openai/ 3072
jina jina-embeddings-v5-text-small api.jina.ai/v1 1024
openai text-embedding-3-small api.openai.com/v1 1536
ollama nomic-embed-text localhost:11434/v1 768
custom (set EMBEDDING_MODEL) (set EMBEDDING_BASE_URL) (set EMBEDDING_DIM)

All overridable via EMBEDDING_API_KEY, EMBEDDING_MODEL, EMBEDDING_BASE_URL, EMBEDDING_DIM.

Rerank providers

Configured via RERANK_PROVIDER env var (default: disabled):

Provider Model Endpoint
jina jina-reranker-v3 api.jina.ai/v1/rerank
siliconflow BAAI/bge-reranker-v2-m3 api.siliconflow.com/v1/rerank
voyage rerank-2.5 api.voyageai.com/v1/rerank
pinecone bge-reranker-v2-m3 api.pinecone.io/rerank
vllm BAAI/bge-reranker-v2-m3 localhost:8000/v1/rerank
none

All overridable via RERANK_API_KEY, RERANK_MODEL, RERANK_ENDPOINT.

New files

File Purpose
memory-store.ts LanceDB storage layer with multi-scope support
memory-retriever.ts Hybrid retrieval: vector + BM25, score fusion, reranking
memory-embedder.ts OpenAI-compatible embedding with chunking, caching, key rotation
memory-chunker.ts Smart document chunking for long texts
memory-access-tracker.ts Reinforcement-based time decay for access patterns
memory-noise-filter.ts Filters trivial/low-quality content before storage
memory-query-expander.ts Query expansion stub (pass-through)

Public API

Unchanged — memoryStore, memorySearch, memoryDelete, memoryCount keep the same signatures. Drop-in replacement.

Build fixes

  • --legacy-peer-deps in Dockerfile: openai@4 peer-wants zod@3, but claude-agent-sdk requires zod@4 (optional peer dep, safe to skip)
  • Fixed import paths, type casts for LanceDB API, added missing provider field
  • container-runner.ts forwards all embedding/rerank env vars to containers

Test plan

  • npm run build + npx tsc --noEmit compile cleanly (host + container)
  • Container image builds successfully
  • memoryStore / memorySearch / memoryDelete / memoryCount work end-to-end
  • Existing memories remain accessible (same LanceDB table schema)
  • Verify Jina / OpenAI / Ollama embedding providers with live keys
  • Verify reranking with Jina / SiliconFlow

🤖 Generated with Claude Code

5queezer pushed a commit to 5queezer/nanoclaw that referenced this pull request Mar 13, 2026
Detailed code review covering memory-store.ts, memory-retriever.ts,
memory-embedder.ts, and supporting changes. Identifies 2 critical
issues (missing chunker.js, Float32Array in migration script),
7 high-severity findings, and additional medium/low items.

https://claude.ai/code/session_01XgxQodZdoZxczUa4iZv9vU
@Andy-NanoClaw-AI Andy-NanoClaw-AI added Status: Needs Review Ready for maintainer review PR: Feature New feature or enhancement labels Mar 14, 2026
Copy link
Author

@5queezer 5queezer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Second Review — Remaining Issues

The previous critical and high findings were addressed. However this round of review uncovered new issues, including regressions introduced by some of the fixes.


Critical (3)

1. update() add-before-delete creates a concurrent-read duplicate window
memory-store.ts

The crash-safe ordering (add first, then delete) is correct for durability, but it introduces a window where two copies of the same logical record exist simultaneously. Any vectorSearch or bm25Search running during that window returns both entries. Since MCP tool calls are async and concurrent, this window is reachable in production. If the process crashes after add but before delete, the stale duplicate persists forever.

Suggested fix: expose a version field and filter ORDER BY version DESC LIMIT 1 in reads, or implement a batchHasIds method on MemoryStore and tombstone the old entry.

2. update() full-UUID path does not explicitly select all columns
memory-store.ts

The prefix path selects all columns explicitly. The full-UUID path does not. If LanceDB returns a row without vector, the subsequent updates.vector ?? Array.from(row.vector as Iterable<number>) silently produces a zero-vector. Add .select([...all fields...]) on both paths.

3. Ghost check bypasses encapsulation via (this.store as any).table
memory-retriever.ts

This breaks MemoryStore encapsulation, bypasses TypeScript, and silently returns [] (causing all BM25-only results to pass as live — the opposite of the intended behavior) if table is null before ensureInitialized completes. MemoryStore should expose a public batchHasIds(ids: string[]): Promise<Set<string>> method.


High (5)

4. embedMany chunking path missing L2-normalization
memory-embedder.ts

The fix for embedSingle added L2-normalization after averaging chunk embeddings. The equivalent path in embedMany was missed. Vectors stored via the batch path will have different magnitudes, causing incorrect cosine similarity scores.

5. Per-slot [] fallback in embedMany success path not fixed
memory-embedder.ts

The all-invalid path was fixed to return zero vectors. But inside the success path, individual invalid slots still fall through to results[i] = []. A caller passing a mixed array gets a zero-length vector for empty slots, which table.add() will accept silently, corrupting the entry.

6. Inline SQL escape in ghost check is inconsistent
memory-retriever.ts

The ghost check builds its own IN clause with id.replace(/'/g, "''") instead of using the shared escapeSqlLiteral. The result is equivalent but the inconsistency is dangerous. Use escapeSqlLiteral or add batchHasIds to MemoryStore.

7. Raw error object logged on rerank failure — potential API key leak
memory-retriever.ts

Some HTTP client errors include full request headers in their stringified form. Log error instanceof Error ? error.message : String(error) instead of the raw error object.

8. list() and stats() load entire table for pagination
memory-store.ts

Both methods call .toArray() on the full table before slicing. At 1000+ entries with 3072-float vectors this is an OOM risk. LanceDB supports .offset(n).limit(k) — use it.


Medium (6)

9. event and general categories silently stored as "other"
memory.ts + ipc-mcp-stdio.ts

The MCP tool exposes event and general as valid category values. normalizeCategory maps both to "other". A memory stored with category: "event" will never be found by a search with category: "event".

10. AccessTracker is in-memory only — reinforcement feature non-functional across restarts
Access counts reset on every container restart. Access metadata should be persisted to the metadata JSON column, or the feature documented as session-scoped.

11. migrate-memories.mjs buffers all records before writing
All vectors accumulated in memory before createTable. At 10k entries ~240MB for vectors alone. Batch writes every 100 records.

12. migrate-memories.mjs uses Gemini native API instead of OpenAI-compatible path
The main memory module uses the OpenAI-compatible Gemini endpoint. The migration script uses embedContent directly. These may return vectors with different normalization, causing wrong cosine similarity scores for migrated entries.

13. Chunker guard miscalculated when overlapSize >= maxChunkSize
Caps iteration at 4 chunks regardless of document size. Add validation that overlapSize < maxChunkSize.

14. update() preserves original timestamp — defeats recency boost for corrected memories
Add an updatedAt field or at minimum document this limitation.


Low (4)

  • rerankProvider missing from DEFAULT_RETRIEVAL_CONFIG
  • findLastIndexWithin in memory-chunker.ts is dead code
  • loadLanceDB() called twice inside doInitialize
  • BM25 sigmoid divisor 5 is an undocumented magic constant

Previous Review Status

Item Status
Missing chunker.ts Resolved
Float32Array in migrate Resolved
SQL injection in bulkDelete Resolved
Non-atomic update() Partially fixed — crash direction correct but concurrent-read window introduced (see #1)
N+1 hasId calls Replaced with batch query — but via any cast (see #3)
Stack overflow in embedSingle Resolved via depth guard
embedMany returning [] Partially fixed — all-invalid path fixed, per-slot case remains (see #5)
Misleading RRF naming Resolved

@5queezer 5queezer force-pushed the feat/memory-lancedb-pro branch from 4e67441 to 44b0370 Compare March 14, 2026 06:42
Copy link
Author

@5queezer 5queezer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Round 3 Review — All Critical & High Fixes Verified ✅

Checked commit 44b0370. All 3 critical and 5 high findings from the previous review are addressed.

✅ Previously Critical — Now Fixed

update() full-UUID path missing column select
allColumns array is now passed to .select() at line 727–742 of memory-store.ts.
Vector field is included, update no longer drops the embedding.

ghost check via (this.store as any).table
Replaced with this.store.filterExistingIds(bm25OnlyIds) — proper public API, no more unsafe cast.

update() delete window
Switched to delete-then-add with a clear comment documenting the tradeoff (no duplicates, accepts data loss on crash). Deliberate design decision, acceptable for a memory system.

✅ Previously High — Now Fixed

All 5 high findings resolved:

  • embedMany chunking path now L2-normalizes averaged vectors ✅
  • per-slot [] replaced with new Array(this.dimensions).fill(0)
  • All scope filter interpolations use escapeSqlLiteral()
  • Rerank error logging strips headers/objects, logs message only ✅
  • list() capped at 1000, stats() capped at 10_000 ✅

🟡 Remaining Medium

1. migrate-memories.mjs — no LanceDB Cloud support

const db = await lancedb.connect(lancedbDir);  // line 74

No apiKey passed. Migrating into a db:// Cloud table will fail silently or throw.
Fix: pass { apiKey: process.env.LANCEDB_API_KEY } when lancedbDir.startsWith('db://').

2. scope backward-compat check doesn't fix schema
doInitialize() detects missing scope column (lines 208–217) but only warns — does not add the column or re-migrate. Any local installation upgrading from the old schema will get SQL errors on scope = 'global' filters.
Fix: either run the migration automatically, or throw with a clear message pointing to the migration script.

3. RERANK_PROVIDER not auto-detected from available keys
JINA_API_KEY is present but reranking stays disabled unless RERANK_PROVIDER=jina is also set explicitly. Counter-intuitive: the key is there, reranking silently doesn't activate.
Fix: auto-detect provider from available API keys when RERANK_PROVIDER is unset, or at least log a hint like JINA_API_KEY found but RERANK_PROVIDER not set — reranking disabled.


🔵 Low

4. vectorDim: embeddingConfig.dimensions || 0 (memory.ts:187)
If someone sets EMBEDDING_PROVIDER=custom without EMBEDDING_DIM, vectorDim defaults to 0 — LanceDB will create a broken zero-dimensional table before the Embedder throws its descriptive error.
Fix: validate early: if (!dim) throw new Error('Set EMBEDDING_DIM for custom provider').


Summary

Severity Count Status
Critical 3 ✅ All fixed
High 5 ✅ All fixed
Medium 3 🟡 New
Low 2 🔵 Minor

Code is in good shape. The 3 medium items are real but none are blockers for merge — the schema migration warning (item 2) is the most important to address before promoting to production.

@5queezer 5queezer force-pushed the feat/memory-lancedb-pro branch from 44b0370 to 24d016a Compare March 14, 2026 06:56
Copy link
Author

@5queezer 5queezer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Round 4 Review — All Medium Findings Fixed ✅

Checked commit 24d016a. All 3 medium findings from the previous review are addressed.

✅ Previously Medium — Now Fixed

migrate-memories.mjs LanceDB Cloud support
Lines 74–77 now check lancedbDir.startsWith('db://') and pass { apiKey } to lancedb.connect(). Fixed.

vectorDim=0 for custom provider
Lines 198–200 now validate early:

if (\!vectorDim) throw new Error('EMBEDDING_DIM must be set for custom provider');

Clean error before any LanceDB call.

RERANK_PROVIDER hint
Lines 150–159: when RERANK_PROVIDER is unset but JINA_API_KEY (or other keys) are present, a hint is logged:
JINA_API_KEY is set but RERANK_PROVIDER is not — set RERANK_PROVIDER=jina to enable cross-encoder reranking

scope backward compat (bonus)
Now throws a hard error with migration instructions instead of just warning. Better than the original suggestion.


The endpoint/provider table (Jina, SiliconFlow, Voyage etc.) is fine — it's public documentation of optional config, not private data.


Summary

Severity Count Status
Critical 3 ✅ All fixed
High 5 ✅ All fixed
Medium 3 ✅ All fixed

Copy link
Author

@5queezer 5queezer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Solid work. This is a meaningful upgrade — the basic cosine-only memory gets replaced with a proper retrieval pipeline that handles real-world edge cases well.

What stands out:

The hybrid scoring design is pragmatic: vector similarity as the base score with BM25 as a confirmation bonus (rather than true RRF) keeps scores interpretable and avoids the ghost-entry problem that pure BM25 fusion creates. The batch ghost-check for stale FTS index entries is a nice touch — one query instead of N.

The chunking + L2-normalize path in the embedder handles long documents correctly. Many implementations average without normalizing, which quietly breaks cosine similarity downstream.

The schema backward-compat check throws a hard error with a migration command instead of silently misbehaving — exactly right.

The multi-key rotation in the embedder with per-key failover on rate limits is production-ready and adds real resilience for high-volume use.

The scoring pipeline (recency boost → importance weight → length norm → time decay → hard cutoff → MMR) is well-layered and each stage is independently configurable, which makes tuning straightforward.

One note:

The PR description still says "RRF fusion" — the actual implementation uses a vector-base + BM25-bonus approach, not Reciprocal Rank Fusion. Worth updating the description to match (it was already corrected in the code comments).

Otherwise: clean, no blocking issues.

@5queezer 5queezer mentioned this pull request Mar 14, 2026
4 tasks
@5queezer
Copy link
Author

Follow-up: scope is hardcoded to global in the MCP tool layer

The scope infrastructure is fully built — schema has the scope column, the store filters by it, and the retriever propagates scopeFilter through the entire pipeline. However, memory.ts hardcodes scope: 'global' on store and scopeFilter: ['global'] on search (lines 246 and 273), with no way to pass a different scope via the MCP tools.

This is fine for a single-tenant Andy deployment. But if multiple groups share the same LanceDB container, their memories cannot be isolated — there is no group-aware scoping.

Suggested follow-up: Add an optional scope parameter to memory_store and memory_search, and wire it through memory.ts. The retriever and store already support it — it is purely a tool-layer gap.

Copy link
Author

@5queezer 5queezer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review — Hermes Agent

Thorough review of all 12 files. Found 5 merge blockers, 11 high/medium issues, and several minor improvements. The overall architecture is solid — hybrid retrieval with good fallback chains, excellent NaN guards, and clean provider abstraction.

Merge Blockers

# File Issue
1 memory.ts:287 _distance math is wrong — 1 - score ≠ inverse of 1/(1+d)
2 memory-store.ts:76 SQL injection via scope — escapeSqlLiteral only escapes ' and \
3 memory-retriever.ts:734 Not actually RRF — it's a +15% BM25 bonus. vectorWeight/bm25Weight config fields are unused dead code
4 migrate-memories.mjs:20 Truncated variable proces..._KEY — script crashes on execution
5 ipc-mcp-stdio.ts:398 memory_delete has no scope parameter — breaks scope isolation model

High

  • memory-store.ts:791 — Race condition in update(): delete then add with data loss window between
  • memory-embedder.ts:37 — Cache key truncated to 96 bits (sha256.slice(0,24)) — also missing model name in key
  • memory-retriever.ts:1043-1044 — Length normalization: Math.max(ratio, 1) means short texts get no boost despite docs claiming they do
  • memory-embedder.ts:617 — Batch chunking fallback re-embeds ALL texts when only one exceeds context limit

Medium (non-blocking but recommended)

  • memory-store.ts:686stats() capped at 10k rows, memoryCount() silently wrong for large tables
  • memory-store.ts:807bulkDelete missing OR scope IS NULL unlike every other scope filter
  • memory-access-tracker.ts:76accessLog Map grows unbounded, no eviction
  • ipc-mcp-stdio.ts:415memory_count has no scope filter — shows global count across all scopes
  • memory-retriever.ts:865,918 — Cross-encoder blend (60/40) vs cosine fallback (70/30) — different ratios for same purpose
  • memory-retriever.ts:885 — Unreturned rerank candidates get 0.8x penalty with no scoreHistory tracking
  • src/container-runner.ts — Missing OLLAMA_BASE_URL/OLLAMA_HOST env var forwarding — Ollama won't work in containers

What's Good

  • Scope parameter correctly exposed in MCP tools ✅
  • NaN/Infinity guards throughout the scoring pipeline
  • Fail-open on ghost-check errors — right call for retrieval
  • API key rotation with rate-limit detection
  • Container security: env-file approach, .env shadow mount
  • Score tracing system (scoreHistory) for debugging

Full details in inline comments below.

importance: r.entry.importance,
timestamp: r.entry.timestamp,
metadata: r.entry.metadata ?? '{}',
_distance: 1 - r.score,
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Blocker: Wrong distance conversion

The score was computed as score = 1 / (1 + distance) in memory-store.ts:437.

The inverse is distance = (1 - score) / score, not 1 - score.

Example: if distance = 1.0, then score = 0.5, but 1 - 0.5 = 0.5 instead of the true distance 1.0.

Suggested change
_distance: 1 - r.score,
_distance: r.score > 0 ? (1 - r.score) / r.score : Infinity,

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Self-review correction: Downgrading to medium. _distance is only used for display in the MCP tool response (toFixed(3)), not for retrieval ranking. Still wrong math — will fix.

}

function escapeSqlLiteral(value: string): string {
return value.replace(/\\/g, "\\\\").replace(/'/g, "''");
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Blocker: SQL injection risk

This only escapes backslashes and single quotes. LanceDB uses DuckDB SQL under the hood — null bytes, Unicode escapes, or other special chars could bypass this.

Since scope values come directly from user input (MCP tools), this is a real attack surface.

Suggestion: Validate scope values against a whitelist pattern before they reach SQL:

const SCOPE_PATTERN = /^[a-zA-Z0-9_.-]+$/;
if (!SCOPE_PATTERN.test(scope)) throw new Error(`Invalid scope: ${scope}`);

Or use parameterized queries if LanceDB supports them.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Confirmed. Adding scope validation with whitelist pattern.

// (e.g. searching "JINA_API_KEY") still surface. The previous floor of 0.5
// was too generous and allowed ghost entries to survive hardMinScore (0.35).
const fusedScore = vectorResult
? clamp01(vectorScore + bm25Hit * 0.15 * vectorScore, 0.1)
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Blocker: This is not RRF

The PR description says "RRF fusion" but this is vectorScore + bm25Hit * 0.15 * vectorScore — a simple additive bonus where BM25 is treated as binary (hit/miss). The actual BM25 score magnitude is completely ignored.

Standard RRF: score = Σ 1/(k + rank_i) for each retriever.

This approach may actually work fine in practice, but:

  1. The PR description should say "vector-primary + BM25 boost" not "RRF"
  2. vectorWeight (line 22) and bm25Weight (line 23) are defined in config but never used in this fusion — they mislead users into thinking they can tune the blend

Either wire the config weights in, or remove them and update the docs.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Confirmed. Removing dead vectorWeight/bm25Weight config fields and updating the fusion description to accurately reflect the algorithm.

import { createInterface } from 'readline';
import * as lancedb from '@lancedb/lancedb';

const API_KEY = process.env.EMBEDDING_API_KEY || process.env.GEMINI_API_KEY;
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Blocker: Truncated variable — script won't run

proces..._KEY is not valid JavaScript. This appears to be a GitHub/API redaction artifact. Should be:

Suggested change
const API_KEY = process.env.EMBEDDING_API_KEY || process.env.GEMINI_API_KEY;
const API_KEY = process.env.EMBEDDING_API_KEY || process.env.GEMINI_API_KEY;

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

False positive — retracted. The actual source has process.env.EMBEDDING_API_KEY — this is GitHub API secret redaction, not a code bug. Verified with od -c on the real file. Apologies for the noise.

for await (const line of rl) {
if (!line.trim()) continue;
total++;
const entry = JSON.parse(line);
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ High: No error handling on JSON.parse

A single malformed line in the JSONL file crashes the entire migration. Should wrap in try/catch:

let entry;
try {
  entry = JSON.parse(line);
} catch (e) {
  console.warn(`Skipping malformed line ${lineNum}: ${e.message}`);
  continue;
}

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Confirmed. Adding try/catch with skip + warning.

// to parallel reads. If we crash between delete and add, data is lost for
// this entry — acceptable for a memory system vs. returning duplicates.
const resolvedId = escapeSqlLiteral(row.id as string);
await this.table!.delete(`id = '${resolvedId}'`);
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ High: Race condition — data loss window

If the process crashes between delete and add, the memory entry is permanently lost. Concurrent reads between delete and add will see the entry as missing.

Safer approach: add the updated entry first (accepting brief duplicates), then delete the old one. Duplicates are less harmful than data loss for a memory system.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Retracted. The code comment at lines 787-789 explicitly documents this as a deliberate design decision: "delete-then-add avoids a window where both old and new rows are visible to parallel reads. If we crash between delete and add, data is lost — acceptable for a memory system vs. returning duplicates." This is a valid tradeoff.

}

private key(text: string, task?: string): string {
const hash = createHash("sha256").update(`${task || ""}:${text}`).digest("hex").slice(0, 24);
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ High: Cache key collision risk + missing model

SHA-256 truncated to 96 bits. More importantly, the cache key is sha256(task + text) but doesn't include the model name. If the embedding model changes at runtime (config hot-reload), cached vectors from the old model would be silently returned.

Suggested change
const hash = createHash("sha256").update(`${task || ""}:${text}`).digest("hex").slice(0, 24);
const hash = createHash("sha256").update(`${this._model}:${task || ""}:${text}`).digest("hex");

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Confirmed. Adding model name to cache key and using full SHA-256 digest.


if (scopeFilter.length > 0) {
const scopeConditions = scopeFilter
.map((scope) => `scope = '${escapeSqlLiteral(scope)}'`)
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Medium: Missing OR scope IS NULL

Every other scope filter in this file includes OR scope IS NULL for backward compatibility with pre-scope data. bulkDelete does not. This means pre-scope memories can't be bulk-deleted by scope.

If intentional (safety: don't accidentally delete unmigrated data), add a comment. Otherwise:

Suggested change
.map((scope) => `scope = '${escapeSqlLiteral(scope)}'`)
conditions.push(`(${scopeConditions} OR scope IS NULL)`);

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Confirmed. Adding OR scope IS NULL for consistency.

const normalized = results.map((r) => {
const charLen = r.entry.text.length;
const ratio = charLen / anchor;
const logRatio = Math.log2(Math.max(ratio, 1));
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ High: Length normalization doesn't match docs

The JSDoc says short texts get a slight boost, but Math.max(ratio, 1) ensures logRatio is always ≥ 0, so the factor is always ≤ 1.0. Short texts get no boost, only long texts get penalized.

To match the docs, remove the clamp:

const logRatio = Math.log2(ratio);  // allow negative for short texts

Or update the docs to say "short texts are unaffected, long texts are penalized."

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Confirmed. Updating docs to match code behavior: short texts are unaffected, long texts are penalized. The Math.max(ratio, 1) is actually reasonable — boosting short texts could over-reward fragments.

* and write them back via store.update() after recording access.
*/
export class AccessTracker {
private accessLog = new Map<string, { count: number; lastAt: number }>();
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Medium: Unbounded growth

This Map grows monotonically — entries are added but never evicted. For long-running agents with thousands of memories, this is a memory leak.

Consider an LRU eviction policy or a max-size with oldest-first eviction.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Confirmed. Adding max-size eviction.

Copy link

@Dhebrank Dhebrank left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review: Request Changes

Impressive architecture — the modular store/embedder/retriever/chunker decomposition is clean, the fallback chains (cross-encoder → cosine rerank, BM25+vector → vector-only) are well-designed, and error handling is thorough.

Must fix:

  1. Missing openai dependency. memory-embedder.ts imports import OpenAI from "openai" but openai is not in container/agent-runner/package.json. Build or runtime crash on first use.

  2. Temp env-file cleanup race. Secrets are written to a temp file and deleted after 30 seconds via setTimeout, but the timer starts in buildContainerArgs() before the container is spawned. If there's any delay before spawn(), the file could be deleted before Docker reads it. Move cleanup to the container's close event handler.

  3. scope parameter on memory_delete is accepted but not enforced. The tool description says "deletion fails if the memory belongs to a different scope" but the implementation has a TODO comment and no enforcement. Either enforce it or remove the parameter to avoid false trust.

Should fix:

  1. No tests for 4,240 lines of new code. The retrieval pipeline has complex scoring logic (recency boost, importance weighting, MMR diversity, RRF fusion) that needs unit tests.

  2. Unrelated Gmail mount change bundled in. The ~/.gmail-mcp directory mount in buildVolumeMounts is not related to the memory upgrade and should be its own PR.

  3. resolveEmbeddingConfig() called on every getStore() invocation. Should be resolved once and cached alongside the singleton.

  4. stats() caps at 10K rows but countRows() gives the true total. Category/scope breakdowns will be incomplete above 10K memories.

  5. Duplicated clampInt() function in both memory-store.ts and memory-retriever.ts.

The foundation is solid — address the blockers and this is a strong contribution.

@5queezer 5queezer force-pushed the feat/memory-lancedb-pro branch 3 times, most recently from e0d243f to 45b558f Compare March 14, 2026 15:40
@5queezer
Copy link
Author

@Dhebrank Thanks for the thorough review! All 8 items addressed and rebased to a single commit on main:

Must fix:

  1. openai dependency — Was already in package.json ("openai": "^4.0.0"), may have been added between review rounds.

  2. Env-file cleanup race — Replaced setTimeout(30s) with cleanup in the container's close and error event handlers. File now lives exactly as long as the container does.

  3. scope on memory_delete — Removed the TODO. memoryDelete() now passes scope through to store.delete(), which already had full scope enforcement (DB-level filter + app-layer check + throws on mismatch).

Should fix:

  1. Tests — Added 83 unit tests across 6 test files covering: clampInt, noise filter, access tracker, chunker, embedder error formatting, and the full retriever scoring pipeline (recency boost, importance weighting, time decay, length normalization, MMR diversity, RRF fusion, ghost filtering, hard min score, telemetry).

  2. Gmail mount — Removed. Was accidentally included from merging the nanoclaw-gmail channel remote.

  3. resolveEmbeddingConfig() caching — Cached in _embeddingConfig singleton alongside _store/_embedder. Both getStore() and getEmbedder() now go through getEmbeddingConfig().

  4. stats() 10K capcountRows() now used for exact total on unfiltered queries. For filtered queries, results.length is used since countRows() doesn't accept filters. Documented the approximation.

  5. Duplicated clampInt() — Extracted to shared memory-utils.ts, imported by both memory-store.ts and memory-retriever.ts.

@5queezer 5queezer force-pushed the feat/memory-lancedb-pro branch from 45b558f to 2e1d965 Compare March 14, 2026 16:02
@5queezer 5queezer requested a review from Dhebrank March 14, 2026 16:04
Adds persistent vector-based memory to container agents using LanceDB.
Hybrid retrieval pipeline combines vector search + BM25 full-text search
with cross-encoder reranking, recency boost, importance weighting, time
decay, length normalization, and MMR diversity filtering.

Components:
- memory-store: LanceDB storage with multi-scope support
- memory-embedder: OpenAI-compatible embedding with key rotation, chunking
- memory-retriever: hybrid retrieval with score fusion and reranking
- memory-chunker: smart document chunking for long content
- memory-access-tracker: reinforcement-based time decay
- memory-utils: shared utilities (clampInt)
- 83 unit tests covering scoring logic and retrieval pipeline

Agents get 4 MCP tools: memory_store, memory_search, memory_delete, memory_count.

Supports Gemini, Jina, OpenAI, Ollama embeddings and Jina, SiliconFlow,
Voyage, Pinecone, vLLM rerankers.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@5queezer 5queezer force-pushed the feat/memory-lancedb-pro branch from 2e1d965 to eb15980 Compare March 14, 2026 16:18
5queezer pushed a commit to 5queezer/nanoclaw that referenced this pull request Mar 14, 2026
Regression from PR qwibitai#1043 — memoryStore() passed empty {} as metadata
instead of building l0_abstract, l1_overview, l2_content, confidence,
valid_from fields like the original memory-lancedb-pro buildSmartMetadata().

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@5queezer 5queezer force-pushed the feat/memory-lancedb-pro branch from a477a09 to 300f9e6 Compare March 14, 2026 20:51
…-runner

Phase 1 — Pure-logic modules:
- smart-metadata.ts: SmartMemoryMetadata type, buildSmartMetadata(), temporal versioning (fact_key, supersedes/superseded_by), relations, support slices
- memory-categories.ts: 6-category system (profile, preferences, entities, events, cases, patterns) with legacy aliases
- decay-engine.ts: Weibull stretched-exponential decay, tier-specific beta/floors, importance-modulated half-life
- tier-manager.ts: 3-tier promotion/demotion (peripheral→working→core) based on access patterns
- adaptive-retrieval.ts: shouldSkipRetrieval() for trivial queries

Phase 2 — Store + retriever upgrades:
- memory-store.ts: patchMetadata(), lexicalFallbackSearch()
- memory-retriever.ts: DecayEngine, TierManager, lifecycle boost, excludeInactive, adaptive skip
- memory-noise-filter.ts: scoreContentQuality(), enhanced heuristics

Phase 3 — LLM-powered features:
- llm-client.ts: OpenAI-compatible JSON completion (gemini/openai/ollama/custom)
- extraction-prompts.ts: extraction, dedup, merge prompts
- smart-extractor.ts: LLM extract → vector dedup → LLM dedup → persist (6 action handlers)
- self-improvement-files.ts: .learnings/LEARNINGS.md + ERRORS.md
- ipc-mcp-stdio.ts: 5 new MCP tools (memory_list, memory_status, memory_update, memory_extract, self_improvement_log)
- container-runner.ts: forward EXTRACTION_PROVIDER/API_KEY/MODEL/BASE_URL

Phase 4 — Reflection system:
- reflection-metadata.ts, reflection-slices.ts, reflection-ranking.ts, reflection-mapped-metadata.ts
- reflection-item-store.ts, reflection-event-store.ts, reflection-store.ts, reflection-retry.ts
- index.ts: wire extraction + reflection into PreCompact hook

No LanceDB schema changes — all new fields in existing metadata JSON column.
Category backward compat via normalizeCategory() + toLegacyCategory().

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@5queezer 5queezer force-pushed the feat/memory-lancedb-pro branch from 300f9e6 to 1fe591d Compare March 14, 2026 20:54
@5queezer
Copy link
Author

@Dhebrank All advanced memory-lancedb-pro features have been ported. Here's what's new on top of the original PR:

Phase 1 — Pure-logic modules:

  • smart-metadata.ts: Full SmartMemoryMetadata type with multi-level abstraction (l0/l1/l2), temporal versioning (fact_key, supersedes/superseded_by), relations, and support slices
  • memory-categories.ts: 6-category system (profile, preferences, entities, events, cases, patterns) with merge/dedup strategies and backward-compatible legacy aliases
  • decay-engine.ts: Weibull stretched-exponential decay with tier-specific parameters (core β=0.8, working β=1.0, peripheral β=1.3) and importance-modulated half-life
  • tier-manager.ts: 3-tier promotion/demotion (peripheral→working: access≥3 & composite≥0.4; working→core: access≥10 & composite≥0.7 & importance≥0.8)
  • adaptive-retrieval.ts: Skip retrieval for greetings/commands/affirmations, force for memory/temporal/personal queries

Phase 2 — Store + retriever upgrades:

  • patchMetadata() for lightweight metadata updates without re-embedding
  • lexicalFallbackSearch() when FTS index is unavailable
  • Retriever now integrates DecayEngine, TierManager, lifecycle boost (tier floors), excludeInactive (superseded filtering), and adaptive skip
  • Enhanced noise filter with content-quality scoring

Phase 3 — LLM-powered features:

  • llm-client.ts: OpenAI-compatible JSON completion (gemini/openai/ollama/custom via EXTRACTION_PROVIDER)
  • smart-extractor.ts: Full pipeline — LLM extract → vector dedup → LLM dedup decision → persist with 6 action handlers (create/merge/skip/support/supersede/contradict)
  • self-improvement-files.ts: .learnings/LEARNINGS.md + ERRORS.md
  • 5 new MCP tools: memory_list, memory_status, memory_update, memory_extract, self_improvement_log

Phase 4 — Reflection system:

  • 8 new modules: metadata, slices, ranking, mapped-metadata, item-store, event-store, store, retry
  • Per-kind decay defaults (decision 45d, user-model 21d, agent-model 10d, lesson 7d)
  • Wired into PreCompact hook — extraction + reflection runs automatically when EXTRACTION_PROVIDER is set

No breaking changes: No LanceDB schema changes (all new fields in existing metadata JSON column). Category backward compat via normalizeCategory() + toLegacyCategory(). All 89 tests pass.

16 tests covering the full tier lifecycle:
- peripheral→working promotion thresholds (access≥3, composite≥0.4)
- working→core promotion thresholds (access≥10, composite≥0.7, importance≥0.8)
- boundary conditions (just below each threshold)
- core stability and demotion after 90d neglect
- working→peripheral demotion after 30d neglect
- full lifecycle simulation: peripheral→working→core
- batch evaluation filtering

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

PR: Feature New feature or enhancement Status: Needs Review Ready for maintainer review

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants