This document outlines the strategy for maximizing the performance of the MemPalace Node.js/TypeScript engine, leveraging V8-specific optimizations, asynchronous I/O, and worker threads.
Goal: Minimize IPC overhead and maximize ONNX runtime efficiency.
- Batch Worker Logic:
- Refactor
src/storage/embedding_worker.tsto acceptstring[]instead of singlestring. - Update the internal Transformers.js pipeline call to process the batch in one go.
- Refactor
- Vector Storage Batching:
- Add
getEmbeddings(texts: string[])toVectorStorage.ts. - Update
upsertDrawerandmineoperations to chunk inputs into batches (size 10-20).
- Add
- Request Coalescing:
- Implement a debouncer in
getEmbeddingto bundle near-simultaneous tool call requests into a single worker message.
- Implement a debouncer in
Goal: Speed up JSON-RPC communication for AI agents.
- Hybrid Schema Serialization:
- Integrate
fast-json-stringify. - Define a pre-compiled schema for core Python-parity fields (
id,content,wing,room,filedAt). - Use
additionalProperties: trueto support dynamic user-defined metadata.
- Integrate
- Parallel Tool Execution:
- Audit all MCP search/recall tools to ensure they use
Promise.all()for multi-wing or multi-room lookups.
- Audit all MCP search/recall tools to ensure they use
Goal: Move from "Buffer-and-Send" to "Stream-and-Yield".
- Generator-based Memory Stack:
- Refactor
src/core/layers.tsto useAsyncGenerators. - Yield context chunks as they are retrieved, reducing peak RSS (memory) during large recalls.
- Refactor
- Lazy Heuristics:
- Apply regex extraction and AAAK dialect compression on-the-fly as data streams through the pipeline.
Goal: Zero-config global installation.
- Worker Resolution: Finalize robust absolute path resolution for
embedding_worker.js. - Binary Mapping: Map
mempalacecommand to the CLI entry point inpackage.json. - Asset Inclusion: Ensure
hooks/and documentation are in the NPMfileswhitelist.
Goal: Mathematically prove performance gains.
- Latency Audit: Measure TTFM (Time to First Memory) and Ingestion Throughput.
- Accuracy Guardrails: Verify 96.4% R@5 accuracy is maintained post-optimization using
longmemeval_bench.ts.