| title | Memory Guide | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| description | Per-user condensed memory that persists across conversations using the @juspay/hippocampus SDK | |||||||||
| keywords |
|
Since: v9.12.0 | Status: Stable | Availability: SDK
NeuroLink includes a memory engine powered by the @juspay/hippocampus SDK. Unlike conversation memory (which tracks recent turns in a session), memory maintains a condensed summary of durable facts about each user across all conversations.
Key characteristics:
- Per-user: Each user gets an independent memory store keyed by
userId - Condensed: Memory is kept to a configurable word limit (default 50 words) via LLM-powered condensation
- Persistent: Stored in S3, Redis, SQLite, or a custom backend — survives server restarts
- Non-blocking: Memory storage happens in the background after each generate/stream call
- Crash-safe: Every SDK method is wrapped in try-catch — errors are logged, never thrown
User prompt arrives
│
▼
┌─────────────┐
│ memory.get() │ ← Retrieve condensed memory for this userId
└──────┬──────┘
│ Prepend memory context to prompt
▼
┌─────────────┐
│ LLM call │ ← generate() or stream() as normal
└──────┬──────┘
│
▼
┌──────────────┐
│ memory.add() │ ← In background: condense old memory + new turn via LLM
└──────────────┘
On each generate() or stream() call:
- Retrieve:
memory.get(userId)fetches the user's condensed memory (if any) - Inject: The memory is prepended to the user's prompt as context
- Generate: The LLM processes the enhanced prompt normally
- Store: After the response completes,
memory.add(userId, content)runs in the background. The SDK sends the old memory + new conversation turn to an LLM which produces a new condensed summary
import { NeuroLink } from "@juspay/neurolink";
const neurolink = new NeuroLink({
conversationMemory: {
enabled: true,
memory: {
enabled: true,
storage: {
type: "s3",
bucket: "my-memory-bucket",
prefix: "memory/condensed/",
},
neurolink: {
provider: "google-ai",
model: "gemini-2.5-flash",
},
maxWords: 50,
},
},
});
// Memory is automatically retrieved and stored on each call
const result = await neurolink.generate({
input: { text: "My name is Alice and I run a Shopify store." },
context: { userId: "user-123" },
});
// Next call — the AI already knows about Alice
const result2 = await neurolink.generate({
input: { text: "What platform do I use?" },
context: { userId: "user-123" },
});
// → "You use Shopify."The memory field on conversationMemory accepts a Memory object:
type Memory = HippocampusConfig & { enabled?: boolean };| Field | Type | Description |
|---|---|---|
enabled |
boolean | Set true to activate memory |
storage.type |
string | Storage backend: "s3", "redis", "sqlite", or "custom" |
neurolink.provider |
string | AI provider for condensation LLM calls |
neurolink.model |
string | Model for condensation LLM calls |
| Field | Type | Default | Description |
|---|---|---|---|
maxWords |
number | 50 | Maximum words in the condensed memory |
prompt |
string | built-in | Custom condensation prompt (supports {{OLD_MEMORY}}, {{NEW_CONTENT}}, {{MAX_WORDS}} placeholders) |
storage.bucket |
string | — | S3 bucket name (required for S3 storage) |
storage.prefix |
string | — | S3 key prefix for memory objects |
storage.url |
string | — | Redis connection URL (required for Redis storage) |
storage.path |
string | — | SQLite file path (required for SQLite storage) |
storage.onGet |
function | — | Callback to retrieve memory (required for custom storage) |
storage.onSet |
function | — | Callback to persist memory (required for custom storage) |
storage.onDelete |
function | — | Callback to delete memory (required for custom storage) |
storage.onClose |
function | — | Callback for cleanup on close (optional for custom storage) |
memory: {
enabled: true,
storage: {
type: "s3",
bucket: "my-bucket",
prefix: "memory/condensed/",
},
neurolink: { provider: "google-ai", model: "gemini-2.5-flash" },
}Each user's memory is stored as a single S3 object at {prefix}{userId}.
memory: {
enabled: true,
storage: {
type: "redis",
url: "redis://localhost:6379",
},
neurolink: { provider: "openai", model: "gpt-4o-mini" },
}memory: {
enabled: true,
storage: {
type: "sqlite",
path: "./memory.db",
},
neurolink: { provider: "google-ai", model: "gemini-2.5-flash" },
}Note: SQLite requires the
better-sqlite3optional peer dependency. Install it manually:pnpm add better-sqlite3
Delegates storage to your application via callbacks. Use this when you want to manage persistence yourself — call your own API, write to your own database, or integrate with any external system.
memory: {
enabled: true,
storage: {
type: "custom",
onGet: async (ownerId) => {
// Retrieve memory from your own storage
return await myDB.getMemory(ownerId);
},
onSet: async (ownerId, memory) => {
// Persist the condensed memory
await myDB.saveMemory(ownerId, memory);
},
onDelete: async (ownerId) => {
// Delete memory
await myDB.deleteMemory(ownerId);
},
},
neurolink: { provider: "google-ai", model: "gemini-2.5-flash" },
}The three callbacks (onGet, onSet, onDelete) are required. An optional onClose callback can be provided for cleanup when the SDK shuts down.
Example — file-based storage:
import { readFile, writeFile, unlink, mkdir } from "node:fs/promises";
import { join } from "node:path";
const memoryDir = "./data/memory";
memory: {
enabled: true,
storage: {
type: "custom",
onGet: async (ownerId) => {
try {
return await readFile(join(memoryDir, `${ownerId}.txt`), "utf-8");
} catch {
return null;
}
},
onSet: async (ownerId, memory) => {
await mkdir(memoryDir, { recursive: true });
await writeFile(join(memoryDir, `${ownerId}.txt`), memory, "utf-8");
},
onDelete: async (ownerId) => {
try { await unlink(join(memoryDir, `${ownerId}.txt`)); } catch { /* ignore */ }
},
},
neurolink: { provider: "google-ai", model: "gemini-2.5-flash" },
}The condensation prompt controls how the LLM merges old memory with new conversation turns. You can provide a custom prompt using the prompt field:
memory: {
enabled: true,
storage: { type: "s3", bucket: "my-bucket" },
neurolink: { provider: "google-ai", model: "gemini-2.5-flash" },
prompt: `You are a memory engine. Merge the old memory with new facts into a summary of at most {{MAX_WORDS}} words.
OLD_MEMORY:
{{OLD_MEMORY}}
NEW_CONTENT:
{{NEW_CONTENT}}
Condensed memory:`,
maxWords: 100,
}| Placeholder | Replaced With |
|---|---|
{{OLD_MEMORY}} |
The user's existing condensed memory (may be empty) |
{{NEW_CONTENT}} |
The new conversation turn: "User: ...\nAssistant: ..." |
{{MAX_WORDS}} |
The configured maxWords value |
Memory integrates automatically with both generate() and stream():
- Before the LLM call: Memory is retrieved and prepended to the input text
- After the LLM call: The conversation turn is stored in the background via
setImmediate() - Timeouts: Retrieval has a 3-second timeout; storage has a 10-second timeout (includes LLM condensation)
- Errors are non-blocking: If memory retrieval or storage fails, the generate/stream call continues normally
For memory to activate on a call, all three conditions must be met:
memory.enabledistruein the configoptions.context.userIdis provided in the generate/stream call- The response has non-empty content (for write)
When memory is globally enabled, it is active for every generate() and stream() call by default. You can override this behavior on a per-call basis using the memory option without changing the global config.
Available flags:
| Flag | Type | Default | Description |
|---|---|---|---|
enabled |
boolean | true |
Master toggle — when false, both read and write are skipped |
read |
boolean | true |
Whether to read past memory and prepend it to the prompt |
write |
boolean | true |
Whether to write this conversation turn into memory after the call |
Note: These flags only take effect when the global memory SDK is enabled. If global memory is disabled, per-call flags have no effect.
Precedence:
- Global config — Is memory enabled globally? If not, per-call flags are ignored.
enabled— Master per-call toggle. Iffalse, both read and write are skipped regardless of individual flags.read/write— Fine-grained control over individual operations.
Use when you want past context but don't want this call stored — e.g., code review where you'll store a curated summary later.
const result = await neurolink.generate({
input: { text: "Review this pull request for security issues" },
memory: { read: true, write: false },
context: { userId: "user-123" },
});Use for onboarding or seeding memory without injecting past context into the prompt.
const result = await neurolink.generate({
input: {
text: "My name is Alice. I work on the payments team and use Python.",
},
memory: { read: false, write: true },
context: { userId: "user-123" },
});Use for operational or utility calls where memory adds noise.
const result = await neurolink.generate({
input: { text: "Fetch the latest PR comments from GitHub" },
memory: { enabled: false },
context: { userId: "user-123" },
});The same memory option works identically in stream().
const stream = await neurolink.stream({
input: { text: "Summarize today's standup notes" },
memory: { read: true, write: false },
context: { userId: "user-123" },
});Retrieve and store memory for multiple users in a single generate() or stream() call. This enables layered memory — combining a user's personal context with org-level policies, team context, or any other memory scope.
The primary user is always determined by context.userId. Additional users are specified via memory.additionalUsers. Memory for all users (primary + additional) is fetched and stored in parallel.
const result = await neurolink.stream({
input: { text: "How should I handle PCI data in our API?" },
context: { userId: "user-alice" },
memory: {
additionalUsers: [
{
userId: "org-acme",
label: "Organization Policy",
prompt: `Extract only compliance requirements, security policies, and org-level decisions.
OLD_MEMORY:
{{OLD_MEMORY}}
NEW_CONTENT:
{{NEW_CONTENT}}
Condensed memory (max {{MAX_WORDS}} words):`,
maxWords: 100,
},
{
userId: "team-payments",
label: "Team Context",
},
],
},
});When multiple users' memories are retrieved, they are formatted with labels and injected into the prompt:
Context from previous conversations:
[User]
Alice is a senior engineer on the payments team, prefers Python.
[Organization Policy]
PCI-DSS Level 1 compliance required. All cardholder data must be encrypted at rest and in transit.
[Team Context]
Payments team uses microservices architecture with Stripe integration.
Current user's request: How should I handle PCI data in our API?
The primary user's label is always "User". Additional users use the label field, falling back to userId if not set.
Each additional user can specify a custom prompt and maxWords for its condensation strategy. This is useful when different memory scopes need different extraction rules — e.g. personal preferences vs compliance policies.
The prompt must include {{OLD_MEMORY}}, {{NEW_CONTENT}}, and {{MAX_WORDS}} placeholders. See Custom Condensation Prompt for details.
Control which additional users participate in read and write independently:
memory: {
additionalUsers: [
{ userId: "org-acme", label: "Org Policy", write: false }, // read-only
{ userId: "team-x", label: "Team", read: false }, // write-only
],
}| Field | Type | Default | Description |
|---|---|---|---|
userId |
string | required | The owner ID to retrieve/store memory for |
label |
string | userId | Label used in the formatted memory context |
read |
boolean | true |
Whether to read this user's memory |
write |
boolean | true |
Whether to write conversation into this user's memory |
prompt |
string | default | Custom condensation prompt for this user |
maxWords |
number | default | Max words for this user's condensed memory |
The @juspay/hippocampus SDK reads these environment variables:
| Variable | Default | Description |
|---|---|---|
HC_LOG_LEVEL |
warn |
SDK log level: debug, info, warn, error |
HC_CONDENSATION_PROMPT |
built-in | Default condensation prompt (overridden by config prompt) |
The memory SDK is designed to never crash the host application:
- Every public method (
get(),add(),delete(),close()) is wrapped in try-catch - Errors are logged via
logger.warn()and safe defaults are returned get()returnsnullon erroradd()silently fails on error- Storage initialization errors result in memory being disabled (returns
nullfromensureMemoryReady())
NeuroLink re-exports the memory types for use in host applications:
import type { Memory, CustomStorageConfig } from "@juspay/neurolink";
// Memory = HippocampusConfig & { enabled?: boolean }
// CustomStorageConfig = { type: 'custom', onGet, onSet, onDelete, onClose? }- Conversation Memory - Session-based conversation history
- Memory Integration - Advanced hippocampus configuration and patterns
- Context Compaction - Automatic context window management
- Context Summarization - Conversation compression