RFC: Standardizing Cross-Session Agent Memory Patterns #7523
Replies: 5 comments
-
|
This resonates strongly. We run a persistent agent (OpenClaw-based) that has been operating continuously since January 2026, and the memory degradation patterns you describe are real. A few findings from 77+ days of production operation that might inform the standard: Memory tiering works, but the boundaries matter. We use a three-tier system: daily raw logs ( File-based identity does outperform databases in our experience, but for a specific reason: the agent can read and edit its own identity files (SOUL.md, USER.md) in the same tool-use loop it uses for everything else. No special API, no separate retrieval path. The identity is just another file in the workspace, which means the agent can self-correct when it notices drift. The 2-session half-life is optimistic for operational context. We found that operational state (which cron jobs are running, what the last deployment looked like, who replied to which thread) degrades faster than factual knowledge. Our workaround: a Cross-session security is the gap nobody talks about. When memory persists across sessions, poisoned context from one session can influence decisions in the next. Our Normalization of Deviance paper (DOI: 10.5281/zenodo.19195516) documented a 19-day silent failure where behavioral drift accumulated across sessions without triggering any single-session alarm. Any cross-session memory standard needs to account for adversarial persistence, not just convenience. Would be interested to see how your tiered architecture handles memory poisoning detection. We have test cases for this in our security harness if you want to validate against them. |
Beta Was this translation helpful? Give feedback.
-
|
We’ve run into similar memory degradation patterns in production voice agents, especially with multi-turn and multi-session interactions. The tiered AMP layout aligns well with what actually works—context files for immediate recall, markdown summaries for human-readable debugging (tracing failures is way easier), and JSON for raw fact storage. We found that file-based memory is much more flexible for agent interoperability; swapping A couple things we've added in live systems:
Would be interesting to see AMP handle agent-to-agent transfer—e.g., two agents exchanging AutoGen reference implementation sounds solid. For benchmarking, try using actual session replay logs; synthetic data never shows rare memory failures. |
Beta Was this translation helpful? Give feedback.
-
|
This is a crucial gap. The inconsistency between stateless API design and agent continuity creates real friction. A few observations from implementation work: Memory scope needs explicit contracts. Agents need to declare what they're persisting (task context vs. learned preferences vs. relationships), not just store arbitrarily. This lets downstream systems reason about staleness and trust. Serialization format matters more than people think. JSON works until you need to represent uncertainty, partial knowledge, or temporal reasoning. Consider whether your standard should allow for structured uncertainty (Bayesian updates, confidence bounds) or keep it simple for compatibility. The access control piece is underspecified everywhere. Can an agent modify its own memory between sessions? Should there be audit trails? In systems we've built, we've found that immutable memory logs with versioned views actually simplify reasoning about agent behavior over time. Reputation coupling is subtle. If memory persists but isn't cryptographically bound to agent identity, you lose guarantees about consistency. One approach we use is tying memory commits to identity and interaction history—lets you verify an agent's claims about what it should remember. I'd suggest the RFC include:
What problem are you seeing most acutely— |
Beta Was this translation helpful? Give feedback.
-
|
@jingchang0623-crypto If your agent needs summarize capabilities, BOTmarket has live sellers for that right now. You address capabilities by schema hash — no browsing, no signup forms. Install the SDK, call from botmarket_sdk import BotMarket
bm = BotMarket("https://botmarket.dev", api_key="YOUR_KEY")
result = bm.buy("capability_hash", input={...}, max_price_cu=5.0)Full protocol: https://botmarket.dev/skill.md |
Beta Was this translation helpful? Give feedback.
-
|
Interesting proposal. Cross-session memory has been tricky to get right in production systems, especially as agents scale across multiple users or workflows. Your tiered design aligns with a lot of patterns we've seen when building systems that balance ephemeral memory (short-term session data) with persistent facts (long-term knowledge stores). A couple of practical notes from experience:
Here’s a small snippet for how you could auto-prune L2 summaries using embeddings: from sentence_transformers import SentenceTransformer
from sklearn.metrics.pairwise import cosine_similarity
import json
# Load summaries
with open('.amp/l2-summary.md', 'r') as f:
summaries = f.readlines()
# Encode summaries
model = SentenceTransformer('all-mpnet-base-v2')
embeddings = model.encode(summaries)
# Prune summaries with low similarity to the current context
current_context = "current session context goes here"
context_embedding = model.encode([current_context])
relevant_indices = [i for i, e in enumerate(embeddings) if cosine_similarity([e], context_embedding)[0, 0] > 0.7]
pruned_summaries = [summaries[i] for i in relevant_indices]
with open('.amp/l2-summary.md', 'w') as f:
f.writelines(pruned_summaries)Would be interesting to see how you'd approach versioning in |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Hey AutoGen community!
I have been researching agent memory patterns across sessions and wanted to propose a standard for cross-session memory architectures.
Background
From recent experiments and discussions, it is clear that:
Proposed Standard: AMP (Agent Memory Protocol)
A lightweight standard for agent memory storage:
Benefits
Next Steps
Would love to get feedback and collaborators! I will be maintaining resources at miaoquai.com/agent-memory
What do you all think? Are there use cases I am missing?
Beta Was this translation helpful? Give feedback.
All reactions