Added git-memory-store design doc by maisieyanz · Pull Request #2895 · strands-agents/harness-sdk

maisieyanz · 2026-06-22T16:18:38Z

Description

Related Issues

Documentation PR

Type of Change

Bug fix
New feature
Breaking change
Documentation update
Other (please describe):

Testing

How have you tested the change? Verify that the changes do not break functionality or introduce new warnings.

I ran hatch run prepare

Checklist

I have read the CONTRIBUTING document
I have reviewed and understand every line of code in this PR, including any generated by AI tools, and I can explain why it works
My change is focused and reasonably small; I have split unrelated work into separate PRs
I have added any necessary tests that prove my fix is effective or my feature works
I have updated the documentation accordingly
I have added an appropriate example to the documentation to outline the feature, or no new docs are needed
My changes generate no new warnings
Any dependent changes have been merged and published

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

notowen333 · 2026-06-22T17:07:25Z

+MemoryManager already handles extraction (promoting observations into long-term memory) and retrieval (injecting stored knowledge back into context). What it lacks is a maintenance layer and a shared substrate:
+
+- **No maintenance** — there is no built-in way to deduplicate, resolve contradictions, or restructure stored knowledge after it's written. MemoryManager writes and retrieves, but never improves what's stored.
+- **No unified timeline** — when L1 and L2 use separate backends, there is no single view that shows when sessions happened, when facts were extracted, and how knowledge evolved. Debugging requires checking multiple systems independently.


Would be helpful to define L1 and L2 before referencing them

notowen333 · 2026-06-22T17:08:16Z

+
+---
+
+## Decision


doc nit: It feels weird that we are saying "Decision" right after "Context"

I think it's time to update our template; https://github.com/maisieyanz/harness-sdk/blob/017553452bb81d618f46ea221bb86cf9abf259a1/team/designs/README.md

Weighing of Pros & Cons needs to be in there too

mkmeral · 2026-06-22T17:09:18Z

+
+## Decision
+
+`GitMemoryStore` is a unified, git-backed storage layer that implements both the `Storage` interface (for `ContextManager`, L1) and the `MemoryStore` interface (for `MemoryManager`, L2) against a single git repository. It serves as a single versioned, inspectable, diffable repository containing everything an agent has learned and experienced (session history, extracted facts, and learned skills).


so this is a new memory implementation? did we even have a file based memory implementation before moving onto timeline based issues?

Yes this is a new memory implementation built on top of Thomas' memory manager

mkmeral · 2026-06-22T17:09:33Z

+MemoryManager already handles extraction (promoting observations into long-term memory) and retrieval (injecting stored knowledge back into context). What it lacks is a maintenance layer and a shared substrate:
+
+- **No maintenance** — there is no built-in way to deduplicate, resolve contradictions, or restructure stored knowledge after it's written. MemoryManager writes and retrieves, but never improves what's stored.
+- **No unified timeline** — when L1 and L2 use separate backends, there is no single view that shows when sessions happened, when facts were extracted, and how knowledge evolved. Debugging requires checking multiple systems independently.


is this for episodic memory? why do we need timeline? what do we use it for?

It helps with debugging especially if an agent gives a wrong response etc

lizradway · 2026-06-22T17:10:56Z

+  constructor(config: GitMemoryStoreConfig)
+
+  // --- Storage (ContextManager L1) ---
+  async store(key: string, content: Uint8Array, contentType?: string): Promise<string>


could consider just defining one store/write method and specify the storage location on the function ie. store for both L1 and L2 is same function with diff arg since storage is unified.

curious to hear arguments for or against. im curious what thomas thinks

I'm wondering if we should conflate the MemoryStore with the L1 storage. Since you already propose an agnostic write system, and L1 storage should be its own construct (do we use session manager today?), I'd expect to use that same pattern:

I create a Storage from FileStoreConfig /

I create a FileMemoryStorage that I plug into my MemoryManager

I create L1Storage that I plug into my L1Manager/SessionManager

zastrowm · 2026-06-22T17:11:18Z

+
+const agent = new Agent({
+    model,
+    contextManager: new ContextManager({


Is ContextManager something we already have?

yes this project is building off of the existing context manager and memory manager implementations

no the class doesn't exist yet, but working on it.

at minimum if we do not want to block this on an elegant context solution, what this requires is a storage param added to the contextManager that writes out to L1 on eviction which should be super simple.

from my perspective, if context work somehow gets dropped the memory manager implementation with consolidation is enough for this project. context manager L1 is nice to have session based data for, but less interesting imo than the memory consolidation aspect.

JackYPCOnline · 2026-06-22T17:11:46Z

+
+MemoryManager already handles extraction (promoting observations into long-term memory) and retrieval (injecting stored knowledge back into context). What it lacks is a maintenance layer and a shared substrate:
+
+- **No maintenance** — there is no built-in way to deduplicate, resolve contradictions, or restructure stored knowledge after it's written. MemoryManager writes and retrieves, but never improves what's stored.


LLM will handle the update? any exmaple of contraditions?

zastrowm · 2026-06-22T17:12:25Z

+
+## Consolidation
+
+Consolidation improves memory quality after facts accumulate. It is a developer-invoked Strands agent exposed as a method on `MemoryManager` (defined under `src/memory/consolidation/`). It reads stored knowledge, reasons across files, and writes changes directly to the git repo via filesystem tools (`readFile`, `writeFile`, `deleteFile`, `gitCommit`). Every change is a git commit, so bad consolidation is trivially reversible with `git revert`.


When we say " It is a developer-invoked" are we just saying that it's up to callers to control when it's enacted?

I'm guessing we do want some sort of "automatic" behavior for agents more generally

The developer can control the consolidation frequency. This is explicitly set in the GitHub action that triggers the consolidation.

mkmeral · 2026-06-22T17:13:23Z

+
+The `metadata` fields come from the `ModelExtractor` when automatic extraction is configured — its system prompt instructs it to produce a title and description for each extracted fact (see [Appendix A](#appendix-a-extraction-configuration) for the configuration example). When the agent uses the `store_memory` tool instead (explicit write), no extractor is involved — `add()` receives raw content with no metadata and falls back to deriving both from the content.
+
+**`search(query, options?)`**


I want us to focus more in here. Behind "git memory" there is a filesystem based memory implementation proposal here, that we do not spend enough time explaining. We need to dive into there

mkmeral · 2026-06-22T17:14:54Z

+
+### Architecture
+
+`GitMemoryStore` is a single class that implements both the `Storage` interface (for `ContextManager`) and the `MemoryStore` interface (for `MemoryManager`). This dual implementation is what enables the unified timeline — both layers write commits to the same repo, so `git log` shows the complete history of sessions and knowledge together.


Do we need an interface that combines them? Will all customers need to implement both? cc @lizradway @opieter-aws

this is an ongoing question in which we have begged for a unified storage interface, but haven't had the time to whip up a good design.

if i remember correctly i beleive we discussed this with @JackYPCOnline / you/ jonathan the other day and decided we would prioritize unified storage interface in the next sprint?

transcripts are a data type/definition. not necessarily storage. so if we need to fix/improve this, let's do :)

mkmeral · 2026-06-22T17:15:48Z

+
+**`search(query, options?)`**
+
+Required by the `MemoryStore` interface. Performs keyword matching (grep) against filenames, `description` frontmatter, and file content, excluding `knowledge/system/` (already loaded in full). Returns the top matches as `MemoryEntry[]`, ranked by term frequency. No model call, no embeddings.


do we know keyword matching is good enough?

progressive disclosure is the primary retrieval / search mechanism. The keyword matching search() implementation is more of a fallback

currently is what our context offloader does for search pretty much is grep/regex and that has performed pretty well.

i actually experimented with some semantic similarly/relevance based mechanisms as well briefly and it performed a tad worse (but barely).

notowen333 · 2026-06-22T17:16:06Z

+
+`GitMemoryStore` is a unified, git-backed storage layer that implements both the `Storage` interface (for `ContextManager`, L1) and the `MemoryStore` interface (for `MemoryManager`, L2) against a single git repository. It serves as a single versioned, inspectable, diffable repository containing everything an agent has learned and experienced (session history, extracted facts, and learned skills).
+
+The existing Strands API remains unchanged. `ContextManager` still owns L0 <--> L1, `MemoryManager` still owns L1 --> L2. What changes is the physical storage: instead of separate, disconnected backends for each layer, both write to the same git repo. Every write from any layer produces an informative git commit, giving developers a complete audit trail using standard git tooling (`git log`, `git diff`, `git revert`).


So two questions:

I assume that I can use this as just context storage or just as a memory store right?

Are there changes being proposed to the MemoryManager in this doc? It seems like we're pointing out shortcomings with the current design?

I wouldn't be changing the actual MemoryManager interface this would just be an alternative memory storage for agents.

zastrowm · 2026-06-22T17:17:01Z

+});
+```
+
+Scheduling frequency is controlled externally by the GitHub Action or cron job. See [Appendix B](#appendix-b-github-action-yaml) for a GitHub Action example.


I think we're talking about two things in this doc? 1) the feature overall (git based memory) and 2) Strand's usage of it?

I think (1) should be the focus with (2) being a dog-fooding exercise which is worth talking about, but ultimately is a specific use case that can be addressed in a couple of different ways

yonib05 · 2026-06-22T17:17:30Z

+
+### Context Loading
+
+Files in `knowledge/system/` are always loaded in full into the system prompt. This is where core context lives (persona, key preferences, critical project facts). Everything outside `system/` is visible by filename + description only, loaded when the agent reads it.


Is it up to the user to explain the file structure, or is that handled by the primitive?

The primitive would handle this

notowen333 · 2026-06-22T17:18:31Z

+
+// Option 2: Standalone script (no agent session needed — for cron, GitHub Action, CLI)
+const memoryManager = new MemoryManager({ stores: [myStore] });
+await memoryManager.consolidate({


consolidate doesn't exist on the manager now. If it's specific to this store, it might make more sense to expose it there

i would imagine this would be exposed on other stores eventually (ie. file system at minimum)

Agree with @notowen333 , this is a store functionality

yonib05 · 2026-06-22T17:18:50Z

+
+Each session writes to its own branch, merges back to `main` on close.
+
+**Why rejected:** Path-based isolation (`sessions/{id}.md`) achieves the same separation without branch management overhead or merge conflicts.


What happens when 2 agents try to modify the same file for things like consolidation?

mkmeral · 2026-06-22T17:19:01Z

+
+All extracted facts land in `knowledge/facts/` by default — `GitMemoryStore.add()` writes there unconditionally for simplicity and to avoid a classification model call on every extraction. Consolidation is therefore responsible for reorganizing files into appropriate subdirectories (`skills/`, `system/`, etc.) during offline maintenance, when it has full cross-file context to make informed categorization decisions.
+
+### How It Works


what do we actually generate with consolidation though?

mkmeral · 2026-06-22T17:19:49Z

+});
+```
+
+Scheduling frequency is controlled externally by the GitHub Action or cron job. See [Appendix B](#appendix-b-github-action-yaml) for a GitHub Action example.


Is this a feature for the SDK, or a feature for a specific agent? how scheduling done would depend on that answer

zastrowm · 2026-06-22T17:22:11Z

+- **No maintenance** — there is no built-in way to deduplicate, resolve contradictions, or restructure stored knowledge after it's written. MemoryManager writes and retrieves, but never improves what's stored.
+- **No unified timeline** — when L1 and L2 use separate backends, there is no single view that shows when sessions happened, when facts were extracted, and how knowledge evolved. Debugging requires checking multiple systems independently.
+
+A git-based approach addresses these issues: version-controlled memory provides built-in history, diffing, and rollback. A shared git repository for both layers provides a unified timeline across sessions and knowledge in a single `git log`. And a developer-invoked consolidation agent provides the missing maintenance mechanism. Since Strands is a client-side SDK with no server process, a scheduled GitHub Action is the natural trigger for consolidation.


How important are most of these properties for the memory; I think the biggest win that we get from git is the the snapshot of the FS at the specific time, the other items just seem like audit-ability vs concrete items that we actually need. I'm curious if this generalizes to something larger like "Snapshottable environment/filesystem" which is what the abstraction should be built on

+1 ^ the way i read this proposal is "we want a filesystem based memory, and git can be a nice way to audit it"

I am not sure how git plugs into this thing functionally

i think @opieter-aws has filebased memory on his roadmap. i worry if a filesystem based memory interface is over-engineered though and if we should instead just implement Snapshottable or something like that that this can extend

notowen333 · 2026-06-22T17:23:23Z

+interface GitMemoryStoreConfig {
+  // Required
+  name: string;
+  repoPath: string;


So this seems like we need the git repo to be accessible locally in the same filesystem as the top level agent runtime? Is that correct?

Where does the GitHub assumption start and stop in this design?

gautamsirdeshmukh · 2026-06-22T17:25:37Z

+
+The `operations` config controls which directives go into the agent's system prompt. They are prompt instructions — the LLM decides how to apply them.
+
+| Operation | Agent behavior | Example |


Would these operations run sequentially in the order that they appear in the operations list in the config? Is there any sort of underlying priority of operations? I'd imagine the order will lead to varying qualities of consolidation

The priority would be decided by the agent

gautamsirdeshmukh · 2026-06-22T17:34:19Z

+
+
+
+### Method Behavior


You had mentioned git log earlier in the doc for getting a full session history - would be interested to see what git APIs you envision being used for each of these methods

mkmeral · 2026-06-22T17:55:37Z

+```
+---
+
+## Consolidation


how does this fit into extraction logic from memory? can't we use that one?

opieter-aws · 2026-06-25T00:43:52Z

+
+LLM-powered agents struggle with maintaining and managing long-term memory effectively. As memory accumulates over extended interactions, memory quality degrades. In the Strands SDK, there is no built-in maintenance mechanism that can combine, deduplicate, resolve contradictions, or restructure isolated facts. Over long-horizon use, memory files accumulate redundancy and lose coherent structure, making retrieval less reliable and context windows less efficient.
+
+MemoryManager already handles extraction (promoting observations into long-term memory) and retrieval (injecting stored knowledge back into context). What it lacks is a maintenance layer and a shared substrate:


You can position this slightly differently: a maintenance layer wouldn't necessarily be a functionality of the MM, but of the store implementation. There exist managed backends that serve this purpose, but these are server-based. What you introduce here is a local alternative to a managed memory store. This allows for both a maintenance layer, but also outside-of-the-agent-loop indexing (which enables semantic search rather than lexical search)

opieter-aws · 2026-06-25T00:45:24Z

+
+A file-based memory system addresses these issues by organizing knowledge as a structured file hierarchy that the agent can navigate directly. By abstracting the storage layer behind a `FileBackend` interface, the same file-based memory system can be backed by a local filesystem, a git repository, S3, or any other store that supports basic file operations. A developer-invoked consolidation agent provides the missing maintenance mechanism — it reads accumulated knowledge, deduplicates redundant entries, resolves contradictions, and reorganizes files, running offline so it doesn't add latency to agent sessions.
+
+The existing `BedrockKnowledgeBaseStore` addresses retrieval via managed vector search, but requires provisioned AWS infrastructure (Bedrock Knowledge Base, credentials, optional S3). This is well-suited for production and enterprise deployments where teams already have AWS infrastructure. `FileMemoryStore` targets the other end: individual developers, prototyping, and environments where standing up a managed service is unnecessary overhead. It requires zero external infrastructure, just a filesystem.


I'd move this up, as IMO this is the main point, and it avoids conflating the memory manager and memory store concepts

opieter-aws · 2026-06-25T00:49:17Z

+
+The storage backend is abstracted behind a `FileBackend` interface — any system that can read, write, list, and delete files can serve as the underlying store. This enables the same memory system to run against a local directory, a git repository, S3, or a custom implementation without changing the memory logic.
+
+The existing Strands API remains unchanged. `ContextManager` still owns L0 <--> L1, `MemoryManager` still owns L1 --> L2. What changes is the physical storage: instead of separate, disconnected backends for each layer, both write to the same file hierarchy. Every write from any layer is routed through the `FileBackend`, which determines how persistence, history, and atomicity are handled.


Can you devote a sentence on the state of L1 today? What are the current solutions?

opieter-aws · 2026-06-25T00:50:17Z

+
+### File Hierarchy
+
+Both the `ContextManager` and `MemoryManager` write to the same file hierarchy but are isolated by path: L1 writes to `sessions/`, while L2 writes to `knowledge/`. Consolidation metadata lives in `consolidation/`.


Be mindful when you use MemoryManager and MemoryStore: the MemoryManager never writes anything, it just calls a MemoryStore's methods

opieter-aws · 2026-06-25T00:52:11Z

+
+## Progressive Disclosure
+
+Not everything loads into context every turn. The agent retrieves relevant knowledge on demand by navigating the file hierarchy directly. LLMs are precise and accurate at scoped filesystem calls (listing directories, grepping for keywords, reading specific files), and progressive disclosure leverages this skill as the primary retrieval mechanism.


The agent retrieves relevant knowledge on demand by navigating the file hierarchy directly

Can you make this more concrete? Currently, when the agent searches or adds is mandated by the MemoryManager, this likely wouldn't be in control of the store itself

opieter-aws · 2026-06-25T00:55:31Z

+The full directory listing of `knowledge/` with each file's `description` frontmatter is injected into the agent's system prompt every turn. The agent always knows what knowledge exists without loading the content:
+
+```
+knowledge/


Are the sub directories static? How does the agent get context about what's in the sub-dir? Does it need that?

opieter-aws · 2026-06-25T00:56:38Z

+
+Files in `knowledge/system/` are always loaded in full into the system prompt. This is where core context lives (persona, key preferences, critical project facts). Everything outside `system/` is visible by filename + description only, loaded when the agent reads it.
+
+**Who manages `system/`:** Developers seed `system/` at repo creation with anything the agent always needs (persona, core preferences). The consolidation agent promotes and demotes files during offline maintenance, analyzing cross-session patterns to move broadly relevant files into `system/` and overly specific ones out. The main agent never writes to `system/` during a session.


How do we avoid system prompt bloat here if all files with all content in this directory go into the system prompt? Would be interesting to do some testing to see the difference with just context injection based on query.

opieter-aws · 2026-06-25T00:58:39Z

+`FileMemoryStore` is the class that implements both the `Storage` interface (for `ContextManager`) and the `MemoryStore` interface (for `MemoryManager`). It operates on the file hierarchy through whichever `FileBackend` is provided. This dual implementation enables the unified timeline — both layers write to the same hierarchy, so the agent sees sessions and knowledge together.
+
+```typescript
+interface FileMemoryStoreConfig {


Consider if you want to add namespaces here to allow for mutli-tenacy

opieter-aws · 2026-06-25T01:08:36Z

+
+**`search(query, options?)`**
+
+Required by the `MemoryStore` interface. Performs keyword matching against filenames, `description` frontmatter, and file content, excluding `knowledge/system/` (already loaded in full). Returns the top matches as `MemoryEntry[]`, ranked by term frequency. No model call, no embeddings.


May be out of scope, but think about how we can extend with an indexing layer to support semantic search

opieter-aws · 2026-06-25T01:10:15Z

+
+Consolidation improves memory quality after facts accumulate. It is a developer-invoked Strands agent exposed as a method on `MemoryManager` (defined under `src/memory/consolidation/`). It reads stored knowledge, reasons across files, and writes changes through the `FileBackend`. Every change is versioned by the backend (via `history()`/`rollback()`), so bad consolidation is trivially reversible.
+
+All extracted facts land in `knowledge/facts/` by default — `FileMemoryStore.add()` writes there unconditionally for simplicity and to avoid a classification model call on every extraction. Consolidation is therefore responsible for reorganizing files into appropriate subdirectories (`skills/`, `system/`, etc.) during offline maintenance, when it has full cross-file context to make informed categorization decisions.


Does the agent here create the files and add the descriptions?

opieter-aws · 2026-06-25T01:11:36Z

+       This serves as both an audit log and the cursor for the next "since-last" run.
+```
+
+### Operations


I don't understand how the operations integrate with the agent / system prompt injection. Can you explain the mechanism?

Added git-memory-store design doc

57df48a

github-actions Bot added size/m documentation Documentation changes, improvements, additions, content updates, site improvements, examples, guides area-persistence Session management or checkpointing enhancement New feature or request labels Jun 22, 2026

cleaned up design doc

0175534

notowen333 reviewed Jun 22, 2026

View reviewed changes

mkmeral reviewed Jun 22, 2026

View reviewed changes

lizradway reviewed Jun 22, 2026

View reviewed changes

zastrowm reviewed Jun 22, 2026

View reviewed changes

JackYPCOnline reviewed Jun 22, 2026

View reviewed changes

zastrowm reviewed Jun 22, 2026

View reviewed changes

mkmeral reviewed Jun 22, 2026

View reviewed changes

notowen333 reviewed Jun 22, 2026

View reviewed changes

zastrowm reviewed Jun 22, 2026

View reviewed changes

yonib05 reviewed Jun 22, 2026

View reviewed changes

notowen333 reviewed Jun 22, 2026

View reviewed changes

yonib05 reviewed Jun 22, 2026

View reviewed changes

mkmeral reviewed Jun 22, 2026

View reviewed changes

zastrowm reviewed Jun 22, 2026

View reviewed changes

notowen333 reviewed Jun 22, 2026

View reviewed changes

gautamsirdeshmukh reviewed Jun 22, 2026

View reviewed changes

mkmeral reviewed Jun 22, 2026

View reviewed changes

transitioned git-based memory store to a broader file-memory store

6dd82be

github-actions Bot added size/l and removed size/m labels Jun 24, 2026

maisieyanz added 3 commits June 24, 2026 13:51

Reordered sections to make them more readable

c9c247d

Made edits to consolidation section and other improvements

6d32bdf

Fixed versioning logic

e8cc6b7

opieter-aws reviewed Jun 25, 2026

View reviewed changes


		## Decision

		`GitMemoryStore` is a unified, git-backed storage layer that implements both the `Storage` interface (for `ContextManager`, L1) and the `MemoryStore` interface (for `MemoryManager`, L2) against a single git repository. It serves as a single versioned, inspectable, diffable repository containing everything an agent has learned and experienced (session history, extracted facts, and learned skills).


		MemoryManager already handles extraction (promoting observations into long-term memory) and retrieval (injecting stored knowledge back into context). What it lacks is a maintenance layer and a shared substrate:

		- No maintenance — there is no built-in way to deduplicate, resolve contradictions, or restructure stored knowledge after it's written. MemoryManager writes and retrieves, but never improves what's stored.


		## Consolidation

		Consolidation improves memory quality after facts accumulate. It is a developer-invoked Strands agent exposed as a method on `MemoryManager` (defined under `src/memory/consolidation/`). It reads stored knowledge, reasons across files, and writes changes directly to the git repo via filesystem tools (`readFile`, `writeFile`, `deleteFile`, `gitCommit`). Every change is a git commit, so bad consolidation is trivially reversible with `git revert`.


		The `metadata` fields come from the `ModelExtractor` when automatic extraction is configured — its system prompt instructs it to produce a title and description for each extracted fact (see [Appendix A](#appendix-a-extraction-configuration) for the configuration example). When the agent uses the `store_memory` tool instead (explicit write), no extractor is involved — `add()` receives raw content with no metadata and falls back to deriving both from the content.

		`search(query, options?)`


		### Architecture

		`GitMemoryStore` is a single class that implements both the `Storage` interface (for `ContextManager`) and the `MemoryStore` interface (for `MemoryManager`). This dual implementation is what enables the unified timeline — both layers write commits to the same repo, so `git log` shows the complete history of sessions and knowledge together.


		`search(query, options?)`

		Required by the `MemoryStore` interface. Performs keyword matching (grep) against filenames, `description` frontmatter, and file content, excluding `knowledge/system/` (already loaded in full). Returns the top matches as `MemoryEntry[]`, ranked by term frequency. No model call, no embeddings.


		`GitMemoryStore` is a unified, git-backed storage layer that implements both the `Storage` interface (for `ContextManager`, L1) and the `MemoryStore` interface (for `MemoryManager`, L2) against a single git repository. It serves as a single versioned, inspectable, diffable repository containing everything an agent has learned and experienced (session history, extracted facts, and learned skills).

		The existing Strands API remains unchanged. `ContextManager` still owns L0 <--> L1, `MemoryManager` still owns L1 --> L2. What changes is the physical storage: instead of separate, disconnected backends for each layer, both write to the same git repo. Every write from any layer produces an informative git commit, giving developers a complete audit trail using standard git tooling (`git log`, `git diff`, `git revert`).


		### Context Loading

		Files in `knowledge/system/` are always loaded in full into the system prompt. This is where core context lives (persona, key preferences, critical project facts). Everything outside `system/` is visible by filename + description only, loaded when the agent reads it.


		Each session writes to its own branch, merges back to `main` on close.

		Why rejected: Path-based isolation (`sessions/{id}.md`) achieves the same separation without branch management overhead or merge conflicts.


		All extracted facts land in `knowledge/facts/` by default — `GitMemoryStore.add()` writes there unconditionally for simplicity and to avoid a classification model call on every extraction. Consolidation is therefore responsible for reorganizing files into appropriate subdirectories (`skills/`, `system/`, etc.) during offline maintenance, when it has full cross-file context to make informed categorization decisions.

		### How It Works


		The `operations` config controls which directives go into the agent's system prompt. They are prompt instructions — the LLM decides how to apply them.

		\| Operation \| Agent behavior \| Example \|


		LLM-powered agents struggle with maintaining and managing long-term memory effectively. As memory accumulates over extended interactions, memory quality degrades. In the Strands SDK, there is no built-in maintenance mechanism that can combine, deduplicate, resolve contradictions, or restructure isolated facts. Over long-horizon use, memory files accumulate redundancy and lose coherent structure, making retrieval less reliable and context windows less efficient.

		MemoryManager already handles extraction (promoting observations into long-term memory) and retrieval (injecting stored knowledge back into context). What it lacks is a maintenance layer and a shared substrate:


		A file-based memory system addresses these issues by organizing knowledge as a structured file hierarchy that the agent can navigate directly. By abstracting the storage layer behind a `FileBackend` interface, the same file-based memory system can be backed by a local filesystem, a git repository, S3, or any other store that supports basic file operations. A developer-invoked consolidation agent provides the missing maintenance mechanism — it reads accumulated knowledge, deduplicates redundant entries, resolves contradictions, and reorganizes files, running offline so it doesn't add latency to agent sessions.

		The existing `BedrockKnowledgeBaseStore` addresses retrieval via managed vector search, but requires provisioned AWS infrastructure (Bedrock Knowledge Base, credentials, optional S3). This is well-suited for production and enterprise deployments where teams already have AWS infrastructure. `FileMemoryStore` targets the other end: individual developers, prototyping, and environments where standing up a managed service is unnecessary overhead. It requires zero external infrastructure, just a filesystem.


		The storage backend is abstracted behind a `FileBackend` interface — any system that can read, write, list, and delete files can serve as the underlying store. This enables the same memory system to run against a local directory, a git repository, S3, or a custom implementation without changing the memory logic.

		The existing Strands API remains unchanged. `ContextManager` still owns L0 <--> L1, `MemoryManager` still owns L1 --> L2. What changes is the physical storage: instead of separate, disconnected backends for each layer, both write to the same file hierarchy. Every write from any layer is routed through the `FileBackend`, which determines how persistence, history, and atomicity are handled.


		### File Hierarchy

		Both the `ContextManager` and `MemoryManager` write to the same file hierarchy but are isolated by path: L1 writes to `sessions/`, while L2 writes to `knowledge/`. Consolidation metadata lives in `consolidation/`.


		## Progressive Disclosure

		Not everything loads into context every turn. The agent retrieves relevant knowledge on demand by navigating the file hierarchy directly. LLMs are precise and accurate at scoped filesystem calls (listing directories, grepping for keywords, reading specific files), and progressive disclosure leverages this skill as the primary retrieval mechanism.


		Files in `knowledge/system/` are always loaded in full into the system prompt. This is where core context lives (persona, key preferences, critical project facts). Everything outside `system/` is visible by filename + description only, loaded when the agent reads it.

		Who manages `system/`: Developers seed `system/` at repo creation with anything the agent always needs (persona, core preferences). The consolidation agent promotes and demotes files during offline maintenance, analyzing cross-session patterns to move broadly relevant files into `system/` and overly specific ones out. The main agent never writes to `system/` during a session.

Uh oh!

Conversation

maisieyanz commented Jun 22, 2026

Description

Related Issues

Documentation PR

Type of Change

Testing

Checklist

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lizradway Jun 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lizradway Jun 22, 2026 •

edited

Loading