Skip to content

feat: add memory injection#2797

Merged
opieter-aws merged 2 commits into
strands-agents:mainfrom
opieter-aws:opieter-aws/port-memory-plugin
Jun 15, 2026
Merged

feat: add memory injection#2797
opieter-aws merged 2 commits into
strands-agents:mainfrom
opieter-aws:opieter-aws/port-memory-plugin

Conversation

@opieter-aws

@opieter-aws opieter-aws commented Jun 15, 2026

Copy link
Copy Markdown
Contributor

Description

Python port of #2631

Motivation

The TypeScript SDK ships context injection: a way to fold just-in-time text into the model's input for a single call without persisting it to the durable conversation. The MemoryManager builds on it to surface relevant memories to the model automatically, so the agent doesn't have to spend a tool call searching. Python had neither — MemoryManager (#2740) landed with extraction and search tools but no injection — so the Python MemoryManager could only recall memory reactively via the search_memory tool, never proactively.

This ports the injection primitive to Python and wires it into MemoryManager, bringing the two SDKs to parity. The design follows the TS source: a small, reusable strands.injection engine delivered as an InvokeModelStage.Input middleware, a thin ContextInjector vended plugin for arbitrary content, and a MemoryManager consumer for memory-specific retrieval. Behavior, fail-open semantics, the default <memory> format, the prepend-vs-append fold rule, and trigger policies all mirror the TS implementation.

Ephemerality is the core contract: injected text reaches the model for one call but never enters agent.messages or the session. This falls out of the middleware seam — the engine rewrites the per-call InvokeModelContext.messages (already a defensive copy of agent state), so durable history is untouched.

Memory injection is on by default in both SDKs: if you configure a MemoryManager with stores, the point is to use them, so proactive recall is the obvious-path default rather than a flag you have to discover. Because this also changes the already-shipped TS MemoryManager, it is a behavior change there — see Breaking Changes. Opt out with injection: false / injection=False.

Public API Changes

1. New strands.injection package — shared configuration types for context injection. Delivery primitives are intentionally internal; the public surface is config only:

from strands.injection import InjectionConfig, InjectionContext, InjectionTrigger

2. New ContextInjector vended plugin — folds arbitrary just-in-time text (current time, retrieved docs, request metadata, …) into the model input:

from strands import Agent
from strands.vended_plugins.context_injector import ContextInjector

agent = Agent(
    plugins=[
        ContextInjector(lambda context: f"<context>{derive(context.messages)}</context>")
    ]
)

render_content is the only required argument (sync or async; returns the text, or None/"" to skip). Optional keyword-only name (default "strands:context-injector") and trigger ("userTurn" default, "everyTurn", or a predicate over the InjectionContext). A callback that raises fails open — injection is skipped and the model call proceeds.

3. MemoryManager gains an injection option — on by default:

from strands import Agent
from strands.memory import MemoryManager, MemoryInjectionConfig

# Default: inject up to 5 entries on a user turn, rendered as a <memory> block
agent = Agent(plugins=[MemoryManager(stores=[store])])

# Opt out
agent = Agent(plugins=[MemoryManager(stores=[store], injection=False)])

# Or customize retrieval, timing, and formatting
agent = Agent(
    plugins=[
        MemoryManager(
            stores=[store],
            injection=MemoryInjectionConfig(
                trigger="everyTurn",
                max_entries=3,
                query=lambda context: derive_query(context.messages),
                format=lambda context: render(context.entries),
            ),
        )
    ]
)

MemoryInjectionConfig extends InjectionConfig. When enabled, MemoryManager derives a query from the conversation (the latest user text on a user turn, else the most recent assistant text), searches its stores, and folds the top entries into the model input. injection defaults to True (False disables it). strands.memory also re-exports InjectionConfig/InjectionContext/InjectionTrigger so injection can be configured from a single import.

The same default flip is applied to the TypeScript MemoryManager (injection?: boolean | MemoryInjectionConfig, now defaulting to enabled) to keep the two SDKs aligned.

One deliberate cross-SDK divergence worth a reviewer's eye: the callback config fields (trigger, render_content, query, format) are typed as Callable | Protocol-with-**kwargs unions rather than the bare Callable the TS side uses. This follows the STYLE_GUIDE's "avoid Callable for extensible interfaces" rule (the Protocol arm lets the calling convention grow keyword arguments later) while the Callable arm keeps the plain-lambda happy path ergonomic under mypy strict — a pure Protocol rejects bare lambdas. This mirrors the existing EdgeCondition pattern in multiagent/graph.py.

Breaking Changes

For the Python MemoryManager, memory injection is new in this PR, so default-on is a fresh default, not a change. For the TypeScript MemoryManager (shipped in #2631), this flips injection from opt-in (false) to opt-out (enabled). Existing TS agents that configure a MemoryManager will now, on each model call, derive a query from the conversation, search their stores, and prepend a <memory> block to the model input — adding tokens and a search round-trip per call, and potentially surfacing irrelevant recall. This is a behavior change under the "pay for play" tenet, called out here for explicit reviewer sign-off; it is not an API-signature break (the field and its type are unchanged).

Migration

// Restore the previous (no-injection) behavior:
new MemoryManager({ stores, injection: false })
# Python equivalent:
MemoryManager(stores=stores, injection=False)

Documentation PR

Follow-up: a docs page for ContextInjector and MemoryManager injection is not included here.

Type of Change

New feature

Testing

How have you tested the change? Verify that the changes do not break functionality or introduce new warnings.

  • I ran hatch run prepare

Unit tests mirror the TS describe blocks (fold/trigger/query/format/fail-open), plus real-Agent integration tests on both the ContextInjector and MemoryManager paths asserting the injected text reaches the model but never agent.messages.

Checklist

  • I have read the CONTRIBUTING document
  • I have reviewed and understand every line of code in this PR, including any generated by AI tools, and I can explain why it works
  • My change is focused and reasonably small; I have split unrelated work into separate PRs
  • I have added any necessary tests that prove my fix is effective or my feature works
  • I have updated the documentation accordingly
  • I have added an appropriate example to the documentation to outline the feature, or no new docs are needed
  • My changes generate no new warnings
  • Any dependent changes have been merged and published

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@github-actions github-actions Bot added size/xl chore Maintenance tasks, dependency updates, CI changes, refactoring with no user-facing impact area-community Related to community and contributor health labels Jun 15, 2026
@opieter-aws opieter-aws force-pushed the opieter-aws/port-memory-plugin branch 2 times, most recently from 0cf9b28 to 29fd13f Compare June 15, 2026 18:01
@opieter-aws opieter-aws changed the title opieter aws/port memory plugin feat: Add memory injection Jun 15, 2026
@opieter-aws opieter-aws marked this pull request as ready for review June 15, 2026 18:10
@opieter-aws opieter-aws changed the title feat: Add memory injection feat: add memory injection Jun 15, 2026
@opieter-aws opieter-aws force-pushed the opieter-aws/port-memory-plugin branch from 29fd13f to 7533bb6 Compare June 15, 2026 18:24
@github-actions

Copy link
Copy Markdown
Contributor

Assessment: Request Changes

Solid, well-documented port — the ephemerality contract is cleanly modeled on the middleware seam, fail-open semantics are consistent, XML escaping protects the default <memory> format, and the test suite mirrors the TS describe blocks with full-object equality assertions. 135 tests pass and mypy is clean locally.

Two things should be resolved before merge, both around the injection default rather than the implementation itself:

Review themes
  • Default behavior vs. description (blocking): Code defaults injection=True while the PR description claims it's False/opt-in/non-breaking. Default-on is a real behavior change for existing MemoryManager users (extra search + injected context every user turn = latency, token cost, context bloat). The description and code must agree, and the choice should be deliberate. See inline comment on memory_manager.py:102.

  • API review (blocking-ish): This introduces new public primitives — the strands.injection package, the ContextInjector vended plugin, and MemoryInjectionConfig — plus a default-behavior change to an existing public API. Per team/API_BAR_RAISING.md this is at least a Moderate, arguably Substantial, change (new abstraction customers build on). It needs the api/needs-review label and a designated API reviewer; the default-on decision in particular is exactly the kind of "are the defaults the most common/safest?" question bar-raising exists to settle. The PR description is already well-prepared for this (use cases, signatures, exports), which should make the review quick.

  • Scope: The PR is titled/scoped as a Python port, but it also flips the TypeScript MemoryManager injection default falsetrue (strands-ts/src/memory/memory-manager.ts, types.ts, tests). That's an independent behavior change for TS users and would normally be its own PR; at minimum it should be called out in the description rather than riding along with the port.

Nice work on the fold-rule handling (prepend on a user ask, append after a tool result) and the matching everyTurn tests — that's the subtle part and it's covered.

Comment thread strands-py/src/strands/memory/memory_manager.py
Comment thread strands-ts/src/memory/types.ts

@agent-of-mkmeral agent-of-mkmeral left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review — Python port of memory injection (#2631)

Assessment: Approve (no blockers). I checked this out, read every new/changed file, ran the suite, type-checked, and ran my own adversarial edge probes. This is a faithful, well-documented port and the earlier blocking concerns (description-vs-code default mismatch; TS scope) have been resolved — the description now correctly states default-on and calls out the TS behavior change under Breaking Changes, and both prior bot threads are resolved.

Verification (local, head 7533bb6)

  • 135 passed across tests/strands/injection, tests/strands/vended_plugins/context_injector, test_injection_integration.py, test_memory_manager.py
  • mypy clean on the 8 new/changed source files
  • Integration test directly proves the ephemerality contract: injected <memory> reaches model.stream(...) for one call but never appears in agent.messages.

What's good

  • Ephemerality via the middleware seam is correct. InvokeModelContext.messages is already a defensive copy of agent state; the handler does replace(context, messages=fold_into_last_user_message(...)) and fold_* returns a brand-new list without mutating inputs (test test_returns_new_list_and_does_not_mutate_input + test_does_not_mutate_original_context_messages confirm). Durable history is structurally untouched — not just by convention.
  • Fold rule is the subtle part and it's right + tested. Prepend on a plain user ask (keeps the user's ask in the recency slot); append on a tool-result turn (keeps the tool result the first block, which providers require). I verified the mixed [toolResult, text] case appends correctly and toolResult stays first.
  • Fail-open is consistent across all four callback seamstrigger predicate, render_content, memory query, and memory format each catch, log reason=<…>, and skip injection so the model call proceeds. Matches TS exactly.
  • XML escaping protects the default <memory> block from both structural breakage and stored-prompt-injection. & is escaped first; attr-escaping adds "/'. My probe on a & b < c > </entry> and he said "hi" escaped correctly.
  • TS parity is line-for-line on the engine, resolver, fold logic, _provide_memory_context, _resolve_injection_query adaptive default (user text on user turn, else most-recent assistant text), and _default_injection_format.

Python-specific touches worth calling out (all good)

  • The Callable | Protocol-with-**kwargs unions for the callback fields (vs. TS's bare function type) is the right call under the STYLE_GUIDE — keeps bare lambdas ergonomic under mypy strict while leaving room for the calling convention to grow kwargs. Mirrors the EdgeCondition precedent.
  • strands.memory re-exporting InjectionConfig/InjectionContext/InjectionTrigger is a nice single-import ergonomic.

Non-blocking observations

  1. max_entries double-cap is correct but worth a one-line note. search(...) caps each store at max_results, then [:max_results] caps the concatenation. With N stores you fetch up to N×max before slicing, and since search concatenates in store-registration order with no cross-store ranking, the slice can systematically favor earlier-registered stores. This is already documented in the MemoryInjectionConfig.max_entries docstring ("can favor entries from earlier-registered stores") — good. No change needed; just flagging it's an intentional, documented tradeoff.
  2. is_user_turn / fold assume well-formed messages (every message has a content list). A message dict missing content raises KeyError rather than failing open. In practice the agent loop only ever produces well-formed messages so this is not reachable from supported paths, and matches TS (which would also misbehave on malformed data). Not worth defensive code, but noting it since the rest of the module is so carefully fail-open.
  3. Default-on is a real behavior change for existing TS MemoryManager users (extra search + injected block + tokens per user turn). It's now correctly documented and is a deliberate "obvious-path default" decision — but it does warrant the API reviewer's explicit sign-off, which the description is well-prepared for.

Nice work — the hard parts (fold placement, ephemerality, fail-open uniformity, escaping) are all handled and covered by tests that mirror the TS describe blocks.

@opieter-aws opieter-aws force-pushed the opieter-aws/port-memory-plugin branch from 7533bb6 to a874938 Compare June 15, 2026 19:35
@github-actions

Copy link
Copy Markdown
Contributor

Assessment: Comment (re-review)

Re-reviewed at efc8b40. Both prior blocking items are resolved, and the public surface is tighter than before. 160 tests pass (up from 135) and mypy is clean locally.

Status of prior feedback
  • Default vs. description (was blocking) — ✅ Resolved. The description now states default-on explicitly, gives the rationale, documents the TS impact as a Breaking Change with a migration snippet, and code/docstrings/types.py all agree on True.

  • Scope / TS default flip (was blocking) — ✅ Resolved. Now called out under "Breaking Changes" with a migration example and a "pay for play" tenet note. Keeping both SDKs aligned in one PR is reasonable now that it's documented.

  • API review⚠️ Partially addressed. efc8b40 privatized the delivery primitives (_create_injection_middleware, _resolve_trigger, _is_user_turn, _fold_into_last_user_message, _escape_xml_*), so the public surface is now config-only (InjectionConfig/InjectionContext/InjectionTrigger, ContextInjector, MemoryInjectionConfig) — exactly the right call. The description is also API-review-ready (use cases, signatures, exports, the deliberate Callable | Protocol divergence). The api/needs-review label is still missing, though: a new customer-facing abstraction plus an opt-in→opt-out default flip on a shipped API is the kind of change bar-raising exists to sign off on. That's the one remaining item — process, not code.

Nice tightening of the public surface in the follow-up commit, and the description rewrite is thorough. Once the api/needs-review label is on and a designated reviewer signs off on the default-on decision, this looks ready from a code standpoint.

@opieter-aws opieter-aws enabled auto-merge (squash) June 15, 2026 20:51
Comment thread strands-py/src/strands/vended_plugins/context_injector/plugin.py
@agent-of-mkmeral

Copy link
Copy Markdown
Contributor

Non-blocking nit (don't hold the merge for this) — nice follow-up underscoring the internal functions in efc8b40. The one remaining loose end is the module files themselves:

  • strands/injection/message_injection.py
  • strands/injection/xml.py

The functions inside are now _-prefixed (good — from strands.injection.message_injection import fold_into_last_user_message correctly fails), but the module paths still read as public. from strands.injection import message_injection works and exposes the _-prefixed internals with no namespace-level "private" signal.

The repo's own convention is to underscore the module/package, not just its contents — e.g. _middleware/, event_loop/_retry.py, models/_defaults.py, strands/_async.py. Injection is currently the one spot that diverges (functions underscored, files not). Suggestion:

  • message_injection.py_message_injection.py
  • xml.py_xml.py
  • update the two relative imports in memory/memory_manager.py and vended_plugins/context_injector/plugin.py

Why now rather than later: once released, the module path could become load-bearing for someone, so renaming gets more expensive post-merge. But this is purely a convention/consistency nit — the exported __all__ surface is correct, named-symbol reachability is now fixed, and nothing here is a correctness issue. Totally fine to take it or merge as-is. ✅

@mkmeral mkmeral added the api/review-complete An API Bar-raiser reviewed and accepted the APIs label Jun 15, 2026
@opieter-aws opieter-aws merged commit a111c5d into strands-agents:main Jun 15, 2026
37 of 39 checks passed
@opieter-aws opieter-aws deleted the opieter-aws/port-memory-plugin branch June 15, 2026 21:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

api/review-complete An API Bar-raiser reviewed and accepted the APIs area-community Related to community and contributor health chore Maintenance tasks, dependency updates, CI changes, refactoring with no user-facing impact size/xl

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants