feat: add memory injection by opieter-aws · Pull Request #2797 · strands-agents/harness-sdk

opieter-aws · 2026-06-15T17:37:32Z

Description

Python port of #2631

Motivation

The TypeScript SDK ships context injection: a way to fold just-in-time text into the model's input for a single call without persisting it to the durable conversation. The MemoryManager builds on it to surface relevant memories to the model automatically, so the agent doesn't have to spend a tool call searching. Python had neither — MemoryManager (#2740) landed with extraction and search tools but no injection — so the Python MemoryManager could only recall memory reactively via the search_memory tool, never proactively.

This ports the injection primitive to Python and wires it into MemoryManager, bringing the two SDKs to parity. The design follows the TS source: a small, reusable strands.injection engine delivered as an InvokeModelStage.Input middleware, a thin ContextInjector vended plugin for arbitrary content, and a MemoryManager consumer for memory-specific retrieval. Behavior, fail-open semantics, the default <memory> format, the prepend-vs-append fold rule, and trigger policies all mirror the TS implementation.

Ephemerality is the core contract: injected text reaches the model for one call but never enters agent.messages or the session. This falls out of the middleware seam — the engine rewrites the per-call InvokeModelContext.messages (already a defensive copy of agent state), so durable history is untouched.

Memory injection is on by default in both SDKs: if you configure a MemoryManager with stores, the point is to use them, so proactive recall is the obvious-path default rather than a flag you have to discover. Because this also changes the already-shipped TS MemoryManager, it is a behavior change there — see Breaking Changes. Opt out with injection: false / injection=False.

Public API Changes

1. New strands.injection package — shared configuration types for context injection. Delivery primitives are intentionally internal; the public surface is config only:

from strands.injection import InjectionConfig, InjectionContext, InjectionTrigger

2. New ContextInjector vended plugin — folds arbitrary just-in-time text (current time, retrieved docs, request metadata, …) into the model input:

from strands import Agent
from strands.vended_plugins.context_injector import ContextInjector

agent = Agent(
    plugins=[
        ContextInjector(lambda context: f"<context>{derive(context.messages)}</context>")
    ]
)

render_content is the only required argument (sync or async; returns the text, or None/"" to skip). Optional keyword-only name (default "strands:context-injector") and trigger ("userTurn" default, "everyTurn", or a predicate over the InjectionContext). A callback that raises fails open — injection is skipped and the model call proceeds.

3. MemoryManager gains an injection option — on by default:

from strands import Agent
from strands.memory import MemoryManager, MemoryInjectionConfig

# Default: inject up to 5 entries on a user turn, rendered as a <memory> block
agent = Agent(plugins=[MemoryManager(stores=[store])])

# Opt out
agent = Agent(plugins=[MemoryManager(stores=[store], injection=False)])

# Or customize retrieval, timing, and formatting
agent = Agent(
    plugins=[
        MemoryManager(
            stores=[store],
            injection=MemoryInjectionConfig(
                trigger="everyTurn",
                max_entries=3,
                query=lambda context: derive_query(context.messages),
                format=lambda context: render(context.entries),
            ),
        )
    ]
)

MemoryInjectionConfig extends InjectionConfig. When enabled, MemoryManager derives a query from the conversation (the latest user text on a user turn, else the most recent assistant text), searches its stores, and folds the top entries into the model input. injection defaults to True (False disables it). strands.memory also re-exports InjectionConfig/InjectionContext/InjectionTrigger so injection can be configured from a single import.

The same default flip is applied to the TypeScript MemoryManager (injection?: boolean | MemoryInjectionConfig, now defaulting to enabled) to keep the two SDKs aligned.

One deliberate cross-SDK divergence worth a reviewer's eye: the callback config fields (trigger, render_content, query, format) are typed as Callable | Protocol-with-**kwargs unions rather than the bare Callable the TS side uses. This follows the STYLE_GUIDE's "avoid Callable for extensible interfaces" rule (the Protocol arm lets the calling convention grow keyword arguments later) while the Callable arm keeps the plain-lambda happy path ergonomic under mypy strict — a pure Protocol rejects bare lambdas. This mirrors the existing EdgeCondition pattern in multiagent/graph.py.

Breaking Changes

For the Python MemoryManager, memory injection is new in this PR, so default-on is a fresh default, not a change. For the TypeScript MemoryManager (shipped in #2631), this flips injection from opt-in (false) to opt-out (enabled). Existing TS agents that configure a MemoryManager will now, on each model call, derive a query from the conversation, search their stores, and prepend a <memory> block to the model input — adding tokens and a search round-trip per call, and potentially surfacing irrelevant recall. This is a behavior change under the "pay for play" tenet, called out here for explicit reviewer sign-off; it is not an API-signature break (the field and its type are unchanged).

Migration

// Restore the previous (no-injection) behavior:
new MemoryManager({ stores, injection: false })

# Python equivalent:
MemoryManager(stores=stores, injection=False)

Documentation PR

Follow-up: a docs page for ContextInjector and MemoryManager injection is not included here.

Type of Change

New feature

Testing

How have you tested the change? Verify that the changes do not break functionality or introduce new warnings.

I ran hatch run prepare

Unit tests mirror the TS describe blocks (fold/trigger/query/format/fail-open), plus real-Agent integration tests on both the ContextInjector and MemoryManager paths asserting the injected text reaches the model but never agent.messages.

Checklist

I have read the CONTRIBUTING document
I have reviewed and understand every line of code in this PR, including any generated by AI tools, and I can explain why it works
My change is focused and reasonably small; I have split unrelated work into separate PRs
I have added any necessary tests that prove my fix is effective or my feature works
I have updated the documentation accordingly
I have added an appropriate example to the documentation to outline the feature, or no new docs are needed
My changes generate no new warnings
Any dependent changes have been merged and published

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

github-actions · 2026-06-15T18:32:52Z

Assessment: Request Changes

Solid, well-documented port — the ephemerality contract is cleanly modeled on the middleware seam, fail-open semantics are consistent, XML escaping protects the default <memory> format, and the test suite mirrors the TS describe blocks with full-object equality assertions. 135 tests pass and mypy is clean locally.

Two things should be resolved before merge, both around the injection default rather than the implementation itself:

Review themes

Default behavior vs. description (blocking): Code defaults injection=True while the PR description claims it's False/opt-in/non-breaking. Default-on is a real behavior change for existing MemoryManager users (extra search + injected context every user turn = latency, token cost, context bloat). The description and code must agree, and the choice should be deliberate. See inline comment on memory_manager.py:102.
API review (blocking-ish): This introduces new public primitives — the strands.injection package, the ContextInjector vended plugin, and MemoryInjectionConfig — plus a default-behavior change to an existing public API. Per team/API_BAR_RAISING.md this is at least a Moderate, arguably Substantial, change (new abstraction customers build on). It needs the api/needs-review label and a designated API reviewer; the default-on decision in particular is exactly the kind of "are the defaults the most common/safest?" question bar-raising exists to settle. The PR description is already well-prepared for this (use cases, signatures, exports), which should make the review quick.
Scope: The PR is titled/scoped as a Python port, but it also flips the TypeScript MemoryManager injection default false→true (strands-ts/src/memory/memory-manager.ts, types.ts, tests). That's an independent behavior change for TS users and would normally be its own PR; at minimum it should be called out in the description rather than riding along with the port.

Nice work on the fold-rule handling (prepend on a user ask, append after a tool result) and the matching everyTurn tests — that's the subtle part and it's covered.

agent-of-mkmeral

Review — Python port of memory injection (#2631)

Assessment: Approve (no blockers). I checked this out, read every new/changed file, ran the suite, type-checked, and ran my own adversarial edge probes. This is a faithful, well-documented port and the earlier blocking concerns (description-vs-code default mismatch; TS scope) have been resolved — the description now correctly states default-on and calls out the TS behavior change under Breaking Changes, and both prior bot threads are resolved.

Verification (local, head `7533bb6`)

135 passed across tests/strands/injection, tests/strands/vended_plugins/context_injector, test_injection_integration.py, test_memory_manager.py
mypy clean on the 8 new/changed source files
Integration test directly proves the ephemerality contract: injected <memory> reaches model.stream(...) for one call but never appears in agent.messages.

What's good

Ephemerality via the middleware seam is correct. InvokeModelContext.messages is already a defensive copy of agent state; the handler does replace(context, messages=fold_into_last_user_message(...)) and fold_* returns a brand-new list without mutating inputs (test test_returns_new_list_and_does_not_mutate_input + test_does_not_mutate_original_context_messages confirm). Durable history is structurally untouched — not just by convention.
Fold rule is the subtle part and it's right + tested. Prepend on a plain user ask (keeps the user's ask in the recency slot); append on a tool-result turn (keeps the tool result the first block, which providers require). I verified the mixed [toolResult, text] case appends correctly and toolResult stays first.
Fail-open is consistent across all four callback seams — trigger predicate, render_content, memory query, and memory format each catch, log reason=<…>, and skip injection so the model call proceeds. Matches TS exactly.
XML escaping protects the default <memory> block from both structural breakage and stored-prompt-injection. & is escaped first; attr-escaping adds "/'. My probe on a & b < c > </entry> and he said "hi" escaped correctly.
TS parity is line-for-line on the engine, resolver, fold logic, _provide_memory_context, _resolve_injection_query adaptive default (user text on user turn, else most-recent assistant text), and _default_injection_format.

Python-specific touches worth calling out (all good)

The Callable | Protocol-with-**kwargs unions for the callback fields (vs. TS's bare function type) is the right call under the STYLE_GUIDE — keeps bare lambdas ergonomic under mypy strict while leaving room for the calling convention to grow kwargs. Mirrors the EdgeCondition precedent.
strands.memory re-exporting InjectionConfig/InjectionContext/InjectionTrigger is a nice single-import ergonomic.

Non-blocking observations

max_entries double-cap is correct but worth a one-line note. search(...) caps each store at max_results, then [:max_results] caps the concatenation. With N stores you fetch up to N×max before slicing, and since search concatenates in store-registration order with no cross-store ranking, the slice can systematically favor earlier-registered stores. This is already documented in the MemoryInjectionConfig.max_entries docstring ("can favor entries from earlier-registered stores") — good. No change needed; just flagging it's an intentional, documented tradeoff.
is_user_turn / fold assume well-formed messages (every message has a content list). A message dict missing content raises KeyError rather than failing open. In practice the agent loop only ever produces well-formed messages so this is not reachable from supported paths, and matches TS (which would also misbehave on malformed data). Not worth defensive code, but noting it since the rest of the module is so carefully fail-open.
Default-on is a real behavior change for existing TS MemoryManager users (extra search + injected block + tokens per user turn). It's now correctly documented and is a deliberate "obvious-path default" decision — but it does warrant the API reviewer's explicit sign-off, which the description is well-prepared for.

Nice work — the hard parts (fold placement, ephemerality, fail-open uniformity, escaping) are all handled and covered by tests that mirror the TS describe blocks.

github-actions · 2026-06-15T20:50:58Z

Assessment: Comment (re-review)

Re-reviewed at efc8b40. Both prior blocking items are resolved, and the public surface is tighter than before. 160 tests pass (up from 135) and mypy is clean locally.

Status of prior feedback

Default vs. description (was blocking) — ✅ Resolved. The description now states default-on explicitly, gives the rationale, documents the TS impact as a Breaking Change with a migration snippet, and code/docstrings/types.py all agree on True.
Scope / TS default flip (was blocking) — ✅ Resolved. Now called out under "Breaking Changes" with a migration example and a "pay for play" tenet note. Keeping both SDKs aligned in one PR is reasonable now that it's documented.
API review — ⚠️ Partially addressed. efc8b40 privatized the delivery primitives (_create_injection_middleware, _resolve_trigger, _is_user_turn, _fold_into_last_user_message, _escape_xml_*), so the public surface is now config-only (InjectionConfig/InjectionContext/InjectionTrigger, ContextInjector, MemoryInjectionConfig) — exactly the right call. The description is also API-review-ready (use cases, signatures, exports, the deliberate Callable | Protocol divergence). The api/needs-review label is still missing, though: a new customer-facing abstraction plus an opt-in→opt-out default flip on a shipped API is the kind of change bar-raising exists to sign off on. That's the one remaining item — process, not code.

Nice tightening of the public surface in the follow-up commit, and the description rewrite is thorough. Once the api/needs-review label is on and a designated reviewer signs off on the default-on decision, this looks ready from a code standpoint.

agent-of-mkmeral · 2026-06-15T21:31:30Z

Non-blocking nit (don't hold the merge for this) — nice follow-up underscoring the internal functions in efc8b40. The one remaining loose end is the module files themselves:

strands/injection/message_injection.py
strands/injection/xml.py

The functions inside are now _-prefixed (good — from strands.injection.message_injection import fold_into_last_user_message correctly fails), but the module paths still read as public. from strands.injection import message_injection works and exposes the _-prefixed internals with no namespace-level "private" signal.

The repo's own convention is to underscore the module/package, not just its contents — e.g. _middleware/, event_loop/_retry.py, models/_defaults.py, strands/_async.py. Injection is currently the one spot that diverges (functions underscored, files not). Suggestion:

message_injection.py → _message_injection.py
xml.py → _xml.py
update the two relative imports in memory/memory_manager.py and vended_plugins/context_injector/plugin.py

Why now rather than later: once released, the module path could become load-bearing for someone, so renaming gets more expensive post-merge. But this is purely a convention/consistency nit — the exported __all__ surface is correct, named-symbol reachability is now fixed, and nothing here is a correctness issue. Totally fine to take it or merge as-is. ✅

github-actions Bot added size/xl chore Maintenance tasks, dependency updates, CI changes, refactoring with no user-facing impact area-community Related to community and contributor health labels Jun 15, 2026

opieter-aws force-pushed the opieter-aws/port-memory-plugin branch 2 times, most recently from 0cf9b28 to 29fd13f Compare June 15, 2026 18:01

opieter-aws changed the title ~~opieter aws/port memory plugin~~ feat: Add memory injection Jun 15, 2026

opieter-aws marked this pull request as ready for review June 15, 2026 18:10

github-actions Bot added the strands-running label Jun 15, 2026

opieter-aws changed the title ~~feat: Add memory injection~~ feat: add memory injection Jun 15, 2026

opieter-aws force-pushed the opieter-aws/port-memory-plugin branch from 29fd13f to 7533bb6 Compare June 15, 2026 18:24

github-actions Bot reviewed Jun 15, 2026

View reviewed changes

Comment thread strands-py/src/strands/memory/memory_manager.py

github-actions Bot reviewed Jun 15, 2026

View reviewed changes

Comment thread strands-ts/src/memory/types.ts

github-actions Bot removed the strands-running label Jun 15, 2026

agent-of-mkmeral reviewed Jun 15, 2026

View reviewed changes

feat: add memory injection (python)

a874938

opieter-aws force-pushed the opieter-aws/port-memory-plugin branch from 7533bb6 to a874938 Compare June 15, 2026 19:35

github-actions Bot added strands-running and removed strands-running labels Jun 15, 2026

make private functions underscored

efc8b40

github-actions Bot added the strands-running label Jun 15, 2026

github-actions Bot removed the strands-running label Jun 15, 2026

opieter-aws enabled auto-merge (squash) June 15, 2026 20:51

JackYPCOnline reviewed Jun 15, 2026

View reviewed changes

Comment thread strands-py/src/strands/vended_plugins/context_injector/plugin.py

mkmeral approved these changes Jun 15, 2026

View reviewed changes

mkmeral added the api/review-complete An API Bar-raiser reviewed and accepted the APIs label Jun 15, 2026

opieter-aws merged commit a111c5d into strands-agents:main Jun 15, 2026
37 of 39 checks passed

opieter-aws deleted the opieter-aws/port-memory-plugin branch June 15, 2026 21:37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add memory injection#2797

feat: add memory injection#2797
opieter-aws merged 2 commits into
strands-agents:mainfrom
opieter-aws:opieter-aws/port-memory-plugin

opieter-aws commented Jun 15, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Jun 15, 2026

Uh oh!

Uh oh!

Uh oh!

agent-of-mkmeral left a comment

Uh oh!

github-actions Bot commented Jun 15, 2026

Uh oh!

Uh oh!

agent-of-mkmeral commented Jun 15, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

opieter-aws commented Jun 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Motivation

Public API Changes

Breaking Changes

Migration

Documentation PR

Type of Change

Testing

Checklist

Uh oh!

github-actions Bot commented Jun 15, 2026

Uh oh!

Uh oh!

Uh oh!

agent-of-mkmeral left a comment

Choose a reason for hiding this comment

Review — Python port of memory injection (#2631)

Verification (local, head 7533bb6)

What's good

Python-specific touches worth calling out (all good)

Non-blocking observations

Uh oh!

github-actions Bot commented Jun 15, 2026

Uh oh!

Uh oh!

agent-of-mkmeral commented Jun 15, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

opieter-aws commented Jun 15, 2026 •

edited

Loading

Verification (local, head `7533bb6`)