Skip to content
This repository was archived by the owner on Jun 3, 2026. It is now read-only.

feat: add middleware system for wrapping agent stages#1068

Open
zastrowm wants to merge 4 commits into
strands-agents:mainfrom
zastrowm:middleware
Open

feat: add middleware system for wrapping agent stages#1068
zastrowm wants to merge 4 commits into
strands-agents:mainfrom
zastrowm:middleware

Conversation

@zastrowm

Copy link
Copy Markdown
Member

Description

Adds a middleware system that wraps agent stages (model calls, tool execution, agent streaming) using async generator handlers. New public API: agent.addMiddleware(stage, handler).

Related Issues

Resolves: #XXX

Documentation PR

Type of Change

New feature

Motivation

Hooks let you observe operations and set flags, but they don't let you wrap them. If you want to do something both before and after a model call (timing it, adding a span, catching errors), hooks force you to manage state across two separate callbacks:

// With hooks: state management across two disconnected callbacks
let startTime: number
agent.addHook(BeforeModelCallEvent, () => { startTime = Date.now() })
agent.addHook(AfterModelCallEvent, () => { metrics.record(Date.now() - startTime) })
// With middleware: one function, natural scoping
agent.addMiddleware(InvokeModelStage, async function* (context, next) {
  const start = Date.now()
  const result = yield* next(context)
  metrics.record(Date.now() - start)
  return result
})

Beyond the before/after pattern, middleware also makes several other use cases much more natural to express: caching (check cache, call model only on miss, store result), input transformation (modify context, pass to next), short-circuiting (return synthetic result without calling next), and error handling (try/catch around next). All of these are awkward or impossible with hooks alone.

Public API Changes

import { Agent, InvokeModelStage, ExecuteToolStage, AgentStreamStage, createStage } from '@strands-agents/sdk'
import type { Stage, MiddlewareHandler, MiddlewareNext } from '@strands-agents/sdk'

const agent = new Agent({ model, tools })

// Register middleware for any built-in stage
agent.addMiddleware(InvokeModelStage, async function* (context, next) {
  // pre-processing: inspect or transform context
  const modified = { ...context, messages: sanitize(context.messages) }
  // call next layer (or don't, to short-circuit)
  const result = yield* next(modified)
  // post-processing: inspect or transform result
  return result
})

Three built-in stages ship with the SDK:

Stage Wraps Context fields
InvokeModelStage Model call (between Before/AfterModelCallEvent) messages, systemPrompt, toolSpecs, toolChoice, modelState
ExecuteToolStage Single tool execution (between Before/AfterToolCallEvent) tool, toolUse (name, id, input)
AgentStreamStage Full agent.stream() output args, options

Third parties can define custom stages via createStage<Ctx, Evt, Res>(name) without modifying SDK internals.

Handlers are async generators. Simple pass-through is return yield* next(context). Manual iteration of next() allows real-time event filtering or injection. Not calling next at all short-circuits the operation.

Hooks fire unconditionally around the middleware chain. If middleware short-circuits a model call, Before/After hooks still fire. Observability stays intact.

When no middleware is registered for a stage, the agent calls the operation directly with no chain composition (zero overhead for users who don't use middleware).

Use Cases

Caching model responses:

agent.addMiddleware(InvokeModelStage, async function* (context, next) {
  const key = hashMessages(context.messages)
  const cached = cache.get(key)
  if (cached) return { result: cached }
  const result = yield* next(context)
  cache.set(key, result.result)
  return result
})

Telemetry spans around tool calls:

agent.addMiddleware(ExecuteToolStage, async function* (context, next) {
  const span = tracer.startSpan(`tool:${context.toolUse.name}`)
  try {
    return yield* next(context)
  } catch (e) {
    span.setStatus({ code: SpanStatusCode.ERROR })
    throw e
  } finally {
    span.end()
  }
})

Mocking tools in tests:

agent.addMiddleware(ExecuteToolStage, async function* (context) {
  return {
    result: new ToolResultBlock({
      toolUseId: context.toolUse.toolUseId,
      status: 'success',
      content: [new TextBlock(JSON.stringify(mockResponses[context.toolUse.name]))],
    }),
  }
})

Filtering events from the agent stream:

agent.addMiddleware(AgentStreamStage, async function* (context, next) {
  const gen = next(context)
  let r = await gen.next()
  while (!r.done) {
    if (r.value.type !== 'modelStreamUpdateEvent') yield r.value
    r = await gen.next()
  }
  return r.value
})

Key Implementation Decisions

Async generators over plain async functions. Agent operations stream events over time. Plain async middleware would require buffering all events (killing latency) or splitting into separate request/response + event APIs. Async generators handle pre-processing, real-time event streaming, and post-processing in one function.

Composition order. Handlers compose right-to-left: first registered = outermost. Register a rate limiter then a cache: rate limiter wraps cache wraps model call.

Stage tokens are frozen objects keyed by reference. Two stages with the same name string are distinct. No collision risk from third-party stages. Generics on the stage token flow through to addMiddleware, giving full type inference at the registration site.

Testing

How have you tested the change?

  • I ran npm run check

61 new tests across 3 test files (registry unit tests, agent integration tests, custom stage tests). 2579 total tests pass. 93% statement coverage, 87% branch coverage.

Checklist

  • I have read the CONTRIBUTING document
  • I have added any necessary tests that prove my fix is effective or my feature works
  • I have updated the documentation accordingly
  • I have added an appropriate example to the documentation to outline the feature, or no new docs are needed
  • My changes generate no new warnings
  • Any dependent changes have been merged and published

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@agent-of-mkmeral

Copy link
Copy Markdown
Contributor

Reviewed this end-to-end (cloned head ab420c3, built, ran the middleware + agent suites, type-check, lint/format, and adversarially tested the composer directly). The core design is strong — async-generator middleware is the right call for a streaming agent, the composer is clean, and behavior matches the docs (right-to-left composition, short-circuit, zero-overhead-when-unused all verified). Full breakdown is in our internal tracking issue; sharing the two API points we'd most want resolved before this goes GA, plus the CI blockers.

1. addMiddleware should return a cleanup handle (parity with addHook)

addHook returns HookCleanup for deregistration; addMiddleware returns void:

addMiddleware<TContext, TEvent, TResult>(
  stage: Stage<TContext, TEvent, TResult>,
  handler: MiddlewareHandler<TContext, TEvent, TResult>,
): void { this._middlewareRegistry.add(stage, handler) }

There's no removal path anywhere on MiddlewareRegistry or the agent. Any plugin or test that installs scoped middleware can't cleanly uninstall it:

const cleanup = agent.addMiddleware(ExecuteToolStage, mockTool) // returns void
cleanup() // ✗ doesn't exist

Returning a disposable (mirroring HookCleanup) closes this without changing the function-based shape. This is the asymmetry most likely to get noticed once people start writing reusable middleware.

2. The *Result objects are a forward-compat liability

InvokeModelResult / ExecuteToolResult / AgentStreamResult are each a single required readonly result, and they're customer-constructed on short-circuit:

agent.addMiddleware(InvokeModelStage, async function* (context, next) {
  const hit = cache.get(key)
  if (hit) return { result: hit }   // customer constructs InvokeModelResult here
  ...
})

Because every short-circuiting/mocking middleware builds this object by hand, the day the SDK wants to return anything alongside result is a breaking change for all of them. Worth either documenting these as "may gain optional fields — construct only what you need" or reserving growth room now, while the surface is still draft.

CI blockers (will fail code-quality as-is)

  • Lint (2 errors): 'MiddlewareNext' is defined but never used (agent.ts:45); 'ToolContext' is defined but never used (middleware-interrupts.test.ts:9). Plus 7 unused eslint-disable require-yield warnings.
  • format:check (9 files): all touched files fail Prettier. Note: local npm run check runs format (auto-writes), but CI runs format:check (fails) — that's why it passed locally. npm run lint:fix && npm run format clears all of it.
  • Rebase: branch is CONFLICTING.

Smaller things

  • AgentStreamStage interrupt IDs are middleware:agentStream:${name} with no per-middleware disambiguation — two stream middlewares interrupting with the same name would collide. ExecuteToolStage keys on toolUseId and is safe. Worth aligning.
  • Resolves: #XXX placeholder still in the description; docs/example checkboxes unchecked.
  • No sdk-python equivalent — worth stating the parity plan so a core extensibility primitive doesn't silently diverge across SDKs.

Test results for the record: middleware 75/75 pass, agent 330/330 pass (no regressions), tsc --noEmit clean, my own composer adversarial tests (order / short-circuit / zero-overhead identity / distinct-stages-by-reference / try-finally-on-throw) all pass.

🤖 AI agent review. Strands Agents.

Resolves CI lint failures introduced after rebasing onto upstream/main:

- Remove unused MiddlewareNext import in agent.ts

- Remove unused ToolContext import in middleware-interrupts.test.ts

- Remove unused require-yield eslint-disable directives in middleware tests
- Apply prettier formatting to middleware module + agent.ts + types/agent.ts

- Stop deep-matching the agent instance in InvokeModelContext test; the new

  ToolCaller Proxy on Agent breaks toMatchObject's recursive comparison.

  Use referential equality for agent and shape-match the rest of the context.
@JackYPCOnline

Copy link
Copy Markdown
Contributor

/strands review this PR carefully, also double check any behavior change after rebasing.

@github-actions github-actions Bot added the strands-running <strands-managed> Whether or not an agent is currently running label May 29, 2026
Comment thread strands-ts/src/agent/agent.ts
Comment thread strands-ts/src/agent/agent.ts
Comment thread strands-ts/src/agent/agent.ts
Comment thread strands-ts/src/middleware/__tests__/registry.test.ts
Comment thread strands-ts/src/middleware/stages.ts
Comment thread strands-ts/src/middleware/registry.ts
Comment thread strands-ts/src/middleware/stages.ts
Comment thread strands-ts/src/middleware/__tests__/middleware-interrupts.test.ts
@github-actions

Copy link
Copy Markdown
Contributor

Review Summary

Assessment: Request Changes

The middleware design is well-architected — async generators for streaming middleware, right-to-left composition, and typed stage tokens are all solid choices. The test coverage is thorough (61 new tests across 4 files).

However, the PR introduces a behavior-affecting bug from rebasing and has API gaps that will be costly to fix post-GA.

Review Categories
  • Behavioral Bug: Double _interruptState.resume() call (lines 746 + 1039 in agent.ts) — idempotent today but fragile and clearly a refactoring oversight from extracting _streamWithResumeLoop.
  • API Design: addMiddleware returns void (no removal path) and result types are a forward-compat liability. Both are breaking to change after release.
  • Code Quality: Duplicate collect() helpers in tests, duplicate TSDoc comment (rebase artifact), mutable arrays behind readonly context properties.
  • Documentation: Missing AGENTS.md directory structure update (required per contributing guidelines), Resolves: #XXX placeholder still in description.
  • Safety: AgentStreamStage interrupt IDs can collide when two middlewares use the same interrupt name.

The core design is strong and test coverage is comprehensive — addressing the API design gaps before this merges will save significant backward-compat pain later.

@github-actions github-actions Bot removed the strands-running <strands-managed> Whether or not an agent is currently running label May 29, 2026
Two issues surfaced by the review agent:

- Duplicate _interruptState.resume() call: stream() already processes

  interrupt responses before middleware runs (so context.interrupt() can

  see them); _stream() was still doing the same work afterwards. Drop

  the resume() call inside _stream() but keep the extraction since the

  interrupted-state guard depends on its length.

- Stray /** in types/agent.ts at line 329, left over from the rebase

  merge between upstream's takeSnapshot/loadSnapshot block and the PR's

  addMiddleware block. Remove the duplicate doc-comment opener.
@strands-agents strands-agents deleted a comment from github-actions Bot May 29, 2026
lizradway pushed a commit to lizradway/sdk-typescript that referenced this pull request Jun 1, 2026
…ands-agents#1068)

* feat: skip model invocation when latest message contains ToolUse

- Add _has_tool_use_in_latest_message() helper function to detect ToolUse in latest message
- Modify event_loop_cycle() to skip model execution when ToolUse is detected
- Set stop_reason='tool_use' and use latest message directly for tool execution
- Add comprehensive test coverage with 10 test scenarios
- Maintain backward compatibility and existing functionality
- No performance impact, minimal overhead for detection

Resolves the requirement to skip model calls when the agent should directly
execute tools based on existing ToolUse messages in the conversation.

🤖 Assisted by the code-assist agent script

* fix: Check messages array size
@github-actions github-actions Bot added the strands-running <strands-managed> Whether or not an agent is currently running label Jun 2, 2026
Comment thread strands-ts/src/agent/agent.ts
Comment thread strands-ts/src/middleware/stages.ts
Comment thread strands-ts/src/middleware/__tests__/agent-middleware.test.ts
@github-actions

github-actions Bot commented Jun 2, 2026

Copy link
Copy Markdown
Contributor

Review Summary

Assessment: Comment

The design and implementation are solid — prior reviews have already covered the key API decisions (addMiddleware return type, result type forward-compat, interrupt ID collisions) and those threads were resolved. The CI blockers from the previous review appear fixed (duplicate resume removed, lint issues resolved).

New findings from this review
  • Hook visibility gap: AgentStreamStage short-circuit skips all lifecycle hooks (BeforeInvocationEvent, AfterInvocationEvent, telemetry spans) since they live inside _stream(). This is an undocumented behavioral asymmetry with InvokeModelStage/ExecuteToolStage where outer hooks fire unconditionally. Should be either documented or tested.
  • Mutable array types: InvokeModelContext.messages and .toolSpecs use Message[]/ToolSpec[] instead of readonly Message[]/readonly ToolSpec[], allowing middleware to accidentally mutate the agent's live state in-place.
  • Test gap: No test verifies the hooks-skipped behavior when AgentStreamStage short-circuits — important for documenting the intended contract.

The refactoring of stream()_streamWithResumeLoop() is clean, and the 61+ tests provide strong coverage of the core composition, short-circuit, and error propagation paths.

@github-actions github-actions Bot removed the strands-running <strands-managed> Whether or not an agent is currently running label Jun 2, 2026
@JackYPCOnline JackYPCOnline marked this pull request as ready for review June 2, 2026 19:38
@github-actions github-actions Bot added the strands-running <strands-managed> Whether or not an agent is currently running label Jun 2, 2026
Comment thread strands-ts/src/agent/agent.ts
@github-actions

github-actions Bot commented Jun 2, 2026

Copy link
Copy Markdown
Contributor

Review Summary

Assessment: Comment

This is a well-designed middleware system. The prior reviews comprehensively covered the API-level decisions (cleanup handle, result type forward-compat, interrupt IDs); this review focused on behavioral correctness of the integration into the agent loop.

New finding
  • Telemetry asymmetry: InvokeModelStage wraps telemetry around the middleware chain (short-circuit still records a span), while ExecuteToolStage telemetry lives inside _executeToolCore (short-circuit produces no tool span/metrics). This means middleware-cached or mocked tool calls are invisible to observability.

Test coverage is strong (61+ tests with good patterns: plugin integration, interrupt round-trips, event filtering), and the async-generator composition is clean.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants