feat: add middleware system for wrapping agent stages#1068
Conversation
|
Reviewed this end-to-end (cloned head 1.
|
Resolves CI lint failures introduced after rebasing onto upstream/main: - Remove unused MiddlewareNext import in agent.ts - Remove unused ToolContext import in middleware-interrupts.test.ts - Remove unused require-yield eslint-disable directives in middleware tests
- Apply prettier formatting to middleware module + agent.ts + types/agent.ts - Stop deep-matching the agent instance in InvokeModelContext test; the new ToolCaller Proxy on Agent breaks toMatchObject's recursive comparison. Use referential equality for agent and shape-match the rest of the context.
|
/strands review this PR carefully, also double check any behavior change after rebasing. |
Review SummaryAssessment: Request Changes The middleware design is well-architected — async generators for streaming middleware, right-to-left composition, and typed stage tokens are all solid choices. The test coverage is thorough (61 new tests across 4 files). However, the PR introduces a behavior-affecting bug from rebasing and has API gaps that will be costly to fix post-GA. Review Categories
The core design is strong and test coverage is comprehensive — addressing the API design gaps before this merges will save significant backward-compat pain later. |
Two issues surfaced by the review agent: - Duplicate _interruptState.resume() call: stream() already processes interrupt responses before middleware runs (so context.interrupt() can see them); _stream() was still doing the same work afterwards. Drop the resume() call inside _stream() but keep the extraction since the interrupted-state guard depends on its length. - Stray /** in types/agent.ts at line 329, left over from the rebase merge between upstream's takeSnapshot/loadSnapshot block and the PR's addMiddleware block. Remove the duplicate doc-comment opener.
…ands-agents#1068) * feat: skip model invocation when latest message contains ToolUse - Add _has_tool_use_in_latest_message() helper function to detect ToolUse in latest message - Modify event_loop_cycle() to skip model execution when ToolUse is detected - Set stop_reason='tool_use' and use latest message directly for tool execution - Add comprehensive test coverage with 10 test scenarios - Maintain backward compatibility and existing functionality - No performance impact, minimal overhead for detection Resolves the requirement to skip model calls when the agent should directly execute tools based on existing ToolUse messages in the conversation. 🤖 Assisted by the code-assist agent script * fix: Check messages array size
Review SummaryAssessment: Comment The design and implementation are solid — prior reviews have already covered the key API decisions ( New findings from this review
The refactoring of |
Review SummaryAssessment: Comment This is a well-designed middleware system. The prior reviews comprehensively covered the API-level decisions (cleanup handle, result type forward-compat, interrupt IDs); this review focused on behavioral correctness of the integration into the agent loop. New finding
Test coverage is strong (61+ tests with good patterns: plugin integration, interrupt round-trips, event filtering), and the async-generator composition is clean. |
Description
Adds a middleware system that wraps agent stages (model calls, tool execution, agent streaming) using async generator handlers. New public API:
agent.addMiddleware(stage, handler).Related Issues
Resolves: #XXX
Documentation PR
Type of Change
New feature
Motivation
Hooks let you observe operations and set flags, but they don't let you wrap them. If you want to do something both before and after a model call (timing it, adding a span, catching errors), hooks force you to manage state across two separate callbacks:
Beyond the before/after pattern, middleware also makes several other use cases much more natural to express: caching (check cache, call model only on miss, store result), input transformation (modify context, pass to next), short-circuiting (return synthetic result without calling next), and error handling (try/catch around next). All of these are awkward or impossible with hooks alone.
Public API Changes
Three built-in stages ship with the SDK:
InvokeModelStageExecuteToolStageAgentStreamStageagent.stream()outputThird parties can define custom stages via
createStage<Ctx, Evt, Res>(name)without modifying SDK internals.Handlers are async generators. Simple pass-through is
return yield* next(context). Manual iteration ofnext()allows real-time event filtering or injection. Not callingnextat all short-circuits the operation.Hooks fire unconditionally around the middleware chain. If middleware short-circuits a model call, Before/After hooks still fire. Observability stays intact.
When no middleware is registered for a stage, the agent calls the operation directly with no chain composition (zero overhead for users who don't use middleware).
Use Cases
Caching model responses:
Telemetry spans around tool calls:
Mocking tools in tests:
Filtering events from the agent stream:
Key Implementation Decisions
Async generators over plain async functions. Agent operations stream events over time. Plain async middleware would require buffering all events (killing latency) or splitting into separate request/response + event APIs. Async generators handle pre-processing, real-time event streaming, and post-processing in one function.
Composition order. Handlers compose right-to-left: first registered = outermost. Register a rate limiter then a cache: rate limiter wraps cache wraps model call.
Stage tokens are frozen objects keyed by reference. Two stages with the same name string are distinct. No collision risk from third-party stages. Generics on the stage token flow through to
addMiddleware, giving full type inference at the registration site.Testing
How have you tested the change?
npm run check61 new tests across 3 test files (registry unit tests, agent integration tests, custom stage tests). 2579 total tests pass. 93% statement coverage, 87% branch coverage.
Checklist
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.