feat(swarm): add AgentBase protocol support#2002
feat(swarm): add AgentBase protocol support#2002emaan-c wants to merge 2 commits intostrands-agents:mainfrom
Conversation
2fc8209 to
296b485
Compare
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
|
/strands review |
| injected_count = 0 | ||
| for node in self.nodes.values(): | ||
| # Only Agent (not generic AgentBase) has tool_registry attribute | ||
| if not isinstance(node.executor, Agent): |
There was a problem hiding this comment.
how do you determine handoffs for non-Agent AgentBase instances then?
There was a problem hiding this comment.
Good question. Looking at the implementation: non-Agent AgentBase instances currently cannot initiate handoffs because _inject_swarm_tools (line 534-577) skips tool injection for nodes that aren't isinstance(node.executor, Agent). They can only be handed to by Agent nodes that do have the handoff tool.
This is a significant capability gap that should be clearly communicated. The _build_node_input method (line 706) also tells all nodes they have "access to swarm coordination tools" regardless of whether tools were injected, which could confuse LLM-backed AgentBase implementations. I've left a separate comment on that.
| # Validate entry point if specified | ||
| if self.entry_point is not None: | ||
| entry_point_node_id = str(self.entry_point.name) | ||
| entry_point_node_id = str(getattr(self.entry_point, "name", None)) |
There was a problem hiding this comment.
Issue: Entry point validation breaks for AgentBase without name attribute.
When entry_point is an AgentBase without a name attribute, str(getattr(self.entry_point, "name", None)) evaluates to the string "None". Meanwhile, nameless nodes are assigned IDs like "node_0", "node_1" etc (line 488). So the lookup on line 503 will never find the entry point, and the user will get a confusing error message.
Suggestion: Use the same logic used during node registration to resolve the entry point's node_id, or look up by identity (is check) instead of by name:
if self.entry_point is not None:
entry_node = None
for node in self.nodes.values():
if node.executor is self.entry_point:
entry_node = node
break
if entry_node is None:
available_agents = [
f"{node_id} ({type(node.executor).__name__})" for node_id, node in self.nodes.items()
]
raise ValueError(f"Entry point agent not found in swarm nodes. Available agents: {available_agents}")This approach is more robust and already has the identity check as a secondary condition (line 504), so consolidating to identity-based lookup avoids the name-based fragility entirely.
| injected_count = 0 | ||
| for node in self.nodes.values(): | ||
| # Only Agent (not generic AgentBase) has tool_registry attribute | ||
| if not isinstance(node.executor, Agent): |
There was a problem hiding this comment.
Issue: Non-Agent nodes silently lose key Swarm capabilities with no user feedback.
When an AgentBase (non-Agent) node is skipped for tool injection, the user isn't informed at Swarm creation time that these nodes won't be able to initiate handoffs. The debug log on line 549 is only visible at DEBUG level. A user might expect their custom AgentBase to participate in handoffs and be surprised when it doesn't.
Suggestion:
- Add a
logger.warning(not just debug) when non-Agent nodes are skipped, or at minimum a clear log after setup summarizing which nodes have handoff capabilities and which don't. - Consider adding a note in the
Swarm.__init__docstring that AgentBase nodes withouttool_registrywill not have handoff capabilities and can only serve as entry/exit points or handoff targets.
| "model_state": node.executor._model_state, | ||
| } | ||
| # Only Agent (not generic AgentBase) has _interrupt_state, state, and messages attributes | ||
| if isinstance(node.executor, Agent): |
There was a problem hiding this comment.
Issue: Silent degradation for interrupt state on non-Agent nodes.
When an interrupt occurs on a non-Agent AgentBase node, no executor context is saved (the isinstance(node.executor, Agent) guard prevents it). This means if a user tries to resume from an interrupt on a non-Agent node, the resume behavior is undefined — reset_executor_state will try to use self.swarm._interrupt_state.context[self.node_id] (line 112) but that key won't exist, causing a KeyError.
Suggestion: Either:
- Raise a clear error if an interrupt is activated on a non-Agent node explaining that interrupt/resume is not supported for AgentBase implementations, or
- Save a minimal context for non-Agent nodes (e.g., just marking the interrupt state without Agent-specific attributes), or
- Add a guard in
reset_executor_stateto handle the case where no context exists for non-Agent nodes during interrupt resume.
|
Issue: Missing documentation PR. This PR adds new public API surface — Users need to understand:
Suggestion: Please add a documentation PR to https://github.com/strands-agents/docs and link it here. |
|
Issue: This PR modifies public API and should have the
Key API questions that should be addressed:
Suggestion: Add the |
|
Assessment: Request Changes Good direction — extending Swarm to accept AgentBase aligns with the Graph pattern and the "Extensible by design" tenet. The core approach of using Review Categories
The pattern is well-aligned with Graph and the test coverage is solid for the happy path. Addressing the entry point bug and the interrupt resume issue would make this ready for a follow-up round. |
| def _initial_node(self) -> SwarmNode: | ||
| if self.entry_point: | ||
| return self.nodes[str(self.entry_point.name)] | ||
| entry_point_name = getattr(self.entry_point, "name", None) |
There was a problem hiding this comment.
Issue: _initial_node (line 1074) has the same name-based lookup fragility as entry point validation.
If entry_point is an AgentBase without a name attribute, getattr(self.entry_point, "name", None) returns None, then str(None) is "None", and "None" in self.nodes will be False. This means it falls through to next(iter(self.nodes.values())) — silently ignoring the user's entry_point configuration.
Suggestion: Use identity-based lookup here too, consistent with the fix for _setup_swarm.
Description
Enable Swarm to accept any
AgentBaseimplementation, not just concreteAgentinstances. This allows developers to integrate agents from LangGraph, Quick Suite, or custom frameworks into Swarm orchestration alongside native Strands agents, reducing development overhead when extending multi-agent systems with existing agent implementations.Motivation
Customers have existing agent investments in other frameworks. Currently, using these agents in Strands Swarm requires rebuilding them as native Strands agents, creating significant duplication of effort. This change enables customers to leverage Strands' swarm orchestration while reusing existing agent implementations, providing a migration path and reducing maintenance burden.
Resolves: #1720
Public API Changes
Swarm now accepts
AgentBaseimplementations:External agents participate in execution and return results. Full Swarm features (handoffs, state management) require native
Agentinstances; external agents degrade gracefully. No breaking changes.Use Cases
Integrate existing agents from other frameworks without rebuilding, enabling quick prototyping and incremental migration from other frameworks to Strands.
Related Issues
Resolves: #1720
Documentation PR
N/A
Type of Change
New feature
Testing
How have you tested the change?
hatch run prepareChecklist
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.