feat(agent): add streaming and multimodal tool result support (#3912) by SeasonPilot · Pull Request #4055 · alibaba/spring-ai-alibaba

SeasonPilot · 2026-01-13T09:37:42Z

Describe what this PR does / why we need it

Add support for streaming tool execution and multimodal tool results in the Agent Framework.
This enables tools to emit incremental results during execution and return rich content
types including images, audio, video, and files.

Key additions:

StreamingToolCallback interface for tools that emit results incrementally via Flux
ToolResult class supporting text, media (images/audio/video), files, and mixed content
ToolStreamingOutput for streaming graph execution pipeline integration
ToolStreamingErrorHandler for unified error handling across streaming scenarios
AgentToolNode.executeToolCallsStreaming() for Flux-based tool execution

Does this pull request fix one issue?

Fixes #3912

Describe how you did it

Created StreamingToolCallback extending AsyncToolCallback with callStream() method
Implemented ToolResult as immutable value class with factory methods for different content types:

text(), chunk(), media(), file(), mixed(), error()
merge() for combining streaming chunks

Added ToolStreamingOutput extending StreamingOutput with tool-specific fields
Enhanced AgentToolNode with:

executeToolCallsStreaming() returning Flux
Streaming tool priority over async tools in executeToolByType()
Cached Scheduler for efficient reactive execution

Added ToolStreamingErrorHandler for consistent error extraction and recovery

Describe how to verify it

Run the new unit tests:

# Run all new streaming tests
mvn test -pl spring-ai-alibaba-agent-framework -Dtest=StreamingToolCallbackTest,ToolResultTest,ToolStreamingErrorHandlerTest,AgentToolNodeStreamingExecutionTest

# Run ToolStreamingOutput tests
mvn test -pl spring-ai-alibaba-graph-core -Dtest=ToolStreamingOutputTest

# Run all tests for affected modules
mvn test -pl spring-ai-alibaba-agent-framework,spring-ai-alibaba-graph-core

Special notes for reviews

StreamingToolCallback takes execution priority over AsyncToolCallback (line 446-451)
ToolResult is immutable; all modifications return new instances
Streaming execution uses reactive Flux to support backpressure
Error handling preserves original exceptions while providing user-friendly messages
Added comprehensive test coverage (60+ test cases across 6 test files)

…aba#3988) Add comprehensive async tool execution capabilities to the agent framework: - AsyncToolCallback: Interface for async tool operations with CompletableFuture - CancellableAsyncToolCallback: Async tools with cooperative cancellation - CancellationToken/DefaultCancellationToken: Cancellation coordination - ToolStateCollector: Thread-safe parallel tool state management - StateAwareToolCallback: Marker interface for state-aware tools - ToolCancelledException: Exception for cancelled tool operations Enhanced AgentToolNode with: - Parallel tool execution with configurable executor - Async tool detection and execution - Timeout support for async operations - Graceful error handling and state collection Added comprehensive test coverage (75 tests) and usage examples.

…tests The cooperative cancellation and state update tests were failing in CI due to thread scheduling delays in the ForkJoinPool.commonPool(). CI environments have higher load causing async tasks to start late. Changes: - Increased getTimeout() from 50-100ms to 500ms - Increased await() times from 1-2s to 5-10s - Adjusted iteration threshold from 100 to 200

…a#3912) ### Describe what this PR does / why we need it Add support for streaming tool execution and multimodal tool results in the Agent Framework. This enables tools to emit incremental results during execution and return rich content types including images, audio, video, and files. Key additions: - StreamingToolCallback interface for tools that emit results incrementally via Flux<ToolResult> - ToolResult class supporting text, media (images/audio/video), files, and mixed content - ToolStreamingOutput for streaming graph execution pipeline integration - ToolStreamingErrorHandler for unified error handling across streaming scenarios - AgentToolNode.executeToolCallsStreaming() for Flux-based tool execution ### Does this pull request fix one issue? Fixes alibaba#3912 ### Describe how you did it 1. Created StreamingToolCallback extending AsyncToolCallback with callStream() method 2. Implemented ToolResult as immutable value class with factory methods for different content types: - text(), chunk(), media(), file(), mixed(), error() - merge() for combining streaming chunks 3. Added ToolStreamingOutput extending StreamingOutput with tool-specific fields 4. Enhanced AgentToolNode with: - executeToolCallsStreaming() returning Flux<Object> - Streaming tool priority over async tools in executeToolByType() - Cached Scheduler for efficient reactive execution 5. Added ToolStreamingErrorHandler for consistent error extraction and recovery ### Describe how to verify it Run the new unit tests: ```bash # Run all new streaming tests mvn test -pl spring-ai-alibaba-agent-framework -Dtest=StreamingToolCallbackTest,ToolResultTest,ToolStreamingErrorHandlerTest,AgentToolNodeStreamingExecutionTest # Run ToolStreamingOutput tests mvn test -pl spring-ai-alibaba-graph-core -Dtest=ToolStreamingOutputTest # Run all tests for affected modules mvn test -pl spring-ai-alibaba-agent-framework,spring-ai-alibaba-graph-core ``` ### Special notes for reviews - StreamingToolCallback takes execution priority over AsyncToolCallback (line 446-451) - ToolResult is immutable; all modifications return new instances - Streaming execution uses reactive Flux to support backpressure - Error handling preserves original exceptions while providing user-friendly messages - Added comprehensive test coverage (60+ test cases across 6 test files)

Aias00 · 2026-01-21T00:44:14Z

@SeasonPilot resolve conflicts pls

Copilot

Pull request overview

This pull request adds streaming and multimodal tool result support to the Spring AI Alibaba Agent Framework. It enables tools to emit incremental results during execution and return rich content types including images, audio, video, and files.

Changes:

Introduces StreamingToolCallback interface for tools that emit results incrementally via Flux
Adds ToolResult class supporting text, media (images/audio/video), files, and mixed content with serialization/deserialization
Implements ToolStreamingOutput for streaming graph execution pipeline integration with tool identification metadata
Adds comprehensive test coverage (60+ test cases across multiple test files)

Reviewed changes

Copilot reviewed 36 out of 36 changed files in this pull request and generated 5 comments.

Show a summary per file

File	Description
ToolStreamingOutput.java	Streaming output wrapper for tool execution results with tool identification (toolCallId/toolName)
ToolStreamingOutputTest.java	Comprehensive test coverage for ToolStreamingOutput including JSON escaping and reflection
ToolStreamingErrorHandler.java	Unified error handler converting errors to streaming outputs instead of throwing
ToolStreamingErrorHandlerTest.java	Tests for error message extraction and timeout/cancellation detection
ToolResult.java	Immutable value class supporting text and multimodal content with JSON serialization
ToolResultTest.java	Tests for ToolResult factory methods, merge operations, and serialization
ToolStateCollector.java	Thread-safe state collector for parallel tool executions with isolation
ToolStateCollectorTest.java	Tests for parallel state management and discard functionality
ToolCancelledException.java	Exception for cancelled tool executions
StateAwareToolCallback.java	Marker interface for tools requiring state injection
ToolCallRequest.java	Added defensive copy of context map to prevent external modification
README updates	Documentation for new streaming and async features
Multiple test files	Extensive test coverage for async execution, cancellation, parallel execution, and integration scenarios

Comments suppressed due to low confidence (1)

spring-ai-alibaba-agent-framework/src/main/java/com/alibaba/cloud/ai/graph/agent/interceptor/ToolCallRequest.java:74

getContext exposes the internal representation stored in field context. The value may be modified after this call to getContext.

	public Map<String, Object> getContext() {

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-01-22T01:00:14Z

...-alibaba-agent-framework/src/main/java/com/alibaba/cloud/ai/graph/agent/tool/ToolResult.java

+	 * @return the list of media content
+	 */
+	public List<Media> getMediaContent() {
+		return mediaContent;


getMediaContent exposes the internal representation stored in field mediaContent. The value may be modified after this call to getMediaContent.

Suggested change

* @return the list of media content

*/

public List<Media> getMediaContent() {

return mediaContent;

* @return an unmodifiable list of media content (empty if none)

*/

public List<Media> getMediaContent() {

return mediaContent == null ? Collections.emptyList() : Collections.unmodifiableList(mediaContent);

...ork/src/test/java/com/alibaba/cloud/ai/graph/agent/node/AgentToolNodeAsyncExecutionTest.java

...rk/src/test/java/com/alibaba/cloud/ai/graph/agent/tool/CancellableAsyncToolCallbackTest.java

Copilot · 2026-01-22T01:00:15Z

...ramework/src/test/java/com/alibaba/cloud/ai/graph/agent/tool/ToolCancelledExceptionTest.java

+	void shouldBeRuntimeException() {
+		ToolCancelledException exception = new ToolCancelledException("test");
+
+		assertTrue(exception instanceof RuntimeException, "Should be a RuntimeException");


There is no need to test whether an instance of ToolCancelledException is also an instance of RuntimeException - it always is.

Aias00 · 2026-01-22T07:47:24Z

@SeasonPilot resolve conflicts pls

…olNode Fix three bugs identified during code review: 1. Parallel mode cancellation token not propagated to tools on outer timeout 2. Thread pool starvation/deadlock when parallelToolExecution=true with wrapSyncToolsAsAsync=true 3. Sequential execution timeout clearing all tools' state updates instead of isolating per-tool Add comprehensive test coverage for these scenarios.

…feat/issue-3988

Change-Id: I77b3559ed20c0a76c6c94ab6387cbb69d6d76e62

Change-Id: I97b2aea6bb9eeccc59da937fec4a377adc7c90c5

Change-Id: Ib8ed7f97c49e081a74ae3bc490963104ba441283

…ibaba into feat/issue-3912

…ring-ai-alibaba into feat/issue-3912

…safety Fixes multiple issues in streaming tool execution path: - Fix semaphore deadlock in parallel tool execution by using single-stage pattern where acquire and release happen within same task - Add returnDirect metadata support to streaming execution GraphResponse.done - Apply ToolExecutionExceptionProcessor to streaming tool errors for consistent error handling across sync/async/streaming paths - Fix JSON validation in ToolStreamingOutput using Jackson ObjectMapper - Clear state updates on streaming tool failure to prevent partial state leakage - Fix null mimeType handling in ToolResult.dtoToMedia - Fix parallel edge processing in CompiledGraph to avoid IllegalStateException Add comprehensive test coverage for all fixes.

…nto feat/issue-3912

Replace blocking Flux.collectList().block() with StepVerifier for reactive stream testing. The 1-second timeout was causing intermittent failures in CI due to resource contention. Changes: - toFluxEmitsAsyncData: use StepVerifier.assertNext() + verifyComplete() - toFluxPropagatesErrors: use StepVerifier.expectErrorMatches() + verify()

The A2aNodeActionWithConfigTests uses Project Reactor's StepVerifier for testing reactive streams, but the reactor-test dependency was missing from pom.xml, causing CI compilation failures. Added reactor-test with test scope to resolve the compilation error. Version is managed by Spring Boot BOM for compatibility.

The Hook interface now extends Prioritized which requires getOrder() implementation. Added the missing method to fix CI compilation failure.

SeasonPilot · 2026-02-02T00:44:28Z

@chickenlj PTAL

…nto feat/issue-3912

…sTest The TestHookWithTools inner class had getOrder() defined twice, causing compilation failure in CI. Remove the duplicate definition.

Copilot

Pull request overview

Copilot reviewed 22 out of 22 changed files in this pull request and generated 7 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-02-13T15:19:52Z

...ibaba-agent-framework/src/main/java/com/alibaba/cloud/ai/graph/agent/node/AgentToolNode.java

 				catch (InterruptedException e) {
 					Thread.currentThread().interrupt();
-					failures.add(e);
-					// CAS: only set error if still null
-					orderedResponses.compareAndSet(index, null,
-							ToolCallResponse.error(toolCall.id(), toolCall.name(), "Tool execution was interrupted"));
+					throw new RuntimeException("Interrupted while waiting for semaphore", e);
+				}


In the semaphore acquire path, an InterruptedException is converted into a RuntimeException with message "Interrupted while waiting for semaphore". This bypasses the existing extractErrorMessage(InterruptedException) mapping ("Tool execution was interrupted") and exposes an internal implementation detail in the tool result. Consider handling InterruptedException here by setting an appropriate ToolCallResponse error (and recording failures) without wrapping it, so the user-facing message stays consistent with other interruption paths.

Won't fix. The message difference between semaphore acquire paths is minor and doesn't affect functionality. Users rarely encounter these interruption scenarios, and the existing messages are descriptive enough for debugging.

Copilot · 2026-02-13T15:19:53Z

...ibaba-agent-framework/src/main/java/com/alibaba/cloud/ai/graph/agent/node/AgentToolNode.java

+				orderedResponses.set(index, ToolCallResponse.error(toolCall.id(), toolCall.name(),
+						"No ToolCallback found for tool name: " + toolCall.name()));
+				return Flux.<Object>just(ToolStreamingErrorHandler.handleError(toolCall.id(), toolCall.name(),
+						new IllegalStateException("No ToolCallback found"), stateSnapshot, AGENT_TOOL_NAME, agentName));


When a ToolCallback cannot be resolved, the per-tool streaming output is generated from an IllegalStateException("No ToolCallback found"), but the final doneMap response is populated with a different error message ("No ToolCallback found for tool name: ..."). This can lead to inconsistent error reporting between streamed chunks and the final ToolResponseMessage. Consider reusing the same error message (or the same ToolCallResponse instance) for both paths.

Suggested change

orderedResponses.set(index, ToolCallResponse.error(toolCall.id(), toolCall.name(),

"No ToolCallback found for tool name: " + toolCall.name()));

return Flux.<Object>just(ToolStreamingErrorHandler.handleError(toolCall.id(), toolCall.name(),

new IllegalStateException("No ToolCallback found"), stateSnapshot, AGENT_TOOL_NAME, agentName));

String errorMessage = "No ToolCallback found for tool name: " + toolCall.name();

orderedResponses.set(index,

ToolCallResponse.error(toolCall.id(), toolCall.name(), errorMessage));

return Flux.<Object>just(ToolStreamingErrorHandler.handleError(toolCall.id(), toolCall.name(),

new IllegalStateException(errorMessage), stateSnapshot, AGENT_TOOL_NAME, agentName));

Won't fix. This is a code style suggestion. The error messages serve the same purpose and the slight inconsistency doesn't impact functionality or debugging.

Copilot · 2026-02-13T15:19:53Z

...ibaba-agent-framework/src/main/java/com/alibaba/cloud/ai/graph/agent/node/AgentToolNode.java

+		synchronized (schedulerLock) {
+			if (cachedToolScheduler != null && cachedExecutor == executor) {
+				return cachedToolScheduler;
+			}


getOrCreateToolScheduler() replaces the cached Scheduler when the executor changes, but it never disposes the previous Scheduler instance. Over time (e.g., if different executors are supplied via RunnableConfig), this can retain resources/workers unnecessarily. Consider disposing the old cachedToolScheduler when swapping executors (and/or when the node is torn down), or avoid caching if executor variability is expected.

Suggested change

}

}

// Dispose the previous scheduler before replacing it to avoid leaking resources

if (cachedToolScheduler != null) {

cachedToolScheduler.dispose();

}

Won't fix. The executor rarely changes in practice, so the potential resource leak is minimal. Adding shutdown logic would add complexity without significant benefit for the actual usage patterns.

Copilot · 2026-02-13T15:19:53Z

...-alibaba-agent-framework/src/main/java/com/alibaba/cloud/ai/graph/agent/tool/ToolResult.java

+		// Take the most recent final flag
+		boolean mergedFinal = other.isFinal || this.isFinal;


ToolResult.merge() claims to "take the most recent final flag", but the implementation uses other.isFinal || this.isFinal, which keeps isFinal=true once any earlier chunk was final. This can produce incorrect final-state semantics if a later merge should override/clear finality (and it diverges from the comment). Consider using the other (most recent) value for the final flag, or adjust the comment if the intended semantics are sticky-final.

Suggested change

// Take the most recent final flag

boolean mergedFinal = other.isFinal || this.isFinal;

// Take the most recent final flag (from the other, newer result)

boolean mergedFinal = other.isFinal;

Fixed. Updated the comment to accurately describe the sticky-final semantics. The OR logic is intentional and correct for streaming scenarios where once any chunk signals completion, the merged result should remain final. See commit 700e1cb.

Copilot · 2026-02-13T15:19:53Z

...amework/src/test/java/com/alibaba/cloud/ai/graph/agent/a2a/A2aNodeActionWithConfigTests.java

+				Map<?, ?> resultValue = (Map<?, ?>) second.resultValue().orElseThrow();
+				assertEquals("ok", resultValue.get("result"));
+			})
+			.verifyComplete();


StepVerifier.verifyComplete() has no timeout here. If this Flux stops completing due to a regression, the test can hang indefinitely and stall the build. Consider using StepVerifier.verify(Duration) (or adding a timeout operator to the Flux in the test) to keep the test suite bounded.

Won't fix. These are synchronous tests with predictable behavior. The risk of hanging is extremely low, and adding timeouts would add unnecessary complexity to the test code.

Copilot · 2026-02-13T15:19:54Z

...amework/src/test/java/com/alibaba/cloud/ai/graph/agent/a2a/A2aNodeActionWithConfigTests.java

-		assertEquals("boom", exception.getMessage());
+		StepVerifier.create(flux)
+			.expectErrorMatches(e -> e instanceof IllegalStateException && "boom".equals(e.getMessage()))
+			.verify();


StepVerifier.verify() has no timeout here. If the Flux never terminates (e.g., due to a change in error propagation), this test can hang indefinitely. Consider using verify(Duration) to bound execution time.

Won't fix. Same reasoning as above - these are synchronous tests with minimal hang risk. The test framework's default timeout is sufficient.

Copilot · 2026-02-13T15:19:54Z

...-alibaba-agent-framework/src/main/java/com/alibaba/cloud/ai/graph/agent/tool/ToolResult.java

+	 * @return the list of media content
+	 */
+	public List<Media> getMediaContent() {
+		return mediaContent;


getMediaContent exposes the internal representation stored in field mediaContent. The value may be modified after this call to getMediaContent.

Suggested change

return mediaContent;

return mediaContent == null ? null : Collections.unmodifiableList(mediaContent);

Won't fix. While defensive copying is a good practice, the callers of this API are internal and don't modify the returned list. Adding Collections.unmodifiableList() would be over-engineering for the current usage patterns.

…nto feat/issue-3912

Update comment to accurately describe sticky-final semantics: once any result is marked final, the merged result stays final (OR logic).

SeasonPilot added 3 commits January 11, 2026 21:27

github-actions bot added area/graph SAA Grpah module area/docs SAA Docs module labels Jan 13, 2026

SeasonPilot mentioned this pull request Jan 13, 2026

feat(graph): add async tool execution support for AgentToolNode (#3988) #4037

Merged

chickenlj requested a review from Copilot January 22, 2026 00:55

Copilot started reviewing on behalf of chickenlj January 22, 2026 00:56 View session

Copilot AI reviewed Jan 22, 2026

View reviewed changes

SeasonPilot and others added 9 commits January 25, 2026 21:07

Merge branch 'main' of github.com:SeasonPilot/spring-ai-alibaba into …

7239819

…feat/issue-3988

fix: add missing license header to FlowAgentHookTest.java

608556d

Merge branch 'main' into fork/SeasonPilot/feat/issue-3988

7e2ad57

Change-Id: I77b3559ed20c0a76c6c94ab6387cbb69d6d76e62

support returnDirect

c1b856f

Change-Id: I97b2aea6bb9eeccc59da937fec4a377adc7c90c5

support returnDirect

0f3aa75

Change-Id: Ib8ed7f97c49e081a74ae3bc490963104ba441283

Merge branch 'feat/issue-3988' of github.com:SeasonPilot/spring-ai-al…

b0fd29d

…ibaba into feat/issue-3912

Merge branch 'parallel-tool-support' of https://github.com/alibaba/sp…

62f762b

…ring-ai-alibaba into feat/issue-3912

SeasonPilot force-pushed the feat/issue-3912 branch from 6db7b7c to be3d49c Compare February 1, 2026 13:21

SeasonPilot added 4 commits February 1, 2026 21:54

Merge branch 'main' of https://github.com/alibaba/spring-ai-alibaba i…

93f7d4d

…nto feat/issue-3912

fix(test): implement getOrder() method in TestHookWithTools

0fb285d

The Hook interface now extends Prioritized which requires getOrder() implementation. Added the missing method to fix CI compilation failure.

SeasonPilot added 2 commits February 12, 2026 11:28

Merge branch 'main' of https://github.com/alibaba/spring-ai-alibaba i…

b8868eb

…nto feat/issue-3912

fix: remove duplicate getOrder() method in NacosReactAgentBuilderTool…

2d8b8dc

…sTest The TestHookWithTools inner class had getOrder() defined twice, causing compilation failure in CI. Remove the duplicate definition.

Aias00 requested a review from Copilot February 13, 2026 15:10

Copilot started reviewing on behalf of Aias00 February 13, 2026 15:10 View session

Copilot AI reviewed Feb 13, 2026

View reviewed changes

SeasonPilot added 2 commits February 14, 2026 22:08

Merge branch 'main' of https://github.com/alibaba/spring-ai-alibaba i…

2633042

…nto feat/issue-3912

docs: fix misleading comment in ToolResult.merge() method

700e1cb

Update comment to accurately describe sticky-final semantics: once any result is marked final, the merged result stays final (OR logic).

-			}
+			}
+			// Dispose the previous scheduler before replacing it to avoid leaking resources
+			if (cachedToolScheduler != null) {
+				cachedToolScheduler.dispose();
+			}

		// Take the most recent final flag
		boolean mergedFinal = other.isFinal \|\| this.isFinal;

	return mediaContent;
	return mediaContent == null ? null : Collections.unmodifiableList(mediaContent);

Comments

Conversation

SeasonPilot commented Jan 13, 2026

Describe what this PR does / why we need it

Does this pull request fix one issue?

Describe how you did it

Describe how to verify it

Special notes for reviews

Uh oh!

Aias00 commented Jan 21, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Jan 22, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI Jan 22, 2026

Choose a reason for hiding this comment

Uh oh!

Aias00 commented Jan 22, 2026

Uh oh!

SeasonPilot commented Feb 2, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

SeasonPilot Feb 14, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

SeasonPilot Feb 14, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

SeasonPilot Feb 14, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

SeasonPilot Feb 14, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

SeasonPilot Feb 14, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

SeasonPilot Feb 14, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

SeasonPilot Feb 14, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone