feat: 为评估项新增超时时间与默认值配置 by AQing-527 · Pull Request #35 · spring-ai-alibaba/AssistantAgent

AQing-527 · 2026-03-04T05:17:28Z

为评估项新增超时时间与默认值配置
修复GraalCodeExecutor中的序列化问题

canfuu

LGTM

Copilot

Pull request overview

该 PR 为评估框架的单个评估项（criterion）增加超时与失败兜底能力，并补充 LLM 调用参数（ChatOptions）与 GraalCodeExecutor 的复杂对象序列化处理，同时下调部分 schema 查询日志级别以降低噪音。

Changes:

为 EvaluationSuite / EvaluationCriterion 增加默认超时、单项超时与默认值（timeout/defaultValue）配置，并在执行器中实现超时保护与兜底返回。
LLM 评估器（文本/多模态）新增可选 ChatOptions 透传，支持调用时自定义模型参数。
GraalCodeExecutor 将复杂对象先 JSON 序列化再转为 Python 字面量，改善直接 toString() 导致的序列化问题；并将部分 schema 查询日志降为 debug。

Reviewed changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 4 comments.

Show a summary per file

File	Description
assistant-agent-evaluation/src/main/java/com/alibaba/assistant/agent/evaluation/model/EvaluationSuite.java	增加 suite 级默认 criterion 超时配置与访问器
assistant-agent-evaluation/src/main/java/com/alibaba/assistant/agent/evaluation/model/EvaluationCriterion.java	增加 criterion 级 timeoutMs 与 defaultValue 字段
assistant-agent-evaluation/src/main/java/com/alibaba/assistant/agent/evaluation/executor/CriterionEvaluationAction.java	执行 criterion 时增加超时保护；timeout/error 时回填 defaultValue 并写入状态
assistant-agent-evaluation/src/main/java/com/alibaba/assistant/agent/evaluation/evaluator/LLMBasedEvaluator.java	支持可选 ChatOptions 构造与 Prompt 调用透传
assistant-agent-evaluation/src/main/java/com/alibaba/assistant/agent/evaluation/evaluator/MultimodalLLMBasedEvaluator.java	多模态评估支持可选 ChatOptions（文本与多模态各自 options）
assistant-agent-evaluation/src/main/java/com/alibaba/assistant/agent/evaluation/builder/EvaluationCriterionBuilder.java	builder 增加 timeout/defaultValue 相关便捷方法
assistant-agent-core/src/main/java/com/alibaba/assistant/agent/core/tool/schema/DefaultReturnSchemaRegistry.java	`getSchema` 查询日志由 info 降为 debug
assistant-agent-core/src/main/java/com/alibaba/assistant/agent/core/executor/GraalCodeExecutor.java	复杂对象使用 Jackson JSON 序列化后再转换为 Python 字面量，修复序列化问题
assistant-agent-autoconfigure/src/main/java/com/alibaba/assistant/agent/autoconfigure/subagent/node/CodeGeneratorNode.java	系统提示词新增“禁止生成注释”规则，并将 schema 查询日志降为 debug

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-03-04T05:24:46Z

...src/main/java/com/alibaba/assistant/agent/evaluation/executor/CriterionEvaluationAction.java

+                    result = CompletableFuture.supplyAsync(() -> {
+                        if (finalBatchEnabled) {
+                            return executeWithBatching(finalContext, finalDepResults, finalBatchConfig);
+                        } else {
+                            return executeWithoutBatching(finalContext, finalDepResults);
+                        }
+                    }, executorService).get(timeoutMs, TimeUnit.MILLISECONDS);


Timeout handling uses CompletableFuture.get(timeout…), but on TimeoutException the underlying task is left running. This can leak work and exhaust the executor (especially if many criteria time out). Keep a reference to the future and cancel it (future.cancel(true)) in the timeout path, and consider cleaning up any resources the evaluator may hold.

Suggested change

result = CompletableFuture.supplyAsync(() -> {

if (finalBatchEnabled) {

return executeWithBatching(finalContext, finalDepResults, finalBatchConfig);

} else {

return executeWithoutBatching(finalContext, finalDepResults);

}

}, executorService).get(timeoutMs, TimeUnit.MILLISECONDS);

CompletableFuture<CriterionResult> future = CompletableFuture.supplyAsync(() -> {

if (finalBatchEnabled) {

return executeWithBatching(finalContext, finalDepResults, finalBatchConfig);

} else {

return executeWithoutBatching(finalContext, finalDepResults);

}

}, executorService);

try {

result = future.get(timeoutMs, TimeUnit.MILLISECONDS);

} catch (TimeoutException te) {

future.cancel(true);

throw te;

}

Copilot · 2026-03-04T05:24:47Z

...src/main/java/com/alibaba/assistant/agent/evaluation/executor/CriterionEvaluationAction.java

        } catch (Exception e) {
            logger.error("Error executing criterion {}: {}", criterion.getName(), e.getMessage(), e);

-            // Create error result
-            CriterionResult errorResult = new CriterionResult();
-            errorResult.setCriterionName(criterion.getName());
-            errorResult.setStatus(CriterionStatus.ERROR);
-            errorResult.setErrorMessage(e.getMessage());
-            errorResult.setStartTimeMillis(System.currentTimeMillis());
-            errorResult.setEndTimeMillis(System.currentTimeMillis());
+            // Create error result with default value if available
+            CriterionResult errorResult = buildErrorResult(startTime, e.getMessage());



The new Future#get(timeout…) path can throw InterruptedException. The current generic catch(Exception) will swallow the interrupt and continue; this can break cancellation/shutdown behavior. Handle InterruptedException explicitly by restoring the interrupt flag (Thread.currentThread().interrupt()) and building an appropriate result (or rethrow).

Copilot · 2026-03-04T05:24:47Z

.../main/java/com/alibaba/assistant/agent/evaluation/evaluator/MultimodalLLMBasedEvaluator.java

+        super(textModel, evaluatorId, textChatOptions);
        this.multimodalChatModel = multimodalModel;
+        this.multimodalChatOptions = multimodalChatOptions;
        this.objectMapper = objectMapper != null ? objectMapper : new ObjectMapper();


This constructor falls back to new ObjectMapper() when the provided mapper is null, but the class relies on a mapper configured to ignore unknown properties (see createDefaultObjectMapper()). Using a plain ObjectMapper here can reintroduce the deserialization failures this class is designed to avoid; default to createDefaultObjectMapper() instead.

Suggested change

this.objectMapper = objectMapper != null ? objectMapper : new ObjectMapper();

this.objectMapper = objectMapper != null ? objectMapper : createDefaultObjectMapper();

Copilot · 2026-03-04T05:24:47Z

...nt-agent-core/src/main/java/com/alibaba/assistant/agent/core/executor/GraalCodeExecutor.java

+			logger.warn("GraalCodeExecutor#convertComplexObjectToPythonLiteral - reason=JSON序列化失败, " +
+					"valueType={}, error={}, 退化为toString()", value.getClass().getName(), e.getMessage());
+			String strValue = value.toString();
+			return "\"" + strValue.replace("\\", "\\\\").replace("\"", "\\\"") + "\"";


In the JSON-serialization fallback, the string is wrapped with normal quotes but newlines aren’t handled. If value.toString() contains '\n', the generated Python literal becomes invalid code. Reuse the same multiline-safe escaping logic as the String branch (triple quotes when needed) for this fallback path too.

Suggested change

return "\"" + strValue.replace("\\", "\\\\").replace("\"", "\\\"") + "\"";

// 使用通用的字符串转换逻辑，确保多行字符串也能安全转换为 Python 字面量

return toPythonLiteral(strValue);

feat: 为评估项新增超时时间与默认值配置

2f4fb06

Copilot AI review requested due to automatic review settings March 4, 2026 05:17

Copilot started reviewing on behalf of AQing-527 March 4, 2026 05:17 View session

canfuu approved these changes Mar 4, 2026

View reviewed changes

Copilot AI reviewed Mar 4, 2026

View reviewed changes

Merge branch 'main' into feature/evaluation_with_timeout

cfa8be1

AQing-527 merged commit e1b7dec into main Mar 4, 2026
1 check passed

AQing-527 deleted the feature/evaluation_with_timeout branch March 4, 2026 06:20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: 为评估项新增超时时间与默认值配置#35

feat: 为评估项新增超时时间与默认值配置#35
AQing-527 merged 2 commits intomainfrom
feature/evaluation_with_timeout

AQing-527 commented Mar 4, 2026

Uh oh!

canfuu left a comment

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Mar 4, 2026

Uh oh!

Copilot AI Mar 4, 2026

Uh oh!

Copilot AI Mar 4, 2026

Uh oh!

Copilot AI Mar 4, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

	this.objectMapper = objectMapper != null ? objectMapper : new ObjectMapper();
	this.objectMapper = objectMapper != null ? objectMapper : createDefaultObjectMapper();

	return "\"" + strValue.replace("\\", "\\\\").replace("\"", "\\\"") + "\"";
	// 使用通用的字符串转换逻辑，确保多行字符串也能安全转换为 Python 字面量
	return toPythonLiteral(strValue);

Conversation

AQing-527 commented Mar 4, 2026

Uh oh!

canfuu left a comment

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Mar 4, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 4, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 4, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 4, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants