Skip to content

refactor(CodeGeneratorNode): Code Extraction Improvements#38

Merged
canfuu merged 5 commits intomainfrom
feature/fix_execute_code_error
Mar 4, 2026
Merged

refactor(CodeGeneratorNode): Code Extraction Improvements#38
canfuu merged 5 commits intomainfrom
feature/fix_execute_code_error

Conversation

@canfuu
Copy link
Collaborator

@canfuu canfuu commented Mar 4, 2026

This pull request introduces robust code extraction and error logging enhancements for the agent's code generation and execution modules. The main improvements focus on reliably parsing Python code from LLM outputs (handling markdown, natural language, and edge cases), and providing better diagnostics for code execution errors by logging the context around error lines. Comprehensive unit tests have been added to ensure the extraction logic works across real-world scenarios.

Code Extraction Improvements

  • Refactored CodeGeneratorNode to include a new extractCodeFromContent method that robustly parses Python code from LLM outputs, supporting markdown code blocks, natural language prefixes, multiple code blocks (returns the last), and fallback strategies.
  • Added imports for Pattern and Matcher to support regex-based extraction logic.

Testing

  • Added a comprehensive test suite in CodeGeneratorNodeExtractCodeTest.java covering standard markdown, natural language, multiple code blocks, no markdown, edge cases, and real-world LLM output formats.
  • Added JUnit Jupiter as a test dependency in pom.xml.

Error Logging and Diagnostics

  • Enhanced GraalCodeExecutor to log the code context around error lines when execution fails, extracting line numbers from exception messages and printing surrounding code for easier debugging.
  • Updated error handling in both execute and executeDirect methods to call the new logging function, and improved error log messages for clarity. [1] [2]
  • Refactored code variable assignments to support improved error context logging. [1] [2] [3]

将原有的内联代码提取逻辑抽取为独立方法 `extractCodeFromContent()`,并添加详细的注释说明各种提取策略。同时,在 `GraalCodeExecutor` 中增强了错误日志记录能力,能够更精确地定位执行错误的具体位置。此外,还增加了单元测试以验证新的代码提取功能。
Copilot AI review requested due to automatic review settings March 4, 2026 13:22
canfuu added 4 commits March 4, 2026 21:23
This commit improves the readability and maintainability of the test file by updating comments and assertions to English while preserving their technical meaning. The changes include renaming nested classes, updating test method names and descriptions, modifying assertion messages, and standardizing variable values across different test scenarios. These updates make the tests more accessible to international developers without altering any functional logic.

The primary goal is to enhance clarity and consistency in documentation and error messaging within the testing framework. By aligning these elements with common practices in software development, we aim to reduce potential misunderstandings and facilitate easier collaboration among team members working on the project.
@canfuu canfuu enabled auto-merge March 4, 2026 13:25
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR improves the agent’s Python code generation/execution robustness by (1) extracting executable Python code more reliably from LLM responses and (2) enhancing GraalPython execution failure diagnostics with surrounding code context logging, backed by new unit tests.

Changes:

  • Refactors CodeGeneratorNode to centralize robust code extraction via extractCodeFromContent.
  • Adds error-context logging in GraalCodeExecutor to print code lines around the reported error line.
  • Introduces a comprehensive JUnit 5 test suite for extraction scenarios and adds the test dependency.

Reviewed changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated 3 comments.

File Description
assistant-agent-core/src/main/java/com/alibaba/assistant/agent/core/executor/GraalCodeExecutor.java Adds error-line extraction + surrounding-code logging on execution failures.
assistant-agent-autoconfigure/src/main/java/com/alibaba/assistant/agent/autoconfigure/subagent/node/CodeGeneratorNode.java Adds extractCodeFromContent to robustly strip markdown/natural language and select code blocks.
assistant-agent-autoconfigure/src/test/java/com/alibaba/assistant/agent/autoconfigure/subagent/node/CodeGeneratorNodeExtractCodeTest.java Adds extraction coverage across common/edge LLM output formats.
assistant-agent-autoconfigure/pom.xml Adds JUnit Jupiter test dependency for the new test suite.
Comments suppressed due to low confidence (1)

assistant-agent-core/src/main/java/com/alibaba/assistant/agent/core/executor/GraalCodeExecutor.java:862

  • 当无法从异常中提取行号时,这里会把“全部待执行代码(带行号)”以 error 级别打印出来;如果生成代码很长会导致日志爆炸(并可能把敏感上下文写入生产日志)。建议改为只打印前/后 N 行并截断总字符数(或降级到 debug),并在日志里明确已截断。
		} else {
			return value.toString();
		}
	}

	private String getStackTrace(Exception e) {
		ByteArrayOutputStream baos = new ByteArrayOutputStream();

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +885 to +895
if (matcher.find()) {
// 如果有多个代码块,取最后一个(通常最终版本在最后)
String lastBlock = matcher.group(1);
while (matcher.find()) {
lastBlock = matcher.group(1);
}
String code = lastBlock.trim();
if (!code.isEmpty()) {
return code;
}
}
Copy link

Copilot AI Mar 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

extractCodeFromContent 在检测到多个 fenced code block 时始终取“最后一个”匹配;如果最后一个代码块为空(例如模型输出了一个空的 python 结尾块),这里会丢掉前面非空的代码块并进入后续兜底逻辑,最终可能返回包含 markdown/自然语言的原文。建议在遍历 matcher 时记录“最后一个非空代码块”,优先返回最后一个非空 block。

Suggested change
if (matcher.find()) {
// 如果有多个代码块,取最后一个(通常最终版本在最后)
String lastBlock = matcher.group(1);
while (matcher.find()) {
lastBlock = matcher.group(1);
}
String code = lastBlock.trim();
if (!code.isEmpty()) {
return code;
}
}
// 遍历所有匹配,记录“最后一个非空代码块”
String lastNonEmptyCodeBlock = null;
while (matcher.find()) {
String block = matcher.group(1);
if (block != null) {
String candidate = block.trim();
if (!candidate.isEmpty()) {
lastNonEmptyCodeBlock = candidate;
}
}
}
if (lastNonEmptyCodeBlock != null) {
return lastNonEmptyCodeBlock;
}

Copilot uses AI. Check for mistakes.
Comment on lines +872 to +874
* <ul>
* <li>GraalPython: {@code SyntaxError: ... (Unnamed, line 21)}</li>
* <li>GraalPython: {@code ... at line 21}</li>
Copy link

Copilot AI Mar 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

extractErrorLineNumber 每次调用都会重新 Pattern.compile(...)。虽然只在异常路径触发,但仍建议提取为 private static final Pattern 复用,减少重复编译并让意图更清晰。

Copilot uses AI. Check for mistakes.
<dependency>
<groupId>org.junit.jupiter</groupId>
<artifactId>junit-jupiter</artifactId>
<version>5.11.0</version>
Copy link

Copilot AI Mar 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

该模块的父 POM 已通过根 pom.xml 导入了 spring-boot-dependencies(BOM)来统一管理测试依赖版本;仓库内其它模块引入 junit-jupiter 也未显式声明版本。这里固定写死 5.11.0 可能导致版本漂移/冲突(尤其是多模块依赖收敛时)。建议移除 <version>,交由 BOM 统一管理。

Suggested change
<version>5.11.0</version>

Copilot uses AI. Check for mistakes.
Copy link
Collaborator

@AQing-527 AQing-527 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@canfuu canfuu merged commit b5d5082 into main Mar 4, 2026
1 check passed
@canfuu canfuu deleted the feature/fix_execute_code_error branch March 4, 2026 13:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants