Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
190 changes: 190 additions & 0 deletions REACT_MASTER_V2_LOOP_BUG_FIX.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,190 @@
# ReActMasterV2 循环执行工具 Bug 修复报告

## 问题概述

ReActMasterV2 Agent 在执行过程中出现循环调用同一个工具的问题,导致任务无法正常完成。

### 现象

用户询问"今天12点30是否有系统异常"时,Agent 反复执行同一个工具调用:

```
view {"path": "/Users/tuyang/GitHub/OpenDerisk/pilot/data/skill/open_rca_diagnosis/SKILL.md"}
```

这个工具被调用了 3 次以上,形成无限循环。

## 根本原因

### 问题定位

在 `packages/derisk-core/src/derisk/agent/core/base_agent.py` 的 `generate_reply` 方法中(line 820-827),存在一个条件判断错误:

```python
if self.current_retry_counter > 0:
if self.run_mode != AgentRunMode.LOOP: # ❌ 问题所在
if self.enable_function_call:
tool_messages = self.function_callning_reply_messages(
agent_llm_out, act_outs
)
all_tool_messages.extend(tool_messages)
```

### 问题分析

1. **ReActMasterV2 的运行模式**:
- ReActMasterV2 使用 `AgentRunMode.LOOP` 模式
- 这意味着它会循环执行多个迭代,直到任务完成

2. **Bug 的影响**:
- 条件 `self.run_mode != AgentRunMode.LOOP` 导致 LOOP 模式的 Agent **不会**将工具调用结果追加到 `all_tool_messages`
- 结果:LLM 在每次迭代时都看不到之前的工具调用结果
- LLM 认为还没有调用过工具,于是再次调用同一个工具
- 形成无限循环

3. **为什么 WorkLog 没起作用**:
- WorkLog 确实记录了工具调用(通过 `_record_action_to_work_log`)
- 但 WorkLog 的注入只在循环开始前执行一次(条件 `self.current_retry_counter == 0`)
- 在 LOOP 模式的后续迭代中,WorkLog 不会被重新获取
- 即使 WorkLog 记录了工具调用,它也不会被转换为 tool_messages 传给 LLM

## 修复方案

### 代码修改

移除 `self.run_mode != AgentRunMode.LOOP` 条件,让所有模式的 Agent 都能接收工具调用结果:

**修改前(BUGGY)**:
```python
if self.current_retry_counter > 0:
if self.run_mode != AgentRunMode.LOOP: # ❌ 移除这个条件
if self.enable_function_call:
tool_messages = self.function_callning_reply_messages(
agent_llm_out, act_outs
)
all_tool_messages.extend(tool_messages)
```

**修改后(FIXED)**:
```python
if self.current_retry_counter > 0:
if self.enable_function_call: # ✅ 所有模式都执行
tool_messages = self.function_callning_reply_messages(
agent_llm_out, act_outs
)
all_tool_messages.extend(tool_messages)
```

### 修复文件

- **文件路径**:`packages/derisk-core/src/derisk/agent/core/base_agent.py`
- **修改行**:Line 821
- **修改类型**:移除条件判断

## 修复效果

### 修复前的行为

```
Iteration 1:
- LLM 调用 view("/path/to/SKILL.md")
- 结果:技能文件内容
- ❌ 结果未添加到 all_tool_messages(因为是 LOOP 模式)

Iteration 2:
- LLM prompt:与 iteration 1 相同(没有工具结果可见)
- LLM 认为:"我应该加载技能文件"
- LLM 再次调用 view("/path/to/SKILL.md") ← 相同调用
- ❌ 结果未添加到 all_tool_messages

Iteration 3:
- 与 iteration 2 相同
- 无限循环!
```

### 修复后的行为

```
Iteration 1:
- LLM 调用 view("/path/to/SKILL.md")
- 结果:技能文件内容
- ✅ 结果添加到 all_tool_messages

Iteration 2:
- LLM prompt:包含 iteration 1 的工具结果
- LLM 看到:"我已经加载了技能文件,现在应该..."
- LLM 根据技能内容调用下一个工具
- 结果:分析数据
- ✅ 结果添加到 all_tool_messages

Iteration 3:
- LLM prompt:包含 iterations 1 和 2 的结果
- LLM 做出最终决策
- 调用 terminate 完成任务
```

## 验证

### 诊断脚本

创建了两个诊断脚本:

1. **`diagnose_loop_tool_messages.py`**:检测 bug 是否存在
2. **`verify_loop_fix.py`**:验证修复是否成功

### 验证结果

```bash
$ python3 verify_loop_fix.py

✅ FIX APPLIED: Buggy condition has been removed
✅ CORRECT CODE: Tool messages are now appended for all modes
```

## 影响范围

### 受影响的 Agent

- **ReActMasterV2**:主要受影响的 Agent
- **所有使用 AgentRunMode.LOOP 模式的 Agent**

### 受益的功能

- ✅ 工具调用结果现在能正确传递给 LLM
- ✅ LLM 能基于历史结果做出明智决策
- ✅ 防止因 LLM 不知道工具已调用而导致的无限循环
- ✅ WorkLog 记录现在能通过 tool_messages 对 LLM 可见

## 后续步骤

1. **重启服务器**:应用代码修改
2. **测试验证**:
- 使用之前导致循环的查询进行测试
- 验证工具结果现在在 LLM prompt 中可见
- 确认任务能正常完成

3. **监控**:
- 观察 ReActMasterV2 的执行日志
- 确认不再出现重复工具调用
- 验证任务完成效率提升

## 总结

这个 bug 是一个典型的"上下文丢失"问题:

- **症状**:Agent 循环调用同一个工具
- **根因**:LOOP 模式的 Agent 在迭代间丢失了工具调用结果
- **修复**:移除错误的条件判断,让所有模式都能接收工具结果
- **效果**:Agent 现在能基于历史结果做出正确决策,避免无限循环

修复后,ReActMasterV2 将能够:
- 正确执行多步骤任务
- 基于前序工具结果做决策
- 高效完成任务,不再陷入循环

---

**修复日期**:2026-03-09
**修复文件**:`packages/derisk-core/src/derisk/agent/core/base_agent.py`
**修复行数**:Line 821
**修复类型**:移除条件判断 `self.run_mode != AgentRunMode.LOOP`
195 changes: 195 additions & 0 deletions diagnose_loop_tool_messages.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,195 @@
#!/usr/bin/env python3
"""
Diagnostic script to verify ReActMasterV2 loop tool_messages bug.

Issue: In AgentRunMode.LOOP mode, tool call results are NOT appended to all_tool_messages,
causing LLM to repeatedly call the same tool because it doesn't see previous results.

Root Cause: Line 821 in base_agent.py has condition `self.run_mode != AgentRunMode.LOOP`
which skips appending tool_messages for LOOP mode agents.

Expected Behavior: Tool call results should be appended to all_tool_messages for ALL modes.
"""

import sys
from pathlib import Path

# Add project paths
_project_root = Path(__file__).parent
sys.path.insert(0, str(_project_root / "packages/derisk-core/src"))


def check_base_agent_code():
"""Check if the bug exists in base_agent.py"""
print("=" * 80)
print("Checking base_agent.py for LOOP mode tool_messages bug")
print("=" * 80)

base_agent_path = (
_project_root / "packages/derisk-core/src/derisk/agent/core/base_agent.py"
)

if not base_agent_path.exists():
print(f"❌ File not found: {base_agent_path}")
return False

with open(base_agent_path, "r") as f:
lines = f.readlines()

# Find the problematic code section (around line 820-827)
print("\n📍 Checking lines 820-827 for the bug condition:\n")

bug_found = False
for i in range(819, min(828, len(lines))):
line = lines[i]
line_num = i + 1
print(f" {line_num:4d}: {line.rstrip()}")

# Check for the bug condition
if "if self.run_mode != AgentRunMode.LOOP:" in line:
bug_found = True
print(
"\n ⚠️ BUG FOUND: This condition prevents LOOP mode agents from getting tool_messages!"
)

print("\n" + "-" * 80)

if bug_found:
print("❌ BUG CONFIRMED: LOOP mode agents will NOT receive tool call results")
print("\n🔧 Impact:")
print(" - ReActMasterV2 (LOOP mode) will repeatedly call the same tool")
print(" - LLM doesn't see previous tool results in next iteration")
print(" - WorkLog records tools but doesn't inject them to LLM prompt")
print("\n💡 Fix: Remove the 'self.run_mode != AgentRunMode.LOOP' condition")
print(" OR handle LOOP mode specially to inject tool messages")
else:
print("✅ No bug found in this section (may have been fixed)")

return bug_found


def explain_the_bug():
"""Explain the bug in detail"""
print("\n" + "=" * 80)
print("DETAILED BUG EXPLANATION")
print("=" * 80)

print("""
## Problem

ReActMasterV2 uses AgentRunMode.LOOP mode to execute multiple iterations.
In each iteration, it should:
1. Call a tool
2. Get result
3. Pass result to LLM in next iteration
4. LLM decides next action based on results

## What Actually Happens

In base_agent.py generate_reply() method (line 820-827):

if self.current_retry_counter > 0:
if self.run_mode != AgentRunMode.LOOP: # ⚠️ PROBLEM: This excludes LOOP mode!
if self.enable_function_call:
tool_messages = self.function_callning_reply_messages(agent_llm_out, act_outs)
all_tool_messages.extend(tool_messages) # ❌ NOT executed for LOOP mode

Result:
- For LOOP mode agents, tool_messages are NEVER appended to all_tool_messages
- LLM sees the SAME context in each iteration (no tool results)
- LLM calls the same tool again → infinite loop

## Why WorkLog Doesn't Help

WorkLog injection happens only ONCE at the start (line 798-804):

if self.enable_function_call and self.current_retry_counter == 0:
worklog_messages = await self._get_worklog_tool_messages()
all_tool_messages.extend(worklog_messages)

The condition `self.current_retry_counter == 0` means WorkLog is only fetched once.
In subsequent LOOP iterations, WorkLog is NOT re-fetched.

## Solution

Remove the `self.run_mode != AgentRunMode.LOOP` condition to allow LOOP mode agents
to receive tool call results in each iteration:

if self.current_retry_counter > 0:
if self.enable_function_call:
tool_messages = self.function_callning_reply_messages(agent_llm_out, act_outs)
all_tool_messages.extend(tool_messages)
""")


def suggest_fix():
"""Suggest the fix"""
print("\n" + "=" * 80)
print("SUGGESTED FIX")
print("=" * 80)

print("""
## File: packages/derisk-core/src/derisk/agent/core/base_agent.py

## Location: Line 820-827

## Current Code (BUGGY):
```python
if self.current_retry_counter > 0:
if self.run_mode != AgentRunMode.LOOP: # ❌ Remove this condition
if self.enable_function_call:
tool_messages = self.function_callning_reply_messages(agent_llm_out, act_outs)
all_tool_messages.extend(tool_messages)
```

## Fixed Code:
```python
if self.current_retry_counter > 0:
if self.enable_function_call:
tool_messages = self.function_callning_reply_messages(agent_llm_out, act_outs)
all_tool_messages.extend(tool_messages)
```

## Why This Works:
- Removes the LOOP mode exclusion
- All agents (including ReActMasterV2) will now receive tool call results
- LLM can see previous tool results and make informed decisions
- Prevents infinite loops caused by LLM not knowing tools were already called
""")


def main():
print("\n" + "🔍" * 40)
print("ReActMasterV2 LOOP Mode Tool Messages Bug Diagnostic")
print("🔍" * 40 + "\n")

# Check for the bug
bug_exists = check_base_agent_code()

# Explain the bug
explain_the_bug()

# Suggest fix
suggest_fix()

# Summary
print("\n" + "=" * 80)
print("SUMMARY")
print("=" * 80)

if bug_exists:
print("❌ Bug confirmed in base_agent.py line 821")
print("✅ Fix: Remove 'self.run_mode != AgentRunMode.LOOP' condition")
print(
"\nThis will resolve the issue where ReActMasterV2 repeatedly calls tools"
)
print("without seeing previous results, causing infinite loops.")
return 1
else:
print("✅ Bug may have been fixed or code has changed")
print("Please verify manually that LOOP mode agents receive tool messages")
return 0


if __name__ == "__main__":
sys.exit(main())
Loading