add a readme in tuner

pan-x-c · pan-x-c · commit ddcdcfb925a0 · 2026-01-15T16:24:31.000+08:00
diff --git a/tuner/README.md b/tuner/README.md
@@ -0,0 +1,27 @@
+# AgentScope Tuner
+
+This directory contains several examples of how to use the AgentScope Tuner for tuning AgentScope applications. The table below summarizes the available examples:
+
+| Example Name      | Description                                                                        | Example Path                    | Multi-step Interaction  |  LLM-as-a-Judge | Tool-use | Multi-Agent | Data Augmentation |
+|-------------------|------------------------------------------------------------------------------------|---------------------------------|-------------------------|-----------------|----------|-------------|-------------------|
+| Math Agent        | A quick start example for tuning a math-solving agent to enhance its capabilities. | [math_agent](./math_agent)      | ✅ | ❌ | ❌ | ❌ | ❌ |
+| Frozen Lake       | Make an agent to navigate the Frozen Lake environment in multi-step interactions.  | [frozen_lake](./frozen_lake)    | ✅ | ❌ | ❌ | ❌ | ❌ |
+| Learn to Ask      | Using LLM as a judge to provide feedback to facilitate agent tuning.               | [learn_to_ask](./learn_to_ask)  | ✅ | ✅ | ❌ | ❌ | ❌ |
+| Email Search      | Enhance the tool use ability of your agent on tasks without ground truth.          | [email_search](./email_search)  | ✅ | ✅ | ✅ | ❌ | ❌ |
+| Werewolf Game     | Enhance the agent's performance in a multi-agent game setting.                     | [werewolf_game](./werewolf_game)| ✅ | ✅ | ✅ | ✅ | ❌ |
+| Data Augment      | Data augmentation for better tuning results.                                       | [data_augment](./data_augment)  | ❌ | ❌ | ❌ | ❌ | ✅ |
+
+Each example contains a README file with detailed instructions on how to set up and run the tuning process for that specific scenario. Feel free to explore and modify the examples to suit your needs!
+
+
+## Prerequisites
+
+AgentScope Tuner requires:
+
+- Python 3.10 or higher
+- `agentscope>=1.0.12`
+- `trinity-rft>=0.4.1`
+
+AgentScope Tuner is built on top of [Trinity-RFT](https://github.com/modelscope/Trinity-RFT).
+Please refer to the [Trinity-RFT installation guide](https://modelscope.github.io/Trinity-RFT/en/main/tutorial/trinity_installation.html)
+for detailed instructions on how to set up the environment.
diff --git a/tuner/README_zh.md b/tuner/README_zh.md
@@ -0,0 +1,26 @@
+# AgentScope Tuner 中文说明
+
+本目录包含了多个使用 AgentScope Tuner 对 AgentScope 应用进行调优的示例。下表总结了可用的示例：
+
+| 示例名称         | 描述                                                                 | 示例路径                        | 多步交互 | LLM 评审 | 工具使用 | 多智能体 | 数据增强 |
+|------------------|-------------------------------------------|---------------------------------|----------|----------|----------|----------|----------|
+| 数学智能体         | 快速入门示例，调优数学智能体以提升其能力。     | [math_agent](./math_agent)      | ✅       | ❌       | ❌       | ❌       | ❌       |
+| Frozen Lake       | 让智能体在多步交互中导航冰湖环境。           | [frozen_lake](./frozen_lake)    | ✅       | ❌       | ❌       | ❌       | ❌       |
+| Learn to Ask      | 使用 LLM 作为评审，为智能体调优提供反馈      | [learn_to_ask](./learn_to_ask)  | ✅       | ✅       | ❌       | ❌       | ❌       |
+| 邮件搜索         | 在无标准答案任务中提升智能体的工具使用能力。     | [email_search](./email_search)  | ✅       | ✅       | ✅       | ❌       | ❌       |
+| 狼人杀游戏       | 提升智能体在多智能体游戏场景下的表现。          | [werewolf_game](./werewolf_game)| ✅       | ✅       | ✅       | ✅       | ❌       |
+| 数据增强         | 通过数据增强获得更好的调优效果。               | [data_augment](./data_augment)  | ❌       | ❌       | ❌       | ❌       | ✅       |
+
+每个示例目录下均包含详细的 README 文件，介绍了该场景下的调优流程和使用方法。欢迎根据实际需求进行探索和修改！
+
+## 先决条件
+
+AgentScope Tuner 需要：
+
+- Python 3.10 或更高版本
+- `agentscope>=1.0.12`
+- `trinity-rft>=0.4.1`
+
+AgentScope Tuner 构建于 [Trinity-RFT](https://github.com/modelscope/Trinity-RFT) 之上。
+请参考 [Trinity-RFT 安装指南](https://modelscope.github.io/Trinity-RFT/zh/main/tutorial/trinity_installation.html)
+获取详细的安装方法。
diff --git a/tuner/math_agent/README.md b/tuner/math_agent/README.md
@@ -7,7 +7,7 @@ This guide walks you through the steps to implement and train an agent workflow
 
 To train your agent workflow using RL, you need to understand three components:
 
-1. **Workflow function**: Refactor your agent workflow into a workflow function that follows the specified input/output signature.
+1. **Workflow function**: Refactor your agent application into a workflow function that follows the specified input/output signature.
 2. **Judge function**: Implement a judge function that computes rewards based on the agent's responses.
 3. **Task dataset**: Prepare a dataset containing training samples for the agent to learn.
 
@@ -29,6 +29,10 @@ flowchart TD
     class Task taskcolor;
 ```
 
+The workflow function takes a chat model and a task from dataset as input, and produces the agent's response.
+The judge function takes the same task and the agent's response as input, and computes a scalar reward.
+The judge function is optional; if not provided, the workflow function can directly output the reward.
+
 ## How to implement
 
 Here we use a math problem solving scenario as an example to illustrate how to implement the above three components.
@@ -37,12 +41,15 @@ Suppose you have an agent workflow that solves math problems using the `ReActAge
 
 ```python
 from agentscope.agent import ReActAgent
+from agentscope.model import OpenAIChatModel
 from agentscope.formatter import OpenAIChatFormatter
 from agentscope.message import Msg
 
 
 async def run_react_agent(query: str):
-    # model = ...  # Initialize your ChatModel here
+    model = OpenAIChatModel(
+        # your model config here...
+    )
 
     agent = ReActAgent(
         name="react_agent",
diff --git a/tuner/requirements.txt b/tuner/requirements.txt
@@ -0,0 +1,2 @@
+agentscope[full]>=1.0.12
+trinity-rft>=0.4.1

Original file line number	Diff line number	Diff line change
`@@ -0,0 +1,2 @@`
	`1`	`+agentscope[full]>=1.0.12`
	`2`	`+trinity-rft>=0.4.1`