This codebase providing a complete pipeline from tool graph construction, task generation, trajectory collection to quality assessment.
trajectory_synthesis/
├── scripts/ # Execution scripts
│ ├── 1_run_graph_pipeline.sh # Graph construction pipeline: Build tool dependency graph → Random walk to extract sub-chains → Chain verification
│ ├── 2_run_task_construction_pipeline.sh # Task construction pipeline: Prompt generation → Query generation → Query augmentation → Quality scoring
│ ├── 3_run_interaction_pipeline.sh # Interaction pipeline: LLM interacts with tool environment to generate execution trajectories
│ └── 4_run_reward.sh # Reward pipeline: Multi-dimensional quality assessment and scoring of trajectories
│
├── src/ # Source code
│ ├── 1_graph_build/ # Tool graph construction and verification module
│ │ ├── build/ # Graph construction: LLM detects tool dependencies, random walk to extract sub-chains
│ │ └── verify/ # Chain verification: Voting verification, back-translation verification and other operators
│ ├── 2_task_construction/ # Task (Query) generation and scoring module
│ │ ├── gen/ # Query generation and augmentation
│ │ ├── verify/ # Query quality scoring
│ │ └── prompts/ # Prompt templates
│ ├── 3_interaction/ # LLM-environment interaction module
│ │ └── qwen_agent/ # Interaction framework based on Qwen-Agent
│ ├── 4_reward/ # Trajectory quality assessment module
│ └── utils/ # Common utility functions (API client, logging, etc.)
│
└── data/ # Input data
├── mcp_servers.jsonl # MCP server configuration: Contains tool list, server information (input for graph construction)
├── tasks.jsonl # Task data: Query, target tools, scoring information (input for interaction pipeline)
└── trajectories.jsonl # Trajectory data: Conversation history, tool call records (input for Reward pipeline)
mcp_servers.jsonl
↓
[1. Graph Construction Pipeline] → Build tool dependency graph, extract valid tool chains
↓
[2. Task Construction Pipeline] → Generate and augment Query, quality scoring
↓
[3. Interaction Pipeline] → LLM interacts with environment to generate trajectories
↓
[4. Reward Pipeline] → Multi-dimensional quality assessment
↓
Final SFT Data
Function: Build tool dependency graph, extract tool chains and verify their validity.
Steps:
- Graph Construction: Call LLM to detect dependencies between tools
- Random Walk: Extract sub-chains of specified length from the graph
- Chain Verification: Use verification operators to filter invalid tool chains
Execution:
# Please modify the internal parameters according to your own needs.
bash scripts/1_run_graph_pipeline.shMain Parameters:
| Parameter | Description |
|---|---|
INPUT_FILE |
Tool Document file |
MODEL_NAME |
Model name |
MIN_LENGTH / MAX_LENGTH |
Sub-chain length range |
OPERATORS |
Verification operators (comma-separated) |
Function: Generate tasks and perform augmentation and quality scoring.
Steps:
- Prompt Construction: Build prompts in different modes
- Task Generation: Call LLM to generate initial tasks
- Task Augmentation: Augment tasks with diversity/complexity/user persona
- Quality Scoring: Score the generated tasks
Execution:
# Please modify the internal parameters according to your own needs.
bash scripts/2_run_task_construction_pipeline.sh Main Parameters:
| Parameter | Description |
|---|---|
INPUT_FILE |
Input file (output from graph pipeline) |
AUG_MODE |
Augmentation mode (options: diverse/complicate/add_ug/all) |
N_SAMPLE |
Number of samples per prompt |
PERSONA_DATASET_PATH |
User persona dataset path |
Function: Enable LLM to interact with the environment and generate complete execution trajectories.
Execution:
# Please modify the internal parameters according to your own needs.
bash scripts/3_run_interaction_pipeline.sh Main Parameters:
| Parameter | Description |
|---|---|
INPUT_FILE |
Task input file |
OUTPUT_FILE |
Output trajectory file |
MODEL_NAME |
Model name |
MAX_WORKERS |
Maximum concurrency |
TIMEOUT |
Single interaction timeout (seconds) |
Function: Perform multi-dimensional quality assessment on generated trajectories.
Execution:
# Please modify the internal parameters according to your own needs.
bash scripts/4_run_reward.shMain Parameters:
| Parameter | Description |
|---|---|
INPUT_FILE |
Input trajectory file (JSON format) |
OUTPUT_DIR |
Output directory |
MAX_CONCURRENT |
Maximum concurrent requests |
All input data is located in the data/ directory, in JSONL format (one JSON object per line).
MCP server configuration information, used as input for the Graph Construction Pipeline.
Data Structure:
{
"base_info": {
"group_info": {
"server_title": "Server Title",
"server_name": "Server Name",
"server_description": "Server Description",
"domain": "Domain"
},
"tool_list": [
{
"name": "Tool Name",
"description": "Tool Description",
"parameters": { ... }
}
]
},
"features": { ... }
}Task data, used as input for the Interaction Pipeline.
Data Structure:
{
"query_info": {
"generated_question": "User Question",
"target_tools": ["tool1", "tool2"],
"augmented_query_info": { ... },
"query_score_info": { ... }
},
"mcp_info": { ... },
"graph": { ... }
}Interaction trajectory data, used as input for the Reward Pipeline.
Data Structure:
{
"tools": [...],
"messages": [
{"role": "user", "content": "..."},
{"role": "assistant", "content": "...", "tool_calls": [...]},
{"role": "tool", "content": "..."}
]
}