Skip to content

Commit 0d72122

Browse files
Added the README and script files for training sql_agent on NPU (#272)
Co-authored-by: Yuge Zhang <[email protected]>
1 parent e49b75b commit 0d72122

File tree

3 files changed

+62
-4
lines changed

3 files changed

+62
-4
lines changed

docs/how-to/train-sql-agent.md

Lines changed: 38 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -321,6 +321,44 @@ For the LLaMA profile, export an `HF_TOKEN` before running so VERL can download
321321
env RAY_DEBUG=legacy HYDRA_FULL_ERROR=1 VLLM_USE_V1=1 ray start --head --dashboard-host=0.0.0.0
322322
```
323323

324+
!!! note "Launching Training with NPUs"
325+
326+
The example also supports running with **Huawei Ascend NPUs**. This feature is contributed by [Teams from Huawei](https://github.com/microsoft/agent-lightning/pull/272). To use it, resort to the function `config_train_npu` in the script.
327+
328+
**Hardware Supported:** Atlas 200T A2 Box16, Atlas 900 A2 PODc, Atlas 800T A3. At least **a single 40GB NPU** is required to run the **Qwen2.5-Coder-1.5B-Instruct** model.
329+
330+
**Environment Setup:** Python 3.11.13, CANN 8.2.RC1, torch 2.7.1+cpu, torch_npu 2.7.1.dev20250724. For basic environment preparation, please refer to this [document](https://gitcode.com/Ascend/pytorch).
331+
332+
Before installing dependencies, configure the following pip mirrors:
333+
334+
```bash
335+
pip config set global.index-url http://repo.huaweicloud.com/repository/pypi/simple
336+
pip config set global.extra-index-url "https://download.pytorch.org/whl/cpu/ https://mirrors.huaweicloud.com/ascend/repos/pypi"
337+
```
338+
339+
Then install vLLM, vLLM-Ascend and VERL:
340+
341+
```bash
342+
pip install vllm==0.10.0 --trusted-host repo.huaweicloud.com
343+
pip install vllm-Ascend==0.10.0rc1 --trusted-host repo.huaweicloud.com
344+
pip install verl==0.5.0
345+
```
346+
347+
To ensure the VERL framework runs correctly on NPU, add the following lines to `verl/utils/vllm_utils.py`:
348+
349+
```python
350+
from vllm_ascend.patch import platform
351+
from vllm_ascend.patch import worker
352+
```
353+
354+
See the following reference for more details: [https://github.com/vllm-project/vllm-ascend/issues/1776](https://github.com/vllm-project/vllm-ascend/issues/1776).
355+
356+
After the above dependencies have been installed, from [`examples/spider`]({{ src("examples/spider") }}) run the following script command:
357+
358+
```bash
359+
python train_sql_agent.py npu
360+
```
361+
324362
### Debugging the Agent without VERL
325363

326364
[`sql_agent.py`]({{ src("examples/spider/sql_agent.py") }}) also provides a `debug_sql_agent()` helper to run the LangGraph workflow directly against a local or hosted OpenAI-compatible endpoint before using VERL.

examples/spider/README.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -37,6 +37,8 @@ Train a SQL agent using the Qwen2.5-Coder-1.5B-Instruct model with the following
3737
python train_sql_agent.py qwen
3838
```
3939

40+
If you want to use an NPU for training, please refer to the **Launch Training with NPUS** section in [How to Train a SQL Agent](../../docs/how-to/train-sql-agent.md).
41+
4042
### Debugging
4143

4244
To test and debug the SQL agent interactively:

examples/spider/train_sql_agent.py

Lines changed: 22 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -137,6 +137,20 @@ def config_train_qwen() -> Dict[str, Any]:
137137
return config
138138

139139

140+
def config_train_npu() -> Dict[str, Any]:
141+
"""A configuration for training with NPU."""
142+
143+
config = deepcopy(RL_TRAINING_CONFIG)
144+
del config["actor_rollout_ref"]["rollout"]["engine_kwargs"]["vllm"]["enable_auto_tool_choice"]
145+
del config["actor_rollout_ref"]["rollout"]["engine_kwargs"]["vllm"]["tool_call_parser"]
146+
del config["trainer"]["logger"][1]
147+
config["actor_rollout_ref"]["actor"]["use_torch_compile"] = False
148+
config["trainer"]["val_before_train"] = False
149+
config["trainer"]["save_freq"] = 256
150+
config["trainer"]["device"] = "npu"
151+
return config
152+
153+
140154
def config_train_llama() -> Dict[str, Any]:
141155
"""A configuration for training with LLaMA-3.2-1B-Instruct.
142156
@@ -171,8 +185,8 @@ def main() -> None:
171185

172186
parser.add_argument(
173187
"config",
174-
choices=["fast", "qwen", "llama"],
175-
help="Training configuration: 'fast' (CI testing), 'qwen' (Qwen-2.5-Coder-1.5B), 'llama' (LLaMA-3.2-3B)",
188+
choices=["fast", "qwen", "llama", "npu"],
189+
help="Training configuration: 'fast' (CI testing), 'qwen' (Qwen-2.5-Coder-1.5B), 'llama' (LLaMA-3.2-3B),'npu' (Train with NPU)",
176190
)
177191

178192
parser.add_argument(
@@ -182,8 +196,12 @@ def main() -> None:
182196
args = parser.parse_args()
183197

184198
# Get the appropriate configuration
185-
config_functions = {"fast": config_train_fast, "qwen": config_train_qwen, "llama": config_train_llama}
186-
199+
config_functions = {
200+
"fast": config_train_fast,
201+
"qwen": config_train_qwen,
202+
"llama": config_train_llama,
203+
"npu": config_train_npu,
204+
}
187205
config = config_functions[args.config]()
188206

189207
# Set active agent - use provided value or default based on config choice

0 commit comments

Comments
 (0)