Skip to content

Commit 17c61be

Browse files
momo609wangxiaoxin-sherie
andauthored
[doc] feat: add rollout&train consistency doc for Ascend Platform (#4166)
### What does this PR do? This document provides guidance on ensuring consistency between verl and vllm inference results on NPU. ### Checklist Before Starting - [ ] Search for similar PRs. Paste at least one query link here: ... - [ ] Format the PR title as `[{modules}] {type}: {description}` (This will be checked by the CI) - `{modules}` include `fsdp`, `megatron`, `sglang`, `vllm`, `rollout`, `trainer`, `ci`, `training_utils`, `recipe`, `hardware`, `deployment`, `ray`, `worker`, `single_controller`, `misc`, `perf`, `model`, `algo`, `env`, `tool`, `ckpt`, `doc`, `data` - If this PR involves multiple modules, separate them with `,` like `[megatron, fsdp, doc]` - `{type}` is in `feat`, `fix`, `refactor`, `chore`, `test` - If this PR breaks any API (CLI arguments, config, function signature, etc.), add `[BREAKING]` to the beginning of the title. - Example: `[BREAKING][fsdp, megatron] feat: dynamic batching` ### Test > For changes that can not be tested by CI (e.g., algorithm implementation, new model support), validate by experiment(s) and show results like training curve plots, evaluation results, etc. ### API and Usage Example > Demonstrate how the API changes if any, and provide usage example(s) if possible. ```python # Add code snippet or script demonstrating how to use this ``` ### Design & Code Changes > Demonstrate the high-level design if this PR is complex, and list the specific changes. ### Checklist Before Submitting > [!IMPORTANT] > Please check all the following items before requesting a review, otherwise the reviewer might deprioritize this PR for review. - [ ] Read the [Contribute Guide](https://github.com/volcengine/verl/blob/main/CONTRIBUTING.md). - [ ] Apply [pre-commit checks](https://github.com/volcengine/verl/blob/main/CONTRIBUTING.md#code-linting-and-formatting): `pre-commit install && pre-commit run --all-files --show-diff-on-failure --color=always` - [ ] Add / Update [the documentation](https://github.com/volcengine/verl/tree/main/docs). - [ ] Add unit or end-to-end test(s) to [the CI workflow](https://github.com/volcengine/verl/tree/main/.github/workflows) to cover all the code. If not feasible, explain why: ... - [ ] Once your PR is ready for CI, send a message in [the `ci-request` channel](https://verl-project.slack.com/archives/C091TCESWB1) in [the `verl` Slack workspace](https://join.slack.com/t/verl-project/shared_invite/zt-3855yhg8g-CTkqXu~hKojPCmo7k_yXTQ). (If not accessible, please try [the Feishu group (飞书群)](https://applink.larkoffice.com/client/chat/chatter/add_by_link?link_token=772jd4f1-cd91-441e-a820-498c6614126a).) Co-authored-by: wangxiaoxin-sherie <[email protected]>
1 parent b96b9ab commit 17c61be

File tree

2 files changed

+51
-0
lines changed

2 files changed

+51
-0
lines changed
Lines changed: 50 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,50 @@
1+
Align the Inference results of the verl and vLLM frameworks on Ascend devices(zh)
2+
====================================
3+
4+
在昇腾设备上对齐verl和vLLM两个框架下的推理结果。
5+
6+
Last updated: 11/17/2025.
7+
8+
这是一份在昇腾设备上对齐verl和vLLM两个框架下推理结果的教程。
9+
10+
环境变量配置
11+
~~~~~~~~~~~~
12+
13+
在多卡通信情况下:
14+
15+
- HCCL通信下(默认场景):
16+
17+
- export CLOSE_MATMUL_K_SHIFT=1
18+
- export ATB_MATMUL_SHUFFLE_K_ENABLE=0
19+
- export HCCL_DETERMINISTIC="true"
20+
- export VLLM_ENABLE_V1_MULTIPROCESSING=0
21+
22+
- LCCL通信下(通过export HCCL_OP_EXPANSION_MODE="AIV"使能):
23+
24+
- export CLOSE_MATMUL_K_SHIFT=1
25+
- export ATB_MATMUL_SHUFFLE_K_ENABLE=0
26+
- export LCCL_DETERMINISTIC=1
27+
- export ATB_LLM_LCOC_ENABLE=0
28+
- export VLLM_ENABLE_V1_MULTIPROCESSING=0
29+
30+
在单卡无通信情况下:
31+
32+
- HCCL和LCCL通信下:
33+
34+
- export CLOSE_MATMUL_K_SHIFT=1
35+
- export ATB_MATMUL_SHUFFLE_K_ENABLE=0
36+
- export VLLM_ENABLE_V1_MULTIPROCESSING=0
37+
38+
vLLM初始化参数
39+
~~~~~~~~~~~~
40+
41+
需要对 SamplingParams 参数里单独设置seed, 保持vLLM和verl推理结果一致, 举例修改如下:
42+
43+
.. code:: yaml
44+
45+
sampling_params = SamplingParams(n=1,
46+
logprobs=0, # can be set to 0 and let actor to recompute
47+
max_tokens=config.response_length,
48+
repetition_penalty=config.get("repetition_penalty", 1.0),
49+
seed=1234)
50+

docs/index.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -140,6 +140,7 @@ verl is fast with:
140140
amd_tutorial/amd_build_dockerfile_page.rst
141141
amd_tutorial/amd_vllm_page.rst
142142
ascend_tutorial/ascend_quick_start.rst
143+
ascend_tutorial/ascend_consistency.rst
143144
ascend_tutorial/ascend_profiling_zh.rst
144145
ascend_tutorial/ascend_profiling_en.rst
145146
ascend_tutorial/dockerfile_build_guidance.rst

0 commit comments

Comments
 (0)