Skip to content

Commit fc51d25

Browse files
authored
[doc] move fp8 doc to qwen3-30B-a3B as qwen3-4B doesn't perform good on fp8 rollout (#952)
1 parent 2f3acd4 commit fc51d25

File tree

4 files changed

+38
-38
lines changed

4 files changed

+38
-38
lines changed

docs/en/examples/qwen3-30B-A3B.md

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -74,6 +74,25 @@ Here, we will briefly introduce the MoE-related parts in the [run-qwen3-30B-A3B.
7474
--sglang-dp-size 8
7575
```
7676

77+
### BF16 Training with FP8 Inference
78+
79+
slime also supports BF16 training with FP8 inference. For the Qwen3-30B-A3B model, you just need to download the following model:
80+
81+
```bash
82+
huggingface-cli download Qwen/Qwen3-30B-A3B-FP8 --local-dir /root/Qwen3-30B-A3B-FP8
83+
```
84+
85+
And replace `--hf-checkpoint` with:
86+
87+
```bash
88+
#--hf-checkpoint /root/Qwen3-30B-A3B
89+
--hf-checkpoint /root/Qwen3-30B-A3B-FP8
90+
```
91+
92+
This will trigger FP8 inference. Currently, we directly cast the BF16 weights to FP8. In the future, we will gradually add more sophisticated quantization schemes that have less impact on precision.
93+
94+
⚠️ The Megatron checkpoint for training still needs to be the one that was originally converted from the BF16 Hugging Face model.
95+
7796
### Multi-Node Support
7897

7998
For a multi-node environment, the following modifications are necessary:

docs/en/examples/qwen3-4B.md

Lines changed: 0 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -250,25 +250,6 @@ This means that each time, the data corresponding to the first `num_samples` pro
250250

251251
⚠️ The `sample.metadata` of each partial rollout sample stores the rollout ID from its initial generation, which can be used for data filtering.
252252

253-
### BF16 Training with FP8 Inference
254-
255-
slime also supports BF16 training with FP8 inference. For the Qwen3-4B model, you just need to download the following model:
256-
257-
```bash
258-
huggingface-cli download Qwen/Qwen3-4B-FP8 --local-dir /root/Qwen3-4B-FP8
259-
```
260-
261-
And replace `--hf-checkpoint` with:
262-
263-
```bash
264-
#--hf-checkpoint /root/Qwen3-4B
265-
--hf-checkpoint /root/Qwen3-4B-FP8
266-
```
267-
268-
This will trigger FP8 inference. Currently, we directly cast the BF16 weights to FP8. In the future, we will gradually add more sophisticated quantization schemes that have less impact on precision.
269-
270-
⚠️ The Megatron checkpoint for training still needs to be the one that was originally converted from the BF16 Hugging Face model.
271-
272253
### Decoupled Training and Inference
273254

274255
In the original script, the resource configuration is as follows:

docs/zh/examples/qwen3-30B-A3B.md

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -73,6 +73,25 @@ bash scripts/run-qwen3-30B-A3B.sh
7373
--sglang-dp-size 8
7474
```
7575

76+
### bf16 训练 fp8 推理
77+
78+
slime 还支持 bf16 训练,fp8 推理。对于 Qwen3-30B-A3B 模型,只需要下载如下模型:
79+
80+
```bash
81+
huggingface-cli download Qwen/Qwen3-30B-A3B-FP8 --local-dir /root/Qwen3-30B-A3B-FP8
82+
```
83+
84+
并将 `--hf-checkpoint` 替换为:
85+
86+
```bash
87+
#--hf-checkpoint /root/Qwen3-30B-A3B
88+
--hf-checkpoint /root/Qwen3-30B-A3B-FP8
89+
```
90+
91+
即可触发 fp8 训练。目前我们会将 bf16 权重直接 cast 为 fp8,后续会逐渐添加对精度影响更小的量化方案。
92+
93+
⚠️ 训练的 megatron checkpoint 还需要是最开始用 bf16 的 huggingface 转换的。
94+
7695
### 多机支持
7796

7897
对于多机环境,需要进行如下的几点修改:

docs/zh/examples/qwen3-4B.md

Lines changed: 0 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -250,25 +250,6 @@ def pop_first(args, rollout_id, buffer: list[list[Sample]], num_samples: int) ->
250250

251251
⚠️ 每条 partial rollout sample 的 `sample.metadata` 中存储了第一次进行生成的 rollout id,可以用于数据过滤。
252252

253-
### bf16 训练 fp8 推理
254-
255-
slime 还支持 bf16 训练,fp8 推理。对于 Qwen3-4B 模型,只需要下载如下模型:
256-
257-
```bash
258-
huggingface-cli download Qwen/Qwen3-4B-FP8 --local-dir /root/Qwen3-4B-FP8
259-
```
260-
261-
并将 `--hf-checkpoint` 替换为:
262-
263-
```bash
264-
#--hf-checkpoint /root/Qwen3-4B
265-
--hf-checkpoint /root/Qwen3-4B-FP8
266-
```
267-
268-
即可触发 fp8 训练。目前我们会将 bf16 权重直接 cast 为 fp8,后续会逐渐添加对精度影响更小的量化方案。
269-
270-
⚠️ 训练的 megatron checkpoint 还需要是最开始用 bf16 的 huggingface 转换的。
271-
272253
### 训推分离
273254

274255
在原始的脚本中,资源配置如下:

0 commit comments

Comments
 (0)