[doc] update spec decoding doc (#752)

zhuzilin · web-flow · commit d8d6ad287cbf · 2025-11-14T23:46:47.000+08:00
diff --git a/docs/en/advanced/speculative-decoding.md b/docs/en/advanced/speculative-decoding.md
@@ -25,11 +25,14 @@ For detailed parameter meanings and configuration, see SGLang’s speculative de
 
 As RL progresses, the sampling distributions of the draft and target models can drift apart. Fewer draft tokens pass verification, and speculative decoding can even yield negative returns.
 
-Slime currently supports online training of the MTP layers during RL, updating the draft model in sync with training to consistently improve sampling speed. See the related rationale in this [blog](https://www.notion.so/jiajunli-guapisolo/Power-Up-Speculative-Decoding-In-Reinforcement-Learning-2a92d24a293b802d9c73dbae429e581e). Use it as follows:
+slime currently supports online training of the MTP layers during RL, updating the draft model in sync with training to consistently improve sampling speed. See the related rationale in this [blog](https://www.notion.so/jiajunli-guapisolo/Power-Up-Speculative-Decoding-In-Reinforcement-Learning-2a92d24a293b802d9c73dbae429e581e). Use it as follows:
 
 ```bash
+--mtp-num-layers 1
 --enable-mtp-training
 --mtp-loss-scaling-factor 0.2
 ```
 
+And note that this requires a torch dist checkpoint with the MTP weight, you need to add `--mtp-num-layers 1` during the checkpoint conversion from huggingface to torch dist.
+
 Training external draft models is still a WIP.
diff --git a/docs/zh/advanced/speculative-decoding.md b/docs/zh/advanced/speculative-decoding.md
@@ -28,8 +28,11 @@
 目前，slime 支持了在 RL 流程中在线训练 MTP 层，随着训练的进行同步更新 draft model，稳定提高了采样速度，相关原理可参见 [blog](https://www.notion.so/jiajunli-guapisolo/Power-Up-Speculative-Decoding-In-Reinforcement-Learning-2a92d24a293b802d9c73dbae429e581e)。使用方法如下：
 
 ```bash
+--mtp-num-layers 1
 --enable-mtp-training
 --mtp-loss-scaling-factor 0.2
 ```
 
+注意 MTP 训练需要一个包含了 MTP 权重的 checkpoint，所以在将 huggingface checkpoint 转为 torch dist 时，也需要加上 `--mtp-num-layers 1`。
+
 外部 draft model 的训练还在 WIP。