Skip to content

Commit 9776970

Browse files
authored
[docs] move reporducibility to main doc (#1433)
1 parent e699b5d commit 9776970

File tree

6 files changed

+57
-5
lines changed

6 files changed

+57
-5
lines changed

docs/en/advanced/low-precision.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# Low Precision Training
22

3-
- [FP8 rollout and FP8 training](#FP8-rollout-and-FP8-training)
3+
- [FP8 rollout and FP8 training](#FP8-rollout-and-BF16-training)
44
- [FP8 rollout and FP8 training](#FP8-rollout-and-FP8-training)
55
- [INT4 QAT Training](#INT4-QAT-Training)
66

Lines changed: 52 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,52 @@
1+
# Reproducibility
2+
3+
Reproducibility is a bedrock of scientific progress. 通过结合 SGLang 提供的 [确定性推理](https://lmsys.org/blog/2025-09-22-sglang-deterministic/) 和 Megatron-LM 的确定性模式,slime 可以提供完全确定性(bitwise)的实验复现能力。
4+
5+
为了开启确定性训练,你需要通过 `pip uninstall flash_attn_3 -y` 卸载 flash attention 3,并设置:
6+
7+
```bash
8+
# sglang config
9+
--sglang-enable-deterministic-inference
10+
--sglang-attention-backend flashinfer
11+
12+
# megatron config
13+
--deterministic-mode
14+
```
15+
16+
以及设置如下环境变量:
17+
18+
```bash
19+
"env_vars": {
20+
...,
21+
"NCCL_ALGO": "Ring",
22+
"NVTE_ALLOW_NONDETERMINISTIC_ALGO": "0",
23+
"CUBLAS_WORKSPACE_CONFIG": ":4096:8"
24+
}
25+
```
26+
27+
我们提供了一个完全确定性的,用 Qwen2.5 0.5B 训练 GSM8K 的脚本。
28+
29+
可以用如下脚本初始化训练数据和 ckpt:
30+
31+
```bash
32+
# download
33+
hf download --repo-type dataset zhuzilin/gsm8k --local-dir /root/gsm8k
34+
hf download Qwen/Qwen2.5-0.5B-Instruct --local-dir /root/Qwen2.5-0.5B-Instruct
35+
36+
# convert ckpt
37+
cd slime/
38+
source scripts/models/qwen2.5-0.5B.sh
39+
PYTHONPATH=/root/Megatron-LM/ python \
40+
tools/convert_hf_to_torch_dist.py \
41+
${MODEL_ARGS[@]} \
42+
--hf-checkpoint /root/Qwen2.5-0.5B-Instruct \
43+
--save /root/Qwen2.5-0.5B-Instruct_torch_dist/
44+
```
45+
46+
可以使用如下脚本进行训练:
47+
48+
```bash
49+
bash script/run-qwen2.5-0.5B-reproducibility.sh
50+
```
51+
52+
这个 PR 中记录了 wandb 的截图 [pull#370](https://github.com/THUDM/slime/pull/370).

docs/en/index.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -40,9 +40,9 @@ slime is the RL-framework behind GLM-4.7, GLM-4.6 and GLM-4.5. Apart from models
4040
:maxdepth: 1
4141
:caption: Advanced Features
4242

43-
_examples_synced/reproducibility/README.md
4443
advanced/speculative-decoding.md
4544
advanced/low-precision.md
45+
advanced/reproducibility.md
4646
advanced/fault-tolerance.md
4747
advanced/pd-disaggregation.md
4848
advanced/arch-support-beyond-megatron.md
Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -45,7 +45,7 @@ PYTHONPATH=/root/Megatron-LM/ python \
4545
And to run training,
4646

4747
```bash
48-
bash examples/reproducibility/run-qwen2.5-0.5B-gsm8k.sh
48+
bash script/run-qwen2.5-0.5B-reproducibility.sh
4949
```
5050

5151
For screen shots of the wandb, please refer to [pull#370](https://github.com/THUDM/slime/pull/370).

docs/zh/index.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -40,9 +40,9 @@ slime 是 GLM-4.7、GLM-4.6、GLM-4.5 背后的 RL 训练框架。除此之外
4040
:maxdepth: 1
4141
:caption: 高级特性
4242

43-
_examples_synced/reproducibility/README.md
4443
advanced/speculative-decoding.md
4544
advanced/low-precision.md
45+
advanced/reproducibility.md
4646
advanced/fault-tolerance.md
4747
advanced/pd-disaggregation.md
4848
advanced/arch-support-beyond-megatron.md

examples/reproducibility/run-qwen2.5-0.5B-gsm8k.sh renamed to scripts/run-qwen2.5-0.5B-reproducibility.sh

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,7 @@ set -ex
1616
export PYTHONBUFFERED=16
1717

1818
SCRIPT_DIR="$(cd -- "$(dirname -- "${BASH_SOURCE[0]}")" &>/dev/null && pwd)"
19-
source "${SCRIPT_DIR}/../../scripts/models/qwen2.5-0.5B.sh"
19+
source "${SCRIPT_DIR}/scripts/models/qwen2.5-0.5B.sh"
2020

2121
CKPT_ARGS=(
2222
--hf-checkpoint /root/Qwen2.5-0.5B-Instruct/

0 commit comments

Comments
 (0)