Skip to content

Commit 8ef61f1

Browse files
committed
mod doc
1 parent 24dd6d8 commit 8ef61f1

3 files changed

Lines changed: 5 additions & 4 deletions

File tree

README.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -180,6 +180,6 @@ set_optimization(
180180

181181
- [x] SFT exps
182182
- [x] Reference configs: Qwen3 8B `playground/pretrain/qwen3/qwen3_8.py`, Step3.5 Flash `playground/pretrain/step3p5/step3p5_flash.py`
183-
- [ ] Eval
183+
- [x] Eval
184184
- [ ] RLVR implementation
185-
- [ ] Triton kernel implementation
185+
- [x] Triton kernel implementation

README_ZH.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -175,6 +175,6 @@ set_optimization(
175175

176176
- [x] SFT exps
177177
- [x] Reference configs: Qwen3 8B `playground/pretrain/qwen3/qwen3_8.py`, Step3.5 Flash `playground/pretrain/step3p5/step3p5_flash.py`
178-
- [ ] Eval
178+
- [x] Eval
179179
- [ ] RLVR 实现
180-
- [ ] Triton kernel 实现
180+
- [x] Triton kernel 实现

playground/sft/step3/step3p5_flash_sft_step3_data_muon.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -219,6 +219,7 @@ def configure_optimizable(self):
219219
moe_weighted_gather="triton",
220220
TokenDispatcher="deep_ep",
221221
grouped_gemm="nv_grouped_gemm",
222+
# grouped_gemm="function_imple", # slower fallback
222223
AttentionCore="flash-attn-3",
223224
)
224225

0 commit comments

Comments
 (0)