more

fzyzcjy · fzyzcjy · commit ff974548952b · 2025-11-13T08:30:16.000+08:00
diff --git a/examples/true_on_policy/README.md b/examples/true_on_policy/README.md
@@ -34,6 +34,6 @@ Briefly speaking, we handled the following components to make them aligned:
 
 * Attention: We use FA3 backend for both training and inference, since it achieves bitwise equal between prefill and decode operations.
 * GEMM: We use DeepGEMM for fast matrix multiplication while preserving true-on-policy, thanks to its algorithm to pick things like tensor core instructions ([SGLang#12142](https://github.com/sgl-project/sglang/pull/12142)).
-* For other kernels, we align numeric operation details between the two systems for simplicity, such as op dtype, detailed kernels, etc. Some operations can also be compiled to speedup ([#603](https://github.com/THUDM/slime/pull/603), [SGLang#12161](https://github.com/sgl-project/sglang/pull/12161)).
+* For other kernels, we align numeric operation details between the two systems for simplicity, such as op dtype, detailed kernels, etc, besides using batch-invariant kernels as a prerequisite. Some operations can also be compiled to speedup ([#603](https://github.com/THUDM/slime/pull/603), [SGLang#12161](https://github.com/sgl-project/sglang/pull/12161)).
 
 In order to more easily align the two parts, we use SGLang's [dumper](https://github.com/sgl-project/sglang/blob/main/python/sglang/srt/debug_utils/dumper.py) tool for quick comparisons. (Need [#12622](https://github.com/sgl-project/sglang/pull/12622) and [#12623](https://github.com/sgl-project/sglang/pull/12623) for most convenience.)