Open
Description
- Suppose the number of experts on device 0 is 4, and the number of tokens assigned to each expert is 1, 2, 3, and 7, respectively.
I think that after DeepEP, the activations received by expert 0 are (13, 7168).
2.In this case, calling m_grouped_gemm_fp8_fp8_bf16_nt_masked requires the activations to be arranged in the form of (4, 8, 7168), with masked_m=[1,2,3,7].
So, if we want to use the output of DeepEP as the input, do we need to manually rearrange the data from form 1 to form 2?
I have only run the DeepGemm demo and am not sure if I have misunderstood the output distribution of DeepEP .
Metadata
Metadata
Assignees
Labels
No labels