Skip to content

How to use the output of DeepEP as the input of DeepGemm #57

Open
@hpz4311

Description

@hpz4311
  1. Suppose the number of experts on device 0 is 4, and the number of tokens assigned to each expert is 1, 2, 3, and 7, respectively.
    I think that after DeepEP, the activations received by expert 0 are (13, 7168).

2.In this case, calling m_grouped_gemm_fp8_fp8_bf16_nt_masked requires the activations to be arranged in the form of (4, 8, 7168), with masked_m=[1,2,3,7].

So, if we want to use the output of DeepEP as the input, do we need to manually rearrange the data from form 1 to form 2?
I have only run the DeepGemm demo and am not sure if I have misunderstood the output distribution of DeepEP .

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions