-
Notifications
You must be signed in to change notification settings - Fork 61
update the moe to support arch < 90 #123
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@@ -162,10 +167,10 @@ def forward(self): | |||
|
|||
# MLP layer 1 (GEMM1 and combine) | |||
mlp_output = self.flux_rs_op.forward_gather_rs( | |||
input=self.ctx.intermediate_output, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
revert this part? @burness
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@wenlei-bao sorry, I debug some code and forget revert it. e6f6b3b
examples/moe.py
Outdated
@@ -165,7 +170,7 @@ def forward(self): | |||
input=self.ctx.intermediate_output, | |||
weight=self.ctx.weight1, | |||
splits_cpu=self.ctx.splits_cpu, | |||
routing_idx=self.ctx.scatter_index.view(-1), | |||
rounting_idx=self.ctx.scatter_index.view(-1), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess here is a typo
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, it's a typo. But when I review the code, I found the flux.GemmGroupedV2GatherRSOp
and GemmGroupedV3GatherRS
, the 4-th arg is scatter_idx
. So the first commit, I del the args_name to adopt it like the ut case here:
Maybe, the routing_idx
should be scatter_idx
?
Sorry about that I have no resource to debug it now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@houqi I think the code example in moe_usage.md
should be right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think v2 and v3 use different names for this field @houqi ?
As Describe in #121, update the examples/moe.py to support arch < 90