Skip to content

Aknezevic/rs accum#7016

Open
AleksKnezevic wants to merge 3 commits intomainfrom
aknezevic/rs_accum
Open

Aknezevic/rs accum#7016
AleksKnezevic wants to merge 3 commits intomainfrom
aknezevic/rs_accum

Conversation

@AleksKnezevic
Copy link
Contributor

Ticket

Link to Github Issue

Problem description

reduce scatter was producing bad results compared to all_gather followed by local reduce. found out it's due to the compute config. Update RS to always use fp_32 accumulation (like local reduce does by default)

What's changed

add WA for reduce scatter to use fp32 accum, pipe it through to runtime. I'll remove change in metal repo once my branch there gets merged and uplifted.

Checklist

  • New/Existing tests provide coverage for changes

Comment on lines +14 to +17
// Skip if compute config is already set
if (srcOp.getComputeConfigAttr()) {
return failure();
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't the exit condition be HiFi4?

Comment on lines +32 to +39
auto newOp = rewriter.create<ReduceScatterOp>(
srcOp.getLoc(), srcOp.getResult().getType(), srcOp.getInput(),
srcOp.getReduceTypeAttr(), srcOp.getScatterDimAttr(),
srcOp.getClusterAxisAttr(), srcOp.getSubDeviceIdAttr(),
srcOp.getMemoryConfigAttr(), srcOp.getNumLinksAttr(),
srcOp.getTopologyAttr(), computeConfig);

rewriter.replaceOp(srcOp, newOp.getResult());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: create followed by replaceOp can be replaced with replaceOpWithNewOp

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants