-
Notifications
You must be signed in to change notification settings - Fork 409
Z: Implement integral vreductionadd evaluator #7982
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Z: Implement integral vreductionadd evaluator #7982
Conversation
@r30shah @hzongaro please take a look at this PR at your convenience. |
ceb5978
to
7a238ac
Compare
7a238ac
to
8a545de
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this looks good. Just a couple of minor suggestions regarding comments.
Add initial implementation for the integral vreductionadd opcode on the IBM Z platform. Note that vreductionadd is not enabled in this commit, as full support requires floating-point handling, which is not yet implemented. Support for floating-point reduction will be introduced in a subsequent PR, after which the opcode can be fully enabled. signed-off-by: Ehsan Kiani Far <[email protected]>
8a545de
to
6402720
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this looks good. I'll wait until @r30shah has had a chance to review before merging.
generateVRIaInstruction(cg, TR::InstOpCode::VGBM, node, scratchReg, 0, 0); | ||
if (needPreReduction) { | ||
// We can not sum all lanes in one operation when the lane size is byte or halfword. | ||
// Calculating the sum of byte or halfword into an intermediate word so we can add all word in the next step. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ehsankianifar - How does this code handles the overflow for byte and half word ? VSUM would zero extend the intermediate sum and place it in the word, but VSUMQ would not do that right ?
Does reduction opcode doubles the element type ? If not, I believe VSUMQ possibly can produce result that is larger than element size.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I confirmed with @gita-omr that we do not need to handle overflow in this opcode.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok - I would like that to be in comment here - or at least in documentation.
Add implementation for integral vreductionadd opcode on IBM Z platform. vreductionAdd is not enabled in this PR since it requires float support as well. Float reduction add implementation will be added in phase 2.