Skip to content

Conversation

midronij
Copy link
Contributor

@midronij midronij commented Aug 25, 2025

Implement PPC codegen for s2m (Short to Mask) on P8+. This operation accepts two byte elements of a given boolean array (read from memory using a halfword load) and converts it into a two-element LongVector mask with the corresponding boolean values.

@midronij midronij force-pushed the s2m branch 2 times, most recently from 08f39f7 to b3884df Compare August 26, 2025 17:19
@midronij midronij changed the title WIP: PPC Codegen for vectorized Short to Mask operation Implement vectorized s2m on PPC Aug 26, 2025
@midronij
Copy link
Contributor Author

@gita-omr @zl-wang could you please review when you have a chance?

@midronij midronij force-pushed the s2m branch 3 times, most recently from 0d7be95 to df618e2 Compare August 26, 2025 17:37
@zl-wang
Copy link
Contributor

zl-wang commented Aug 27, 2025

you can describe more clearly what needs to be done. expected input and expected output, bit-based conversion or byte-based conversion, or others. otherwise, it can be reviewed with speculations of the semantic.

@midronij midronij force-pushed the s2m branch 2 times, most recently from a7a8726 to cc68f2b Compare September 9, 2025 19:56
@midronij midronij changed the title Implement vectorized s2m on PPC WIP: Implement vectorized s2m on PPC Sep 12, 2025
@midronij midronij force-pushed the s2m branch 2 times, most recently from 2ef9b4c to 5a30c08 Compare September 23, 2025 18:31
@midronij midronij changed the title WIP: Implement vectorized s2m on PPC Implement vectorized s2m on PPC Sep 23, 2025
@midronij midronij force-pushed the s2m branch 3 times, most recently from 73a8b65 to e61c7c1 Compare September 24, 2025 18:48
@midronij midronij force-pushed the s2m branch 6 times, most recently from 8e57923 to 96cfe5b Compare October 7, 2025 17:24
@midronij midronij force-pushed the s2m branch 3 times, most recently from 8901887 to 8a4209a Compare October 8, 2025 21:32
generateTrg1Src1Imm2Instruction(cg, TR::InstOpCode::rldimi, node, tmpGPR, srcReg, 32, 0x000000FF00000000);
} else {
generateTrg1Src1Imm2Instruction(cg, TR::InstOpCode::rldicr, node, tmpGPR, srcReg, 24, 0xFFFFFFFF00000000);
generateTrg1Src1Imm2Instruction(cg, TR::InstOpCode::rlwimi, node, tmpGPR, srcReg, 0, 0x000000FF);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rldimi is more appropriate here, since rlwimi clears the top 32bits automatically.

@midronij midronij changed the title Implement vectorized s2m on PPC WIP: Implement vectorized s2m on PPC Oct 16, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants