Skip to content

Z: Implement vector bit compress and expand evaluators#8276

Open
ehsankianifar wants to merge 1 commit into
eclipse-omr:masterfrom
ehsankianifar:Z_bitCompressAndExpand
Open

Z: Implement vector bit compress and expand evaluators#8276
ehsankianifar wants to merge 1 commit into
eclipse-omr:masterfrom
ehsankianifar:Z_bitCompressAndExpand

Conversation

@ehsankianifar

Copy link
Copy Markdown
Contributor

Implement vcompressbitsEvaluator, vmcompressbitsEvaluator, vexpandbitsEvaluator, and vmexpandbitsEvaluator on IBM Z Platform.

@ehsankianifar

Copy link
Copy Markdown
Contributor Author

@r30shah could you please review changes in this PR? Thanks

Comment thread compiler/z/codegen/OMRTreeEvaluator.cpp Outdated
{
return TR::TreeEvaluator::unImpOpEvaluator(node, cg);
/**
* Vector Compress Bits Evaluator

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ehsankianifar - Can we add doxygen styled comments for this evaluator so that the details you have here (2260-2267) can go there. I may have missed recommending that in other PRs, something just occurred to me. But that would be ideal place.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I moved the comments to the doxygen document

@ehsankianifar ehsankianifar force-pushed the Z_bitCompressAndExpand branch from 3f23e7a to 715f13b Compare June 8, 2026 15:11
@ehsankianifar

Copy link
Copy Markdown
Contributor Author

To explore using existing GPR instructions to accelerate this operation in newer hardware (z17+), I created this issue:
#8283

@r30shah r30shah left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ehsankianifar - Some nitpicks.

* \brief
* Compresses the lane-wise values of the source vector based on a bit mask.
*
* \details

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do think we can still improve the readability of the comment. Looking at the opcode properties, it clearly states that the implementation is equivalent to the Java's Integer/Long.compress. So I would simplify the comment. The usage of MSB here kind of throws me. I thought it was the way opcodes needs specifically. We could include the example of how it looks on one element in the lane-mask pair.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Comment thread compiler/z/codegen/OMRTreeEvaluator.cpp Outdated
Comment thread compiler/z/codegen/OMRTreeEvaluator.cpp Outdated
// set.
generateVRSaInstruction(cg, TR::InstOpCode::VERLL, node, sourceReg, sourceReg, generateS390MemoryReference(1, cg),
elementSizeMask);
generateVRReInstruction(cg, TR::InstOpCode::VSEL, node, resultReg, sourceReg, resultReg, scratchReg, 0, 0);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add comment -
// Step 3: Conditionally copy source MSB to result if mask bit is set
// VSEL: result = (scratchReg[bit0] == 1) ? workingSource : resultReg

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Comment thread compiler/z/codegen/OMRTreeEvaluator.cpp Outdated
Comment thread compiler/z/codegen/OMRTreeEvaluator.cpp Outdated
Comment thread compiler/z/codegen/OMRTreeEvaluator.cpp Outdated
TR_ASSERT_FATAL_WITH_NODE(node, node->getDataType().getVectorLength() == TR::VectorLength128,
"Only 128-bit vectors are supported %s", node->getDataType().toString());
const uint8_t elementSizeMask = static_cast<uint8_t>(getVectorElementSizeMask(node));
uint32_t elementBitNum = getVectorElementSize(node) * 8;

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are not changing this variable - Can we use const?

Also to improve readability, I would use const uint32_t msbPosition = elementBitNum - 1; and use it in the instruction - this improves readability of the code.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

cg->evaluate(maskChild), 0, 0);
cg->decReferenceCount(maskChild);
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TBH - this is the complex implementation but may be the only way we can achieve this operation ? We will have loop iteration that is controlled by the element size in number of bits. So worst case it executes 64 loop iteration for Long. Also in each iteration we have 5 vector operations. I do understand that we want this operation to work on all platform, but I really think that you should consider using bit extract and deposit instruction where it is available.

Also

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am already working on that but I prefer to have it in a different PR so I can test it properly on the new hardware.

* Expands the lane-wise values of the source vector based on a bit mask.
*
* \details
* The bits from the source are distributed to positions where the mask has set bits.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to compressbits, comment should be made simpler for expand.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

* \return
* TR::Register with the compressed values.
*/
TR::Register *OMR::Z::TreeEvaluator::vexpandbitsEvaluator(TR::Node *node, TR::CodeGenerator *cg)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same feedback as compressBits for comments.

@ehsankianifar ehsankianifar force-pushed the Z_bitCompressAndExpand branch 3 times, most recently from e86bf28 to b18646e Compare June 16, 2026 18:30
@ehsankianifar

Copy link
Copy Markdown
Contributor Author

@r30shah I addressed all the comments. could you please take another look? thanks.

@r30shah r30shah left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Last nitpicks, but overall looks good.

Comment thread compiler/z/codegen/OMRTreeEvaluator.cpp Outdated
* Compresses the lane-wise values of the source vector based on a bit mask.
*
* \details
* Performs lanewise compression similar to integer/long.compress()

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit-pick Integer/Long.compress.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Comment thread compiler/z/codegen/OMRTreeEvaluator.cpp Outdated
* Expands the lane-wise values of the source vector based on a bit mask.
*
* \details
* Performs lanewise expansion similar to scalar integer/long.expand()

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit - Integer/Long.expand()

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Implement vcompressbitsEvaluator, vmcompressbitsEvaluator,
vexpandbitsEvaluator, and vmexpandbitsEvaluator on IBM Z Platform.

signed-off-by: Ehsan Kiani Far <ehsan.kianifar@gmail.com>
@ehsankianifar ehsankianifar force-pushed the Z_bitCompressAndExpand branch from b18646e to cc7c116 Compare June 18, 2026 16:54
@ehsankianifar

Copy link
Copy Markdown
Contributor Author

Thanks @r30shah for your review. I addressed the last change requests. Let me know if it looks good to you.

@r30shah r30shah left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants