[HW] HWVectorization Part 3: Structural Patterns#9749
[HW] HWVectorization Part 3: Structural Patterns#9749mafeguimaraes wants to merge 3 commits intollvm:mainfrom
Conversation
022b687 to
5adfe4a
Compare
|
Hi @uenoku, I’ve just finished the implementation for Part 3 (Structural Patterns). When you have a moment, could you please take a look? Thanks for the help throughout this process! |
| if (!visited.insert(val).second) | ||
| return true; | ||
|
|
||
| if (auto *op = val.getDefiningOp()) { |
There was a problem hiding this comment.
This is always non-null because isa<BlockArgument>(val)...
| if (auto *op = val.getDefiningOp()) { | |
| auto *op = val.getDefiningOp(); |
| /// Determines if a shared value is safe for vectorization. Safe values | ||
| /// include constants and block arguments, which act as shared control | ||
| /// signals. | ||
| bool isSafeSharedValue(mlir::Value val, |
There was a problem hiding this comment.
Do you mind if you could explain when this return false? As far as I see I think it always returns true.
There was a problem hiding this comment.
You're right! The function always returned true because every value eventually traces back to a BlockArgument or ConstantOp. The recursive traversal was unnecessary, only constants and block arguments are safe to share between bit lanes, so I simplified it. Fixed in the latest commit.
| if (!slice0Val || !slice0Val.getDefiningOp()) | ||
| return false; | ||
|
|
||
| Value slice1Val = findBitSource(output, 1); | ||
| if (!slice1Val || !slice1Val.getDefiningOp()) | ||
| return false; |
There was a problem hiding this comment.
| if (!slice0Val || !slice0Val.getDefiningOp()) | |
| return false; | |
| Value slice1Val = findBitSource(output, 1); | |
| if (!slice1Val || !slice1Val.getDefiningOp()) | |
| return false; | |
| if (!slice0Val) | |
| return false; | |
| Value slice1Val = findBitSource(output, 1); | |
| if (!slice1Val) | |
| return false; |
5adfe4a to
3faf348
Compare
b179be0 to
a771499
Compare
Context: This PR is the third part of a series of incremental patches for the
HWVectorizationpass. Building upon the bit-tracking infrastructure from Parts 1 and 2 (#9704 and #9739), this patch introduces Structural Vectorization: the ability to collapse N isomorphic scalar logic cones into a single wide operation.Key Enhancements:
AND/OR/XOR/MUXgates into a single N-bit operation.MUX selector anAND enable) are identified byareSubgraphsEquivalentas common leaves and passed directly to the wide operation, or broadcast viacomb.replicatewhen used as data operands.vectorizeSubgraphrecursively rebuilds the scalar logic tree into its vectorized equivalent, supporting arbitrary depth (e.g.,(a[i] & b[i]) ^ c[i]across all bits becomescomb.andfollowed bycomb.xor).out_xorandout_and) are fully vectorized in a single pass.