[MicroBenchmarks] Add benchmark for control-flow-vectorization.#345
[MicroBenchmarks] Add benchmark for control-flow-vectorization.#345ElvisWang123 wants to merge 2 commits intollvm:mainfrom
Conversation
0f922e8 to
800ee3d
Compare
Benchmarks with vs. without autovec with control flow inside for loops with conditional codes.
800ee3d to
b224f56
Compare
| DEF_COND_INC_LOOP(cond_inc_stride_128, 128) | ||
|
|
||
| // Conditional increment by value (sparse condition). | ||
| DEF_COND_INC_VALUE_LOOP(cond_inc_by_value, 42) |
There was a problem hiding this comment.
what's going to be a % of active lanes here ? Is it really worth to be tracked ?
There was a problem hiding this comment.
Removed small-stride cases to focus on larger strides.
This allows for entirely inactive lanes (in most of the cases), which helps testing conditional vector block optimizations(control-flow vectorization) across different targets.
| // Define conditional increment loop with given stride. | ||
| #define DEF_COND_INC_LOOP(name, stride) \ | ||
| template <typename T> \ | ||
| __attribute__((noinline)) static void run_##name##_autovec(T *A, \ |
There was a problem hiding this comment.
What is the meaning of this benchmark ? Just track current state of cf vectorization of novec and autovec or help to identify better LMUL to vectorize the loop ? If latter, it does make sense to add similar functions with forced vectorization for default LMUL and specified LMULs
There was a problem hiding this comment.
IIUC, this benchmark serves as a test suite for other targets to measure the performance impact of enabling control-flow vectorization.
I've updated the PR description to make it more clear.
Included benchmarks with conditional loops to trigger control-flow vectorization.
These cases can be used for measuring the performance impact of control-flow vectorization across targets.