Skip to content

[MicroBenchmarks] Add benchmark for control-flow-vectorization.#345

Open
ElvisWang123 wants to merge 2 commits intollvm:mainfrom
ElvisWang123:control-flow-vectorization
Open

[MicroBenchmarks] Add benchmark for control-flow-vectorization.#345
ElvisWang123 wants to merge 2 commits intollvm:mainfrom
ElvisWang123:control-flow-vectorization

Conversation

@ElvisWang123
Copy link

@ElvisWang123 ElvisWang123 commented Feb 23, 2026

Included benchmarks with conditional loops to trigger control-flow vectorization.
These cases can be used for measuring the performance impact of control-flow vectorization across targets.

@ElvisWang123 ElvisWang123 force-pushed the control-flow-vectorization branch from 0f922e8 to 800ee3d Compare February 24, 2026 05:50
Benchmarks with vs. without autovec with control flow inside for loops
with conditional codes.
DEF_COND_INC_LOOP(cond_inc_stride_128, 128)

// Conditional increment by value (sparse condition).
DEF_COND_INC_VALUE_LOOP(cond_inc_by_value, 42)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's going to be a % of active lanes here ? Is it really worth to be tracked ?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed small-stride cases to focus on larger strides.
This allows for entirely inactive lanes (in most of the cases), which helps testing conditional vector block optimizations(control-flow vectorization) across different targets.

// Define conditional increment loop with given stride.
#define DEF_COND_INC_LOOP(name, stride) \
template <typename T> \
__attribute__((noinline)) static void run_##name##_autovec(T *A, \

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the meaning of this benchmark ? Just track current state of cf vectorization of novec and autovec or help to identify better LMUL to vectorize the loop ? If latter, it does make sense to add similar functions with forced vectorization for default LMUL and specified LMULs

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC, this benchmark serves as a test suite for other targets to measure the performance impact of enabling control-flow vectorization.

I've updated the PR description to make it more clear.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants