-
Notifications
You must be signed in to change notification settings - Fork 917
optimize: avoid __builtin_ctz(0) in lowest_set_bit() #1608
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
optimize: avoid __builtin_ctz(0) in lowest_set_bit() #1608
Conversation
| unsigned long bit; | ||
| unsigned long index; | ||
|
|
||
| /* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can remove the test for the result being zero, but please don't remove the comment, as it still applies, as far as I know.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And if the parameter is changed to u_int, and the case is removed, as per https://github.com/the-tcpdump-group/libpcap/pull/1608/files#r2678473653, the comment no longer applies.
optimize.c
Outdated
| struct block *b; | ||
|
|
||
| for (i = 0; i < opt_state->n_blocks; ++i) | ||
| for ( i = 0; i < opt_state->n_blocks; ++i) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please don't add an extra space there.
| * | ||
| * If handed zero, the results are platform- and compiler-dependent. | ||
| * Keep it out of the light, don't give it any water, don't feed it | ||
| * after midnight, and don't pass zero to it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Chris Columbus, too.
|
As noted in the comment before all the declarations of However, any compiler worth its salt will figure out that it's being called with a guaranteed-not-to-be-zero value and optimize the test away, so this probably will cause the same code to be generated on any such compiler, so we might as well change it in case any other code ends up using it. |
|
I’ve pushed updates addressing all review feedback: the original comment is restored, the redundant zero check is removed, and the formatting issue is fixed. Please let me know if there’s anything further to adjust. |
optimize.c
Outdated
| { | ||
| if (mask == 0) | ||
| return 0; | ||
| return (u_int)__builtin_ctz((unsigned int)mask); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mask is an u_int . Why do you cast it to unsigned int?
|
Good catch the cast is unnecessary here. mask is already an unsigned integer type and the zero case is handled before the builtin is called I’ll drop the cast to keep the code clearer and consistent |
optimize.c
Outdated
| #endif | ||
|
|
||
| static __forceinline u_int | ||
| lowest_set_bit(int mask) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For consistency, make the argument a u_int, just as it is for the other version. The argument passed to it in the only call we currently make to it is a bpf_u_int32, which is the exact same type on all the platforms on which we run.
That would also mean that we don't need to cast mask to unsigned long - it will get promoted to unsigned long and no sign-extension will be done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I’ve changed lowest_set_bit() to take a u_int argument for consistency with the other implementation and the current call site, and removed the unnecessary cast.
__builtin_ctz(0) is undefined behavior under GCC/Clang and may lead to
miscompilation if the compiler assumes the argument is non-zero.
Replace the macro implementation with a static inline helper that
explicitly handles the zero case before calling the builtin. Apply the
same defined behavior to the MSVC _BitScanForward() path for consistency.