-
-
Notifications
You must be signed in to change notification settings - Fork 15
WIP: Use avx when available. #108
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Added x86-64 intrinsics implementation. Moved implemenation into a separate header. Added macro checks for x86. non x86 targets use 128 bit simd max. Changed intrinsic macro definition to LIBUNICODE_USE_INTRINSICS.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some comments, actions are failing as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot, I checked that detection is working on both linux and windows, last thing is to fix following warnings:
warning: no previous declaration for ‘void cpuid(int32_t*, int32_t, int32_t)’ [-Wmissing-declarations]
29 | void cpuid(int32_t out[4], int32_t eax, int32_t ecx)
| ^~~~~
warning: no previous declaration for ‘uint64_t xgetbv(unsigned int)’ [-Wmissing-declarations]
33 | uint64_t xgetbv(unsigned int index)
| ^~~~~~
warning: no previous declaration for ‘bool detect_os_avx()’ [-Wmissing-declarations]
45 | auto detect_os_avx() -> bool
| ^~~~~~~~~~~~~
warning: no previous declaration for ‘bool detect_os_avx512()’ [-Wmissing-declarations]
64 | auto detect_os_avx512() -> bool
| ^~~~~~~~~~~~~~~~
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Dooez many thanks for your contribution. We really very much appreciate it. We've just looked together with @Yaraslaut over it. It looks overall very well implemented. Many thanks again for that.
@Yaraslaut is going to add some commit on top of it to address really minor stuff and fix the Windows platform issue. :)
c8a6e56
to
1bf0902
Compare
Added runtime detection of instruction set copied from Mysticial.
Added two additional
scan_for_text_ascii
with bigger simd vector sizes and compile flags to enable the compilation.-mavx512bitalg
may be unnecessary for 512bit version.