-
Notifications
You must be signed in to change notification settings - Fork 433
add vpdpbusd avx512 intrinsic
#4776
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Thank you for contributing to Miri! A reviewer will take a look at your PR, typically within a week or two. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR!
I am slightly concerned about slowly growing a huge avx512 file that nobody has an overview of any more.^^ But as long as there's a clear motivation in the form of a core ecosystem crate, I hope that will naturally limit the scope of what we have to support.
|
Reminder, once the PR becomes ready for a review, use |
|
@rustbot ready |
src/shims/x86/avx512.rs
Outdated
|
|
||
| /// Multiply groups of 4 adjacent pairs of unsigned 8-bit integers in `a` with corresponding signed | ||
| /// 8-bit integers in `b`, producing 4 intermediate signed 16-bit results. Sum these 4 results with | ||
| /// the corresponding 32-bit integer in `src`, and store the packed 32-bit results in `dst`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| /// the corresponding 32-bit integer in `src`, and store the packed 32-bit results in `dst`. | |
| /// the corresponding 32-bit integer in `src` (using wrapping arithmetic), and store the packed 32-bit results in `dst`. |
| 0, | ||
| 7, | ||
| // Using values close to the minimum/maximum here makes it very likely that the final | ||
| // addition with the element from this array overflows. The addition should wrap. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you mean by "very likely"? You are fixing all the inputs here, there should be no probabilities.^^
This pairs up with elements in A and B, right? So you can have comments there indicating which ones this is paired up with.
|
This looks great, thanks! Please squash the commits. You can squash manually if there are multiple independent commits you want to preserve, or use @rustbot author |
|
@rustbot ready |
This intrinsic is useful for the adler32 checksum algorithm.
The test attempts to hit a bunch of overflow and truncation cases, and I've validated it on real hardware.