-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add CPUID for AvxVnniInt8 and AvxVnniInt16 #113956
base: main
Are you sure you want to change the base?
Conversation
6714249
to
141d643
Compare
@tannergooding This is first of the 2 PRs needed for AVX VNNI INT* API introduction #112586 |
{ NI_Illegal, NI_Illegal }, // AVXVNNIINT8 | ||
{ NI_Illegal, NI_Illegal }, // AVXVNNIINT8_V512 | ||
{ NI_Illegal, NI_Illegal }, // AVXVNNIINT16 | ||
{ NI_Illegal, NI_Illegal }, // AVXVNNIINT16_V512 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason we're not adding the APIs at the same time? They look like they should be generally table driven, so it should be a minimal change on top...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can do that too. For all other ISAs, we generally did CPUID and API introduction as separate PRs. Also, it becomes easier to run superpmi once the CPUID PR goes in. Let me know what you'd prefer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For all other ISAs, we generally did CPUID and API introduction as separate PRs.
For some of the others, like AVX10.2, we've done it incrementally because of the number of APIs and total work required.
That is, checking in the CPUID support first allowed a reduction of conflicts and parallelization of adding a large number of intrinsic APIs across several PRs.
In this case, there's only a very small number of APIs that are likely entirely table driven, so there's little to no risk of conflicts or additional churn.
Doing it all at once lets us build confidence the CPUID checks and end to end story is correct since it is self contained like that and since it allows adding the CPUID and other tests at the same time.
Also, it becomes easier to run superpmi once the CPUID PR goes in
There's not much need to run SPMI for net new intrinsics that nothing is using yet, we're going to get zero diffs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ohkay. Thanks for the review. I will switch this PR to add everything together and then update you.
@@ -102,6 +106,8 @@ instructionset64bit,X86 ,AVX10v1_V512 | |||
instructionset64bit,X86 ,AVX10v2 | |||
instructionset64bit,X86 ,AVX10v2_V512 | |||
instructionset64bit,X86 ,GFNI | |||
instructionset64bit,X86 ,AVXVNNIINT8 | |||
instructionset64bit,X86 ,AVXVNNIINT16 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These don't need the X64 ISA defined. The X64 nested classes that will exist in the managed API are only there to hide Avx2.X64
which would otherwise be inherited. JIT only needs to handle AvxVnni*.X64.IsSupported
as intrinsic, which will happen automatically (to always return false) since there will be no ISA mapping.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to always return false
AvxVnni*.X64.IsSupported
should return false on a 32-bit machine and on a 64-bit machine it should return the same result as AvxVnni*.IsSupported
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Understood. I will be removing the AvxVnniInt*.x64
Isas.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to always return false
AvxVnni*.X64.IsSupported
should return false on a 32-bit machine and on a 64-bit machine it should return the same result asAvxVnni*.IsSupported
hmm.. so to sum up, current structure should be kept as is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, sorry, my mistake. I was thinking about previous cases where an X64 ISA has been added when there was no corresponding managed class at all. As Tanner said, it will have different behavior on x86 vs x64, so we need an ISA to map to for IsSupported
This PR adds support for CPUID for
AVX-VNNI-INT8
&AVX-VNNI-INT16
ISAsDesign
The changes are made in a way to enable the 2 ISAs when
Avx10.2
is enabled orThis is w.r.t the discussions done in API proposal #112586
Testing
Note1: Emitter unit tests not ran since they are added and verified along with AVX10.2 PR #111209
Note2: Superpmi results are not accurate since we are adding a new CPUID and it leads to a new jiteeversionguid. Even after changing the jiteeversion manually, superpmi run shows errors and failures based on the old mch files which can be ignored.
Run JIT subtree with AVXVNNIINT* enabled / disabled
AVXVNNIINT* Enabled

AVXVNNIINT* disabled
