Hello,
The following GPFs:
let inout = [0; 256];
mlkem_ntt(&mut inout, &Q_DATA);
This is because vmovdqa is used on the first pointer, which says:
When the source or destination operand is a memory operand, the operand must be aligned on a 16 (EVEX.128)/32(EVEX.256)/64(EVEX.512)-byte boundary or a general-protection exception (#GP) will be generated.
So this pointer actually has a 32-byte alignment requirement, and in fact this is true of both pointers to this function. I suspect its common to most of the mlkem functions?
Not sure of the best route, here:
-
Just document it, it is not hard to arrange as a caller and I suspect the tests/test.c has the requisite knowledge. Probably warrants an update to SOUNDNESS.md in the "D1. Global execution-environment assumptions" section.
-
Move to unaligned loads/stores. I understand that these have parity performance nowadays, but I could be wrong.
-
Some sort of extended formalisation of these instructions that feeds into (1) or (2). Something to note here is that currently x86_VMOVDQA and x86_VMOVDQU have the same definition.
Hello,
The following GPFs:
This is because
vmovdqais used on the first pointer, which says:So this pointer actually has a 32-byte alignment requirement, and in fact this is true of both pointers to this function. I suspect its common to most of the mlkem functions?
Not sure of the best route, here:
Just document it, it is not hard to arrange as a caller and I suspect the tests/test.c has the requisite knowledge. Probably warrants an update to SOUNDNESS.md in the "D1. Global execution-environment assumptions" section.
Move to unaligned loads/stores. I understand that these have parity performance nowadays, but I could be wrong.
Some sort of extended formalisation of these instructions that feeds into (1) or (2). Something to note here is that currently
x86_VMOVDQAandx86_VMOVDQUhave the same definition.