There is a type bug in barret_reduce in reduce.c:
int16_t barrett_reduce(int16_t a) {
int16_t t;
const int16_t v = ((1<<26) + KYBER_Q/2)/KYBER_Q;
t = ((int32_t)v*a + (1<<25)) >> 26;
t *= KYBER_Q;
return a - t;
}
The type of 1 is int, thus depending on the number of bits in an int (at least 16), (1 << 26) and (1 << 25) are undefined. Indeed, compiling this with SDCC, we see them both become 0, which I don't think was intended. The fix is simple: replace both 1 with 1l to change the type to long, which has at least 32 bits.
There is a type bug in barret_reduce in reduce.c:
The type of 1 is int, thus depending on the number of bits in an int (at least 16), (1 << 26) and (1 << 25) are undefined. Indeed, compiling this with SDCC, we see them both become 0, which I don't think was intended. The fix is simple: replace both 1 with 1l to change the type to long, which has at least 32 bits.