-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Fix SGEMV on POWER8 by reverting to the non-vectorized earlier code #5125
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This was probably an endianness issue but I'm OK with the fix. |
We can always improve on this later, but it would seem pointless to leave broken code in place for the next release. Curious how the exact same code works on POWER9&10 ppc64le but not on POWER8 ppc64le - does the POWER8 implement the vecmerge(l/h) differently ? |
It requires some investigation of what exactly is the cause. vec_mergeh/l should be implemented the same on P8, P9 or P10. |
Compiler options are identical except for the |
Looks like the failure is related to inc_x and/or inc_y not being equal to one. It seems P8 isn't handling that correctly for SBGEMV without your latest fix. I know I updated the SBGEMV unit test to test for inc_x/y = 2 last year. I made sure it worked for P9 & P10 when I improved the SBGEMV for those architecture. |
I think your change will fix that failure for P8 but I see it will probably fail for P7 and earlier. Instead of
it should be
|
P7 and earlier do not use these kernels, they fall back to the older gemv_(n/t).S |
Ok, after digging into it a bit, it looks like an epsilon difference for P8. Still trying to figure out why but as I said before, your PR is probably the best approach. |
Maybe gcc doing something wild when |
FTR https://gcc.gnu.org/PR119234 (not a valid GCC bug report) has root caused the issue. |
reverts part of PR #4880 as already discussed there earlier, fixes #5122