-
Notifications
You must be signed in to change notification settings - Fork 76
SYCL: fix index order #2488
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
SYCL: fix index order #2488
Conversation
2094231
to
7d3c32b
Compare
I updated the PR, because I missed to change |
5c30c29
to
9797f22
Compare
a36bd58
to
9797f22
Compare
@AuroraPerego could you try this PR on a CPU SYCL device. On an intel GPU all tests pass but for unknown reasons the CI fails where we executed the code on a CPU accelerator. error is:
|
The fast moving index in SYCLs `nd_item` is the rigth most equal to alpaka's index order. In our code base we implemented it equal to CUDA's index order where the left most index is the fast moving index.
9797f22
to
e0c8877
Compare
I thing a comment @fwyzard is maybe a good starting point for the current problem seen on FPGA emulation and CPU alpaka/include/alpaka/exec/Once.hpp Lines 33 to 38 in 040feb6
In the original alpaka code we permuted the indices twice. Once before the kernel start to calculate the grid size and within the kernel, we permutate all sycl indices back. If we linearized the permutated indices we could differ from what I try currently to find if the AI is hallucinating or the following is true.
|
I have set this PR to draft and added in the last commit debug output. |
ed7181a
to
4475f66
Compare
It could be that we are not allowed to run the alpaka/test/unit/warp/src/Any.cpp Lines 47 to 54 in 040feb6
|
We have already disabled the tests for |
disable FBGA SYCL tests CI_FILTER: linux_icpx
4475f66
to
3a3484d
Compare
I strongly believe the AI is hallucinating, that would be a very weird and common bug. Looking in the source code, So, if the N-dimensional values are correct, it would be extremely surprising that the linear id is wrong... |
@psychocoderHPC, see #2470 . We need to agree what the behaviour of the alpaka warp functions should be, and in case implement #2485. |
Yes and we missed
Yes, we should find an agreement in the next meeting mid of April. |
I currently try to understand the output of my last debug test. I disabled FPGa and run CPU only. Within the
I do not currently have an explanation for why we see so many valid outputs and then fail with thread zero only. |
The fast moving index in SYCLs
nd_item
is the right most equal to alpaka's index order.In our code base we implemented it equal to CUDA's index order where the left most index is the fast moving index.
This PR should be back ported to develop version 1.3
You can read more about it https://www.intel.com/content/www/us/en/docs/dpcpp-compatibility-tool/developer-guide-reference/2023-2/cuda-and-sycl-programming-model-comparison.html
thanks to @SimeonEhrig to pointing me to this issue.