Description
Version
latest master / 4.0.0
What behaviour are you expecting?
I was reproducing the server-client setup via HAL server as in https://github.com/codeplaysoftware/oneapi-construction-kit/tree/main/examples/hal_cpu_remote_server, then I noticed my big kernels are erroring out on (both of) my RISC-V device(s). I am sure the (both) device(s) have sufficient memory, and in fact the allocation takes place as expected.
What actual behaviour are you seeing?
I am seeing the following from the local client (first lines as expected):
$ HAL_REMOTE_PORT=5906 ./test $((1<<25))
Running on ock cpu
Allocated 128 MB
$ HAL_REMOTE_PORT=5906 ./test $((1<<26))
Running on ock cpu
Allocated 256 MB
terminate called after throwing an instance of 'sycl::_V1::runtime_error'
what(): Native API failed. Native API returns: -999 (Unknown PI error) -999 (Unknown PI error)
Aborted (core dumped)
and on the RISC-V server, I get a seg fault shortly as: Segmentation fault (core dumped)
. And after that, attempting to restart the server on the same port, I fail with Unable to start server on requested port 5906, node 127.0.0.1
.
On the other hand, empty kernel, or no kernel at all is OK.
What steps are required to reproduce the bug?
To reproduce, on the client side:
#include <sycl/sycl.hpp>
int main(int argc, char **argv) {
unsigned long long len = 1 << 28;
if (argc > 1) {
len = std::stoull(argv[1]);
}
sycl::queue queue(sycl::accelerator_selector_v);
std::cout << "Running on " << queue.get_device().get_info<sycl::info::device::name>() << std::endl;
float *d_a = sycl::malloc_device<float>(len, queue);
queue.wait();
std::cout << "Allocated " << len * sizeof(float) / 1024 / 1024 << " MB" << std::endl;
queue.parallel_for(sycl::range<1>(len), [=](sycl::id<1> idx) {
d_a[idx] = idx;
}).wait();
return 0;
}
On the server, simply listen on a port as usual.
Minimal test case
No response
Anything else we should know?
No response