HAL server failing with large kernels

### Version

latest master / 4.0.0

### What behaviour are you expecting?

I was reproducing the server-client setup via HAL server as in https://github.com/codeplaysoftware/oneapi-construction-kit/tree/main/examples/hal_cpu_remote_server, then I noticed my big kernels are erroring out on (both of) my RISC-V device(s). I am sure the (both) device(s) have sufficient memory, and in fact the allocation takes place as expected.



### What actual behaviour are you seeing?

I am seeing the following from the local client (first lines as expected):
```
$ HAL_REMOTE_PORT=5906 ./test $((1<<25))
Running on ock cpu
Allocated 128 MB

$ HAL_REMOTE_PORT=5906 ./test $((1<<26))
Running on ock cpu
Allocated 256 MB
terminate called after throwing an instance of 'sycl::_V1::runtime_error'
  what():  Native API failed. Native API returns: -999 (Unknown PI error) -999 (Unknown PI error)
Aborted (core dumped)
```

and on the RISC-V server, I get a seg fault shortly as: `Segmentation fault (core dumped)`. And after that, attempting to restart the server on the same port, I fail with `Unable to start server on requested port 5906, node 127.0.0.1`.

On the other hand, empty kernel, or no kernel at all is OK.

### What steps are required to reproduce the bug?

To reproduce, on the client side:
```cpp
#include <sycl/sycl.hpp>

int main(int argc, char **argv) {
  unsigned long long len = 1 << 28;
  if (argc > 1) {
    len = std::stoull(argv[1]);
  }

  sycl::queue queue(sycl::accelerator_selector_v);
  std::cout << "Running on " << queue.get_device().get_info<sycl::info::device::name>() << std::endl;
  float *d_a = sycl::malloc_device<float>(len, queue);
  queue.wait();
  std::cout << "Allocated " << len * sizeof(float) / 1024 / 1024 << " MB" << std::endl;
  queue.parallel_for(sycl::range<1>(len), [=](sycl::id<1> idx) {
    d_a[idx] = idx;
  }).wait();
  return 0;
}
```

On the server, simply listen on a port as usual.

### Minimal test case

_No response_

### Anything else we should know?

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

HAL server failing with large kernels #514

Version

What behaviour are you expecting?

What actual behaviour are you seeing?

What steps are required to reproduce the bug?

Minimal test case

Anything else we should know?

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

HAL server failing with large kernels #514

Description

Version

What behaviour are you expecting?

What actual behaviour are you seeing?

What steps are required to reproduce the bug?

Minimal test case

Anything else we should know?

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions