Add generic device policies #2036
Conversation
| RAJA treats ``Teams(i,j,k)`` and ``Threads(i,j,k)`` as an (x,y,z) ordering. | ||
| For users who prefer SYCL's (dim0, dim1, dim2) ordering, RAJA provides | ||
| ``Teams::sycl_order(dim0, dim1, dim2)`` and | ||
| ``Threads::sycl_order(dim0, dim1, dim2)``, which map to the RAJA (x,y,z) |
There was a problem hiding this comment.
Is this what is meant, or should a more explicit mapping be documented here?
| ``Threads::sycl_order(dim0, dim1, dim2)``, which map to the RAJA (x,y,z) | |
| ``Threads::sycl_order(dim0, dim1, dim2)``, which map to the RAJA (z,y,x) |
|
|
||
| // kernel (For) index mapping | ||
| template<int nx_threads> | ||
| using device_global_size_x_direct = RAJA::cuda_global_size_x_direct<nx_threads>; |
There was a problem hiding this comment.
There are more policies than these. Check out the cuda or hip policy file where there are policies for the cartesian product of things like this.
- (|flatten) x (global|block|thread) x (|size) x (x|y|z|xy|xz|yx|yz|zx|zy|xyz|xzy|yxz|yzx|zxy|zyx) x (direct_unchecked|direct|loop) x (|unchecked)
- (global|block|thread) x (x|y|z|xy|xz|yx|yz|zx|zy|xyz|xzy|yxz|yzx|zxy|zyx) x (syncable_loop)
There was a problem hiding this comment.
Where should we draw the line in terms of providing device generics? I think it would be a little burdensome in terms of boilerplate to have using device_policy = .... for all possible cartesian products of policies.
We could probably automatically emit device generics with a .hpp.in + CMake file configure type interface, which would just cover all the cases in <backend>/policy.hpp, and error out if there isn't a backend specific implementation available. This is probably a bit cleaner and maintainable than manually generating a long list of aliases.
There was a problem hiding this comment.
Also--not sure if we can generate SYCL analogues to all of these cartesian product policies. However we could just leave the policies undefined in the case of SYCL
| using device_global_thread_z = RAJA::cuda_global_thread_z; | ||
|
|
||
| // kernel (loop) index mapping | ||
| using device_thread_x_direct = RAJA::cuda_thread_x_direct; |
There was a problem hiding this comment.
Would it make sense to rewrite some examples like launch_flatten using the generic names? It could simplify the examples
|
|
||
| // kernel (For) index mapping | ||
| template<int nx_threads> | ||
| using device_global_size_x_direct = RAJA::cuda_global_size_x_direct<nx_threads>; |
There was a problem hiding this comment.
Where should we draw the line in terms of providing device generics? I think it would be a little burdensome in terms of boilerplate to have using device_policy = .... for all possible cartesian products of policies.
We could probably automatically emit device generics with a .hpp.in + CMake file configure type interface, which would just cover all the cases in <backend>/policy.hpp, and error out if there isn't a backend specific implementation available. This is probably a bit cleaner and maintainable than manually generating a long list of aliases.
|
@artv3 @MrBurmark I pushed a commit containing an automatic CMake based file generation for |
Summary
AI generated, but I have reviewed the code and it looks reasonable.