-
Notifications
You must be signed in to change notification settings - Fork 83
Description
The documentation mentions dynamic shared memory, but I can't find an example of how to do it.
At the moment we have
auto& shared_data = alpaka::declareSharedVar<T[NumberOfThreads], 0>(accelerator);
in our reduce code. Actually, the value behind NumberOfThreads is a run-time variable. NumberOfThreads is just a configurable constant, which equals to a reasonable high value. However some types T are so large, that sizeof(T) * NumberOfThreads exceed 0xc000. On the other hand, I don't want to restrict NumberOfThreads to a very low value just because some Ts are too large. So what I actually need is
auto& shared_data = alpaka::allocateSharedVar<T, 0>(accelerator, numberOfElements);
where numberOfElements is the actual number of threads (a run-time value).
How do I do this?
PS: We never paid attention to the ID template parameter. Since we only declare one shared variable inside a kernel invocation, should this matter?