Skip to content

Clarification on synchronization semantics for reductions on HIP vs CUDA backends #2022

@gzagaris

Description

@gzagaris

Describe the bug

I encountered a situation where the synchronization behavior of loops with reductions appears to be different between HIP and CUDA backends.

Consider the following code snippet:

template < typename loop_exec, typename reduce_exec, typename T, typename SizeT >
T raja_sum_reduce(T* data, SizeT N) noexcept
{
  RAJA::ReduceSum< reduce_exec, T > sum(T(0));

  RAJA::forall< loop_exec >(
    RAJA::RangeSegment(SizeT(0), N), RAJA_DEVICE(SizeT i) {
      sum += data[i];
  });

  // For HIP it seems that I have to explicitly synchronize before calling sum.get()
  T val = static_cast<T>(sum.get());
  return val;
}

For HIP, I am calling this with:

raja_sum_reduce< RAJA::hip_exec< 256 >, RAJA::hip_reduce >(data, N);

And for CUDA, I am calling this with:

raja_sum_reduce< RAJA::cuda_exec< 256 >, RAJA::cuda_reduce >(data, N);

Observed Behavior

With the HIP backend, it seems that I have to call RAJA::synchronize() before calling sum.get(). However, with the CUDA backend everything works as expected.

I encountered this in the context of a larger application, but, I haven't been able to reproduce it with a minimal standalone RAJA test.

Questions

  1. Does the call to sum.get() guarantee synchronization or should the application explicitly synchronize even though the RAJA::forall execution policy is not asynchronous (async)?

  2. Is the behavior of reductions identical between the CUDA and HIP backends?

  3. I noticed that there are some additional execution policies for loops that have reductions, RAJA::cuda_exec_with_reduce< BLOCK_SIZE > and RAJA::hip_exec_with_reduce< BLOCK_SIZE > respectively.

    • Is the current guidance to use these policies with RAJA::forall instead?
    • How are these policies different from the normal RAJA::cuda_exec< BLOCK_SIZE > and RAJA::hip_exec< BLOCK_SIZE > execution policies?

Thank you very much for all your guidance and help.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions