Skip to content

Conversation

@dssgabriel
Copy link
Collaborator

@dssgabriel dssgabriel commented Nov 26, 2025

  • Make reduction operator conversion CommSpace-generic
    • Perform necessary changes for NCCL
  • Added conversion logic between KokkosComm's reduction operators and MPI_Op
    • Added missing LXor and BXor operators
  • Cleaned up primary impl header
    • updated (missed) copyright header
    • fix includes (IWYU)
    • removed unnecessary namespace prefixing

@dssgabriel dssgabriel added C-enhancement Category: an enhancement or bug fix A-mpi Area: KokkosComm MPI backend implementation labels Nov 26, 2025
@dssgabriel dssgabriel self-assigned this Nov 26, 2025
@dssgabriel dssgabriel force-pushed the refactor/reduction-ops branch from fa9a7e1 to 8d76720 Compare November 26, 2025 21:00
@dssgabriel dssgabriel force-pushed the refactor/reduction-ops branch 3 times, most recently from 845b50b to f0ca12d Compare December 2, 2025 10:42
Copy link
Member

@cedricchevalier19 cedricchevalier19 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Member

@cedricchevalier19 cedricchevalier19 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure how to test the dispatch logic, perhaps adding a unit test that almost looks like your implementation but that can save us later when changing the reduction op dispatch.

Can you use these reduction op in MPI, in this PR?

Can we make something to be able to know if a reduction operator is implemented for a given CommunicationSpace?

@dssgabriel
Copy link
Collaborator Author

dssgabriel commented Dec 8, 2025

perhaps adding a unit test that almost looks like your implementation, but that can save us later when changing the reduction op dispatch

Yes, definitely. I will add conversion tests in this PR.


Can you use these reduction op in MPI, in this PR?

Do you mean for the existing low-level MPI reduction interfaces we have? I could, but this would mean that these no longer take an MPI_Op parameter, which would not be consistent with what I did for the low-level NCCL bindings (which take an ncclRedOp_t parameter instead). As I understand our design logic, low-level bindings (the ones in the mpi:: or nccl:: sub-namespaces) use the backend's own data types (e.g. MPI_Comm/ncclComm_t, etc.).
PRs #197 and #199 are the ones that introduce the use of the generic KokkosComm::ReductionOperator concepts for their reduction op parameter, since they are part of the core API (consistent with the NCCL specialization).


Can we make something to be able to know if a reduction operator is implemented for a given CommunicationSpace?

I could write a helper function like:

template <CommunicationSpace CS, ReductionOperator RO>
auto is_reduction_op_convertible() -> bool;

Is this what you meant?

Additionally, this could become part of the constraints on the ReductionOperator concept, which may allow us to produce better diagnostics if a reduction operator isn't available for a given backend.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-mpi Area: KokkosComm MPI backend implementation C-enhancement Category: an enhancement or bug fix SNL-CI-APPROVAL Required to run SNL CI on non-SNL contributions

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants