Skip to content

Conversation

@mpokorny
Copy link
Contributor

@mpokorny mpokorny commented Nov 5, 2025

This branch adds support for executing Kokkos tasks on machines with multiple instances of each processor type. In other words, with this branch, Kokkos tasks can run on machines configured with more than one GPU and/or OpenMP processor. This branch supports older Kokkos versions, but the added capability depends on a minimum Kokkos version of 4.3.0 (for an implementation on all processor types).

The support for multiple OpenMP processors is less developed than the multiple GPU support, and requires further review.

An associated PR on Legion has some tests, but there is no Realm-only test code in this PR.

For OpenMP:

  • requires Kokkos v4.0.0, and falls back to previous implementation (including restrictions) otherwise
  • implementation should work with either the Realm OpenMP implementation or system OpenMPs
  • tested with kokkos_saxpy (in Legion)
  • one caveat when using Realm OpenMP: I had some difficulty preventing application code from linking and using the system OpenMP despite my intentions, which I fixed by changing the kokkoscore target properties in the CMakeLists.txt file for the application, but this modification is not part of this PR

For CUDA:

  • requires Kokkos v4.3.0, and falls back to previous implementation (including restrictions) otherwise
  • tested with kokkos_saxpy (in Legion)

@mpokorny
Copy link
Contributor Author

mpokorny commented Nov 5, 2025

The associated Legion PR is here: https://gitlab.com/StanfordLegion/legion/-/merge_requests/1956

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant