Skip to content

2026 04 23 meeting minutes

Carl Pearson edited this page Apr 23, 2026 · 1 revision

Attendees:

General Topics

  • 0.1 release
    • Milestone overdue by 1 month
    • A few remaining items, but we should do a release now
    • Some key additions:
      • MPI-backend exposed in the core API
      • Refactored Requests: shared API between backends, easier to write CommSpace-generic code (doc up-to-date)
      • Refactored Communicators: shared API between backends, new capabilities: split, duplicate (doc up-to-date)
      • DeepCopy packing strategy can now handle Views of arbitrary rank
      • Collectives don't have limitations on View ranks anymore (doc specifies semantics)
      • Cleaned up Kokkos Comm traits (doc up-to-date)
    • TODOs
      • Test some stuff on El Dorado, just to know what works and what doesn't
      • Merge #229
      • Tag a commit (Gabriel)
        • Make a GitHub release out of it (Gabriel)

Changelog (post 0.1)

  • changelog.md, modified when we merge a PR
    • When you open a PR, you get a number -> add a commit to your PR that adds the PR to the changelog
  • Items postponed to 0.2 (end of June, PASC):
    • static_assert on GPU buffers in MPICH / fix El Dorado MPICH detection
    • Non-GPU-aware MPI host staging (Gabriel)
      • critical to have before PASC
    • Spack package recipe (Cedric)
      • better to have 0.1 released anyway before writing the recipe
    • NCCL and MPI results from the same code (Gabriel)
      • Nicole has the Heat3D benchmark in her fork -> open a PR
    • Blog post (Carl)
    • Fixing documentation (Carl)
      • Cedric opened #201
      • Things have been partially addressed in multiple PRs that have been merged recently
    • miniFE port (Carl)
    • Lulesh port (Gabriel + minions)

Carried forward from last time

  • Updates on CI dev container? (Cedric)

    • Waiting for build system rework from Gabriel
  • MPICH (via Spack) + CudaSpace views, exclusive scan fails

    • SNL CI only tests Open MPI + CUDA
      • Carl will make a container
    • Not sure if it's an MPICH problem or something else we're doing wrong
    • Gabriel is splitting up the unit test

Round Table

  • Carl

    • low-level API port of MiniFE is done, think about high-level next
  • Evan

    • stream triggered MPI + Kokkos Comm @ SNL this summer
      • compiles on Tuolumne, at least one test not working
  • Nicole

    • no specific plans for the internship yet
  • Gabriel

    • Planning to engage with RCCL / NCCL generic layer external contribution
    • Host staging for PASC

Clone this wiki locally