Skip to content

HMM_Paper#14

Open
JBludau wants to merge 20 commits intokokkos:mainfrom
JBludau:hmm_paper
Open

HMM_Paper#14
JBludau wants to merge 20 commits intokokkos:mainfrom
JBludau:hmm_paper

Conversation

@JBludau
Copy link
Copy Markdown

@JBludau JBludau commented Nov 4, 2025

Start of the Paper on HMM and Kokkos.
The main focus of the paper is the question: "In which circumstances is the way we currently teach (two allocations for host and device and explicit deep_copies) the fastest way to write performance portable code." Since HMM came up and was integrated into the toolchains by Nvidia and AMD, system allocations and implicit memory movement are an option. This might have benefits based on access patterns and data layouts as well as on the bytes per flop in kernels. Synchronization is still required to be done manually but depending on the algorithm and datasizes it might be negligible.

Includes benchmark Vector_ping_pong for benchmarking the impact of using two arrays (how we teach) vs using automatic page migration (via new, malloc, or managed).

TODO:

  • Add Vector-Ping-Pong results
  • Add other test cases (probably something like a stream benchmark)
  • Structure of paper

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant