Skip to content

Persistent API #156

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 33 commits into
base: develop
Choose a base branch
from
Open

Persistent API #156

wants to merge 33 commits into from

Conversation

nicoleavans
Copy link
Collaborator

This PR introduces a minimal implementation of persistence functionality to obtain feedback, suggestions, and requests from developers and users. A version utilizing MPI windows is in progress, as are some additional unit tests and a performance test.

10/11 Test #10: test-channel .....................   Passed    2.91 sec

@nicoleavans nicoleavans requested a review from cwpearson March 10, 2025 16:27
@nicoleavans nicoleavans self-assigned this Mar 10, 2025
@nicoleavans
Copy link
Collaborator Author

Copy link
Collaborator

@dssgabriel dssgabriel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a first pass with a few comments and some newbie questions so I can get a better grasp on persistent communications. 🙂

Note: since these channels rely on MPI-only mechanisms, I think the channel.hpp header should be moved to the KokkosComm/mpi directory.

@dssgabriel
Copy link
Collaborator

I think this PR needs to be rebased on top of develop (which had the changes to header includes merged in earlier today).
Looks good to me otherwise 👍

@nicoleavans
Copy link
Collaborator Author

Here is a test script, for anyone wanting to try it.

@nicoleavans nicoleavans requested a review from cwpearson March 25, 2025 21:57
Copy link
Member

@cedricchevalier19 cedricchevalier19 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am very interested in this, but I have questions about how it will work with Kokkos' asynchronous programming model.

using value_type = typename SendView::value_type;
// Add a new request to the send_reqs_ vector
send_reqs_.emplace_back();
MPI_Send_init(KokkosComm::data_handle(view), KokkosComm::span(view), KokkosComm::Impl::mpi_type_v<value_type>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is a fence() needed there?

Kokkos::Tools::pushRegion("KokkosComm::Channel::recvinit");
using value_type = typename RecvView::value_type;
recv_reqs_.emplace_back();
MPI_Recv_init(KokkosComm::data_handle(view), KokkosComm::span(view), KokkosComm::Impl::mpi_type_v<value_type>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is a fence() needed there?

Comment on lines +28 to +31
channel.sendinit(v);
channel.recvinit(v);
channel.start();
channel.wait();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should check if the contents are ok, therefore it is probably safer to use two different views.

for (auto& req : recv_reqs_) {
mpi_reqs.push_back(req.mpi_request());
}
MPI_Startall(mpi_reqs.size(), mpi_reqs.data());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or are fences needed there?

nicoleavans and others added 2 commits April 7, 2025 12:14
Co-authored-by: Cédric Chevalier <[email protected]>
…ists

attempt to resolve check: Linux-Install / Kokkos-master-OpenMPI (pull_request)

fix: address CMake issues
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants