-
Notifications
You must be signed in to change notification settings - Fork 0
#11: termination: fix single rank case #12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
#11: termination: fix single rank case #12
Conversation
09acf24 to
8032c0e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR addresses issue #5 by adding special handling for the single rank (size=1) case in the termination detection algorithm. The termination detector uses a tree-based approach that doesn't work correctly when only one MPI rank exists, requiring specialized logic to progress the state machine.
Key changes:
- Extracted child completion checking into a new
checkAllChildrenComplete()method - Added
singleRank()helper method and special case handling insendControlToChildren()and wave restart logic - Updated printf format specifiers from
%lldto%" PRIu64 "for portable uint64_t printing
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| src/vt-lb/comm/MPI/termination.h | Adds new method declarations checkAllChildrenComplete() and singleRank() to support single rank termination detection |
| src/vt-lb/comm/MPI/termination.cc | Implements single rank handling logic, refactors completion checking, and fixes printf format specifiers for uint64_t |
| src/vt-lb/comm/MPI/comm_mpi.h | Adds single rank state machine progression in poll loop and initialization logic for rank 0 |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| void TerminationDetector::onResponse(uint64_t in_sent, uint64_t in_recv) { | ||
| #if DEBUG_TERMINATION | ||
| printf("Rank %d: received response: sent=%lld, recv=%lld, global_sent1=%lld, global_recv1_=%lld waiting_children=%d\n", | ||
| printf("Rank %d: received response: sent=%" PRIu64 ", recv=%" PRIu64 ", global_sent1=%" PRIu64 ", global_recv1=%" PRIu64 " waiting_children=%d\n", |
Copilot
AI
Dec 3, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing closing underscore in format string: global_recv1_= should be global_recv1= to match the variable name pattern in the output (the other variables don't have trailing underscores in the format string).
| printf("Rank %d: received response: sent=%" PRIu64 ", recv=%" PRIu64 ", global_sent1=%" PRIu64 ", global_recv1=%" PRIu64 " waiting_children=%d\n", | |
| printf("Rank %d: received response: sent=%" PRIu64 ", recv=%" PRIu64 ", global_sent1=%" PRIu64 ", global_recv1_=%" PRIu64 " waiting_children=%d\n", |
nlslatt
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me and fixes my problem.
Fixes #11