Skip to content

Conversation

@sbak5
Copy link
Contributor

@sbak5 sbak5 commented Nov 4, 2025

Dependent on #215
This PR removes any heuristical ordering unmatched PGs but create windows of PGs and order them to figure out the wavefront of chained hanging PGs.

  1. Pick the most common PG in the current active entry per rank
  2. Group them to a pg. and register this in the active pg list
  3. If the ranks in the current window overlaps with the correspoding PG registered in the active pg -> Then it creates a new pg
  4. if the new ranks belong to a previously registered active pg and they're matched later with the new ranks. the pg is deleted from the active pg because it's completed
  5. Repeat until all per-rank entry index == len(per_rank_entries)

Use these windows as the global order of process groups -> return the first PG in the graph analysis as we do previously

this approach respects the local scheduling order of process groups as well as the partially different ordering of PGs across ranks for p2p.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant