Skip to content

Conversation

@VbhvGupta
Copy link

Description

This PR fixes a critical bug that causes a crash in convert_partition.py when a partition contains zero nodes across all node types.

Problem

The last_id tensor was only assigned within an else branch of a loop that iterates over node types. If a partition had no nodes for any type, this else branch was never entered, leaving last_id uninitialized. This resulted in a crash when the uninitialized variable was subsequently passed to dist.all_gather.

Solution

The fix ensures that last_id is always initialized before being used:

  1. A new integer variable, max_last_id, is initialized with the previous partition's last ID before the loop begins.
  2. Inside the loop, max_last_id is updated to track the maximum node ID encountered in the current partition.
  3. The last_id tensor is now created after the loop completes, using the final max_last_id.

This change guarantees that dist.all_gather always receives a valid tensor, preventing the crash and improving the robustness of the partitioning process, especially for small graphs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[CRASH CONDITION] - Critical Bug - Uninitialized last_id Variable

1 participant