Fix issue 1721 by always initializing process group. #1722

luowyang · 2023-09-26T08:24:38Z

This pull request fixes issue #1721, where single GPU training/inference may error if the worker uses torch.dist. In summary, it makes sure the default process group is always initialized as long as world_size > 0, otherwise it raises a ValueError to indicate illegal argument(s). The existing code should not be affected, as stated in issue #1721.

Rationale: Always initializing the process group is preferred, because when launch is called, the user most likely requests distributed semantics. This fix makes the user code consistent by allowing the users to make torch.dist calls even if there is only one GPU.

CLAassistant · 2023-09-26T08:24:44Z

All committers have signed the CLA.

…rocess group. Megvii-BaseDetection#1722

Fix issue 1721 by always initializing process group.

5ab9121

vossr pushed a commit to vossr/YOLOX-custom that referenced this pull request Apr 21, 2024

fix issue for single GPU training/inference. By always initializing p…

85c22a4

…rocess group. Megvii-BaseDetection#1722

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix issue 1721 by always initializing process group. #1722

Fix issue 1721 by always initializing process group. #1722

Uh oh!

luowyang commented Sep 26, 2023

Uh oh!

CLAassistant commented Sep 26, 2023 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Fix issue 1721 by always initializing process group. #1722

Are you sure you want to change the base?

Fix issue 1721 by always initializing process group. #1722

Uh oh!

Conversation

luowyang commented Sep 26, 2023

Uh oh!

CLAassistant commented Sep 26, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

CLAassistant commented Sep 26, 2023 •

edited

Loading