Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix issue 1721 by always initializing process group. #1722

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

luowyang
Copy link

This pull request fixes issue #1721, where single GPU training/inference may error if the worker uses torch.dist. In summary, it makes sure the default process group is always initialized as long as world_size > 0, otherwise it raises a ValueError to indicate illegal argument(s). The existing code should not be affected, as stated in issue #1721.

Rationale: Always initializing the process group is preferred, because when launch is called, the user most likely requests distributed semantics. This fix makes the user code consistent by allowing the users to make torch.dist calls even if there is only one GPU.

@CLAassistant
Copy link

CLAassistant commented Sep 26, 2023

CLA assistant check
All committers have signed the CLA.

vossr pushed a commit to vossr/YOLOX-custom that referenced this pull request Apr 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants