-
Notifications
You must be signed in to change notification settings - Fork 6.6k
Open
Description
Thanks for the work for supporting DDP training. I noticed a likely bug when checking normalization choice for DDP:
if self.opt.norm == "syncbatch":
raise ValueError(f"For distributed training, opt.norm must be 'syncbatch' or 'inst', but got '{self.opt.norm}'. " "Please set --norm syncbatch for multi-GPU training.")which can be found https://github.com/junyanz/pytorch-CycleGAN-and-pix2pix/blob/2a7afba2895d52556dd5dfe07e8555ef657ced6f/models/base_model.py#L118C21-L119C193
A quick fix can be
if self.opt.norm != "syncbatch" and self.opt.norm != "instance":
raise ValueError(f"For distributed training, opt.norm must be 'syncbatch' or 'instance', but got '{self.opt.norm}'. " "Please set --norm syncbatch for multi-GPU training.")The codes work for me under multi-GPU DDP after this minor revision.
Metadata
Metadata
Assignees
Labels
No labels