This repository was archived by the owner on Mar 6, 2024. It is now read-only.
This repository was archived by the owner on Mar 6, 2024. It is now read-only.
Can we use bitfusion to run Distributed Data Parallel Pytorch code? #43
Open
Description
Recently, I have got a VM with 2 A100 GPU. I want to use these VM to run data parallel through Pytorch. However, I meet several problems with the environment. I have succeeded on my lab's server without bitfusion. I want to know that whether bitfusion does not support torch.nn.DataParallel
(https://pytorch.org/docs/stable/generated/torch.nn.parallel.DistributedDataParallel.html) or nccl (https://developer.nvidia.com/nccl).
I am looking forward to your reply.
Metadata
Metadata
Assignees
Labels
No labels