-
Notifications
You must be signed in to change notification settings - Fork 15
Open
Description
Thank you so much for creating and sharing this repository. The work is greatly appreciated!
I'm working on inference speed test eval/efficiency/attention_speed.py with multiple GPU, encountered a problem unsolved:
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cuda:7! (when checking argument for argument mat2 in method wrapper_CUDA_bmm)
It implies speed test is running on single-GPU, what changes can be implemented to run on multi-GPU within one node, or across nodes?
xwpaul3
Metadata
Metadata
Assignees
Labels
No labels