Inference speed test on multi-GPU

Thank you so much for creating and sharing this repository. The work is greatly appreciated!
I'm working on inference speed test `eval/efficiency/attention_speed.py` with multiple GPU, encountered a problem unsolved: 

> RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cuda:7! (when checking argument for argument mat2 in method wrapper_CUDA_bmm)

It implies speed test is running on single-GPU, what changes can be implemented to run on multi-GPU within one node, or across nodes?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Inference speed test on multi-GPU #27

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Inference speed test on multi-GPU #27

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions