Skip to content

Inference speed test on multi-GPU #27

@xwpaul3

Description

@xwpaul3

Thank you so much for creating and sharing this repository. The work is greatly appreciated!
I'm working on inference speed test eval/efficiency/attention_speed.py with multiple GPU, encountered a problem unsolved:

RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cuda:7! (when checking argument for argument mat2 in method wrapper_CUDA_bmm)

It implies speed test is running on single-GPU, what changes can be implemented to run on multi-GPU within one node, or across nodes?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions