Is it possible to use a huggingface Automodel(device_map = auto) with a multi-gpu trainer? #19194
Unanswered
surya-narayanan
asked this question in
DDP / multi-GPU / multi-node
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
The automodel allocates the model across devices, but the L trainer seems to want to replicate model architecture across devices, which is incompatible.
When I try using an automodel with device_map set to auto in a trainer, the forward pass errors out, claiming that it is multiplying tensors on different devices.
Beta Was this translation helpful? Give feedback.
All reactions