Replies: 1 comment
-
|
Marking as stale. No activity in 60 days. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Your question
The RotaryEmbedding of Megatron is of nn.Module type.
If the entire model is cast to another type by to(torch.bfloat16), the data type of inv_freq will change accordingly.
However, maintaining float32 in subsequent sin/cos calculations seems to be a wise choice.
Is this a potential precision issue that could lead to unnecessary calculation errors?
Beta Was this translation helpful? Give feedback.
All reactions