Skip to content

[QUESTION] In RotaryEmbedding, the datatype of inv_freq and the corresponding sin/cos computations should be maintained as torch.float32? #744

@rchardx

Description

@rchardx

Your question

The RotaryEmbedding of Megatron is of nn.Module type.
If the entire model is cast to another type by to(torch.bfloat16), the data type of inv_freq will change accordingly.
However, maintaining float32 in subsequent sin/cos calculations seems to be a wise choice.
Is this a potential precision issue that could lead to unnecessary calculation errors?

Metadata

Metadata

Assignees

No one assigned

    Labels

    staleNo activity in 60 days on issue or PR

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions