Fix: Align `nn.Embedding` output with AMP autocast precision #251

Xiaoming-AMD · 2025-10-23T05:16:43Z

Summary

This PR fixes a precision mismatch issue where nn.Embedding outputs remain in fp32
even when AMP (autocast) is enabled with bf16 or fp16.
This inconsistency causes unnecessary dtype conversions and increased memory usage.

✨ What’s Changed

Globally monkey-patched nn.Embedding.__init__ to register a forward hook.
The hook:
- Checks if AMP (torch.is_autocast_enabled()) is active.
- If yes, casts the embedding output to the current autocast dtype (e.g., bf16).
Controlled via environment variable:

 --primus_turbo.enable_embedding_autocast=false  # disables the patch

…config

…IG-AIMA/Primus into fix/titan-amp/force-embedding-bf16

fix(amp): patch nn.Embedding for AMP autocast alignment (bf16/fp16)

a151b20

Xiaoming-AMD requested review from limou102 and wenxie-amd as code owners October 23, 2025 05:16

Xiaoming-AMD and others added 7 commits October 23, 2025 13:16

Merge branch 'main' into fix/titan-amp/force-embedding-bf16

acb9cbd

feat(torchtitan): support enabling/disabling embedding AMP patch via …

8be6185

…config

Merge branch 'fix/titan-amp/force-embedding-bf16' of github.com:AMD-A…

d2462b3

…IG-AIMA/Primus into fix/titan-amp/force-embedding-bf16

chore(torchtitan): remove debug log for cfg_dict in pre_trainer

f8cf3a3

chore(torchtitan): remove debug log for cfg_dict in pre_trainer

9a78370

refactor(torchtitan): remove env override for embedding AMP patch

a0bd945

feat(example): add log before pip install in run_pretrain.sh

3053726

wenxie-amd approved these changes Oct 23, 2025

View reviewed changes

Xiaoming-AMD merged commit 9ee2c51 into main Oct 23, 2025
3 checks passed

Xiaoming-AMD deleted the fix/titan-amp/force-embedding-bf16 branch October 27, 2025 02:19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix: Align `nn.Embedding` output with AMP autocast precision #251

Fix: Align `nn.Embedding` output with AMP autocast precision #251

Uh oh!

Xiaoming-AMD commented Oct 23, 2025 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Fix: Align nn.Embedding output with AMP autocast precision #251

Fix: Align nn.Embedding output with AMP autocast precision #251

Uh oh!

Conversation

Xiaoming-AMD commented Oct 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

✨ What’s Changed

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Fix: Align `nn.Embedding` output with AMP autocast precision #251

Fix: Align `nn.Embedding` output with AMP autocast precision #251

Xiaoming-AMD commented Oct 23, 2025 •

edited

Loading