Skip to content

Fix device mismatch in random_time_with_decode masking#536

Open
pjreddie wants to merge 1 commit intomainfrom
joer/fix-randperm-device-masking
Open

Fix device mismatch in random_time_with_decode masking#536
pjreddie wants to merge 1 commit intomainfrom
joer/fix-randperm-device-masking

Conversation

@pjreddie
Copy link
Copy Markdown
Collaborator

Summary

  • torch.randperm at masking.py:1920 was missing device= arg, causing it to default to CUDA while the indexed tensor was on CPU
  • Every other randperm call in the file already passes device= correctly — this was the only one missed
  • Crashes any experiment using random_time_with_decode masking strategy

Test plan

  • Verified fix locally: 100-step training run completed successfully with random_time_with_decode masking + static_temporal encoding

🤖 Generated with Claude Code

torch.randperm defaulted to CUDA while the indexed tensor was on CPU.
Every other randperm call in this file already passes device=.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants