wrong local_world_size fallbacks to global world size with slurm #2667
Open
Description
Hi team, TorchRec has defined local_world_size and local_rank which gets initialized through several ENVs.
However, if the job is launched via slurm ( say, sbatch or srun), none of those is ENV defined.
I'm wondering if torchrec could add slurm env support rather than manually export LOCAL_WORLD_SIZE . Thanks!
Metadata
Assignees
Labels
No labels