Skip to content

wrong local_world_size fallbacks to global world size with slurm #2667

Open
@JacoCheung

Description

Hi team, TorchRec has defined local_world_size and local_rank which gets initialized through several ENVs.

However, if the job is launched via slurm ( say, sbatch or srun), none of those is ENV defined.

I'm wondering if torchrec could add slurm env support rather than manually export LOCAL_WORLD_SIZE . Thanks!

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions