Skip to content

[BUG] mpi based training error #6997

Closed
@cyr0930

Description

@cyr0930

Describe the bug
fc22d96
This commit (after ver 0.15.2) makes all the environment variables quoted in mpi based training.
For example, module cannot be imported correctly because PYTHONPATH should be (some path) but "(some path)".

To Reproduce
Steps to reproduce the behavior:

  1. Run multi-node training with mpi (e.g., openmpi)

Expected behavior
Environment variable should be quoted when it is necessary

ds_report output
N/A

Screenshots
N/A

System info (please complete the following information):
N/A

Launcher context
deepspeed, MPI

Docker context
N/A

Additional context
N/A

Activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

Labels

bugSomething isn't workingtraining

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions