Skip to content

Why does Tensor-parallel Communication Overlap require MPI? #11849

Answered by ashors1
mjkpolo asked this question in Q&A
Discussion options

You must be logged in to vote

Hi, MPI is used by default to bootstrap the user buffers (see the TransformerEngine documentation here). However, NCCL bootstrap should also be supported now. You can try setting tp_comm_bootstrap_backend="nccl" here.

Replies: 1 comment

Comment options

You must be logged in to vote
0 replies
Answer selected by ashors1
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants