@@ -7,12 +7,13 @@ Distributed Communication
77
88MLX supports distributed communication operations that allow the computational cost
99of training or inference to be shared across many physical machines. At the
10- moment we support two different communication backends:
10+ moment we support three different communication backends:
1111
1212* `MPI <https://en.wikipedia.org/wiki/Message_Passing_Interface >`_ a
1313 full-featured and mature distributed communications library
14- * A **ring ** backend of our own that uses native TCP sockets and should be
15- faster for thunderbolt connections.
14+ * A **ring ** backend of our own that uses native TCP sockets. It should be
15+ faster for thunderbolt connections, but it also works over Ethernet.
16+ * `nccl <https://developer.nvidia.com/nccl >`_, for use in CUDA environments.
1617
1718The list of all currently supported operations and their documentation can be
1819seen in the :ref: `API docs<distributed> `.
@@ -84,9 +85,8 @@ Selecting Backend
8485^^^^^^^^^^^^^^^^^
8586
8687You can select the backend you want to use when calling :func: `init ` by passing
87- one of ``{'any', 'ring', 'mpi'} ``. When passing ``any ``, MLX will try to
88- initialize the ``ring `` backend and if it fails the ``mpi `` backend. If they
89- both fail then a singleton group is created.
88+ one of ``{'any', 'ring', 'mpi', 'nccl'} ``. When passing ``any ``, MLX will try all
89+ available backends. If they all fail then a singleton group is created.
9090
9191.. note ::
9292 After a distributed backend is successfully initialized :func: `init ` will
@@ -220,22 +220,24 @@ print 4 etc.
220220Installing MPI
221221^^^^^^^^^^^^^^
222222
223- MPI can be installed with Homebrew, using the Anaconda package manager or
223+ MPI can be installed with Homebrew, pip, using the Anaconda package manager, or
224224compiled from source. Most of our testing is done using ``openmpi `` installed
225225with the Anaconda package manager as follows:
226226
227227.. code :: shell
228228
229229 $ conda install conda-forge::openmpi
230230
231- Installing with Homebrew may require specifying the location of ``libmpi.dyld ``
231+ Installing with Homebrew or pip requires specifying the location of ``libmpi.dyld ``
232232so that MLX can find it and load it at runtime. This can simply be achieved by
233233passing the ``DYLD_LIBRARY_PATH `` environment variable to ``mpirun `` and it is
234- done automatically by ``mlx.launch ``.
234+ done automatically by ``mlx.launch ``. Some environments use a non-standard
235+ library filename that can be specified using the ``MPI_LIBNAME `` environment
236+ variable. This is automatically taken care of by ``mlx.launch `` as well.
235237
236238.. code :: shell
237239
238- $ mpirun -np 2 -x DYLD_LIBRARY_PATH=/opt/homebrew/lib/ python test.py
240+ $ mpirun -np 2 -x DYLD_LIBRARY_PATH=/opt/homebrew/lib/ -x MPI_LIBNAME=libmpi.40.dylib python test.py
239241 $ # or simply
240242 $ mlx.launch -n 2 test.py
241243
0 commit comments