Skip to content

Commit 9d7befe

Browse files
authored
Remove UCX configuration schema (#9127)
1 parent ac04bfa commit 9d7befe

File tree

3 files changed

+4
-65
lines changed

3 files changed

+4
-65
lines changed

distributed/distributed-schema.yaml

Lines changed: 0 additions & 53 deletions
Original file line numberDiff line numberDiff line change
@@ -959,59 +959,6 @@ properties:
959959
Alternatively, the key can be appended to the cert file
960960
above, and this field left blank
961961
962-
ucx:
963-
type: object
964-
description: |
965-
UCX provides access to other transport methods including NVLink and InfiniBand.
966-
properties:
967-
cuda-copy:
968-
type: [boolean, 'null']
969-
description: |
970-
Set environment variables to enable CUDA support over UCX. This may be used even if
971-
InfiniBand and NVLink are not supported or disabled, then transferring data over TCP.
972-
tcp:
973-
type: [boolean, 'null']
974-
description: |
975-
Set environment variables to enable TCP over UCX, even if InfiniBand and NVLink
976-
are not supported or disabled.
977-
nvlink:
978-
type: [boolean, 'null']
979-
description: |
980-
Set environment variables to enable UCX over NVLink, implies ``distributed.comm.ucx.tcp=True``.
981-
infiniband:
982-
type: [boolean, 'null']
983-
description: |
984-
Set environment variables to enable UCX over InfiniBand, implies ``distributed.comm.ucx.tcp=True``.
985-
rdmacm:
986-
type: [boolean, 'null']
987-
description: |
988-
Set environment variables to enable UCX RDMA connection manager support,
989-
requires ``distributed.comm.ucx.infiniband=True``.
990-
create-cuda-context:
991-
type: [boolean, 'null']
992-
description: |
993-
Creates a CUDA context before UCX is initialized. This is necessary to enable UCX to
994-
properly identify connectivity of GPUs with specialized networking hardware, such as
995-
InfiniBand. This permits UCX to choose transports automatically, without specifying
996-
additional variables for each transport, while ensuring optimal connectivity. When
997-
``True``, a CUDA context will be created on the first device listed in
998-
``CUDA_VISIBLE_DEVICES``.
999-
environment:
1000-
type: object
1001-
description: |
1002-
Mapping for setting arbitrary UCX environment variables.
1003-
Names here are translated via the following rules to
1004-
map to the relevant UCX environment variable:
1005-
- hyphens are replaced with underscores
1006-
- words are uppercased
1007-
- UCX_ is prepended
1008-
So, for example, setting ``some-option=value`` is
1009-
equivalent to setting ``UCX_SOME_OPTION=value`` in
1010-
the calling environment.
1011-
1012-
For a full list of supported UCX environment
1013-
variables, run ``ucx_info -f``.
1014-
1015962
websockets:
1016963
type: object
1017964
properties:

distributed/distributed.yaml

Lines changed: 0 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -230,16 +230,6 @@ distributed:
230230
offload: 10MiB # Size after which we choose to offload serialization to another thread
231231
default-scheme: tcp
232232
socket-backlog: 2048
233-
ucx:
234-
cuda-copy: null # enable cuda-copy
235-
tcp: null # enable tcp
236-
nvlink: null # enable cuda_ipc
237-
infiniband: null # enable Infiniband
238-
rdmacm: null # enable RDMACM
239-
create-cuda-context: null # create CUDA context before UCX initialization
240-
environment: {} # Any other environment settings to
241-
# be transferred to UCX. Name
242-
# munging: key-name => UCX_KEY_NAME
243233
zstd:
244234
level: 3 # Compression level, between 1 and 22.
245235
threads: 0 # Threads to use. 0 for single-threaded, -1 to infer from cpu count.

distributed/worker.py

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1626,8 +1626,10 @@ async def close( # type: ignore
16261626
# Give some time for a UCX scheduler to complete closing endpoints
16271627
# before closing self.batched_stream, otherwise the local endpoint
16281628
# may be closed too early and errors be raised on the scheduler when
1629-
# trying to send closing message.
1630-
if self._protocol == "ucx": # pragma: no cover
1629+
# trying to send closing message. Using startswith supports variations
1630+
# of the protocols, e.g., `ucx` and `ucxx` which are both valid in
1631+
# distributed-ucxx.
1632+
if self._protocol.startswith("ucx"): # pragma: no cover
16311633
await asyncio.sleep(0.2)
16321634

16331635
self.batched_send({"op": "close-stream"})

0 commit comments

Comments
 (0)