Skip to content

Commit 2ece9ff

Browse files
authored
[2.50 CHERRY-PICK][docs] Update SLURM docs with symmetric-run (#56775) (#57565)
We recently added `ray symmetric-run`. This helps greatly simplify our slurm tutorial. cherrypick #56775
1 parent e46fe7e commit 2ece9ff

File tree

3 files changed

+44
-85
lines changed

3 files changed

+44
-85
lines changed

doc/source/cluster/doc_code/slurm-basic.sh

Lines changed: 20 additions & 40 deletions
Original file line numberDiff line numberDiff line change
@@ -16,50 +16,30 @@ nodes=$(scontrol show hostnames "$SLURM_JOB_NODELIST")
1616
nodes_array=($nodes)
1717

1818
head_node=${nodes_array[0]}
19-
head_node_ip=$(srun --nodes=1 --ntasks=1 -w "$head_node" hostname --ip-address)
2019

21-
# if we detect a space character in the head node IP, we'll
22-
# convert it to an ipv4 address. This step is optional.
23-
if [[ "$head_node_ip" == *" "* ]]; then
24-
IFS=' ' read -ra ADDR <<<"$head_node_ip"
25-
if [[ ${#ADDR[0]} -gt 16 ]]; then
26-
head_node_ip=${ADDR[1]}
27-
else
28-
head_node_ip=${ADDR[0]}
29-
fi
30-
echo "IPV6 address detected. We split the IPV4 address as $head_node_ip"
31-
fi
32-
# __doc_head_address_end__
33-
34-
# __doc_head_ray_start__
3520
port=6379
36-
ip_head=$head_node_ip:$port
21+
ip_head=$head_node:$port
3722
export ip_head
3823
echo "IP Head: $ip_head"
24+
# __doc_head_address_end__
3925

40-
echo "Starting HEAD at $head_node"
41-
srun --nodes=1 --ntasks=1 -w "$head_node" \
42-
ray start --head --node-ip-address="$head_node_ip" --port=$port \
43-
--num-cpus "${SLURM_CPUS_PER_TASK}" --num-gpus "${SLURM_GPUS_PER_TASK}" --block &
44-
# __doc_head_ray_end__
45-
46-
# __doc_worker_ray_start__
47-
# optional, though may be useful in certain versions of Ray < 1.0.
48-
sleep 10
49-
50-
# number of nodes other than the head node
51-
worker_num=$((SLURM_JOB_NUM_NODES - 1))
52-
53-
for ((i = 1; i <= worker_num; i++)); do
54-
node_i=${nodes_array[$i]}
55-
echo "Starting WORKER $i at $node_i"
56-
srun --nodes=1 --ntasks=1 -w "$node_i" \
57-
ray start --address "$ip_head" \
58-
--num-cpus "${SLURM_CPUS_PER_TASK}" --num-gpus "${SLURM_GPUS_PER_TASK}" --block &
59-
sleep 5
60-
done
61-
# __doc_worker_ray_end__
26+
# __doc_symmetric_run_start__
27+
# Start Ray cluster using symmetric_run.py on all nodes.
28+
# Symmetric run will automatically start Ray on all nodes and run the script ONLY the head node.
29+
# Use the '--' separator to separate Ray arguments and the entrypoint command.
30+
# The --min-nodes argument ensures all nodes join before running the script.
31+
32+
# All nodes (including head and workers) will execute this block.
33+
# The entrypoint (simple-trainer.py) will only run on the head node.
34+
srun --nodes="$SLURM_JOB_NUM_NODES" --ntasks="$SLURM_JOB_NUM_NODES" \
35+
ray symmetric-run \
36+
--address "$ip_head" \
37+
--min-nodes "$SLURM_JOB_NUM_NODES" \
38+
--num-cpus="${SLURM_CPUS_PER_TASK}" \
39+
--num-gpus="${SLURM_GPUS_PER_TASK}" \
40+
-- \
41+
python -u simple-trainer.py "$SLURM_CPUS_PER_TASK"
42+
# __doc_symmetric_run_end__
6243

6344
# __doc_script_start__
64-
# ray/doc/source/cluster/doc_code/simple-trainer.py
65-
python -u simple-trainer.py "$SLURM_CPUS_PER_TASK"
45+
# The entrypoint script (simple-trainer.py) will be run on the head node by symmetric_run.

doc/source/cluster/vms/user-guides/community/slurm.rst

Lines changed: 20 additions & 41 deletions
Original file line numberDiff line numberDiff line change
@@ -8,15 +8,9 @@ Slurm usage with Ray can be a little bit unintuitive.
88
* SLURM requires multiple copies of the same program are submitted multiple times to the same cluster to do cluster programming. This is particularly well-suited for MPI-based workloads.
99
* Ray, on the other hand, expects a head-worker architecture with a single point of entry. That is, you'll need to start a Ray head node, multiple Ray worker nodes, and run your Ray script on the head node.
1010

11-
.. warning::
11+
To bridge this gap, Ray 2.49 and above introduces ``ray symmetric-run`` command, which will start a Ray cluster on all nodes with given CPU and GPU resources and run your entrypoint script ONLY the head node.
1212

13-
SLURM support is still a work in progress. SLURM users should be aware
14-
of current limitations regarding networking.
15-
See :ref:`here <slurm-network-ray>` for more explanations.
16-
17-
SLURM support is community-maintained. Maintainer GitHub handle: tupui.
18-
19-
This document aims to clarify how to run Ray on SLURM.
13+
Below, we provide a walkthrough using ``ray symmetric-run`` to run Ray on SLURM.
2014

2115
.. contents::
2216
:local:
@@ -107,46 +101,27 @@ Next, we'll want to obtain a hostname and a node IP address for the head node. T
107101
:start-after: __doc_head_address_start__
108102
:end-before: __doc_head_address_end__
109103

104+
.. note:: In Ray 2.49 and above, you can use IPv6 addresses/hostnames.
110105

111106

112-
Starting the Ray head node
113-
~~~~~~~~~~~~~~~~~~~~~~~~~~
107+
Starting Ray and executing your script
108+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
114109

115-
After detecting the head node hostname and head node IP, we'll want to create
116-
a Ray head node runtime. We'll do this by using ``srun`` as a background task
117-
as a single task/node (recall that ``tasks-per-node=1``).
110+
.. note:: `ray symmetric-run` is available in Ray 2.49 and above. Check older versions of the documentation if you are using an older version of Ray.
111+
112+
Now, we'll use `ray symmetric-run` to start Ray on all nodes with given CPU and GPU resources and run your entrypoint script ONLY the head node.
118113

119114
Below, you'll see that we explicitly specify the number of CPUs (``num-cpus``)
120115
and number of GPUs (``num-gpus``) to Ray, as this will prevent Ray from using
121116
more resources than allocated. We also need to explicitly
122-
indicate the ``node-ip-address`` for the Ray head runtime:
123-
124-
.. literalinclude:: /cluster/doc_code/slurm-basic.sh
125-
:language: bash
126-
:start-after: __doc_head_ray_start__
127-
:end-before: __doc_head_ray_end__
128-
129-
By backgrounding the above srun task, we can proceed to start the Ray worker runtimes.
130-
131-
Starting the Ray worker nodes
132-
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
133-
134-
Below, we do the same thing, but for each worker. Make sure the Ray head and Ray worker processes are not started on the same node.
117+
indicate the ``address`` parameter for the head node to identify itself and other nodes to connect to:
135118

136119
.. literalinclude:: /cluster/doc_code/slurm-basic.sh
137120
:language: bash
138-
:start-after: __doc_worker_ray_start__
139-
:end-before: __doc_worker_ray_end__
140-
141-
Submitting your script
142-
~~~~~~~~~~~~~~~~~~~~~~
143-
144-
Finally, you can invoke your Python script:
145-
146-
.. literalinclude:: /cluster/doc_code/slurm-basic.sh
147-
:language: bash
148-
:start-after: __doc_script_start__
121+
:start-after: __doc_symmetric_run_start__
122+
:end-before: __doc_symmetric_run_end__
149123

124+
After the training job is completed, the Ray cluster will be stopped automatically.
150125

151126
.. note:: The -u argument tells python to print to stdout unbuffered, which is important with how slurm deals with rerouting output. If this argument is not included, you may get strange printing behavior such as printed statements not being logged by slurm until the program has terminated.
152127

@@ -165,6 +140,7 @@ One common use of a SLURM cluster is to have multiple users running concurrent
165140
jobs on the same infrastructure. This can easily conflict with Ray due to the
166141
way the head node communicates with its workers.
167142

143+
168144
Considering 2 users, if they both schedule a SLURM job using Ray
169145
at the same time, they are both creating a head node. In the backend, Ray will
170146
assign some internal ports to a few services. The issue is that as soon as the
@@ -183,13 +159,12 @@ adjusted. For an explanation on ports, see :ref:`here <ray-ports>`::
183159
--ray-client-server-port
184160
--redis-shard-ports
185161

186-
For instance, again with 2 users, they would have to adapt the instructions
187-
seen above to:
162+
For instance, again with 2 users, they would run the following commands. Note that we don't use symmetric-run here
163+
because it does not currently work in multi-tenant environments:
188164

189165
.. code-block:: bash
190166
191167
# user 1
192-
# same as above
193168
...
194169
srun --nodes=1 --ntasks=1 -w "$head_node" \
195170
ray start --head --node-ip-address="$head_node_ip" \
@@ -202,8 +177,9 @@ seen above to:
202177
--max-worker-port=19999 \
203178
--num-cpus "${SLURM_CPUS_PER_TASK}" --num-gpus "${SLURM_GPUS_PER_TASK}" --block &
204179
180+
python -u your_script.py
181+
205182
# user 2
206-
# same as above
207183
...
208184
srun --nodes=1 --ntasks=1 -w "$head_node" \
209185
ray start --head --node-ip-address="$head_node_ip" \
@@ -216,12 +192,15 @@ seen above to:
216192
--max-worker-port=29999 \
217193
--num-cpus "${SLURM_CPUS_PER_TASK}" --num-gpus "${SLURM_GPUS_PER_TASK}" --block &
218194
195+
python -u your_script.py
196+
219197
As for the IP binding, on some cluster architecture the network interfaces
220198
do not allow to use external IPs between nodes. Instead, there are internal
221199
network interfaces (`eth0`, `eth1`, etc.). Currently, it's difficult to
222200
set an internal IP
223201
(see the open `issue <https://github.com/ray-project/ray/issues/22732>`_).
224202

203+
225204
Python-interface SLURM scripts
226205
------------------------------
227206

python/ray/scripts/symmetric_run.py

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -84,7 +84,7 @@ def curate_and_validate_ray_start_args(run_and_start_args: List[str]) -> List[st
8484
8585
USAGE:
8686
87-
python -m ray.scripts.symmetric_run --address ADDRESS
87+
ray symmetric-run --address ADDRESS
8888
[--min-nodes NUM_NODES] [RAY_START_OPTIONS] -- [ENTRYPOINT_COMMAND]
8989
9090
DESCRIPTION:
@@ -100,15 +100,15 @@ def curate_and_validate_ray_start_args(run_and_start_args: List[str]) -> List[st
100100
101101
# Start Ray with default settings and run a Python script
102102
103-
python -m ray.scripts.symmetric_run --address 127.0.0.1:6379 -- python my_script.py
103+
ray symmetric-run --address 127.0.0.1:6379 -- python my_script.py
104104
105105
# Start Ray with specific head node and run a command
106106
107-
python -m ray.scripts.symmetric_run --address 127.0.0.1:6379 --min-nodes 4 -- python train_model.py --epochs=100
107+
ray symmetric-run --address 127.0.0.1:6379 --min-nodes 4 -- python train_model.py --epochs=100
108108
109109
# Start Ray and run a multi-word command
110110
111-
python -m ray.scripts.symmetric_run --address 127.0.0.1:6379 --min-nodes 4 --num-cpus=4 -- python -m my_module --config=prod
111+
ray symmetric-run --address 127.0.0.1:6379 --min-nodes 4 --num-cpus=4 -- python -m my_module --config=prod
112112
113113
RAY START OPTIONS:
114114

0 commit comments

Comments
 (0)