You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: doc/source/placement-group.rst
+35-2
Original file line number
Diff line number
Diff line change
@@ -115,6 +115,37 @@ Let's see an example of using placement group. Note that this example is done wi
115
115
116
116
Let's create a placement group. Recall that each bundle is a collection of resources, and tasks or actors can be scheduled on each bundle.
117
117
118
+
.. note::
119
+
120
+
When specifying bundles,
121
+
122
+
- "CPU" will correspond with `num_cpus` as used in `ray.remote`
123
+
- "GPU" will correspond with `num_gpus` as used in `ray.remote`
124
+
- "MEM" will correspond with `memory` as used in `ray.remote`
125
+
- Other resources will correspond with `resources` as used in `ray.remote`.
126
+
127
+
Once the placement group reserves resources, original resources are unavailable until the placement group is removed. For example:
128
+
129
+
.. code-block:: python
130
+
131
+
# Two "CPU"s are available.
132
+
ray.init(num_cpus=2)
133
+
134
+
# Create a placement group.
135
+
pg = placement_group([{"CPU": 2}])
136
+
ray.get(pg.ready())
137
+
138
+
# Now, 2 CPUs are not available anymore because they are pre-reserved by the placement group.
139
+
@ray.remote(num_cpus=2)
140
+
deff():
141
+
returnTrue
142
+
143
+
# Won't be scheduled because there are no 2 cpus.
144
+
f.remote()
145
+
146
+
# Will be scheduled because 2 cpus are reserved by the placement group.
147
+
f.options(placement_group=pg).remote()
148
+
118
149
.. code-block:: python
119
150
120
151
gpu_bundle = {"GPU": 2}
@@ -162,7 +193,9 @@ Now, you can guarantee all gpu actors and extra_resource tasks are located on th
162
193
because they are scheduled on a placement group with the STRICT_PACK strategy.
163
194
164
195
Note that you must remove the placement group once you are finished with your application.
165
-
Workers of actors and tasks that are scheduled on placement group will be all killed:
196
+
Workers of actors and tasks that are scheduled on placement group will be all killed.
197
+
198
+
.. warning:: Do not lose the reference to the placement group - you will not be able to remove it. This behavior will change in a later release.
166
199
167
200
.. code-block:: python
168
201
@@ -194,7 +227,7 @@ Placement groups are pending creation if there are no nodes that can satisfy res
194
227
195
228
If nodes that contain some bundles of a placement group die, bundles will be rescheduled on different nodes by GCS. This means that the initial creation of placement group is "atomic", but once it is created, there could be partial placement groups.
196
229
197
-
Unlike actors and tasks, placement group is currently not fault tolerant yet. It is in progress.
230
+
Placement groups are tolerant to worker nodes failures (bundles on dead nodes are rescheduled). However, placement groups are currently unable to tolerate head node failures (GCS failures).
0 commit comments