Skip to content

Commit bfa22c6

Browse files
rjpowerHelw150
authored andcommitted
Halve Ray cluster minimums, boost Iris capacity (#4175)
- Halved `min_workers` across all Ray clusters (12 files: training, vllm, staging) to free TPU capacity - Transferred freed capacity to Iris `min_slices` in `marin.yaml`: v5p-8 +8, v4-8 +2, v5e-4 +2, v5e-128 +1, v6e-128 +1 - Upgraded Iris controller VM from `e2-standard-4` (16GB) to `e2-highmem-4` (32GB) in prod and dev ## Ray min_workers changes | Cluster | Node type | Before | After | |---|---|---|---| | us-central1 | tpu_worker (v5p-8) | 1 | 0 | | us-central1 | tpu_slice_v5p_8 | 12 | 6 | | us-central1 | tpu_slice_v5p_16 | 1 | 0 | | us-central1 | tpu_slice_v5p_32 | 1 | 0 | | us-central1 | tpu_slice_v5p_64 | 1 | 0 | | us-central2 | tpu_worker (v4-8) | 4 | 2 | | us-central2-staging | tpu_worker (v4-8) | 4 | 2 | | eu-west4 | tpu_worker (v5e-4) | 4 | 2 | | eu-west4 | tpu_slice_v5e_128 | 1 | 0 | | eu-west4-a | tpu_slice_v6e_128 | 2 | 1 | | us-east5-a | tpu_worker (v5p-8) | 8 | 4 | | us-east5-a | tpu_slice_v5p_8 | 8 | 4 | | us-east5-a-vllm | tpu_worker | 1 | 0 | | us-east5-a-vllm | tpu_slice_v5p_8 | 2 | 1 | | us-east5-b-vllm | tpu_worker | 2 | 1 | | eu-west4-vllm | tpu_worker | 2 | 1 | | us-central1-vllm | tpu_worker | 1 | 0 | | us-central1-vllm | tpu_slice_v5p_8 | 2 | 1 | | us-central2-vllm | tpu_worker | 2 | 1 | | us-east1-d-vllm | tpu_worker | 2 | 1 |
1 parent c42346a commit bfa22c6

14 files changed

Lines changed: 28 additions & 28 deletions

infra/marin-eu-west4-a.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -194,7 +194,7 @@ available_node_types:
194194

195195
tpu_slice_v6e_128:
196196
max_workers: 1024
197-
min_workers: 2
197+
min_workers: 1
198198
node_config:
199199
acceleratorType: v6e-128
200200
runtimeVersion: v2-alpha-tpuv6e

infra/marin-eu-west4-vllm.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -129,7 +129,7 @@ available_node_types:
129129
sourceImage: projects/ubuntu-os-cloud/global/images/family/ubuntu-2204-lts
130130
tpu_worker:
131131
max_workers: 1024
132-
min_workers: 2
132+
min_workers: 1
133133
node_config:
134134
acceleratorType: v5litepod-4
135135
runtimeVersion: v2-alpha-tpuv5-lite

infra/marin-eu-west4.yaml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -134,7 +134,7 @@ available_node_types:
134134
sourceImage: projects/ubuntu-os-cloud/global/images/family/ubuntu-2204-lts
135135
tpu_worker:
136136
max_workers: 1024
137-
min_workers: 4
137+
min_workers: 2
138138
node_config:
139139
acceleratorType: v5litepod-4
140140
runtimeVersion: v2-alpha-tpuv5-lite
@@ -194,7 +194,7 @@ available_node_types:
194194

195195
tpu_slice_v5e_128:
196196
max_workers: 1024
197-
min_workers: 1
197+
min_workers: 0
198198
node_config:
199199
acceleratorType: v5litepod-128
200200
runtimeVersion: v2-alpha-tpuv5-lite

infra/marin-us-central1-vllm.yaml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -129,7 +129,7 @@ available_node_types:
129129
sourceImage: projects/ubuntu-os-cloud/global/images/family/ubuntu-2204-lts
130130
tpu_worker:
131131
max_workers: 1024
132-
min_workers: 1
132+
min_workers: 0
133133
node_config:
134134
acceleratorType: v5p-8
135135
runtimeVersion: v2-alpha-tpuv5
@@ -141,7 +141,7 @@ available_node_types:
141141

142142
tpu_slice_v5p_8:
143143
max_workers: 1024
144-
min_workers: 2
144+
min_workers: 1
145145
node_config:
146146
acceleratorType: v5p-8
147147
runtimeVersion: v2-alpha-tpuv5

infra/marin-us-central1.yaml

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -134,7 +134,7 @@ available_node_types:
134134
sourceImage: projects/ubuntu-os-cloud/global/images/family/ubuntu-2204-lts
135135
tpu_worker:
136136
max_workers: 1024
137-
min_workers: 1
137+
min_workers: 0
138138
node_config:
139139
acceleratorType: v5p-8
140140
runtimeVersion: v2-alpha-tpuv5
@@ -146,7 +146,7 @@ available_node_types:
146146

147147
tpu_slice_v5p_8:
148148
max_workers: 1024
149-
min_workers: 12
149+
min_workers: 6
150150
node_config:
151151
acceleratorType: v5p-8
152152
runtimeVersion: v2-alpha-tpuv5
@@ -158,7 +158,7 @@ available_node_types:
158158

159159
tpu_slice_v5p_16:
160160
max_workers: 1024
161-
min_workers: 1
161+
min_workers: 0
162162
node_config:
163163
acceleratorType: v5p-16
164164
runtimeVersion: v2-alpha-tpuv5
@@ -170,7 +170,7 @@ available_node_types:
170170

171171
tpu_slice_v5p_32:
172172
max_workers: 1024
173-
min_workers: 1
173+
min_workers: 0
174174
node_config:
175175
acceleratorType: v5p-32
176176
runtimeVersion: v2-alpha-tpuv5
@@ -182,7 +182,7 @@ available_node_types:
182182

183183
tpu_slice_v5p_64:
184184
max_workers: 1024
185-
min_workers: 1
185+
min_workers: 0
186186
node_config:
187187
acceleratorType: v5p-64
188188
runtimeVersion: v2-alpha-tpuv5

infra/marin-us-central2-staging.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -134,7 +134,7 @@ available_node_types:
134134
sourceImage: projects/ubuntu-os-cloud/global/images/family/ubuntu-2204-lts
135135
tpu_worker:
136136
max_workers: 1024
137-
min_workers: 4
137+
min_workers: 2
138138
node_config:
139139
acceleratorType: v4-8
140140
runtimeVersion: tpu-ubuntu2204-base

infra/marin-us-central2-vllm.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -129,7 +129,7 @@ available_node_types:
129129
sourceImage: projects/ubuntu-os-cloud/global/images/family/ubuntu-2204-lts
130130
tpu_worker:
131131
max_workers: 1024
132-
min_workers: 2
132+
min_workers: 1
133133
node_config:
134134
acceleratorType: v4-8
135135
runtimeVersion: tpu-ubuntu2204-base

infra/marin-us-central2.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -134,7 +134,7 @@ available_node_types:
134134
sourceImage: projects/ubuntu-os-cloud/global/images/family/ubuntu-2204-lts
135135
tpu_worker:
136136
max_workers: 1024
137-
min_workers: 4
137+
min_workers: 2
138138
node_config:
139139
acceleratorType: v4-8
140140
runtimeVersion: tpu-ubuntu2204-base

infra/marin-us-east1-d-vllm.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -129,7 +129,7 @@ available_node_types:
129129
sourceImage: projects/ubuntu-os-cloud/global/images/family/ubuntu-2204-lts
130130
tpu_worker:
131131
max_workers: 1024
132-
min_workers: 2
132+
min_workers: 1
133133
node_config:
134134
acceleratorType: v6e-8
135135
runtimeVersion: v2-alpha-tpuv6e

infra/marin-us-east5-a-vllm.yaml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -129,7 +129,7 @@ available_node_types:
129129
sourceImage: projects/ubuntu-os-cloud/global/images/family/ubuntu-2204-lts
130130
tpu_worker:
131131
max_workers: 1024
132-
min_workers: 1
132+
min_workers: 0
133133
node_config:
134134
acceleratorType: v5p-8
135135
runtimeVersion: v2-alpha-tpuv5
@@ -141,7 +141,7 @@ available_node_types:
141141

142142
tpu_slice_v5p_8:
143143
max_workers: 1024
144-
min_workers: 2
144+
min_workers: 1
145145
node_config:
146146
acceleratorType: v5p-8
147147
runtimeVersion: v2-alpha-tpuv5

0 commit comments

Comments
 (0)