Skip to content

Commit a20b0b5

Browse files
authored
[iris] Label always-on CoreWeave nodes as system-critical (#4011)
Add cks.coreweave.cloud/system-critical label to NodePools with min_nodes > 0. This pins Konnectivity agents and monitoring pods to always-on CPU nodes so GPU NodePools can safely scale to zero without losing cluster connectivity.
1 parent 3552abc commit a20b0b5

1 file changed

Lines changed: 3 additions & 0 deletions

File tree

lib/iris/src/iris/cluster/platform/coreweave.py

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -384,6 +384,9 @@ def _ensure_one_nodepool(
384384
"nodeLabels": {
385385
self._iris_labels.iris_managed: "true",
386386
self._iris_labels.iris_scale_group: scale_group_name,
387+
# Pin Konnectivity agents and monitoring pods to always-on nodes
388+
# so GPU NodePools can safely scale to zero.
389+
**({"cks.coreweave.cloud/system-critical": "true"} if min_nodes > 0 else {}),
387390
},
388391
},
389392
}

0 commit comments

Comments
 (0)