Description
I have a permanent dask cluster in kubernetes. Current operator ignores all changes to manifest.
There has been an issue about supporting spec updates, it got closed as resolved after implementing scale field support: #636.
The only fields that cause changes to deployment after applying updated manifest are spec.worker.replicas
, DaskAutoscaler
min/max.
Is it possible to support other fields, specifically image
, args
, env
, volumes/mounts
?
If not, what could be the optimal way to gracefully shut down and update cluster?
Cluster manifest (mostly copypasted from example):
---
apiVersion: kubernetes.dask.org/v1
kind: DaskCluster
metadata:
name: dask-primary
spec:
worker:
replicas: 1
spec:
containers:
- name: worker
image: "//backend/dask:image"
imagePullPolicy: Always
args:
- worker
- --name
- $(DASK_WORKER_NAME)
- --dashboard
- --dashboard-address
- "8788"
ports:
- name: http-dashboard
containerPort: 8788
protocol: TCP
env:
- name: ENV_1
value: 1
- name: ENV_2
value: 2
volumeMounts:
- name: kafka-certs
mountPath: /etc/ssl/kafka/ca.crt
subPath: ca.crt
readOnly: true
volumes:
- name: kafka-certs
configMap:
name: kafka-certs
scheduler:
spec:
containers:
- name: scheduler
image: "//backend/dask:image"
imagePullPolicy: Always
args:
- scheduler
ports:
- name: tcp-comm
containerPort: 8786
protocol: TCP
- name: http-dashboard
containerPort: 8787
protocol: TCP
readinessProbe:
httpGet:
port: http-dashboard
path: /health
initialDelaySeconds: 5
periodSeconds: 10
livenessProbe:
httpGet:
port: http-dashboard
path: /health
initialDelaySeconds: 15
periodSeconds: 20
imagePullSecrets:
- name: regcred
service:
type: ClusterIP
selector:
dask.org/cluster-name: dask-primary
dask.org/component: scheduler
ports:
- name: tcp-comm
protocol: TCP
port: 8786
targetPort: "tcp-comm"
- name: http-dashboard
protocol: TCP
port: 8787
targetPort: "http-dashboard"
---
apiVersion: kubernetes.dask.org/v1
kind: DaskAutoscaler
metadata:
name: dask-primary
spec:
cluster: dask-primary
minimum: 1
maximum: 10
Operator version: helm install --repo https://helm.dask.org --create-namespace -n dask-operator --generate-name --version 2024.5.0 dask-kubernetes-operator
Dask version: custom built image that uses the following deps:
dask = "^2024.5.2"
bokeh = "^3.4.1"
distributed = "^2024.5.2"
Although it's the same with 2024.5.2-py3.11
image