-
Notifications
You must be signed in to change notification settings - Fork 237
Description
Describe the bug
Applying a MySQL Cluster
and an OpsRequest
(type: Reconfiguring
with at least one restart-required parameter) in the same apply for new clusters leads to a crashloop/broken cluster when PVC provisioning delays the first Pod start. The OpsRequest
is queued and processed by the operator before the MySQL cluster has completed its first boot. When the volume is finally provisioned and the Pod starts, the already-processed OpsRequest
immediately triggers the restart-required reconfigure (e.g., innodb_buffer_pool_instances
), and the component fails to complete initial bootstrap reliably.
To Reproduce
-
Apply the following at once (single
kubectl apply -f
), using a storage class that takes a few seconds to provision a PVC:--- kind: Namespace apiVersion: v1 metadata: name: kubeblocks-test --- apiVersion: apps.kubeblocks.io/v1 kind: Cluster metadata: name: cluster1 namespace: kubeblocks-test spec: clusterDef: mysql topology: semisync terminationPolicy: Delete componentSpecs: - name: mysql componentDef: "mysql-8.0" serviceVersion: 8.0.33 replicas: 1 volumeClaimTemplates: - name: data spec: accessModes: ["ReadWriteOnce"] resources: requests: storage: 10Gi --- apiVersion: operations.kubeblocks.io/v1alpha1 kind: OpsRequest metadata: name: mysql-reconfiguring namespace: kubeblocks-test spec: clusterName: cluster1 force: false reconfigures: - componentName: mysql parameters: - key: innodb_buffer_pool_instances value: "5" preConditionDeadlineSeconds: 60 type: Reconfiguring
-
Observe: PVC provisioning keeps the Pod at
Pending
; theOpsRequest
is processed and ready to execute before the Pod exists. -
When the Pod finally starts, the restart-required reconfigure is executed immediately (before first-boot completes), and the component fails to finish initialization / enters restart loops.
Expected behavior
The OpsRequest
should not be processed until the MySQL Pod is running and all init containers have completed; applying Cluster
+ OpsRequest
together for new clusters should be safe for GitOps workflows even when PVC provisioning is slow.
Additional context
- Kubernetes:
1.33.5+k3s1
- KubeBlocks:
v1.0.1
- MySQL add-on:
1.0.3
- Storage class / CSI: hetzner-csi
Does not happen if
- The
OpsRequest
is applied after the Cluster successfully bootstraps (all init containers successfully exit). - The Cluster has no
volumeClaimTemplates
(Pod starts quickly).