What happened?
When we update k8ssandra-operator or cass-management-api version, the first node that has just started rolling a restart gets stuck in 1/2 status(DOWN). Because it doesn't have cassandra.datastax.com/seed-node=true label. Also, I can't see its address in the seed endpoints list. But if I add this label to the node manually like this:
kubectl label pod abc-eu-central-eu-central-1b-sts-0 cassandra.datastax.com/seed-node=true --overwrite
then it can be UP itself after a short time.
cass-operator logs show us its in the endless loop for seed handling by grep "seed". Let me show you one line:
2026-03-30T12:24:10.405Z INFO calling Management API reload seeds - POST /api/v0/ops/seeds/reload {"controller": "cassandradatacenter_controller", "controllerGroup": "cassandra.datastax.com", "controllerKind": "CassandraDatacenter", "CassandraDatacenter": {"name":"eu-central","namespace":"default"}, "namespace": "default", "name": "eu-central", "reconcileID": "874f0571-e3bf-4342-93fb-944dbf2cc1d4", "pod": "abc-eu-central-eu-central-1a-sts-0"} 2026-03-30T12:24:10.437Z INFO calling Management API reload seeds - POST /api/v0/ops/seeds/reload {"controller": "cassandradatacenter_controller", "controllerGroup": "cassandra.datastax.com", "controllerKind": "CassandraDatacenter", "CassandraDatacenter": {"name":"eu-central","namespace":"default"}, "namespace": "default", "name": "eu-central", "reconcileID": "874f0571-e3bf-4342-93fb-944dbf2cc1d4", "pod": "abc-eu-central-eu-central-1c-sts-0"}
as you see there is no node "abc-eu-central-eu-central-1b-sts-0"
This is the K8ssandraCluster CRD manifest:
apiVersion: k8ssandra.io/v1alpha1
kind: K8ssandraCluster
metadata:
name: k8ss-cluster
namespace: default
#annotations:
#kustomize.toolkit.fluxcd.io/prune: disabled
spec:
auth: false
reaper:
#cassandra-jaas.config is only required if remote JMX authentication is enabled, but it is no longer used by default between k8ssandra-operator 1.11 and 1.26.
#Enabling HTTP management removes the need for cassandra-jaas.config and allows pods to run with a read-only root filesystem.
httpManagement:
enabled: true
containerImage:
name: cassandra-reaper
registry: docker.io
repository: thelastpickle
tag: 4.1.1
initContainerImage:
name: cassandra-reaper
registry: docker.io
repository: thelastpickle
tag: 4.1.1
autoScheduling:
enabled: true
cassandra:
telemetry:
prometheus:
enabled: true
vector:
enabled: true
mcac:
enabled: false
clusterName: "ABC DEV"
serverVersion: "4.1.10"
resources:
requests:
memory: "6G"
cpu: "1"
limits:
memory: "8G"
datacenters:
- metadata:
name: eu-central
size: 3
racks:
- name: eu-central-1a
nodeAffinityLabels:
topology.kubernetes.io/zone: eu-central-1a
- name: eu-central-1b
nodeAffinityLabels:
topology.kubernetes.io/zone: eu-central-1b
- name: eu-central-1c
nodeAffinityLabels:
topology.kubernetes.io/zone: eu-central-1c
storageConfig:
cassandraDataVolumeClaimSpec:
storageClassName: gp3-retain
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 400Gi
config:
cassandraYaml:
batch_size_fail_threshold: 5000KiB
batch_size_warn_threshold: 1000KiB
num_tokens: 256
partitioner: org.apache.cassandra.dht.RandomPartitioner
jvmOptions:
gc: ZGC
heapSize: 4048M
serviceAccount: k8ssandra
K8ssandra-operator manifest:
---
apiVersion: source.toolkit.fluxcd.io/v1
kind: HelmRepository
metadata:
name: k8ssandra
namespace: default
spec:
interval: 1m0s
url: https://helm.k8ssandra.io/stable
---
apiVersion: helm.toolkit.fluxcd.io/v2
kind: HelmRelease
metadata:
name: k8ssandra-operator
namespace: default
spec:
chart:
spec:
chart: k8ssandra-operator
reconcileStrategy: ChartVersion
sourceRef:
kind: HelmRepository
name: k8ssandra
namespace: default
version: 1.30.2
interval: 1m0s
What did you expect to happen?
I expect to see the node handled by the operator added to the seed endpoint list and come UP. We should not have to add a manual.
How can we reproduce it (as minimally and precisely as possible)?
the operator should handle the nodes. We should not have to add a manual like kubectl label pod abc-eu-central-eu-central-1b-sts-0 cassandra.datastax.com/seed-node=true --overwrite
cass-operator version
image: docker.io/k8ssandra/cass-operator:v1.28.1
Kubernetes version
Client Version: v1.31.1 Kustomize Version: v5.4.2 Server Version: v1.33.8-eks-3a10415
Method of installation
Kustomize, Helm by k8ssandra-operator
Anything else we need to know?
Did I miss something?
What happened?
When we update k8ssandra-operator or cass-management-api version, the first node that has just started rolling a restart gets stuck in 1/2 status(DOWN). Because it doesn't have cassandra.datastax.com/seed-node=true label. Also, I can't see its address in the seed endpoints list. But if I add this label to the node manually like this:
kubectl label pod abc-eu-central-eu-central-1b-sts-0 cassandra.datastax.com/seed-node=true --overwritethen it can be UP itself after a short time.
cass-operator logs show us its in the endless loop for seed handling by grep "seed". Let me show you one line:
2026-03-30T12:24:10.405Z INFO calling Management API reload seeds - POST /api/v0/ops/seeds/reload {"controller": "cassandradatacenter_controller", "controllerGroup": "cassandra.datastax.com", "controllerKind": "CassandraDatacenter", "CassandraDatacenter": {"name":"eu-central","namespace":"default"}, "namespace": "default", "name": "eu-central", "reconcileID": "874f0571-e3bf-4342-93fb-944dbf2cc1d4", "pod": "abc-eu-central-eu-central-1a-sts-0"} 2026-03-30T12:24:10.437Z INFO calling Management API reload seeds - POST /api/v0/ops/seeds/reload {"controller": "cassandradatacenter_controller", "controllerGroup": "cassandra.datastax.com", "controllerKind": "CassandraDatacenter", "CassandraDatacenter": {"name":"eu-central","namespace":"default"}, "namespace": "default", "name": "eu-central", "reconcileID": "874f0571-e3bf-4342-93fb-944dbf2cc1d4", "pod": "abc-eu-central-eu-central-1c-sts-0"}as you see there is no node "abc-eu-central-eu-central-1b-sts-0"
This is the K8ssandraCluster CRD manifest:
K8ssandra-operator manifest:
What did you expect to happen?
I expect to see the node handled by the operator added to the seed endpoint list and come UP. We should not have to add a manual.
How can we reproduce it (as minimally and precisely as possible)?
the operator should handle the nodes. We should not have to add a manual like
kubectl label pod abc-eu-central-eu-central-1b-sts-0 cassandra.datastax.com/seed-node=true --overwritecass-operator version
image: docker.io/k8ssandra/cass-operator:v1.28.1
Kubernetes version
Client Version: v1.31.1 Kustomize Version: v5.4.2 Server Version: v1.33.8-eks-3a10415
Method of installation
Kustomize, Helm by k8ssandra-operator
Anything else we need to know?
Did I miss something?