Open
Description
Issue originally authored by tnozicka as #1189
Describe the bug
Scalling cluster down and up before it's finished gets the cluster stuck on a node that's decommissioned but never removed.
To Reproduce
Steps to reproduce the behavior:
$ # Create a cluster with 2 nodes
$ yq e 'del(.metadata.generateName) | .metadata.name="example" | .spec.version = "5.2.0-rc2" | .spec.datacenter.racks[0].members = 2' ./test/e2e/fixture/scylla/basic.scyllacluster.yaml| kubectl apply --server-side --force-conflicts -f -
$ # Wait for the cluster to rollout
$ # Scale down to 1 node but don't wait
$ kubectl patch scyllacluster.scylla.scylladb.com/example --type='json' -p='[{"op": "replace", "path": "/spec/datacenter/racks/0/members", "value": 1}]'
$ # Wait for the second node to start decommissioning / go unready
$ # Scale up to 3 nodes while the second node is still decommisioning
$ kubectl patch scyllacluster.scylla.scylladb.com/example --type='json' -p='[{"op": "replace", "path": "/spec/datacenter/racks/0/members", "value": 1}]'
$ # Observe how it gets stuck on the second node
I've been able to reproduced this every time.
Initially I've reproduced it with scylladb 5.0.5 but to make sure this isn't scylladb/scylladb#11302 I've bumped to 5.2.0-rc2.
Expected behavior
Eventually gets to 3 nodes.
Additional context
$ kubectl get scyllacluster,sts,pods,svc,pvc
NAME AGE
scyllacluster.scylla.scylladb.com/example 31m
NAME READY AGE
statefulset.apps/example-us-east-1-us-east-1a 1/3 27m
NAME READY STATUS RESTARTS AGE
pod/example-us-east-1-us-east-1a-0 2/2 Running 0 12m
pod/example-us-east-1-us-east-1a-1 1/2 Running 0 11m
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/example-client ClusterIP 10.101.69.218 <none> 7000/TCP,7001/TCP,7199/TCP,10001/TCP,9180/TCP,5090/TCP,9100/TCP,9042/TCP,9142/TCP,19042/TCP,19142/TCP,9160/TCP 12m
service/example-us-east-1-us-east-1a-0 ClusterIP 10.109.78.80 <none> 7000/TCP,7001/TCP,7199/TCP,10001/TCP,9180/TCP,5090/TCP,9100/TCP,9042/TCP,9142/TCP,19042/TCP,19142/TCP,9160/TCP 12m
service/example-us-east-1-us-east-1a-1 ClusterIP 10.104.226.80 <none> 7000/TCP,7001/TCP,7199/TCP,10001/TCP,9180/TCP,5090/TCP,9100/TCP,9042/TCP,9142/TCP,19042/TCP,19142/TCP,9160/TCP 12m
service/example-us-east-1-us-east-1a-2 ClusterIP 10.109.59.242 <none> 7000/TCP,7001/TCP,7199/TCP,10001/TCP,9180/TCP,5090/TCP,9100/TCP,9042/TCP,9142/TCP,19042/TCP,19142/TCP,9160/TCP 7m54s
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
persistentvolumeclaim/data-example-us-east-1-us-east-1a-0 Bound local-pv-19c669cf 23Gi RWO local-storage 12m
persistentvolumeclaim/data-example-us-east-1-us-east-1a-1 Bound local-pv-bb0d6ca7 23Gi RWO local-storage 11m
NFO 2023-03-17 16:32:35,322 [shard 0] storage_service - decommission[96dc4cae-0a54-4f95-a00b-ad5c6c96ad76]: Stopped heartbeat_updater
INFO 2023-03-17 16:32:35,323 [shard 0] storage_service - decommission[96dc4cae-0a54-4f95-a00b-ad5c6c96ad76]: leaving Raft group 0
INFO 2023-03-17 16:32:35,323 [shard 0] raft_group0 - leaving group 0 (my id = 5fce5ebb-5461-42ab-a4f3-6b52e23fa87d)...
INFO 2023-03-17 16:32:35,365 [shard 0] storage_service - decommission[96dc4cae-0a54-4f95-a00b-ad5c6c96ad76]: left Raft group 0
INFO 2023-03-17 16:32:35,365 [shard 0] storage_service - Stop transport: starts
INFO 2023-03-17 16:32:35,365 [shard 0] migration_manager - stopping migration service
INFO 2023-03-17 16:32:35,365 [shard 0] storage_service - Shutting down native transport server
INFO 2023-03-17 16:32:35,365 [shard 0] cql_server_controller - CQL server stopped
INFO 2023-03-17 16:32:35,365 [shard 0] storage_service - Shutting down native transport server was successful
INFO 2023-03-17 16:32:35,365 [shard 0] storage_service - Shutting down rpc server
INFO 2023-03-17 16:32:35,365 [shard 0] storage_service - Shutting down rpc server was successful
INFO 2023-03-17 16:32:35,365 [shard 0] storage_service - Shutting down alternator server
INFO 2023-03-17 16:32:35,365 [shard 0] storage_service - Shutting down alternator server was successful
INFO 2023-03-17 16:32:35,365 [shard 0] storage_service - Shutting down redis server
INFO 2023-03-17 16:32:35,365 [shard 0] storage_service - Shutting down redis server was successful
INFO 2023-03-17 16:32:35,365 [shard 0] storage_service - Stop transport: shutdown rpc and cql server done
INFO 2023-03-17 16:32:35,365 [shard 0] gossip - My status = LEFT
WARN 2023-03-17 16:32:35,365 [shard 0] gossip - No local state or state is in silent shutdown, not announcing shutdown
INFO 2023-03-17 16:32:35,365 [shard 0] gossip - Disable and wait for gossip loop started
INFO 2023-03-17 16:32:35,542 [shard 0] gossip - failure_detector_loop: Finished main loop
INFO 2023-03-17 16:32:35,542 [shard 0] gossip - Gossip is now stopped
INFO 2023-03-17 16:32:35,542 [shard 0] storage_service - Stop transport: stop_gossiping done
INFO 2023-03-17 16:32:35,542 [shard 0] messaging_service - Stopping nontls server
INFO 2023-03-17 16:32:35,542 [shard 0] messaging_service - Stopping tls server
INFO 2023-03-17 16:32:35,542 [shard 0] messaging_service - Stopping tls server - Done
INFO 2023-03-17 16:32:35,542 [shard 0] messaging_service - Stopping client for address: 10.109.78.80:0
INFO 2023-03-17 16:32:35,542 [shard 0] messaging_service - Stopping client for address: 10.104.226.80:0
INFO 2023-03-17 16:32:35,542 [shard 0] messaging_service - Stopping client for address: 10.109.78.80:0
INFO 2023-03-17 16:32:35,542 [shard 0] messaging_service - Stopping client for address: 10.109.78.80:0
INFO 2023-03-17 16:32:35,542 [shard 0] messaging_service - Stopping client for address: 10.109.78.80:0
INFO 2023-03-17 16:32:35,542 [shard 0] messaging_service - Stopping client for address: 10.109.78.80:0
INFO 2023-03-17 16:32:35,542 [shard 0] messaging_service - Stopping client for address: 10.104.226.80:0
INFO 2023-03-17 16:32:35,543 [shard 0] messaging_service - Stopping client for address: 10.109.78.80:0 - Done
INFO 2023-03-17 16:32:35,543 [shard 0] messaging_service - Stopping client for address: 10.109.78.80:0 - Done
INFO 2023-03-17 16:32:35,543 [shard 0] messaging_service - Stopping client for address: 10.109.78.80:0 - Done
INFO 2023-03-17 16:32:35,543 [shard 0] messaging_service - Stopping client for address: 10.104.226.80:0 - Done
INFO 2023-03-17 16:32:35,544 [shard 0] messaging_service - Stopping client for address: 10.109.78.80:0 - Done
INFO 2023-03-17 16:32:35,544 [shard 0] messaging_service - Stopping client for address: 10.104.226.80:0 - Done
INFO 2023-03-17 16:32:35,544 [shard 0] messaging_service - Stopping client for address: 10.109.78.80:0 - Done
INFO 2023-03-17 16:32:35,544 [shard 0] messaging_service - Stopping nontls server - Done
INFO 2023-03-17 16:32:35,544 [shard 0] storage_service - messaging_service stopped
INFO 2023-03-17 16:32:35,544 [shard 0] storage_service - Stop transport: shutdown messaging_service done
INFO 2023-03-17 16:32:35,544 [shard 0] storage_service - Stop transport: shutdown stream_manager done
INFO 2023-03-17 16:32:35,544 [shard 0] storage_service - Stop transport: done
INFO 2023-03-17 16:32:35,544 [shard 0] storage_service - DECOMMISSIONING: stopped transport
INFO 2023-03-17 16:32:35,544 [shard 0] batchlog_manager - Asked to drain
INFO 2023-03-17 16:32:35,544 [shard 0] batchlog_manager - Drained
INFO 2023-03-17 16:32:35,544 [shard 0] storage_service - DECOMMISSIONING: stop batchlog_manager done
INFO 2023-03-17 16:32:35,907 [shard 0] compaction - [Compact system.local 53173930-c4e1-11ed-a898-c81a46a262e5] Compacting [/var/lib/scylla/data/system/local-7ad54392bcdd35a684174e047860b377/me-11-big-Data.db:level=0:origin=memtable,/var/lib/scylla/data/system/local-7ad54392bcdd35a684174e047860b377/me-10-big-Data.db:level=0:origin=compaction]
INFO 2023-03-17 16:32:35,908 [shard 0] storage_service - DECOMMISSIONING: set_bootstrap_state done
INFO 2023-03-17 16:32:35,908 [shard 0] storage_service - entering DECOMMISSIONED mode
INFO 2023-03-17 16:32:35,908 [shard 0] storage_service - DECOMMISSIONING: done
INFO 2023-03-17 16:32:36,307 [shard 0] compaction - [Compact system.local 53173930-c4e1-11ed-a898-c81a46a262e5] Compacted 2 sstables to [/var/lib/scylla/data/system/local-7ad54392bcdd35a684174e047860b377/me-12-big-Data.db:level=0]. 86kB to 45kB (~52% of original) in 311ms = 144kB/s. ~256 total partitions merged to 1.
INFO 2023-03-17 16:32:36,353 [shard 0] raft_group_registry - marking Raft server 5fce5ebb-5461-42ab-a4f3-6b52e23fa87d as dead for raft groups
INFO 2023-03-17 16:32:36,353 [shard 0] raft_group_registry - marking Raft server afdb17f4-86ef-44ea-bdb4-01ddb1dc2902 as dead for raft groups
I0317 16:32:37.658371 1 sidecar/probes.go:122] "readyz probe: node is not ready" Service="test/example-us-east-1-us-east-1a-1"
I0317 16:32:39.310344 1 sidecar/sync.go:92] "The node is already decommissioned"
I0317 16:32:47.659056 1 sidecar/probes.go:122] "readyz probe: node is not ready" Service="test/example-us-east-1-us-east-1a-1"
I0317 16:32:57.657179 1 sidecar/probes.go:122] "readyz probe: node is not ready" Service="test/example-us-east-1-us-east-1a-1"
I0317 16:33:07.658469 1 sidecar/probes.go:122] "readyz probe: node is not ready" Service="test/example-us-east-1-us-east-1a-1"
I0317 16:33:17.660454 1 sidecar/probes.go:122] "readyz probe: node is not ready" Service="test/example-us-east-1-us-east-1a-1"
I0317 16:33:27.434626 1 sidecar/probes.go:122] "readyz probe: node is not ready" Service="test/example-us-east-1-us-east-1a-1"
I0317 16:33:27.657882 1 sidecar/probes.go:122] "readyz probe: node is not ready" Service="test/example-us-east-1-us-east-1a-1"
I0317 16:33:37.655646 1 sidecar/probes.go:122] "readyz probe: node is not ready" Service="test/example-us-east-1-us-east-1a-1"
I0317 16:33:47.657373 1 sidecar/probes.go:122] "readyz probe: node is not ready" Service="test/example-us-east-1-us-east-1a-1"
I0317 16:33:57.659529 1 sidecar/probes.go:122] "readyz probe: node is not ready" Service="test/example-us-east-1-us-east-1a-1"
I0317 16:34:07.656316 1 sidecar/probes.go:122] "readyz probe: node is not ready" Service="test/example-us-east-1-us-east-1a-1"
I0317 16:34:17.659727 1 sidecar/probes.go:122] "readyz probe: node is not ready" Service="test/example-us-east-1-us-east-1a-1"
I0317 16:34:27.658937 1 sidecar/probes.go:122] "readyz probe: node is not ready" Service="test/example-us-east-1-us-east-1a-1"
I0317 16:34:37.660100 1 sidecar/probes.go:122] "readyz probe: node is not ready" Service="test/example-us-east-1-us-east-1a-1"
I0317 16:34:47.658364 1 sidecar/probes.go:122] "readyz probe: node is not ready" Service="test/example-us-east-1-us-east-1a-1"
I0317 16:34:53.434242 1 sidecar/probes.go:122] "readyz probe: node is not ready" Service="test/example-us-east-1-us-east-1a-1"
I0317 16:34:57.657199 1 sidecar/probes.go:122] "readyz probe: node is not ready" Service="test/example-us-east-1-us-east-1a-1"
I0317 16:35:07.657590 1 sidecar/probes.go:122] "readyz probe: node is not ready" Service="test/example-us-east-1-us-east-1a-1"
I0317 16:35:17.658096 1 sidecar/probes.go:122] "readyz probe: node is not ready" Service="test/example-us-east-1-us-east-1a-1"
I0317 16:35:27.655923 1 sidecar/probes.go:122] "readyz probe: node is not ready" Service="test/example-us-east-1-us-east-1a-1"
I0317 16:35:37.658450 1 sidecar/probes.go:122] "readyz probe: node is not ready" Service="test/example-us-east-1-us-east-1a-1"
I0317 16:35:47.661271 1 sidecar/probes.go:122] "readyz probe: node is not ready" Service="test/example-us-east-1-us-east-1a-1"
I0317 16:35:57.659960 1 sidecar/probes.go:122] "readyz probe: node is not ready" Service="test/example-us-east-1-us-east-1a-1"
I0317 16:36:07.660792 1 sidecar/probes.go:122] "readyz probe: node is not ready" Service="test/example-us-east-1-us-east-1a-1"
I0317 16:36:08.437477 1 sidecar/probes.go:122] "readyz probe: node is not ready" Service="test/example-us-east-1-us-east-1a-1"
I0317 16:36:17.659351 1 sidecar/probes.go:122] "readyz probe: node is not ready" Service="test/example-us-east-1-us-east-1a-1"
I0317 16:36:27.657682 1 sidecar/probes.go:122] "readyz probe: node is not ready" Service="test/example-us-east-1-us-east-1a-1"
I0317 16:36:37.656668 1 sidecar/probes.go:122] "readyz probe: node is not ready" Service="test/example-us-east-1-us-east-1a-1"
I0317 16:36:47.658896 1 sidecar/probes.go:122] "readyz probe: node is not ready" Service="test/example-us-east-1-us-east-1a-1"
...
I0317 16:46:57.659029 1 sidecar/probes.go:122] "readyz probe: node is not ready" Service="test/example-us-east-1-us-east-1a-1"
...