scaledown - making it more deterministic #1214

gtully · 2025-09-16T16:12:07Z

gtully
Sep 16, 2025
Collaborator

The logic around scale down, takes the pod out of the statefulset and starts a deployment that takes over the PVC and makes use of the jmx shutdown(scaledown=true) feature of the broker, which leverages cluster support. In the event of failures, this procedure can leave messages pending or lost. We loose visibility of the draining broker and the availability of the statefulset.

I am thinking to change the behaviour. The guiding principal would be to only scale down empty brokers and to block the statefulset size adjustment till the broker is empty.

This means that the broker needs to quiesce, producers need to stop and consumers probably to, as connections can share producers and consumers.
Then existing messages on queues need to be drained. Updating the configuration to add an AMQP queue-bridge-to-the-broker-group can drain any pending messages.

If there are pending messages on any cluster snf queue, these need to be allowed to drain over an existing cluster connection. This may be a challenge if we also want to stop message production. This is only the case when clustered=true and size > 1. It will still be ok to scale to zero with pending messages as ordinal will be 0.

I am thinking conditions and reasons.

ScaleDownPending
ReasonQuescingBroker connectionCount=?
ReasonPendingMessages count=?
ReasonPendingSNFMessages count=?

If a broker is in state ScaleDownPending=true with reason ReasonPendingSNFMessages count > 0, some management intervention may be needed to purge the relevant snf or move the messages elsewhere.

Eventually, the operator will see an empty quiesced broker and will scaledown statefulset and release the PVC in the normal way.

I need to check the detail on quiescing, to see what is currently available through config reload and JMX, ideally config reload will suffice to effect the necessary restrictions. That is the plan.

Answered by gtully

Dec 5, 2025

here is an answer: #1248 it is a little simpler, making repeated use of the jmx broker control scaledown operation. The configuration for scaledown of the cluster (discovery and target selection) is now in brokerProperties and any errors are surfaced. And the scaling down node remains part of the StatefulSet till empty.

View full answer

gtully · 2025-12-05T16:31:05Z

gtully
Dec 5, 2025
Collaborator Author

here is an answer: #1248 it is a little simpler, making repeated use of the jmx broker control scaledown operation. The configuration for scaledown of the cluster (discovery and target selection) is now in brokerProperties and any errors are surfaced. And the scaling down node remains part of the StatefulSet till empty.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

scaledown - making it more deterministic #1214

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Select a reply

Uh oh!

scaledown - making it more deterministic #1214

Uh oh!

Uh oh!

gtully Sep 16, 2025 Collaborator

Replies: 1 comment

Uh oh!

Uh oh!

gtully Dec 5, 2025 Collaborator Author

gtully
Sep 16, 2025
Collaborator

gtully
Dec 5, 2025
Collaborator Author