Skip to content

Commit 433b3bb

Browse files
david-yuclaude
andauthored
[v/25.2] manage/k8s: document decommission timing (--decommission-wait-interval) (#1765)
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
1 parent 2afba70 commit 433b3bb

1 file changed

Lines changed: 59 additions & 2 deletions

File tree

modules/manage/pages/kubernetes/k-decommission-brokers.adoc

Lines changed: 59 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -466,12 +466,14 @@ helm upgrade --install redpanda-controller redpanda/operator \
466466
--namespace <namespace> \
467467
--set image.tag={latest-operator-version} \
468468
--create-namespace \
469-
--set additionalCmdFlags={--additional-controllers="decommission"} \
469+
--set "additionalCmdFlags={--additional-controllers=decommission}" \
470470
--set rbac.createAdditionalControllerCRs=true
471471
----
472472
+
473-
- `--additional-controllers="decommission"`: Enables the Decommission controller.
473+
- `--additional-controllers=decommission`: Enables the Decommission controller.
474474
- `rbac.createAdditionalControllerCRs=true`: Creates the required RBAC rules for the Redpanda Operator to monitor the StatefulSet and update PVCs and PVs.
475+
+
476+
TIP: To change how often the Decommission controller re-checks the cluster for brokers that need decommissioning, pass the `--decommission-wait-interval` flag through `additionalCmdFlags`. See <<decommission-timing>>.
475477

476478
.. Configure a Redpanda resource with seven Redpanda brokers:
477479
+
@@ -644,6 +646,61 @@ kubectl logs <pod-name> --namespace <namespace> -c sidecars
644646
645647
You can repeat this procedure to continue to scale down.
646648
649+
[[decommission-timing]]
650+
== Tune automatic decommission timing
651+
652+
The <<Automated,automatic decommissioner>> re-checks the cluster on a regular interval for brokers that need to be decommissioned. The setting that controls this interval, and any debounce window before the decommissioner acts, depends on how the controller is deployed: as the Decommission controller inside the Redpanda Operator, or as the broker decommissioner sidecar in a Helm-only deployment.
653+
654+
[cols="2,1,4"]
655+
|===
656+
| Setting | Default | Description
657+
658+
| `--decommission-wait-interval` (Operator; set through `additionalCmdFlags`)
659+
| `8s`
660+
| Requeue interval (`RequeueAfter`) for the Operator's Decommission controller: how often the controller re-checks the cluster for brokers that need decommissioning when a reconcile did not already schedule a sooner re-check.
661+
662+
| `decommissionRequeueTimeout` (Helm sidecar; under `statefulset.sideCars.brokerDecommissioner`)
663+
| `10s`
664+
| How often the sidecar re-checks a cluster that already has a broker flagged for decommissioning.
665+
666+
| `decommissionAfter` (Helm sidecar; under `statefulset.sideCars.brokerDecommissioner`)
667+
| `60s`
668+
| How long a broker must continuously meet the decommission conditions before the sidecar acts. This debounce window prevents acting on transient conditions, such as a broker that is briefly unreachable during a restart.
669+
|===
670+
671+
=== Set the interval for the Operator
672+
673+
The Operator's Decommission controller does not expose its interval as a dedicated Helm value. Instead, pass the `--decommission-wait-interval` flag through `additionalCmdFlags` when you install or upgrade the Operator:
674+
675+
[,bash,subs="attributes+"]
676+
----
677+
helm upgrade --install redpanda-controller redpanda/operator \
678+
--namespace <namespace> \
679+
--create-namespace \
680+
--set image.tag={latest-operator-version} \
681+
--set "additionalCmdFlags={--additional-controllers=decommission,--decommission-wait-interval=30s}" \
682+
--set rbac.createAdditionalControllerCRs=true
683+
----
684+
685+
The flag accepts any Go duration string, such as `8s`, `30s`, or `2m`. The default is `8s`. After each reconcile, the controller logs the next scheduled run, and the `next run in` value reflects the configured interval:
686+
687+
[.no-copy]
688+
----
689+
{"level":"info","logger":"DecommissionReconciler.Reconcile","msg":"successful reconciliation finished in 1m0s, next run in 30s","controller":"statefulset", ...}
690+
----
691+
692+
=== Set the intervals for Helm
693+
694+
For a Helm-only deployment, set the sidecar values directly under `statefulset.sideCars.brokerDecommissioner`. For a full example, see <<Automated,Use the BrokerDecommissioner>>.
695+
696+
=== Guidance for adjusting the intervals
697+
698+
* These settings control only how often the decommissioner *re-checks* for work and how long it waits before acting. They do not change how fast partition data is reallocated once a decommission begins. Reallocation throughput is governed by xref:reference:cluster-properties.adoc#raft_learner_recovery_rate[`raft_learner_recovery_rate`] and xref:reference:tunable-properties.adoc#partition_autobalancing_concurrent_moves[`partition_autobalancing_concurrent_moves`].
699+
* This interval is the *periodic* re-check cadence. A scale-in that you initiate by reducing `statefulset.replicas` is detected from a StatefulSet watch event and acted on promptly, so raising the interval does not delay a routine scale-in. The interval primarily determines how quickly the controller notices conditions that arise without a triggering event, such as a broker that becomes unreachable.
700+
* Increase the re-check interval to reduce reconcile frequency, and the associated log and Admin API traffic, on large or stable clusters. Decrease it for faster detection of brokers that need decommissioning.
701+
* For Helm (sidecar) deployments, keep `decommissionRequeueTimeout` smaller than `decommissionAfter` -- ideally well below it -- so the sidecar re-evaluates the cluster at least once within the debounce window. If the re-check interval is close to or larger than `decommissionAfter`, the decommissioner may wait up to one additional interval before acting. The Kubernetes controller-runtime work queue also adds a small amount of jitter.
702+
* A single Operator reconcile can take up to about a minute because the Decommission controller verifies that cluster health is stable before it commits to a decommission. This is expected, and is independent of the `--decommission-wait-interval` value.
703+
647704
== Troubleshooting
648705
649706
If the decommissioning process is not making progress, investigate the following potential issues:

0 commit comments

Comments
 (0)