You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/versioned_docs/version-4.4.x/user-guides/replicated-storage-user-guide/replicated-pv-mayastor/configuration/rs-topology-parameters.md
+8-11Lines changed: 8 additions & 11 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -266,31 +266,28 @@ as the storage class has `zone` as the value for `poolHasTopologyKey` that match
266
266
267
267
## "stsAffinityGroup"
268
268
269
-
`stsAffinityGroup` represents a collection of volumes that belong to instances of Kubernetes StatefulSet. When a StatefulSet is deployed, each instance within the StatefulSet creates its own individual volume, which collectively forms the `stsAffinityGroup`. Each volume within the `stsAffinityGroup` corresponds to a pod of the StatefulSet.
269
+
`stsAffinityGroup` represents a collection of volumes that belong to instances of Kubernetes StatefulSet. When a StatefulSet is deployed, each instance within the StatefulSet creates its own individual volume, which collectively forms the `stsAffinityGroup`. Each volume within the `stsAffinityGroup` corresponds to a pod of the StatefulSet.
270
270
271
271
This feature enforces the following rules to ensure the proper placement and distribution of replicas and targets so that there is not any single point of failure affecting multiple instances of StatefulSet.
272
272
273
273
1. Anti-Affinity among single-replica volumes:
274
-
This rule ensures that replicas of different volumes are distributed in such a way that there is no single point of failure. By avoiding the colocation of replicas from different volumes on the same node.
274
+
This is a hard rule. Single-replica volumes in the same affinity group must not be placed on the same node. This prevents a single node failure from impacting multiple StatefulSet pods.
275
275
276
-
2. Anti-Affinity among multi-replica volumes:
277
-
278
-
If the affinity group volumes have multiple replicas, they already have some level of redundancy. This feature ensures that in such cases, the replicas are distributed optimally for the stsAffinityGroup volumes.
276
+
2. Anti-Affinity among multi-replica volumes:
277
+
This is a soft rule. While placement is optimized to spread replicas across nodes, the scheduler may relax this rule when necessary.
279
278
280
279
3. Anti-affinity among targets:
281
-
282
-
The [High Availability](../replicated-pv-mayastor/advanced-operations/HA.md) feature ensures that there is no single point of failure for the targets.
283
-
The `stsAffinityGroup` ensures that in such cases, the targets are distributed optimally for the stsAffinityGroup volumes.
280
+
Targets are distributed to avoid a failure domain impacting multiple volumes in the affinity group.
284
281
285
282
By default, the `stsAffinityGroup` feature is disabled. To enable it, modify the storage class YAML by setting the `parameters.stsAffinityGroup` parameter to true.
286
283
287
284
### Volume Affinity Group Scale-Down Restrictions
288
285
289
-
When using stsAffinityGroup, replicas of volumes belonging to the same StatefulSet are distributed across different nodes to avoid a single point of failure. Due to these anti-affinity rules, scaling a volume down to 1 replica may be restricted, if doing so would cause the last remaining replica to reside on a node that already hosts another single-replica volume from the same affinity group.
286
+
When using `stsAffinityGroup`, replicas of volumes belonging to the same StatefulSet are distributed across different nodes to avoid a single point of failure. Because of these anti-affinity rules, scaling a volume down to 1 replica may be restricted if doing so would place the last remaining replica on a node that already hosts another single-replica volume from the same affinity group.
290
287
291
-
Scale-down to 1 replica is allowed only when the current replicas are already placed on different nodes. If the replicas end up on the same node. For example, after scaling from 3 replicas to 2, the system may block the scale-down until the placement is improved.
288
+
A scale-down to 1 replica is allowed only when the current replicas are already placed on different nodes. If the replicas end up on the same node, for example, after scaling from 3 replicas to 2, the system may block the scale-down until the placement is improved.
292
289
293
-
If a scale-down is blocked, you can resolve it by temporarily scaling the volume up to add a replica (allowing the system to place it on a different node) and then scaling down again. This reshuffles the replicas to meet the affinity group’s placement rules.
290
+
If a scale-down is blocked, you can resolve it by temporarily scaling the volume up to add a replica whilst the volume is published and then scaling down again. This reshuffles the replicas to meet the affinity group’s placement rules.
294
291
295
292
These restrictions ensure that a single node failure does not impact multiple StatefulSet instances, preserving fault isolation and reliability for applications using affinity-grouped volumes.
0 commit comments