Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion docs/versioned_docs/version-4.4.x/releases.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,8 @@ The status of the various components as of v4.4 are as follows:
| Local Storage | Local PV Hostpath | 4.4.0 | Stable |
| Local Storage | Local PV LVM | 1.8.0 | Stable |
| Local Storage | Local PV ZFS | 2.9.0 | Stable |
| External Provisioners | Local PV Hostpath | 4.4.0 | Stable |
| Local Storage | Local PV Rawfile | 0.12.0 | Experimental |
| Out-of-tree (External Storage) Provisioners | Local PV Hostpath | 4.4.0 | Stable |
| Other Components | CLI | 4.4.0 | — |

## What’s New
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -266,31 +266,28 @@ as the storage class has `zone` as the value for `poolHasTopologyKey` that match

## "stsAffinityGroup"

`stsAffinityGroup` represents a collection of volumes that belong to instances of Kubernetes StatefulSet. When a StatefulSet is deployed, each instance within the StatefulSet creates its own individual volume, which collectively forms the `stsAffinityGroup`. Each volume within the `stsAffinityGroup` corresponds to a pod of the StatefulSet.
`stsAffinityGroup` represents a collection of volumes that belong to instances of Kubernetes StatefulSet. When a StatefulSet is deployed, each instance within the StatefulSet creates its own individual volume, which collectively forms the `stsAffinityGroup`. Each volume within the `stsAffinityGroup` corresponds to a pod of the StatefulSet.

This feature enforces the following rules to ensure the proper placement and distribution of replicas and targets so that there is not any single point of failure affecting multiple instances of StatefulSet.

1. Anti-Affinity among single-replica volumes:
This rule ensures that replicas of different volumes are distributed in such a way that there is no single point of failure. By avoiding the colocation of replicas from different volumes on the same node.
This is a hard rule. Single-replica volumes in the same affinity group must not be placed on the same node. This prevents a single node failure from impacting multiple StatefulSet pods.

2. Anti-Affinity among multi-replica volumes:

If the affinity group volumes have multiple replicas, they already have some level of redundancy. This feature ensures that in such cases, the replicas are distributed optimally for the stsAffinityGroup volumes.
2. Anti-Affinity among multi-replica volumes:
This is a soft rule. While placement is optimized to spread replicas across nodes, the scheduler may relax this rule when necessary.

3. Anti-affinity among targets:

The [High Availability](../replicated-pv-mayastor/advanced-operations/HA.md) feature ensures that there is no single point of failure for the targets.
The `stsAffinityGroup` ensures that in such cases, the targets are distributed optimally for the stsAffinityGroup volumes.
Targets are distributed to avoid a failure domain impacting multiple volumes in the affinity group.

By default, the `stsAffinityGroup` feature is disabled. To enable it, modify the storage class YAML by setting the `parameters.stsAffinityGroup` parameter to true.

### Volume Affinity Group Scale-Down Restrictions

When using stsAffinityGroup, replicas of volumes belonging to the same StatefulSet are distributed across different nodes to avoid a single point of failure. Due to these anti-affinity rules, scaling a volume down to 1 replica may be restricted, if doing so would cause the last remaining replica to reside on a node that already hosts another single-replica volume from the same affinity group.
When using `stsAffinityGroup`, replicas of volumes belonging to the same StatefulSet are distributed across different nodes to avoid a single point of failure. Because of these anti-affinity rules, scaling a volume down to 1 replica may be restricted if doing so would place the last remaining replica on a node that already hosts another single-replica volume from the same affinity group.

Scale-down to 1 replica is allowed only when the current replicas are already placed on different nodes. If the replicas end up on the same node. For example, after scaling from 3 replicas to 2, the system may block the scale-down until the placement is improved.
A scale-down to 1 replica is allowed only when the current replicas are already placed on different nodes. If the replicas end up on the same node, for example, after scaling from 3 replicas to 2, the system may block the scale-down until the placement is improved.

If a scale-down is blocked, you can resolve it by temporarily scaling the volume up to add a replica (allowing the system to place it on a different node) and then scaling down again. This reshuffles the replicas to meet the affinity group’s placement rules.
If a scale-down is blocked, you can resolve it by temporarily scaling the volume up to add a replica whilst the volume is published and then scaling down again. This reshuffles the replicas to meet the affinity group’s placement rules.

These restrictions ensure that a single node failure does not impact multiple StatefulSet instances, preserving fault isolation and reliability for applications using affinity-grouped volumes.

Expand Down
Loading