|
| 1 | +--- |
| 2 | +title: vsphere-configurable-maximum-allowed-number-of-block-volumes-per-node |
| 3 | +authors: |
| 4 | + - "@rbednar" |
| 5 | +reviewers: |
| 6 | + - "@jsafrane" |
| 7 | + - "@gnufied" |
| 8 | + - "@deads2k" |
| 9 | +approvers: |
| 10 | + - "@jsafrane" |
| 11 | + - "@gnufied" |
| 12 | + - "@deads2k" |
| 13 | +api-approvers: |
| 14 | + - "@deads2k" |
| 15 | +creation-date: 2025-01-31 |
| 16 | +last-updated: 2025-01-31 |
| 17 | +tracking-link: |
| 18 | + - https://issues.redhat.com/browse/OCPSTRAT-1829 |
| 19 | +see-also: |
| 20 | + - "None" |
| 21 | +replaces: |
| 22 | + - "None" |
| 23 | +superseded-by: |
| 24 | + - "None" |
| 25 | +--- |
| 26 | + |
| 27 | +# vSphere configurable maximum allowed number of block volumes per node |
| 28 | + |
| 29 | +This document proposes an enhancement to the vSphere CSI driver to allow administrators to configure the maximum number |
| 30 | +of block volumes that can be attached to a single vSphere node. This enhancement addresses the limitations of the current driver, |
| 31 | +which relies on a static limit based on the number of SCSI controllers available on the vSphere node. |
| 32 | + |
| 33 | +## Summary |
| 34 | + |
| 35 | +The vSphere CSI driver for vSphere version 7 uses a constant to determine the maximum number of block volumes that can |
| 36 | +be attached to a single node. This limit is influenced by the number of SCSI controllers available on the node. |
| 37 | +By default, a node can have up to four SCSI controllers, each supporting up to 15 devices, allowing for a maximum of 60 |
| 38 | +volumes per node (59 + root volume). |
| 39 | + |
| 40 | +However, vSphere version 8 increased the maximum number of volumes per node to 256 (255 + root volume). This enhancement |
| 41 | +aims to leverage this increased limit and provide administrators with finer-grained control over volume allocation |
| 42 | +allowing them to configure the maximum number of block volumes that can be attached to a single node. |
| 43 | + |
| 44 | +Details about configuration maximums: https://configmax.broadcom.com/guest?vmwareproduct=vSphere&release=vSphere%208.0&categories=3-0 |
| 45 | +Volume limit configuration for vSphere storage plug-in: https://techdocs.broadcom.com/us/en/vmware-cis/vsphere/container-storage-plugin/3-0/getting-started-with-vmware-vsphere-container-storage-plug-in-3-0/vsphere-container-storage-plug-in-concepts/configuration-maximums-for-vsphere-container-storage-plug-in.html |
| 46 | + |
| 47 | +## Motivation |
| 48 | + |
| 49 | +### User Stories |
| 50 | + |
| 51 | +- As a vSphere administrator, I want to configure the maximum number of volumes that can be attached to a node, so that |
| 52 | + I can optimize resource utilization and prevent oversubscription. |
| 53 | +- As a cluster administrator, I want to ensure that the vSphere CSI driver operates within the limits imposed by the |
| 54 | + underlying vSphere infrastructure. |
| 55 | + |
| 56 | +### Goals |
| 57 | + |
| 58 | +- Provide administrators with granular control over volume allocation on vSphere nodes. |
| 59 | +- Improve resource utilization and prevent oversubscription. |
| 60 | +- Ensure compatibility with existing vSphere infrastructure limitations. |
| 61 | +- Maintain backward compatibility with existing deployments. |
| 62 | + |
| 63 | +### Non-Goals |
| 64 | + |
| 65 | +- Dynamically adjust the limit based on real-time resource usage. |
| 66 | +- Implement per-namespace or per-workload volume limits. |
| 67 | +- Modify the underlying vSphere VM configuration. |
| 68 | + |
| 69 | +## Proposal |
| 70 | + |
| 71 | +1. Enable Feature State Switch (FSS): |
| 72 | + |
| 73 | + - Use FSS of the vSphere driver to control the activation of the maximum volume limit functionality. |
| 74 | + - The operator will check for vSphere version 8 (`VCenterChecker`) and conditionally set higher volume limit if version 8 or higher is detected. |
| 75 | + |
| 76 | +2. API for Maximum Volume Limit: |
| 77 | + |
| 78 | + - Introduce a new field `spec.driverConfig.vSphere.maxAllowedBlockVolumesPerNode` in ClusterCSIDriver API to allow administrators to configure the desired maximum number of volumes per node. |
| 79 | + - The vSphere CSI operator will read the configured value from the API. |
| 80 | + |
| 81 | +3. Update CSI Node Pods: |
| 82 | + |
| 83 | + - If the new `maxAllowedBlockVolumesPerNode` API field is set in ClusterCSIDriver the operator will inject the `MAX_VOLUMES_PER_NODE` environment variable into node pods using a DaemonSet hook. |
| 84 | + |
| 85 | +4. Driver Behavior: |
| 86 | + |
| 87 | + - The vSphere CSI driver will continue to perform basic validation on the user-defined limit, allowing the new limit of 255 volumes per node only on vSphere versions higher than 8. |
| 88 | + - The driver will respect the configured limit when provisioning volumes. |
| 89 | + |
| 90 | +### Workflow Description |
| 91 | + |
| 92 | +1. Administrator Configures Limit: |
| 93 | + - The administrator creates or updates a ClusterCSIDriver object to specify the desired maximum number of volumes per node. |
| 94 | +2. Operator Reads Configuration: |
| 95 | + - The vSphere CSI Operator monitors the configuration object for changes. |
| 96 | + - Upon detecting a change, the operator reads the configured limit value. |
| 97 | +3. Operator Updates sets the new volume limit for DaemonSet: |
| 98 | + - The operator updates the DaemonSet for the vSphere CSI driver, injecting the `MAX_VOLUMES_PER_NODE` environment variable with the configured limit value into the driver node pods. |
| 99 | +4. Driver Enforces Limit: |
| 100 | + - The vSphere CSI driver reads the `MAX_VOLUMES_PER_NODE` environment variable and uses the configured limit during volume provisioning requests. |
| 101 | + |
| 102 | +### API Extensions |
| 103 | + |
| 104 | +- New field in ClusterCSIDriver CRD: |
| 105 | + - A new CRD field will be introduced to represent the maximum volume limit configuration. |
| 106 | + - This CRD will contain a single new field (e.g., `spec.driverConfig.vSphere.maxAllowedBlockVolumesPerNode`) to define the desired limit. |
| 107 | + - The CRD should be designed with appropriate validation rules to ensure valid values are provided. |
| 108 | + |
| 109 | +### Topology Considerations |
| 110 | + |
| 111 | +#### Hypershift / Hosted Control Planes |
| 112 | + |
| 113 | +No unique considerations for Hypershift. The configuration and behavior of the vSphere CSI driver with respect to the |
| 114 | +maximum volume limit will remain consistent across standalone and managed clusters. |
| 115 | + |
| 116 | +#### Standalone Clusters |
| 117 | + |
| 118 | +This enhancement is fully applicable to standalone OpenShift clusters. |
| 119 | + |
| 120 | +#### Single-node Deployments or MicroShift |
| 121 | + |
| 122 | +No unique considerations for MicroShift. The configuration and behavior of the vSphere CSI driver with respect to the |
| 123 | +maximum volume limit will remain consistent across standalone and SNO/MicroShift clusters. |
| 124 | + |
| 125 | +### Implementation Details/Notes/Constraints |
| 126 | + |
| 127 | +One of the possible future constraints might be increasing the limit with newer vSphere versions. However, we expect the |
| 128 | +limit to be increasing rather than decreasing and making the API validation more relaxed is possible. |
| 129 | + |
| 130 | +### Risks and Mitigations |
| 131 | + |
| 132 | +- Possible risk of disabling CSI controller volume publish capability: The new field in ClusterCSIDriver for setting |
| 133 | + limits should default to higher value than 0 (59 is reasonable, to match vSphere version 7 limit) |
| 134 | +- Impact on existing deployments: The default limit remains unchanged, minimizing disruption for existing deployments. |
| 135 | + |
| 136 | +### Drawbacks |
| 137 | + |
| 138 | +- Increased Complexity: Introducing a new CRD and operator logic adds complexity to the vSphere CSI driver ecosystem. |
| 139 | +- Potential for Configuration Errors: Incorrectly configuring the maximum volume limit can lead to unexpected behavior or resource limitations. |
| 140 | +- Limited Granularity: The current proposal provides a node-level limit. More fine-grained control (e.g., per-namespace or per-workload limits) would require further investigation and development. |
| 141 | + |
| 142 | +## Open Questions [optional] |
| 143 | + |
| 144 | +None. |
| 145 | + |
| 146 | +## Test Plan |
| 147 | + |
| 148 | +- E2E tests will be implemented to verify the correct propagation of the configured limit to the driver pods. These tests will only run on vSphere 8. |
| 149 | + |
| 150 | +## Graduation Criteria |
| 151 | + |
| 152 | +- GA in 4.19. |
| 153 | +- E2E tests are implemented and passing. |
| 154 | +- Documentation is updated. |
| 155 | + |
| 156 | +### Dev Preview -> Tech Preview |
| 157 | + |
| 158 | +- Ability to utilize the enhancement end to end |
| 159 | + |
| 160 | +### Tech Preview -> GA |
| 161 | + |
| 162 | +- E2E test coverage demonstrating stability. |
| 163 | +- Available by default. |
| 164 | +- User facing documentation created in [openshift-docs](https://github.com/openshift/openshift-docs/). |
| 165 | + |
| 166 | +### Removing a deprecated feature |
| 167 | + |
| 168 | +- No. |
| 169 | + |
| 170 | +## Upgrade / Downgrade Strategy |
| 171 | + |
| 172 | +- **Upgrades:** During an upgrade, the operator will apply the new API field value and update the driver DaemonSet with |
| 173 | + the new `MAX_VOLUMES_PER_NODE` value if it's configured. If the new field is not configured, the operator will |
| 174 | + keep using its previous hardcoded value configured in DaemonSet (59). |
| 175 | +-**Downgrades:** Downgrading to a version without this feature will result in the API field being ignored and the |
| 176 | + operator will revert to its previous hardcoded value configured in DaemonSet (59). If there is a higher count of |
| 177 | + attached volumes that the limit after downgrade, the vSphere CSI driver will not be able to attach new volumes and |
| 178 | + users will need to manually detach the extra volumes. |
| 179 | + |
| 180 | +## Version Skew Strategy |
| 181 | + |
| 182 | +There are no version skew concerns for this enhancement. |
| 183 | + |
| 184 | +## Operational Aspects of API Extensions |
| 185 | + |
| 186 | +- API extension does not pose any operational challenges. |
| 187 | + |
| 188 | +## Support Procedures |
| 189 | + |
| 190 | +* To check the status of the vSphere CSI operator, use the following command: `oc get deployments -n openshift-cluster-csi-drivers`. Ensure that the operator is running and healthy, inspect logs. |
| 191 | +* To inspect the `ClusterCSIDriver` CRs, use the following command: `oc get clustercsidriver/<driver_name> - yaml`. Examine the `spec.driverConfig.vSphere.maxAllowedBlockVolumesPerNode` field. |
| 192 | + |
| 193 | +## Alternatives |
| 194 | + |
| 195 | +- We could conditionally set FSS with the operator based either on presence of the new field or feature gate enablement in OpenShift. |
| 196 | + This should not be necessary as the FSS in the driver only allows setting higher volume limit (255) per node. |
| 197 | + |
| 198 | +## Infrastructure Needed [optional] |
| 199 | + |
| 200 | +- Current infrastructure needed to support the enhancement is available for testing vSphere version 8. |
0 commit comments