-
Notifications
You must be signed in to change notification settings - Fork 31
STOR-2141: add support for maxAllowedBlockVolumesPerNode #287
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@RomanBednar: This pull request references STOR-2141 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the epic to target the "4.19.0" version, but no target version was set. DetailsIn response to this: Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
4002654 to
087f333
Compare
087f333 to
1a23c7d
Compare
623109c to
ea8a387
Compare
|
@RomanBednar: This pull request references STOR-2141 which is a valid jira issue. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
@RomanBednar: This pull request references STOR-2141 which is a valid jira issue. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/retest-required |
|
/assign @gnufied For review - feel free to reassign to other candidate. |
|
Shouldn't all this code be behind a feature gate? |
It could, but do we want it? I thought we don't because:
So I'm not sure what exactly we would gain from featuregating this code, but might have missed something so please share your thoughts. |
But then, why did we introduce https://github.com/openshift/api/blob/master/features/features.go#L821 featuregate in first place? Isn't the idea that, user must first enable that featuregate before this can be used? If snapshot-options feature went TP without a featuregate check in the code, that was a mistake. cc @jsafrane There are other reasons why this should be featuregated, say I am a user who discovers this feature, which is available by default (and I didn't read docs), then if this feature is removed in next release, my cluster will be broken. There is also risk of code change breaking stable features (no matter how small that chance is). |
|
@RomanBednar: This pull request references STOR-2141 which is a valid jira issue. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
@RomanBednar: This pull request references STOR-2141 which is a valid jira issue. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
@RomanBednar: This pull request references STOR-2141 which is a valid jira issue. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
That feature gate now affects which CRDs are applied, so unless users enable that feature gate they will not be able to set the field (it will appear as invalid for clustercsidriver).
This is a valid point, but if there's a GA'd feature, it must have e2e tests, so in theory we should see any breakage well. But this was historically not always the case I think.
What scenario are we talking about exactly? I can see removal as problematic, either because of the field presence (we can remove it by operator) or having more volumes than 59 attached already (we won't be able to deal with this anyway). But feature removal should be only for cases like vmware removing it first right? And we don't know when/if that happens and we can't keep featuregates forever I believe. So what's the suggestion here exactly? |
pkg/operator/vspherecontroller/vsphere_environment_checker_test.go
Outdated
Show resolved
Hide resolved
In order to validate maximum volume attachment limit set by user NodeChecker needs access to ClusterCSIDriver which is where users set the value.
We need to check versions of all ESXI hosts in the cluster and if we detect that users set a custom volume attachment limit that is incorrect we degrade the cluster. Incorrect value is any value above 59 if any of the vSphere hosts in a cluster is not on ESXI version 8 or higher In case the maxAllowedBlockVolumesPerNode field is not set it will return 0, which is not valid and we need to use default.
|
/retest-required |
|
Performed pre-merge test with cluster-bot build image 4.19.0-0.test-2025-04-09-032635-ci-ln-cdqc3dk-latest |
Since NodeChecker now checks max attachment limit value it is now safe to add a hook for reflecting maxAllowedBlockVolumesPerNode field of clusterCSIDriver into daemonset as env variable.
|
hmm, looks like vsphere operator tests that check storage removal are failing. Is that a real failure? @RomanBednar can you check? |
|
/retest-required |
|
@RomanBednar: The following tests failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
|
@gnufied Looks like a flake, green on next try. |
|
/lgtm |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: gnufied, RomanBednar The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
/label qe-approved |
|
@RomanBednar: This pull request references STOR-2141 which is a valid jira issue. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
5e1017e
into
openshift:main
|
[ART PR BUILD NOTIFIER] Distgit: ose-vmware-vsphere-csi-driver-operator |
Depends on
Manual verification
Test value limits for
maxAllowedBlockVolumesPerNodefield:Validate
maxAllowedBlockVolumesPerNodevalue propagation to driver deployment asMAX_VOLUMES_PER_NODE:Validate propagation to CSINode as allocatable count:
If
maxAllowedBlockVolumesPerNodeis unset (for example after cluster upgrade) we must use default value (never zero):