Skip to content

rbd: use volumeattributesclass feature implement rbd volume qos#6160

Open
YiteGu wants to merge 12 commits intoceph:develfrom
YiteGu:support-vac-feature-for-rbd
Open

rbd: use volumeattributesclass feature implement rbd volume qos#6160
YiteGu wants to merge 12 commits intoceph:develfrom
YiteGu:support-vac-feature-for-rbd

Conversation

@YiteGu
Copy link
Copy Markdown
Member

@YiteGu YiteGu commented Mar 6, 2026

Describe what this PR does

  1. The solution using storageclass to implement RBD volume QoS has been cancelled because the storageclass parameters cannot be modified. The volumeattributesclass solution is more flexible because it supports mutable parameters.
  2. Add the new parameter: volumeAttributesClassName when defining the PVC. For example:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: vac-pvc-1
  namespace: rook-ceph
spec:
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 30Gi
  storageClassName: rook-ceph-block
  volumeMode: Block
  volumeAttributesClassName: silver
  1. Implemented the ControllerModifyVolume interface. Users can upgrade their QoS by modifying the volumeAttributesClassName parameter of an existing PVC.

ref: https://kubernetes.io/docs/concepts/storage/volume-attributes-classes/

Is there anything that requires special attention

Do you have any questions?

Is the change backward compatible?

Are there concerns around backward compatibility?

Provide any external context for the change, if any.

For example:

  • Kubernetes links that explain why the change is required
  • CSI spec related changes/catch-up that necessitates this patch
  • golang related practices that necessitates this change

Related issues

Mention any github issues relevant to this PR. Adding below line
will help to auto close the issue once the PR is merged.

Fixes: #issue_number

Future concerns

List items that are not part of the PR and do not impact it's
functionality, but are work items that can be taken up subsequently.

Checklist:

  • Commit Message Formatting: Commit titles and messages follow
    guidelines in the developer
    guide
    .
  • Reviewed the developer guide on Submitting a Pull
    Request
  • Pending release
    notes

    updated with breaking and/or notable changes for the next major release.
  • Documentation has been updated, if necessary.
  • Unit tests have been added, if necessary.
  • Integration tests have been added, if necessary.

Show available bot commands

These commands are normally not required, but in case of issues, leave any of
the following bot commands in an otherwise empty comment in this PR:

  • /retest ci/centos/<job-name>: retest the <job-name> after unrelated
    failure (please report the failure too!)

@mergify mergify bot added the component/rbd Issues related to RBD label Mar 6, 2026
@YiteGu YiteGu force-pushed the support-vac-feature-for-rbd branch 3 times, most recently from b78efe5 to 36c1c2f Compare March 6, 2026 11:43
@YiteGu YiteGu force-pushed the support-vac-feature-for-rbd branch 4 times, most recently from c1e4f0a to dffa21e Compare March 9, 2026 13:41
Copy link
Copy Markdown
Member

@nixpanic nixpanic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a user visible change. Put a note in PendingReleaseNotes.md for this too. Maybe add a note in the main README.md for support of this as well, update other documentation?

@mergify
Copy link
Copy Markdown
Contributor

mergify bot commented Mar 10, 2026

This pull request now has conflicts with the target branch. Could you please resolve conflicts and force push the corrected changes? 🙏

@YiteGu YiteGu force-pushed the support-vac-feature-for-rbd branch 3 times, most recently from cb885ba to a8df95a Compare March 11, 2026 05:13
@mergify
Copy link
Copy Markdown
Contributor

mergify bot commented Mar 11, 2026

This pull request now has conflicts with the target branch. Could you please resolve conflicts and force push the corrected changes? 🙏

@YiteGu YiteGu force-pushed the support-vac-feature-for-rbd branch 2 times, most recently from 43fa44e to 1a4ade8 Compare March 11, 2026 06:25
@YiteGu YiteGu requested a review from nixpanic March 11, 2026 07:36
@YiteGu YiteGu force-pushed the support-vac-feature-for-rbd branch from f7b6685 to e786b28 Compare March 11, 2026 07:44
Copy link
Copy Markdown
Member

@nixpanic nixpanic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two minor things that caught the eye of Claude Code, and look like good recommendations to me.

@YiteGu
Copy link
Copy Markdown
Member Author

YiteGu commented Mar 11, 2026

/test ci/centos/mini-e2e/k8s-1.34

@YiteGu YiteGu force-pushed the support-vac-feature-for-rbd branch from e786b28 to 650a107 Compare March 11, 2026 11:46
Copy link
Copy Markdown
Member

@nixpanic nixpanic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another round of review (now completed by me, not Claude).

These are the only changes I would like to see, everything else looks good to me. Thanks!

@YiteGu YiteGu force-pushed the support-vac-feature-for-rbd branch 2 times, most recently from b2d68cb to a79ce62 Compare March 16, 2026 02:52
@YiteGu YiteGu requested a review from nixpanic March 16, 2026 03:34
Copy link
Copy Markdown
Member

@nixpanic nixpanic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, looks good to me.

@nixpanic nixpanic requested a review from a team March 16, 2026 09:05
@nixpanic nixpanic added the enhancement New feature or request label Mar 16, 2026
YiteGu and others added 12 commits April 1, 2026 16:06
1. add new MODIFY_VOLUME to controller service capability.
2. CreateVolume use mutable_parameters to set volume qos

Signed-off-by: Yite Gu <guyite@bytedance.com>
When the PVC's volumeAttributesClassName is modified,
the csi-resizer initiates ControllerModifyVolume call.

In the first version of ControllerModifyVolume, we implemented
QoS functionality for RBD volumes.

Signed-off-by: Yite Gu <guyite@bytedance.com>
1. Create a VolumeAttributesClass with a QoS parameter.
   Create a PVC with a VolumeAttributesClassName parameter.
   Verify the QoS results of the RBD image.

2. Modify the VolumeAttributesClassName of the PVC and
   verify the change in the QoS of the RBD image.

3. Create a VolumeAttributesClass that supports capacity-based QoS
   and verify the QoS results of the RBD image.
   Create a cloned volume of the PVC and verify the QoS results of
   the cloned volume.

Signed-off-by: Yite Gu <guyite@bytedance.com>
Co-authored-by: Niels de Vos <ndevos@ibm.com>
The addition of a new op caused the tryAcquire function
to become too complex and fail CI checks. The tryAcquire function
implementation was refactored, and the conflictMatrix design
was introduced, which greatly reduced the complexity.

Signed-off-by: Yite Gu <guyite@bytedance.com>
storageclass is no longer used as the qos parameters carrier
for RBD volumes; these parameters are now defined in volumeattribetesclass.

Signed-off-by: Yite Gu <guyite@bytedance.com>
Signed-off-by: Yite Gu <guyite@bytedance.com>
Signed-off-by: Yite Gu <guyite@bytedance.com>
Signed-off-by: Yite Gu <guyite@bytedance.com>
Signed-off-by: Yite Gu <guyite@bytedance.com>
Signed-off-by: Yite Gu <guyite@bytedance.com>
Signed-off-by: Yite Gu <guyite@bytedance.com>
@ceph-csi-bot ceph-csi-bot force-pushed the support-vac-feature-for-rbd branch from bab1bd9 to ec7ba93 Compare April 1, 2026 16:06
@nixpanic nixpanic added the ok-to-test Label to trigger E2E tests label Apr 1, 2026
@mergify
Copy link
Copy Markdown
Contributor

mergify bot commented Apr 1, 2026

rebase

✅ Branch has been successfully rebased

@ceph-csi-bot
Copy link
Copy Markdown
Collaborator

/test ci/centos/k8s-e2e-external-storage/1.34

@ceph-csi-bot
Copy link
Copy Markdown
Collaborator

/test ci/centos/upgrade-tests-cephfs

@ceph-csi-bot
Copy link
Copy Markdown
Collaborator

/test ci/centos/mini-e2e-helm/k8s-1.34

@ceph-csi-bot
Copy link
Copy Markdown
Collaborator

/test ci/centos/upgrade-tests-rbd

@ceph-csi-bot
Copy link
Copy Markdown
Collaborator

/test ci/centos/k8s-e2e-external-storage/1.33

@ceph-csi-bot
Copy link
Copy Markdown
Collaborator

/test ci/centos/k8s-e2e-external-storage/1.35

@ceph-csi-bot
Copy link
Copy Markdown
Collaborator

/test ci/centos/mini-e2e/k8s-1.34

@ceph-csi-bot
Copy link
Copy Markdown
Collaborator

/test ci/centos/mini-e2e-helm/k8s-1.33

@ceph-csi-bot
Copy link
Copy Markdown
Collaborator

/test ci/centos/mini-e2e-helm/k8s-1.35

@ceph-csi-bot
Copy link
Copy Markdown
Collaborator

/test ci/centos/mini-e2e/k8s-1.33

@ceph-csi-bot
Copy link
Copy Markdown
Collaborator

/test ci/centos/mini-e2e/k8s-1.35

@ceph-csi-bot ceph-csi-bot removed the ok-to-test Label to trigger E2E tests label Apr 1, 2026
@nixpanic
Copy link
Copy Markdown
Member

nixpanic commented Apr 1, 2026

/retest ci/centos/upgrade-tests-rbd

@YiteGu
Copy link
Copy Markdown
Member Author

YiteGu commented Apr 2, 2026

  E0401 18:20:35.911732       1 reflector.go:205] "Failed to watch" err="failed to list *v1.VolumeAttributesClass: volumeattributesclasses.storage.k8s.io is forbidden: User \"system:serviceaccount:cephcsi-e2e-29354d17:rbd-csi-provisioner\" cannot list resource \"volumeattributesclasses\" in API group \"storage.k8s.io\" at the cluster scope" logger="UnhandledError" reflector="k8s.io/client-go/informers/factory.go:160" type="*v1.VolumeAttributesClass"

Hi, @nixpanic Which file was this user's permission added to?

@nixpanic
Copy link
Copy Markdown
Member

nixpanic commented Apr 2, 2026

  E0401 18:20:35.911732       1 reflector.go:205] "Failed to watch" err="failed to list *v1.VolumeAttributesClass: volumeattributesclasses.storage.k8s.io is forbidden: User \"system:serviceaccount:cephcsi-e2e-29354d17:rbd-csi-provisioner\" cannot list resource \"volumeattributesclasses\" in API group \"storage.k8s.io\" at the cluster scope" logger="UnhandledError" reflector="k8s.io/client-go/informers/factory.go:160" type="*v1.VolumeAttributesClass"

Hi, @nixpanic Which file was this user's permission added to?

I think the failure happens because in the upgrade test, the new RBAC is not used, but the RBAC of the old version that is deployed before the upgrade. Just guessing, would need to inspect the e2e code in detail.

The logs do show an other issue too:

  I0401 18:30:36.048415 77711 pvc.go:85] PVC rbd-pvc Event: ProvisioningFailed - failed to strip CSI Parameters of prefixed keys: found unknown parameter key "csi.storage.k8s.io/controller-modify-secret-name" with reserved namespace csi.storage.k8s.io/

The e2e creates a StorageClass with the new parameters for the credentials, but the old sidecars are used for the old version...

I can't immediately think of a nice approach that addresses this in a way it easily keeps working in the future. Will ponder on this for a bit and come back to it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

component/rbd Issues related to RBD enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants