Skip to content

util: use csi objectuuid for rados locks#6204

Merged
mergify[bot] merged 1 commit intoceph:develfrom
black-dragon74:fix-rados-enametoolong
Apr 2, 2026
Merged

util: use csi objectuuid for rados locks#6204
mergify[bot] merged 1 commit intoceph:develfrom
black-dragon74:fix-rados-enametoolong

Conversation

@black-dragon74
Copy link
Copy Markdown
Member

@black-dragon74 black-dragon74 commented Mar 25, 2026

Describe what this PR does

CSI vol ids > 100bytes fail with error ENAMETOOLONG when acquiring rados lock as ceph sets default value for osd_max_attr_name_len to 100.

This patch fixes the issue by decomposing the CSI ID and using the ObjectUUID to ensure the lock names are always < 100 bytes ("lock." + UUID + "-mutexlock"" = 51bytes).

"lock."(5 bytes) is a prefix applied by Ceph internally for every lock name so we can only pass lock names that are <= 95bytes.

The hash based approach is not used to keep things debuggable.

Fixes: #5419

@black-dragon74 black-dragon74 requested a review from Madhu-1 March 25, 2026 18:28
@black-dragon74
Copy link
Copy Markdown
Member Author

/test ci/centos/mini-e2e-helm/k8s-1.34

@black-dragon74 black-dragon74 requested a review from a team March 26, 2026 06:43
@black-dragon74
Copy link
Copy Markdown
Member Author

/test ci/centos/mini-e2e/k8s-1.35

@black-dragon74 black-dragon74 force-pushed the fix-rados-enametoolong branch 2 times, most recently from 44862ba to d4ce75f Compare March 27, 2026 09:54
@black-dragon74 black-dragon74 requested a review from nixpanic March 27, 2026 09:54
nixpanic
nixpanic previously approved these changes Mar 27, 2026
@nixpanic nixpanic requested a review from a team March 27, 2026 09:56
defer ioctx.Destroy()

lock := iolock.NewLock(ioctx, volIDStr, lockName, lockCookie, lockDesc, lockDuration)
lock := iolock.NewLock(ioctx, objectUUID, lockName, lockCookie, lockDesc, lockDuration)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this backward compatible?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, these are transient operation locks at the rados layer. They are not persisted.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is fine as long as you have the same ceph-csi version running on all nodes. I don't think we need to support mixed ceph-csi versions.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what about rolling updates?

lockName := objectUUID + "-mutexlock"
lockDesc := "Key rotation mutex lock for " + rv.VolID
lockCookie := rv.VolID + "-enc-key-rotate"
lockCookie := objectUUID + "-enc-key-rotate"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, is this backward compatible?

@black-dragon74 black-dragon74 force-pushed the fix-rados-enametoolong branch from d4ce75f to 27e8a05 Compare March 31, 2026 10:22
@mergify mergify bot dismissed nixpanic’s stale review March 31, 2026 10:23

Pull request has been modified.

defer ioctx.Destroy()

lock := iolock.NewLock(ioctx, volIDStr, lockName, lockCookie, lockDesc, lockDuration)
lock := iolock.NewLock(ioctx, objectUUID, lockName, lockCookie, lockDesc, lockDuration)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is fine as long as you have the same ceph-csi version running on all nodes. I don't think we need to support mixed ceph-csi versions.

@nixpanic nixpanic requested review from a team and iPraveenParihar April 1, 2026 08:47
@nixpanic
Copy link
Copy Markdown
Member

nixpanic commented Apr 2, 2026

@Mergifyio rebase

CSI vol ids > 100bytes fail with error `ENAMETOOLONG`
when acquiring rados lock as ceph sets default value
for `osd_max_attr_name_len` to 100.

This patch fixes the issue by decomposing the CSI ID
and using the ObjectUUID to ensure the lock names
are always < 100 bytes ("lock." + UUID + "-mutexlock"" = 51bytes).

"lock."(5 bytes) is a prefix applied by Ceph internally for every
lock name so we can only pass lock names that are <= 95bytes.

The hash based approach is not used to keep things
debuggable.

Fixes: ceph#5419

Signed-off-by: Niraj Yadav <niryadav@redhat.com>
@mergify
Copy link
Copy Markdown
Contributor

mergify bot commented Apr 2, 2026

rebase

✅ Branch has been successfully rebased

@ceph-csi-bot ceph-csi-bot force-pushed the fix-rados-enametoolong branch from 27e8a05 to da69d2f Compare April 2, 2026 11:20
@nixpanic nixpanic added the ok-to-test Label to trigger E2E tests label Apr 2, 2026
@ceph-csi-bot
Copy link
Copy Markdown
Collaborator

/test ci/centos/upgrade-tests-cephfs

@ceph-csi-bot
Copy link
Copy Markdown
Collaborator

/test ci/centos/k8s-e2e-external-storage/1.33

@ceph-csi-bot
Copy link
Copy Markdown
Collaborator

/test ci/centos/k8s-e2e-external-storage/1.34

@ceph-csi-bot
Copy link
Copy Markdown
Collaborator

/test ci/centos/upgrade-tests-rbd

@ceph-csi-bot
Copy link
Copy Markdown
Collaborator

/test ci/centos/mini-e2e-helm/k8s-1.33

@ceph-csi-bot
Copy link
Copy Markdown
Collaborator

/test ci/centos/mini-e2e-helm/k8s-1.34

@ceph-csi-bot
Copy link
Copy Markdown
Collaborator

/test ci/centos/mini-e2e/k8s-1.33

@ceph-csi-bot
Copy link
Copy Markdown
Collaborator

/test ci/centos/mini-e2e/k8s-1.34

@ceph-csi-bot
Copy link
Copy Markdown
Collaborator

/test ci/centos/k8s-e2e-external-storage/1.35

@ceph-csi-bot
Copy link
Copy Markdown
Collaborator

/test ci/centos/mini-e2e-helm/k8s-1.35

@ceph-csi-bot
Copy link
Copy Markdown
Collaborator

/test ci/centos/mini-e2e/k8s-1.35

@ceph-csi-bot ceph-csi-bot removed the ok-to-test Label to trigger E2E tests label Apr 2, 2026
@mergify
Copy link
Copy Markdown
Contributor

mergify bot commented Apr 2, 2026

Merge Queue Status

  • Entered queue2026-04-02 14:26 UTC · Rule: default
  • Checks skipped · PR is already up-to-date
  • Merged2026-04-02 14:26 UTC · at da69d2fb5e2365857c1567ff140e60f6ebf4f5d4

This pull request spent 11 seconds in the queue, including 2 seconds running CI.

Required conditions to merge

@mergify mergify bot merged commit 70b950d into ceph:devel Apr 2, 2026
38 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

CephFS-CSI Volume Mount Fails with Encryption Enabled Due to Lock File Error (rados: ret=-36, File Name Too Long)

5 participants