disk: replace conf file with xattr #1608

huww98 · 2026-01-05T10:11:36Z

What type of PR is this?

/kind bug

What this PR does / why we need it:

We used to use /etc/kubernetes/volumes/disk/d-*.conf files to record the relationships between disks without serial number and the device path. But it has multiple drawbacks:

leak: we may fail to remove the files we created
inaccurate: if the disk is detached, switched driver, the device will gone without our knowledge. In this case, the conf file may points to a non-exist file, or even a wrong file.

Replace the conf files with xattr, which we are already using to calculate number of volumes available on node. The xattrs are attached to the device inode, and will go with the inode. So that we don't need to worry about any cleanup or inaccuracy.

Old conf files are migrated to xattr in one go, in the init container.

As a bonus, the partition support is now more decoupled, and should work on more scenarios. xattrs are always attached to root block device, not partition.

Which issue(s) this PR fixes:

Fixes #

Special notes for your reviewer:

/hold
for manual verification on disks without serial number

Does this PR introduce a user-facing change?

NONE

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

huww98 · 2026-01-10T18:21:48Z

manual verification:

Before upgrade, two disks attached to the node:

d-2ze4p69x0n3tx1ch0xch: a disk without serial number, with a partition
d-2zecxehhi1afgh1rl7dh: newly created disk.

[root@iZ2zeaaxogmsvnc525axfoZ ~]# grep '' /etc/kubernetes/volumes/disk/*.conf
/etc/kubernetes/volumes/disk/d-2ze4p69x0n3tx1ch0xch.conf:/dev/nvme3n1p1
/etc/kubernetes/volumes/disk/d-2zecxehhi1afgh1rl7dh.conf:/dev/disk/by-id/nvme-Alibaba_Cloud_Elastic_Block_Storage_2zecxehhi1afgh1rl7dh

Upgrade:
init:

migrating disk conf: /host/etc/kubernetes/volumes/disk/d-2ze4p69x0n3tx1ch0xch.conf
device /dev/nvme3n1p1 is a partition of /dev/nvme3n1
cat: can't open '/sys/devices/pci0000:00/0000:00:09.0/nvme/nvme3/nvme3n1/serial': No such file or directory
device /dev/nvme3n1 has no serial, assigning disk ID d-2ze4p69x0n3tx1ch0xch
migrating disk conf: /host/etc/kubernetes/volumes/disk/d-2zecxehhi1afgh1rl7dh.conf
device /dev/disk/by-id/nvme-Alibaba_Cloud_Elastic_Block_Storage_2zecxehhi1afgh1rl7dh is a symlink, skip

unmount globalmount, optionally restart csi, then restart kubelet to verify load devMap from xattr:

I0110 17:47:41.041626   18140 nodeserver.go:509] NodeStageVolume: Stage VolumeId: d-2ze4p69x0n3tx1ch0xch, Target Path: /var/lib/kubelet/plugins/kubernetes.io/csi/diskplugin.csi.alibabacloud.com/2d5f5866b9f40da3d6181467c60adda021b5d011f52c7c195162dd1fb370c409/globalmount, VolumeContext: map[]
I0110 17:47:41.041810   18140 cloud.go:98] GetRootBlockDevice: got disk d-2ze4p69x0n3tx1ch0xch device name /dev/nvme3n1 from devMap
I0110 17:47:41.041817   18140 bdf.go:543] NewDeviceDriver: start to get deviceNumber from device: /dev/nvme3n1
I0110 17:47:41.041841   18140 device_manager.go:404] NewDeviceDriver: get symlink dir: /sys/devices/pci0000:00/0000:00:09.0/nvme/nvme3/nvme3n1
I0110 17:47:41.041846   18140 device_manager.go:411] NewDeviceDriver: busPrefix: ^[0-9a-fA-F]{4}:[0-9a-fA-F]{2}:[0-9a-fA-F]{2}, parentDir: nvme3, matched: false
I0110 17:47:41.041850   18140 device_manager.go:404] NewDeviceDriver: get symlink dir: /sys/devices/pci0000:00/0000:00:09.0/nvme/nvme3
I0110 17:47:41.041853   18140 device_manager.go:411] NewDeviceDriver: busPrefix: ^[0-9a-fA-F]{4}:[0-9a-fA-F]{2}:[0-9a-fA-F]{2}, parentDir: nvme, matched: false
I0110 17:47:41.041855   18140 device_manager.go:404] NewDeviceDriver: get symlink dir: /sys/devices/pci0000:00/0000:00:09.0/nvme
I0110 17:47:41.041860   18140 device_manager.go:411] NewDeviceDriver: busPrefix: ^[0-9a-fA-F]{4}:[0-9a-fA-F]{2}:[0-9a-fA-F]{2}, parentDir: 0000:00:09.0, matched: true
I0110 17:47:41.041876   18140 nodeserver.go:1481] "checkMountedOfRunvAndRund: check pvmMounted" device="/dev/nvme3n1" pvmMounted=false driver="nvme"
I0110 17:47:41.041884   18140 cloud.go:195] "Starting Do AttachDisk" method="/csi.v1.Node/NodeStageVolume" volumeID="d-2ze4p69x0n3tx1ch0xch"
I0110 17:47:41.115795   18140 low_latency.go:187] "got batch" type="disk" n=1 requestID="252FCE8B-FE8F-514F-A5E8-FDAC1DD8D403" duration="69.983169ms" wait="3.90955ms"
I0110 17:47:41.126638   18140 nodeserver.go:585] NodeStageVolume: Volume Successful Attached: d-2ze4p69x0n3tx1ch0xch, to Node: i-2zeaaxogmsvnc525axfo, Device: /dev/nvme3n1p1

Simulating the devMap is not accurate:

[root@iZ2zeaaxogmsvnc525axfoZ ~]# setfattr -x trusted.csi-managed-disk /dev/nvme1n1
[root@iZ2zeaaxogmsvnc525axfoZ ~]# umount /var/lib/kubelet/plugins/kubernetes.io/csi/diskplugin.csi.alibabacloud.com/2d5f5866b9f40da3d6181467c60adda021b5d011f52c7c195162dd1fb370c409/globalmount
[root@iZ2zeaaxogmsvnc525axfoZ ~]# systemctl restart kubelet.service

I0110 18:18:42.760176   21112 nodeserver.go:509] NodeStageVolume: Stage VolumeId: d-2ze4p69x0n3tx1ch0xch, Target Path: /var/lib/kubelet/plugins/kubernetes.io/csi/diskplugin.csi.alibabacloud.com/2d5f5866b9f40da3d6181467c60adda021b5d011f52c7c195162dd1fb370c409/globalmount, VolumeContext: map[]
I0110 18:18:42.760316   21112 xattr.go:117] "disk has no xattr" dev="/dev/nvme1n1" diskID="d-2ze4p69x0n3tx1ch0xch"
W0110 18:18:42.760337   21112 nodeserver.go:1463] NodeStageVolume: GetVolumeDeviceName failed: [get by link "/dev/disk/by-id/virtio-2ze4p69x0n3tx1ch0xch" failed: no such file or directory, get by link "/dev/disk/by-id/nvme-Alibaba_Cloud_Elastic_Block_Storage_2ze4p69x0n3tx1ch0xch" failed: no such file or directory, find by serial: file does not exist]
E0110 18:18:42.760488   21112 bdf.go:593] "Failed to execute xdragon-bdf command" err="fork/exec /usr/bin/nsenter: no such file or directory" volumeId="d-2ze4p69x0n3tx1ch0xch" output=""
E0110 18:18:42.760556   21112 bdf.go:593] "Failed to execute xdragon-bdf command" err="fork/exec /usr/bin/nsenter: no such file or directory" volumeId="d-2ze4p69x0n3tx1ch0xch" output=""
E0110 18:18:42.760562   21112 nodeserver.go:1468] "NodeStageVolume:  Failed to get bdf number" err="Failed to find device number for d-2ze4p69x0n3tx1ch0xch" volumeId="d-2ze4p69x0n3tx1ch0xch"
I0110 18:18:42.760568   21112 cloud.go:195] "Starting Do AttachDisk" method="/csi.v1.Node/NodeStageVolume" volumeID="d-2ze4p69x0n3tx1ch0xch"
I0110 18:18:42.843227   21112 low_latency.go:187] "got batch" type="disk" n=1 requestID="B8426F15-0733-5B9A-8EC3-7CE0B628A529" duration="78.736296ms" wait="3.909001ms"
W0110 18:18:42.843248   21112 cloud.go:257] AttachDisk: Disk (no serial) d-2ze4p69x0n3tx1ch0xch is already attached to instance i-2zeaaxogmsvnc525axfo, but device unknown, will be detached and try again
I0110 18:18:42.843254   21112 cloud.go:275] AttachDisk: Disk d-2ze4p69x0n3tx1ch0xch is already attached to instance i-2zeaaxogmsvnc525axfo, will be detached
I0110 18:18:42.984471   21112 cloud.go:287] AttachDisk: Wait for disk d-2ze4p69x0n3tx1ch0xch to be detached
I0110 18:18:43.055697   21112 batched.go:216] "polled batch" type="disk" n=1 interval="2.269µs" duration="71.198942ms" requestID="474B8D9C-0644-593F-9EA6-1ADCA2D34703"
I0110 18:18:43.055717   21112 batched.go:120] "poll response processed" type="disk" queueDepth=1 requeue=1
I0110 18:18:45.069247   21112 batched.go:216] "polled batch" type="disk" n=1 interval="1.928777808s" duration="84.740777ms" requestID="D0DF021E-E9D0-513B-BC2C-959F0B4EB787"
I0110 18:18:45.069273   21112 batched.go:120] "poll response processed" type="disk" queueDepth=0 requeue=0
I0110 18:18:45.299942   21112 cloud.go:341] AttachDisk: Waiting for Disk d-2ze4p69x0n3tx1ch0xch is Attached to instance i-2zeaaxogmsvnc525axfo with RequestId: 5CF90C09-8642-5AA4-B26C-7FB83D523E3E
I0110 18:18:47.050801   21112 batched.go:216] "polled batch" type="disk" n=1 interval="1.6845354s" duration="66.295782ms" requestID="9728B7A3-88F0-5338-9F66-D2473C54F78F"
I0110 18:18:47.050824   21112 batched.go:120] "poll response processed" type="disk" queueDepth=0 requeue=0
I0110 18:18:47.050902   21112 cloud.go:153] "found device by diff" method="/csi.v1.Node/NodeStageVolume" volumeID="d-2ze4p69x0n3tx1ch0xch" device="/dev/nvme2n1"
I0110 18:18:47.066834   21112 nodeserver.go:585] NodeStageVolume: Volume Successful Attached: d-2ze4p69x0n3tx1ch0xch, to Node: i-2zeaaxogmsvnc525axfo, Device: /dev/nvme2n1p1
I0110 18:18:47.066852   21112 util.go:574] formatAndMount: mount options : [shared]
I0110 18:18:47.076557   21112 nodeserver.go:660] "mount successful" method="/csi.v1.Node/NodeStageVolume" volumeID="d-2ze4p69x0n3tx1ch0xch" target="/var/lib/kubelet/plugins/kubernetes.io/csi/diskplugin.csi.alibabacloud.com/2d5f5866b9f40da3d6181467c60adda021b5d011f52c7c195162dd1fb370c409/globalmount" device="/dev/nvme2n1p1" mkfsOptions=[] options=["shared"]

Move DiskXattrName and DiskXattrVirtioBlkName variables from nodeserver.go to a new file xattr.go. This is a pure code move with no functional changes.

We used to use /etc/kubernetes/volumes/disk/d-*.conf files to record the relationships between disks without serial number and the device path. But it has multiple drawbacks: - leak: we may fail to remove the files we created - inaccurate: if the disk is detached, switched driver, the device will gone without our knowledge. In this case, the conf file may points to a non-exist file, or even a wrong file. Replace the conf files with xattr, which we are already using to calculate number of volumes available on node. The xattrs are attached to the device inode, and will go with the inode. So that we don't need to worry about any cleanup or inaccuracy. Old conf files are migrated to xattr in one go, in the init container. As a bonus, the partition support is now more decoupled, and should work on more scenarios. xattrs are always attached to root block device, not partition.

huww98 · 2026-01-11T15:13:33Z

/unhold

mowangdk · 2026-01-12T06:53:32Z

/lgtm
/approve

k8s-ci-robot · 2026-01-12T06:53:39Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: huww98, mowangdk

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details

Needs approval from an approver in each of these files:

~~OWNERS~~ [mowangdk]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

k8s-ci-robot added do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. kind/bug Categorizes issue or PR as related to a bug. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Jan 5, 2026

k8s-ci-robot requested review from iltyty and mowangdk January 5, 2026 10:11

k8s-ci-robot added the size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. label Jan 5, 2026

huww98 force-pushed the disk-rm-conf branch from e77a983 to bad5eb4 Compare January 5, 2026 11:19

huww98 force-pushed the disk-rm-conf branch from bad5eb4 to 17e28c3 Compare January 10, 2026 18:24

disk: move xattr variables to xattr.go

25ea53b

Move DiskXattrName and DiskXattrVirtioBlkName variables from nodeserver.go to a new file xattr.go. This is a pure code move with no functional changes.

huww98 force-pushed the disk-rm-conf branch from 17e28c3 to ff8d36e Compare January 11, 2026 09:48

huww98 force-pushed the disk-rm-conf branch from ff8d36e to 2992d5b Compare January 11, 2026 15:13

k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jan 11, 2026

k8s-ci-robot assigned mowangdk Jan 12, 2026

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jan 12, 2026

k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jan 12, 2026

k8s-ci-robot merged commit edf8fd3 into kubernetes-sigs:master Jan 12, 2026
14 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

disk: replace conf file with xattr #1608

disk: replace conf file with xattr #1608

Uh oh!

huww98 commented Jan 5, 2026

Uh oh!

huww98 commented Jan 10, 2026

Uh oh!

huww98 commented Jan 11, 2026

Uh oh!

mowangdk commented Jan 12, 2026

Uh oh!

k8s-ci-robot commented Jan 12, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

disk: replace conf file with xattr #1608

disk: replace conf file with xattr #1608

Uh oh!

Conversation

huww98 commented Jan 5, 2026

What type of PR is this?

What this PR does / why we need it:

Which issue(s) this PR fixes:

Special notes for your reviewer:

Does this PR introduce a user-facing change?

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

Uh oh!

huww98 commented Jan 10, 2026

Uh oh!

huww98 commented Jan 11, 2026

Uh oh!

mowangdk commented Jan 12, 2026

Uh oh!

k8s-ci-robot commented Jan 12, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants