Fix PrepareResourceClaim to read the rdmadev from sysfs path as well#57
Fix PrepareResourceClaim to read the rdmadev from sysfs path as well#57neaggarwMS wants to merge 1 commit intokubernetes-sigs:mainfrom
Conversation
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: neaggarwMS The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
|
Welcome @neaggarwMS! |
|
Hi @neaggarwMS. Thanks for your PR. I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
|
||
| //Fallback to sysfs check if rdmamap fails. This is particularly related to a known | ||
| // issue to detect RDMA devices for certain Mellanox NICs | ||
| // https://github.com/Mellanox/rdmamap/issues/15 | ||
|
|
||
| rdmaDir := filepath.Join("/sys/class/net", ifName, "device/infiniband") | ||
|
|
||
| entries, err := os.ReadDir(rdmaDir) | ||
| if err != nil { | ||
| return "", fmt.Errorf("no RDMA device for %s: %w", ifName, err) | ||
| } | ||
|
|
||
| for _, entry := range entries { | ||
| if entry.IsDir() { | ||
| return entry.Name(), nil // Return first RDMA device found (e.g., "mlx5_0") | ||
| } | ||
| } | ||
|
|
||
| return "", fmt.Errorf("no RDMA device found for %s", ifName) | ||
| } | ||
|
|
||
| return rdmaDev, nil |
There was a problem hiding this comment.
should we consider creating this as function in pkg/inventory/sysfs.go as all sysnet operations added in that file
There was a problem hiding this comment.
+1
Extending this further, can we please refactor the existing function
Line 112 in 97d40e9
IsRDmaDeviceForNetdevice and GetRdmaDeviceForNetdevice, where IsRDmaDeviceForNetdevice is a simple wrapper over GetRdmaDeviceForNetdevice (Ref. https://github.com/Mellanox/rdmamap/blob/37bd11cc4c57da931b7b117f829fb663d46ce480/rdma_map.go#L348-L368
gauravkghildiyal
left a comment
There was a problem hiding this comment.
Thanks @neaggarwMS. Mostly looks good.
(Please take a look at the presubmits to sign the CLA)
/ok-to-test
|
|
||
| //Fallback to sysfs check if rdmamap fails. This is particularly related to a known | ||
| // issue to detect RDMA devices for certain Mellanox NICs | ||
| // https://github.com/Mellanox/rdmamap/issues/15 | ||
|
|
||
| rdmaDir := filepath.Join("/sys/class/net", ifName, "device/infiniband") | ||
|
|
||
| entries, err := os.ReadDir(rdmaDir) | ||
| if err != nil { | ||
| return "", fmt.Errorf("no RDMA device for %s: %w", ifName, err) | ||
| } | ||
|
|
||
| for _, entry := range entries { | ||
| if entry.IsDir() { | ||
| return entry.Name(), nil // Return first RDMA device found (e.g., "mlx5_0") | ||
| } | ||
| } | ||
|
|
||
| return "", fmt.Errorf("no RDMA device found for %s", ifName) | ||
| } | ||
|
|
||
| return rdmaDev, nil |
There was a problem hiding this comment.
+1
Extending this further, can we please refactor the existing function
Line 112 in 97d40e9
IsRDmaDeviceForNetdevice and GetRdmaDeviceForNetdevice, where IsRDmaDeviceForNetdevice is a simple wrapper over GetRdmaDeviceForNetdevice (Ref. https://github.com/Mellanox/rdmamap/blob/37bd11cc4c57da931b7b117f829fb663d46ce480/rdma_map.go#L348-L368
|
Hey @neaggarwMS, checking in to see if you still need this change and will be able to accommodate the minor proposed comment. |
|
ping @neaggarwMS , please check last comment and also sign the CLA, this seems almost ready to go Thanks |
his PR fixes the PrepareResourceClaim function to read the RDMA device name from sysfs path as a fallback when rdmamap fails to detect the device.
Changes:
Added `getRdmaDeviceFromNetdev` function in `rdmadevice.go` that:
Added comprehensive unit tests in `rdmadevice_test.go`
Testing: