-
Notifications
You must be signed in to change notification settings - Fork 133
Description
Now I'm encountering a problem: after using all of the IB SRIOV devices, I deleted a container. I noticed that the sriov-device-plugin was restarted. Since other IB VF cards are still in use, the device plugin seems to only recognize one card. The numbers in the “Allocatable” and “Capacity” fields of the node have been updated to reflect the remaining quantity of 1 card. However, the “Allocated Resources” field still shows the amount of resources that were previously allocated, which means there aren’t enough resources available for new containers to be created.
Pod error:
Warning FailedScheduling 29s xpu-engine-scheduler 0/2 nodes are available: 2 Insufficient openshift.io/ib-sriov. preemption: 0/2 nodes are available: 2 No preemption victims found for incoming pod.
My SriovNetworkNodePolicy configuration is as follows:
(base) [root@cp0 ~]# kubectl describe SriovNetworkNodePolicy ib-sriov -n sriov-network-operator
Name: ib-sriov
Namespace: sriov-network-operator
Labels:
Annotations:
API Version: sriovnetwork.openshift.io/v1
Kind: SriovNetworkNodePolicy
Metadata:
Creation Timestamp: 2026-02-10T03:10:31Z
Generation: 1
Managed Fields:
API Version: sriovnetwork.openshift.io/v1
Fields Type: FieldsV1
fieldsV1:
f:metadata:
f:annotations:
.:
f:kubectl.kubernetes.io/last-applied-configuration:
f:spec:
.:
f:deviceType:
f:isRdma:
f:linkType:
f:mtu:
f:nicSelector:
.:
f:deviceID:
f:vendor:
f:nodeSelector:
.:
f:kubernetes.io/os:
f:numVfs:
f:priority:
f:resourceName:
Manager: kubectl-client-side-apply
Operation: Update
Time: 2026-02-10T03:10:31Z
Resource Version: 5802321
UID: eaec8e82-3641-44f8-a7df-c03015058c99
Spec:
Device Type: netdevice
Is Rdma: true
Link Type: ib
Mtu: 2044
Nic Selector:
Device ID: 1021
Vendor: 15b3
Node Selector:
kubernetes.io/os: linux
Num Vfs: 1
Priority: 99
Resource Name: ib-sriov
Events: