Skip to content

sriov-device-plugin restart #1028

@nnnq-terry

Description

@nnnq-terry

Now I'm encountering a problem: after using all of the IB SRIOV devices, I deleted a container. I noticed that the sriov-device-plugin was restarted. Since other IB VF cards are still in use, the device plugin seems to only recognize one card. The numbers in the “Allocatable” and “Capacity” fields of the node have been updated to reflect the remaining quantity of 1 card. However, the “Allocated Resources” field still shows the amount of resources that were previously allocated, which means there aren’t enough resources available for new containers to be created.

Pod error:
Warning FailedScheduling 29s xpu-engine-scheduler 0/2 nodes are available: 2 Insufficient openshift.io/ib-sriov. preemption: 0/2 nodes are available: 2 No preemption victims found for incoming pod.

My SriovNetworkNodePolicy configuration is as follows:
(base) [root@cp0 ~]# kubectl describe SriovNetworkNodePolicy ib-sriov -n sriov-network-operator
Name: ib-sriov
Namespace: sriov-network-operator
Labels:
Annotations:
API Version: sriovnetwork.openshift.io/v1
Kind: SriovNetworkNodePolicy
Metadata:
Creation Timestamp: 2026-02-10T03:10:31Z
Generation: 1
Managed Fields:
API Version: sriovnetwork.openshift.io/v1
Fields Type: FieldsV1
fieldsV1:
f:metadata:
f:annotations:
.:
f:kubectl.kubernetes.io/last-applied-configuration:
f:spec:
.:
f:deviceType:
f:isRdma:
f:linkType:
f:mtu:
f:nicSelector:
.:
f:deviceID:
f:vendor:
f:nodeSelector:
.:
f:kubernetes.io/os:
f:numVfs:
f:priority:
f:resourceName:
Manager: kubectl-client-side-apply
Operation: Update
Time: 2026-02-10T03:10:31Z
Resource Version: 5802321
UID: eaec8e82-3641-44f8-a7df-c03015058c99
Spec:
Device Type: netdevice
Is Rdma: true
Link Type: ib
Mtu: 2044
Nic Selector:
Device ID: 1021
Vendor: 15b3
Node Selector:
kubernetes.io/os: linux
Num Vfs: 1
Priority: 99
Resource Name: ib-sriov
Events:

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions