Skip to content

MIG Stuck in Pending enable state in A100 with 580 driver #1845

@PraveenKumarInjam

Description

@PraveenKumarInjam

Hi @tariq1890 ,

We are seeing MIG issue with 580.65.06, the MIG enablement is stuck in pending state and MIG Manager is failing to apply the changes. This issue we are specifically observing with A100 GPU's, where as everything is working fine with H100 and H200

MIG Manager log:-

"2025-10-30T07:33:20Z" level=fatal msg="Error applying MIG configuration with hooks: unable to apply MIG config with MIG mode disabled"

nvidia-mig-manager time="2025-10-30T07:33:20Z" level=info msg="Restarting any GPU clients previously shutdown in Kubernetes by reenabling their component-specific nodeSelector labels"

nvidia-mig-manager time="2025-10-30T07:33:20Z" level=info msg="Changing the 'nvidia.com/mig.config.state' node label to 'failed'\n"

nvidia-mig-manager time="2025-10-30T07:33:20Z" level=error msg="Error: failed to apply MIG configuration: exit status 1"

nvidia-mig-manager time="2025-10-30T07:33:20Z" level=info msg="Waiting for change to 'nvidia.com/mig.config' label"

Output of Nvidia SMI:
Image

Log : -
[root@nvidia-driver-daemonset-m4wl8 drivers]# nvidia-smi -mig 1
Warning: MIG mode is in pending enable state for GPU 00000001:00:00.0:In use by another client 00000001:00:00.0 is currently being used by one or more other processes (e.g. CUDA application or a monitoring application such as another instance of nvidia-smi).
Please first kill all processes using the device and retry the command or reboot the system to make MIG mode effective. Warning: MIG mode is in pending enable state for GPU 00000002:00:00.0:In use by another client 00000002:00:00.0 is currently being used by one or more other processes (e.g. CUDA application or a monitoring application such as another instance of nvidia-smi). Please first kill all processes using the device and retry the command or reboot the system to make MIG mode effective.
Warning: MIG mode is in pending enable state for GPU 00000003:00:00.0:In use by another client 00000003:00:00.0 is currently being used by one or more other processes (e.g. CUDA application or a monitoring application such as another instance of nvidia-smi). Please first kill all processes using the device and retry the command or reboot the system to make MIG mode effective.
Warning: MIG mode is in pending enable state for GPU 00000004:00:00.0:In use by another client 00000004:00:00.0 is currently being used by one or more other processes (e.g. CUDA application or a monitoring application such as another instance of nvidia-smi). Please first kill all processes using the device and retry the command or reboot the system to make MIG mode effective. All done.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions