Before you can use an accelerator in {productname-short}, you must install the relevant software components. The installation process varies based on the accelerator type.
-
You have logged in to your {openshift-platform} cluster.
-
You have the
cluster-admin
role in your {openshift-platform} cluster. -
You have installed an accelerator and confirmed that it is detected in your environment.
-
Follow the appropriate documentation to enable your accelerator:
-
NVIDIA GPUs: See Enabling NVIDIA GPUs.
-
Intel Gaudi AI accelerators: See Enabling Intel Gaudi AI accelerators.
-
AMD GPUs: See Enabling AMD GPUs.
-
-
After installing your accelerator, create an accelerator profile as described in: Working with accelerator profiles.
-
From the Administrator perspective, go to the Operators → Installed Operators page. Confirm that the following Operators appear:
-
The Operator for your accelerator
-
Node Feature Discovery (NFD)
-
Kernel Module Management (KMM)
-
-
The accelerator is correctly detected a few minutes after full installation of the Node Feature Discovery (NFD) and the relevant accelerator Operator. The {openshift-platform} command line interface (CLI) displays the appropriate output for the GPU worker node. For example, here is output confirming that an NVIDIA GPU is detected:
# Expected output when the accelerator is detected correctly oc describe node <node name> ... Capacity: cpu: 4 ephemeral-storage: 313981932Ki hugepages-1Gi: 0 hugepages-2Mi: 0 memory: 16076568Ki nvidia.com/gpu: 1 pods: 250 Allocatable: cpu: 3920m ephemeral-storage: 288292006229 hugepages-1Gi: 0 hugepages-2Mi: 0 memory: 12828440Ki nvidia.com/gpu: 1 pods: 250