Skip to content

Add NVidia Datacenter (V100/A100/H100) GPU assignment to CloudStack guests #5405

Open
@MejdiB

Description

@MejdiB
ISSUE TYPE
  • Enhancement Request
COMPONENT NAME

Apache Cloudstack UI, Agent, KVM, API, etc.

CLOUDSTACK VERSION

Apache Cloudstack 4.15.1

CONFIGURATION

Small setup with 1 Zone, 1 Pod, 2 Clusters: One for CPU virtualization purposes and the other for GPU virtualization purposes.
The GPU cluster consists of 2 nodes with 4 Nvidia A100 cards per node.

OS / ENVIRONMENT

Red Hat Enterpise Linux 8.4 with KVM/QEMU/libvirt (Cloudstack Agent) installed on several GPU hosts with several NVidia A100 AI cards and Cloudstack setup and hosts running on RHEL 8.4 as well.

SUMMARY

Currently, support for Nvidia A100 GPU is not available in Cloudstack. Under Service Offerings -> New Service Offering, I can only define the "old" GRID K1 and K2 cards as GPU ressources. GUI and underlying functionality support for A100 and/or V100 cards are missing. Furthermore, no GPU ressources are displayed zone-wide e.g. on the dashboard when adding a host with A100 cards. As far as I know, KVM virtualization with RHEL 8.4 is supported.

STEPS TO REPRODUCE

Adding a host/node with A100 cards to a cluster in Cloudstack. No Zone-wide GPU ressources are displayed on the dashboard as well as no possibilites to define Compute Offerings with A100 capabilities on the Cloudstack GUI.

EXPECTED RESULTS

Add the A100 card as an option in Service Offering > GPU. When chosen, users can define how man vGPUs to pass to the virtual machine. On the dashboard of Cloudstack, the available amount of GPU ressources are displayed.


Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    Status

    Todo

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions