Skip to content

Commit 5d2791e

Browse files
committed
Report recommended CPU models for heterogeneous clusters
In heterogeneous clusters with multiple CPU vendors and generations, choosing the best CPU model for VMs is challenging: - Using a newer CPU model limits VM placement to a subset of nodes - Using an older model impacts guest performance unnecessarily This change adds automatic CPU model recommendations to the HyperConverged status. The recommendations are sorted by a weighted score that balances: - PassMark performance score (50%) - for guest performance - Available CPU cores (20%) - for workload capacity - Available memory (15%) - for workload capacity - Node count (15%) - for availability and redundancy The top recommendations appear in status.nodeInfo.recommendedCpuModels, helping cluster administrators make informed decisions when setting spec.defaultCPUModel. The CPU model data is gathered from node labels (cpu-model.node.kubevirt.io/*) set by KubeVirt's virt-handler, and is automatically refreshed when nodes change or hourly. Assisted-by: claude-4-sonnet Signed-off-by: Dan Kenigsberg <danken@redhat.com>
1 parent ecae78c commit 5d2791e

File tree

17 files changed

+920
-4
lines changed

17 files changed

+920
-4
lines changed

api/v1beta1/hyperconverged_types.go

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,7 @@ package v1beta1
33
import (
44
openshiftconfigv1 "github.com/openshift/api/config/v1"
55
corev1 "k8s.io/api/core/v1"
6+
"k8s.io/apimachinery/pkg/api/resource"
67
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
78

89
v1 "kubevirt.io/api/core/v1"
@@ -856,6 +857,24 @@ type NodeInfoStatus struct {
856857
WorkloadsArchitectures []string `json:"workloadsArchitectures,omitempty"`
857858
// ControlPlaneArchitectures is a distinct list of the CPU architecture of the control-plane nodes.
858859
ControlPlaneArchitectures []string `json:"controlPlaneArchitectures,omitempty"`
860+
// RecommendedCpuModels is a list of recommended CPU models for the cluster based on available nodes
861+
// +listType=map
862+
// +listMapKey=name
863+
RecommendedCpuModels []CpuModelInfo `json:"recommendedCpuModels,omitempty"`
864+
}
865+
866+
// CpuModelInfo contains information about a CPU model and its availability in the cluster
867+
type CpuModelInfo struct {
868+
// Name is the CPU model name
869+
Name string `json:"name"`
870+
// Benchmark is the CPU performance score for this model
871+
Benchmark int `json:"benchmark"`
872+
// Nodes is the number of nodes that support this CPU model
873+
Nodes int `json:"nodes"`
874+
// CPU is the total CPU cores available across all nodes supporting this model
875+
CPU *resource.Quantity `json:"cpu"`
876+
// Memory is the total memory available across all nodes supporting this model
877+
Memory *resource.Quantity `json:"memory"`
859878
}
860879

861880
// ApplicationAwareConfigurations holds the AAQ configurations

api/v1beta1/zz_generated.deepcopy.go

Lines changed: 34 additions & 1 deletion
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

api/v1beta1/zz_generated.defaults.go

Lines changed: 1 addition & 1 deletion
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

api/v1beta1/zz_generated.openapi.go

Lines changed: 1 addition & 1 deletion
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

config/crd/bases/hco.kubevirt.io_hyperconvergeds.yaml

Lines changed: 45 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5163,6 +5163,51 @@ spec:
51635163
items:
51645164
type: string
51655165
type: array
5166+
recommendedCpuModels:
5167+
description: RecommendedCpuModels is a list of recommended CPU
5168+
models for the cluster based on available nodes
5169+
items:
5170+
description: CpuModelInfo contains information about a CPU model
5171+
and its availability in the cluster
5172+
properties:
5173+
benchmark:
5174+
description: Benchmark is the CPU performance score for
5175+
this model
5176+
type: integer
5177+
cpu:
5178+
anyOf:
5179+
- type: integer
5180+
- type: string
5181+
description: CPU is the total CPU cores available across
5182+
all nodes supporting this model
5183+
pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$
5184+
x-kubernetes-int-or-string: true
5185+
memory:
5186+
anyOf:
5187+
- type: integer
5188+
- type: string
5189+
description: Memory is the total memory available across
5190+
all nodes supporting this model
5191+
pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$
5192+
x-kubernetes-int-or-string: true
5193+
name:
5194+
description: Name is the CPU model name
5195+
type: string
5196+
nodes:
5197+
description: Nodes is the number of nodes that support this
5198+
CPU model
5199+
type: integer
5200+
required:
5201+
- benchmark
5202+
- cpu
5203+
- memory
5204+
- name
5205+
- nodes
5206+
type: object
5207+
type: array
5208+
x-kubernetes-list-map-keys:
5209+
- name
5210+
x-kubernetes-list-type: map
51665211
workloadsArchitectures:
51675212
description: WorkloadsArchitectures is a distinct list of the
51685213
CPU architectures of the workloads nodes in the cluster.

controllers/hyperconverged/hyperconverged_controller.go

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -514,6 +514,15 @@ func updateStatus(req *common.HcoRequest) {
514514
req.Instance.Status.NodeInfo.WorkloadsArchitectures = workloadsArch
515515
req.StatusDirty = true
516516
}
517+
518+
if cpuModels := nodeinfo.GetRecommendedCpuModels(); !slices.EqualFunc(req.Instance.Status.NodeInfo.RecommendedCpuModels, cpuModels, func(a, b hcov1beta1.CpuModelInfo) bool {
519+
return a.Name == b.Name && a.Benchmark == b.Benchmark && a.Nodes == b.Nodes &&
520+
((a.CPU == nil && b.CPU == nil) || (a.CPU != nil && b.CPU != nil && a.CPU.Equal(*b.CPU))) &&
521+
((a.Memory == nil && b.Memory == nil) || (a.Memory != nil && b.Memory != nil && a.Memory.Equal(*b.Memory)))
522+
}) {
523+
req.Instance.Status.NodeInfo.RecommendedCpuModels = cpuModels
524+
req.StatusDirty = true
525+
}
517526
}
518527

519528
// getHyperConverged gets the HyperConverged resource from the Kubernetes API.

deploy/crds/hco00.crd.yaml

Lines changed: 45 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5163,6 +5163,51 @@ spec:
51635163
items:
51645164
type: string
51655165
type: array
5166+
recommendedCpuModels:
5167+
description: RecommendedCpuModels is a list of recommended CPU
5168+
models for the cluster based on available nodes
5169+
items:
5170+
description: CpuModelInfo contains information about a CPU model
5171+
and its availability in the cluster
5172+
properties:
5173+
benchmark:
5174+
description: Benchmark is the CPU performance score for
5175+
this model
5176+
type: integer
5177+
cpu:
5178+
anyOf:
5179+
- type: integer
5180+
- type: string
5181+
description: CPU is the total CPU cores available across
5182+
all nodes supporting this model
5183+
pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$
5184+
x-kubernetes-int-or-string: true
5185+
memory:
5186+
anyOf:
5187+
- type: integer
5188+
- type: string
5189+
description: Memory is the total memory available across
5190+
all nodes supporting this model
5191+
pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$
5192+
x-kubernetes-int-or-string: true
5193+
name:
5194+
description: Name is the CPU model name
5195+
type: string
5196+
nodes:
5197+
description: Nodes is the number of nodes that support this
5198+
CPU model
5199+
type: integer
5200+
required:
5201+
- benchmark
5202+
- cpu
5203+
- memory
5204+
- name
5205+
- nodes
5206+
type: object
5207+
type: array
5208+
x-kubernetes-list-map-keys:
5209+
- name
5210+
x-kubernetes-list-type: map
51665211
workloadsArchitectures:
51675212
description: WorkloadsArchitectures is a distinct list of the
51685213
CPU architectures of the workloads nodes in the cluster.

deploy/index-image/community-kubevirt-hyperconverged/1.18.0/manifests/hco00.crd.yaml

Lines changed: 45 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5163,6 +5163,51 @@ spec:
51635163
items:
51645164
type: string
51655165
type: array
5166+
recommendedCpuModels:
5167+
description: RecommendedCpuModels is a list of recommended CPU
5168+
models for the cluster based on available nodes
5169+
items:
5170+
description: CpuModelInfo contains information about a CPU model
5171+
and its availability in the cluster
5172+
properties:
5173+
benchmark:
5174+
description: Benchmark is the CPU performance score for
5175+
this model
5176+
type: integer
5177+
cpu:
5178+
anyOf:
5179+
- type: integer
5180+
- type: string
5181+
description: CPU is the total CPU cores available across
5182+
all nodes supporting this model
5183+
pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$
5184+
x-kubernetes-int-or-string: true
5185+
memory:
5186+
anyOf:
5187+
- type: integer
5188+
- type: string
5189+
description: Memory is the total memory available across
5190+
all nodes supporting this model
5191+
pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$
5192+
x-kubernetes-int-or-string: true
5193+
name:
5194+
description: Name is the CPU model name
5195+
type: string
5196+
nodes:
5197+
description: Nodes is the number of nodes that support this
5198+
CPU model
5199+
type: integer
5200+
required:
5201+
- benchmark
5202+
- cpu
5203+
- memory
5204+
- name
5205+
- nodes
5206+
type: object
5207+
type: array
5208+
x-kubernetes-list-map-keys:
5209+
- name
5210+
x-kubernetes-list-type: map
51665211
workloadsArchitectures:
51675212
description: WorkloadsArchitectures is a distinct list of the
51685213
CPU architectures of the workloads nodes in the cluster.

deploy/olm-catalog/community-kubevirt-hyperconverged/1.18.0/manifests/hco00.crd.yaml

Lines changed: 45 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5163,6 +5163,51 @@ spec:
51635163
items:
51645164
type: string
51655165
type: array
5166+
recommendedCpuModels:
5167+
description: RecommendedCpuModels is a list of recommended CPU
5168+
models for the cluster based on available nodes
5169+
items:
5170+
description: CpuModelInfo contains information about a CPU model
5171+
and its availability in the cluster
5172+
properties:
5173+
benchmark:
5174+
description: Benchmark is the CPU performance score for
5175+
this model
5176+
type: integer
5177+
cpu:
5178+
anyOf:
5179+
- type: integer
5180+
- type: string
5181+
description: CPU is the total CPU cores available across
5182+
all nodes supporting this model
5183+
pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$
5184+
x-kubernetes-int-or-string: true
5185+
memory:
5186+
anyOf:
5187+
- type: integer
5188+
- type: string
5189+
description: Memory is the total memory available across
5190+
all nodes supporting this model
5191+
pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$
5192+
x-kubernetes-int-or-string: true
5193+
name:
5194+
description: Name is the CPU model name
5195+
type: string
5196+
nodes:
5197+
description: Nodes is the number of nodes that support this
5198+
CPU model
5199+
type: integer
5200+
required:
5201+
- benchmark
5202+
- cpu
5203+
- memory
5204+
- name
5205+
- nodes
5206+
type: object
5207+
type: array
5208+
x-kubernetes-list-map-keys:
5209+
- name
5210+
x-kubernetes-list-type: map
51665211
workloadsArchitectures:
51675212
description: WorkloadsArchitectures is a distinct list of the
51685213
CPU architectures of the workloads nodes in the cluster.

docs/api.md

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,7 @@ This Document documents the types introduced by the hyperconverged-cluster-opera
88
* [ApplicationAwareConfigurations](#applicationawareconfigurations)
99
* [CertRotateConfigCA](#certrotateconfigca)
1010
* [CertRotateConfigServer](#certrotateconfigserver)
11+
* [CpuModelInfo](#cpumodelinfo)
1112
* [DataImportCronStatus](#dataimportcronstatus)
1213
* [DataImportCronTemplate](#dataimportcrontemplate)
1314
* [DataImportCronTemplateStatus](#dataimportcrontemplatestatus)
@@ -71,6 +72,20 @@ CertRotateConfigServer contains the tunables for TLS certificates.
7172

7273
[Back to TOC](#table-of-contents)
7374

75+
## CpuModelInfo
76+
77+
CpuModelInfo contains information about a CPU model and its availability in the cluster
78+
79+
| Field | Description | Scheme | Default | Required |
80+
| ----- | ----------- | ------ | -------- |-------- |
81+
| name | Name is the CPU model name | string | | true |
82+
| benchmark | Benchmark is the CPU performance score for this model | int | | true |
83+
| nodes | Nodes is the number of nodes that support this CPU model | int | | true |
84+
| cpu | CPU is the total CPU cores available across all nodes supporting this model | *resource.Quantity | | true |
85+
| memory | Memory is the total memory available across all nodes supporting this model | *resource.Quantity | | true |
86+
87+
[Back to TOC](#table-of-contents)
88+
7489
## DataImportCronStatus
7590

7691
DataImportCronStatus is the status field of the DIC template
@@ -353,6 +368,7 @@ NodeInfoStatus holds information about the cluster nodes
353368
| ----- | ----------- | ------ | -------- |-------- |
354369
| workloadsArchitectures | WorkloadsArchitectures is a distinct list of the CPU architectures of the workloads nodes in the cluster. | []string | | false |
355370
| controlPlaneArchitectures | ControlPlaneArchitectures is a distinct list of the CPU architecture of the control-plane nodes. | []string | | false |
371+
| recommendedCpuModels | RecommendedCpuModels is a list of recommended CPU models for the cluster based on available nodes | [][CpuModelInfo](#cpumodelinfo) | | false |
356372

357373
[Back to TOC](#table-of-contents)
358374

0 commit comments

Comments
 (0)