Report recommended CPU models for heterogeneous clusters#3944
Report recommended CPU models for heterogeneous clusters#3944dankenigsberg wants to merge 1 commit intokubevirt:mainfrom
Conversation
|
Skipping CI for Draft Pull Request. |
1 similar comment
|
Skipping CI for Draft Pull Request. |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
/test all |
a18fa9d to
4423d64
Compare
deploy/olm-catalog/community-kubevirt-hyperconverged/1.16.0/manifests/hco00.crd.yaml
Show resolved
Hide resolved
4423d64 to
ffb8383
Compare
nunnatsa
left a comment
There was a problem hiding this comment.
Thanks for fixing the comment, Added some more inline.
Please also handle the linter issues (redefinition of the builtin function max).
ffb8383 to
5d2791e
Compare
Pull Request Test Coverage Report for Build 20880735694Details
💛 - Coveralls |
|
hco-e2e-operator-sdk-sno-aws lane succeeded. |
|
@hco-bot: Overrode contexts on behalf of hco-bot: ci/prow/hco-e2e-operator-sdk-sno-azure DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
hco-e2e-operator-sdk-aws lane succeeded. |
|
@hco-bot: Overrode contexts on behalf of hco-bot: ci/prow/hco-e2e-operator-sdk-azure, ci/prow/hco-e2e-operator-sdk-gcp DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
hco-e2e-upgrade-prev-operator-sdk-aws lane succeeded. |
|
@hco-bot: Overrode contexts on behalf of hco-bot: ci/prow/hco-e2e-upgrade-operator-sdk-azure, ci/prow/hco-e2e-upgrade-prev-operator-sdk-azure, ci/prow/hco-e2e-upgrade-prev-operator-sdk-sno-azure DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
hco-e2e-consecutive-operator-sdk-upgrades-aws lane succeeded. |
|
@hco-bot: Overrode contexts on behalf of hco-bot: ci/prow/hco-e2e-consecutive-operator-sdk-upgrades-azure DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
hco-e2e-kv-smoke-gcp lane succeeded. |
|
@hco-bot: Overrode contexts on behalf of hco-bot: ci/prow/hco-e2e-kv-smoke-azure DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
nunnatsa
left a comment
There was a problem hiding this comment.
Thanks @dankenigsberg
Added some more comments.
|
|
||
| if cpuModels := nodeinfo.GetRecommendedCpuModels(); !slices.EqualFunc(req.Instance.Status.NodeInfo.RecommendedCpuModels, cpuModels, func(a, b hcov1beta1.CpuModelInfo) bool { | ||
| return a.Name == b.Name && a.Benchmark == b.Benchmark && a.Nodes == b.Nodes && | ||
| ((a.CPU == nil && b.CPU == nil) || (a.CPU != nil && b.CPU != nil && a.CPU.Equal(*b.CPU))) && | ||
| ((a.Memory == nil && b.Memory == nil) || (a.Memory != nil && b.Memory != nil && a.Memory.Equal(*b.Memory))) | ||
| }) { | ||
| req.Instance.Status.NodeInfo.RecommendedCpuModels = cpuModels | ||
| req.StatusDirty = true | ||
| } |
There was a problem hiding this comment.
- we already have a comparison function in the nodeinfo package, we can't expose it for reuse, because it's in an internal package, but maybe we can add it as the CpuModel type's method, instead?
- this code is not tested. Can we have a few unit tests for it? We can use a simple mock, as done for the rest of the functions in the nodeinfo package, to return any desired value.
In heterogeneous clusters with multiple CPU vendors and generations, choosing the best CPU model for VMs is challenging: - Using a newer CPU model limits VM placement to a subset of nodes - Using an older model impacts guest performance unnecessarily This change adds automatic CPU model recommendations to the HyperConverged status. The recommendations are sorted by a weighted score that balances: - PassMark performance score (50%) - for guest performance - Available CPU cores (20%) - for workload capacity - Available memory (15%) - for workload capacity - Node count (15%) - for availability and redundancy The top recommendations appear in status.nodeInfo.recommendedCpuModels, helping cluster administrators make informed decisions when setting spec.defaultCPUModel. The CPU model data is gathered from node labels (cpu-model.node.kubevirt.io/*) set by KubeVirt's virt-handler, and is automatically refreshed when nodes change or hourly. Assisted-by: claude-4-sonnet Signed-off-by: Dan Kenigsberg <danken@redhat.com>
5d2791e to
f435588
Compare
|
|
hco-e2e-operator-sdk-aws lane succeeded. |
|
@hco-bot: Overrode contexts on behalf of hco-bot: ci/prow/hco-e2e-operator-sdk-azure, ci/prow/hco-e2e-operator-sdk-gcp, ci/prow/hco-e2e-operator-sdk-sno-azure DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
hco-e2e-upgrade-operator-sdk-aws lane succeeded. |
|
@hco-bot: Overrode contexts on behalf of hco-bot: ci/prow/hco-e2e-upgrade-operator-sdk-azure DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
hco-e2e-consecutive-operator-sdk-upgrades-aws lane succeeded. |
|
@hco-bot: Overrode contexts on behalf of hco-bot: ci/prow/hco-e2e-consecutive-operator-sdk-upgrades-azure, ci/prow/hco-e2e-upgrade-prev-operator-sdk-azure, ci/prow/hco-e2e-upgrade-prev-operator-sdk-sno-azure DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
@dankenigsberg: The following tests failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
|
hco-e2e-upgrade-prev-operator-sdk-aws lane succeeded. |
|
@hco-bot: Overrode contexts on behalf of hco-bot: ci/prow/hco-e2e-upgrade-prev-operator-sdk-azure DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
hco-e2e-kv-smoke-gcp lane succeeded. |
|
@hco-bot: Overrode contexts on behalf of hco-bot: ci/prow/hco-e2e-kv-smoke-azure DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
Thank you for this important effort. Have you considered clusters with multiple vendors or architectures? Would this recommendation still be valid in those cases? |
Yes: if most of the CPU power of the cluster come from an "exotic" architecture, the code would recommend models from that architecture. |
| ) | ||
|
|
||
| // cpuModelPassMarkScores maps libvirt CPU model names to approximate PassMark scores. | ||
| // Keys must match exactly the cpu-model.node.kubevirt.io/* label values. |
There was a problem hiding this comment.
Just a heads-up that the keys in this list tend to change with different libvirt versions, so this would require continued maintenance.
|
PR needs rebase. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |



In heterogeneous clusters with multiple CPU vendors and generations, choosing the best CPU model for VMs is challenging:
This change adds automatic CPU model recommendations to the HyperConverged status. The recommendations are sorted by a weighted score that balances:
The top recommendations appear in status.nodeInfo.recommendedCpuModels, helping cluster administrators make informed decisions when setting spec.defaultCPUModel.
The CPU model data is gathered from node labels (cpu-model.node.kubevirt.io/*) set by KubeVirt's virt-handler, and is automatically refreshed when nodes change or hourly.
Assisted-by: claude-4-sonnet
Reviewer Checklist
Jira Ticket:
Release note: