|
| 1 | +# HCOGoldenImageWithNoSupportedArchitecture |
| 2 | + |
| 3 | +## Meaning |
| 4 | + |
| 5 | +When running on a heterogeneous cluster, a cluster with nodes of different |
| 6 | +architectures, the DataImportCronTemplates (DICTs; also known as golden |
| 7 | +images), in the hyperconverged cluster operator (HCO) should be annotated with |
| 8 | +the `ssp.kubevirt.io/dict.architectures` annotation, where the value is the |
| 9 | +list of the architectures supported by the image, that is defined in each DICT. |
| 10 | + |
| 11 | +For pre-defined DICTs, this annotation is already set, but for custom DICTs |
| 12 | +(user defined DICTs), this annotation must be set by the user in the |
| 13 | +HyperConverged custom resource (CR). |
| 14 | + |
| 15 | +For each DICT, if the annotation does not include any architecture that is |
| 16 | +supported by the cluster (which mean, there is no node in the cluster with |
| 17 | +the architectures listed in the DICT annotation), Then HCO will trigger |
| 18 | +the `HCOGoldenImageWithNoSupportedArchitecture` alert for this specific DICT. |
| 19 | + |
| 20 | +> **Note:** This alert is only triggered, if the `enableMultiArchBootImageImport` |
| 21 | +> feature gate is enabled in the HyperConverged CR. |
| 22 | +
|
| 23 | +## Impact |
| 24 | + |
| 25 | +When this alert is triggered, it means that the DICT is not supported by any of |
| 26 | +the nodes in the cluster. HCO will not populate the SSP CR with this DICT, and |
| 27 | +so this golden image will not be available for use in the cluster. |
| 28 | + |
| 29 | +## Diagnosis |
| 30 | + |
| 31 | +Read the HyperConverged CR: |
| 32 | + |
| 33 | +```bash |
| 34 | + # Get the namespace of the HyperConverged CR |
| 35 | +$ NAMESPACE="$(kubectl get hyperconverged -A --no-headers | awk '{print $1}')" |
| 36 | + |
| 37 | +#Read the HyperConverged CR |
| 38 | +$ kubectl get hyperconverged -n "${NAMESPACE}" -o yaml |
| 39 | +``` |
| 40 | + |
| 41 | +There are a few fields in the HyperConverged CR status that can be used to |
| 42 | +diagnose this issue: |
| 43 | + |
| 44 | +1. The `status.nodeInfo.workloadsArchitectures` shows the list of architectures |
| 45 | + supported by the cluster. |
| 46 | +2. The `status.dataImportCronTemplates` field shows the list of DICTs that are |
| 47 | + managed by HCO. |
| 48 | + 1. Find the specific DICT object that is triggering this alert by its name, |
| 49 | + as specified in the alert message. check the DICT's |
| 50 | + `ssp.kubevirt.io/dict.architectures` annotation. Unlike the annotation |
| 51 | + in the spec field, this annotation contain only the architectures that |
| 52 | + are supported by the image **and** by the cluster. |
| 53 | + |
| 54 | + If the annotation is empty, then there is no architecture supported by |
| 55 | + the image and by the cluster. |
| 56 | + 2. The DICT status field will include the `conditions` field, with the |
| 57 | + `Deployed` condition set to `False`, and the `reason` field set to |
| 58 | + `UnsupportedArchitectures`. |
| 59 | + > **Note:** For DICT with supported architectures, the status |
| 60 | + field will not contain the `conditions` field. |
| 61 | + 3. The DICT's `status.workloadsArchitectures` field shows the list of |
| 62 | + architectures supported by the image, as was set in the |
| 63 | + `ssp.kubevirt.io/dict.architectures` annotation in the source DICT. |
| 64 | + |
| 65 | +### Example |
| 66 | + |
| 67 | +```yaml |
| 68 | +apiVersion: hco.kubevirt.io/v1beta1 |
| 69 | +kind: HyperConverged |
| 70 | +... |
| 71 | +status: |
| 72 | + ... |
| 73 | + dataImportCronTemplates: |
| 74 | + - metadata: |
| 75 | + annotations: |
| 76 | + ssp.kubevirt.io/dict.architectures: "" |
| 77 | + name: my-image |
| 78 | + spec: |
| 79 | + ... |
| 80 | + status: |
| 81 | + conditions: |
| 82 | + - message: DataImportCronTemplate has no supported architectures for the current |
| 83 | + cluster |
| 84 | + reason: UnsupportedArchitectures |
| 85 | + status: "False" |
| 86 | + type: Deployed |
| 87 | + originalSupportedArchitectures: someUnsupportedArch,otherUnsupportedArch |
| 88 | +``` |
| 89 | +
|
| 90 | +## Mitigation |
| 91 | +
|
| 92 | +### Pre-defined DataImportCronTemplates |
| 93 | +
|
| 94 | +The pre-defined DICTs are not defined in the `spec.dataImportCronTemplates` |
| 95 | +field in the HyperConverged CR, but they are defined internally in the HCO |
| 96 | +application. |
| 97 | + |
| 98 | +All pre-defined DICTs are annotated with the `ssp.kubevirt.io/dict.architectures` |
| 99 | +annotation, and all of them supports the `amd64`, `arm64`, and `s390x` |
| 100 | +architectures. In the unlikely case that the cluster does not support any of |
| 101 | +these architectures, there is no way to use these pre-defined DICTs in the |
| 102 | +cluster. |
| 103 | + |
| 104 | +To mitigate this issue, (if adding supported nodes to the cluster is not an |
| 105 | +option), you can either: |
| 106 | + |
| 107 | +1. Disable the pre-defined DICTs in the HyperConverged CR, to turn this alert |
| 108 | + off: |
| 109 | + 1. Find the DICT(s) you want to disable, in the HyperConverged `status.dataImportCronTemplates` |
| 110 | + field, as described |
| 111 | + [above](#diagnosis). |
| 112 | + 2. Add the DICT to the `spec.dataImportCronTemplates` field in the |
| 113 | + HyperConverged CR. Add the `dataimportcrontemplate.kubevirt.io/enable` |
| 114 | + annotation with the value `false` to the DICT. Only the DICT name and |
| 115 | + the annotation are required, in this case |
| 116 | + |
| 117 | + For example, to disable the `centos-stream10-image-cron` DICT: |
| 118 | + ```yaml |
| 119 | + apiVersion: hco.kubevirt.io/v1beta1 |
| 120 | + kind: HyperConverged |
| 121 | + metadata: |
| 122 | + name: kubevirt-hyperconverged |
| 123 | + spec: |
| 124 | + dataImportCronTemplates: |
| 125 | + - metadata: |
| 126 | + name: centos-stream10-image-cron |
| 127 | + annotations: |
| 128 | + dataimportcrontemplate.kubevirt.io/enable: 'false' |
| 129 | + ``` |
| 130 | +2. If you have the self-built desired image, that is supported by the nodes in |
| 131 | + the cluster, you can modify the pre-defined DICT to use your image, adding |
| 132 | + the DICT to the `spec.dataImportCronTemplates` field in the HyperConverged |
| 133 | + CR, and modify its `spec.source.registry` field. |
| 134 | + |
| 135 | + > Tip: you can find the pre-defined DICTs in HyperConverged CR `status.dataImportCronTemplates` |
| 136 | + > field, as described [above](#diagnosis). Then you can copy the DICT from |
| 137 | + > there, and modify it in the HyperConverged CR |
| 138 | + > `spec.dataImportCronTemplates` field. |
| 139 | + |
| 140 | + Don't forget to set the `ssp.kubevirt.io/dict.architectures` annotation to |
| 141 | + include all the architectures supported by your image. |
| 142 | + |
| 143 | + In this case, you'll need to add all the fields of the DICT. |
| 144 | + |
| 145 | + For example: |
| 146 | + ```yaml |
| 147 | + apiVersion: hco.kubevirt.io/v1beta1 |
| 148 | + kind: HyperConverged |
| 149 | + metadata: |
| 150 | + name: kubevirt-hyperconverged |
| 151 | + spec: |
| 152 | + dataImportCronTemplates: |
| 153 | + - metadata: |
| 154 | + annotations: |
| 155 | + cdi.kubevirt.io/storage.bind.immediate.requested: "true" |
| 156 | + ssp.kubevirt.io/dict.architectures: arch1,arch2 |
| 157 | + name: centos-stream10-image-cron |
| 158 | + spec: |
| 159 | + garbageCollect: Outdated |
| 160 | + managedDataSource: centos-stream10 |
| 161 | + schedule: "0 */12 * * *" |
| 162 | + template: |
| 163 | + spec: |
| 164 | + source: |
| 165 | + registry: |
| 166 | + url: docker://your-registry/your-image:latest |
| 167 | + storage: |
| 168 | + resources: |
| 169 | + requests: |
| 170 | + storage: 10Gi |
| 171 | + ``` |
| 172 | + |
| 173 | +### User-defined DataImportCronTemplates |
| 174 | + |
| 175 | +User-defined DICTs are defined in the HyperConverged CR, in the |
| 176 | +`spec.dataImportCronTemplates` field. |
| 177 | + |
| 178 | +First, check what architectures are supported by the image. You can use the |
| 179 | +following command: |
| 180 | + |
| 181 | +```bash |
| 182 | +$ podman manifest inspect your-registry/your-image:latest |
| 183 | +``` |
| 184 | + |
| 185 | +See here for |
| 186 | +the [podman manifest inspect documentation](https://docs.podman.io/en/latest/markdown/podman-manifest-inspect.1.html). |
| 187 | + |
| 188 | +If the image is multi architecture manifest (fat manifest), it will include the |
| 189 | +`manifests` field, which is a list of architectures supported by the image. If |
| 190 | +the image is not a multi architecture manifest, you will need to find out what |
| 191 | +is its architecture. |
| 192 | + |
| 193 | +Then, check that the `ssp.kubevirt.io/dict.architectures` annotation is set |
| 194 | +with the correct value. If not, edit the HyperConverged CR to fix the |
| 195 | +annotation to the right value. The format of the annotation is a |
| 196 | +comma-separated list of architectures; e.g., `amd64,arm64,s390x`. |
| 197 | + |
| 198 | +If the image does not support any of the architectures supported by the |
| 199 | +cluster, you will need to either rebuild the image for one or more of |
| 200 | +the architectures supported by the cluster, or remove the DICT from the |
| 201 | +HyperConverged CR. It is also possible to disable the DICT, by adding it |
| 202 | +the `dataimportcrontemplate.kubevirt.io/enable` annotation, with the value |
| 203 | +of `false.`; for example: |
| 204 | + ```yaml |
| 205 | + apiVersion: hco.kubevirt.io/v1beta1 |
| 206 | + kind: HyperConverged |
| 207 | + metadata: |
| 208 | + name: kubevirt-hyperconverged |
| 209 | + spec: |
| 210 | + dataImportCronTemplates: |
| 211 | + - metadata: |
| 212 | + annotations: |
| 213 | + dataimportcrontemplate.kubevirt.io/enable: "false" |
| 214 | + ssp.kubevirt.io/dict.architectures: unsupported-arch1,unsupported-arch2 |
| 215 | + name: my-image |
| 216 | + spec: |
| 217 | + ... |
| 218 | + ``` |
| 219 | + |
| 220 | +Find some more information about building multi-architecture images, see the |
| 221 | +[podman documentation](https://docs.podman.io/en/latest/markdown/podman-manifest-create.1.html). |
| 222 | + |
| 223 | +<!--DS: If you cannot resolve the issue, log in to the |
| 224 | +link:https://access.redhat.com[Customer Portal] and open a support case, |
| 225 | +attaching the artifacts gathered during the diagnosis procedure.--> |
| 226 | +<!--USstart--> |
| 227 | +If you cannot resolve the issue, see the following resources: |
| 228 | + |
| 229 | +- [OKD Help](https://okd.io/docs/community/help/) |
| 230 | +- [#virtualization Slack channel](https://kubernetes.slack.com/channels/virtualization) |
| 231 | + |
| 232 | +<!--USend--> |
0 commit comments