-
Notifications
You must be signed in to change notification settings - Fork 8
Description
What would you like to be added?
Currently the NicFirmwareTemplate resource allows automatic firmware updating by pointing spec.template.nicFirmwareSourceRef to a valid firmware source and setting spec.template.updatePolicy to "Update". But the check whether a firmware update is needed is coming from a separate ConfigMap the user likely isn't aware of (nic-configuration-operator-supported-nic-firmware) and isn't connected to the firmware that actually is getting applied from the configured firmware source. As a result, you can get into an endless firmware update loop, if the firmware BFB is applying a different version than is defined in the ConfigMap.
Allowing the user to specify the intended firmware version in the NicFirmwareTemplate directly, ensures that the firmware version is checked against this value (which we can assume matches the firmware version in the source BFB) and prevents an endless loop. For example:
apiVersion: configuration.net.nvidia.com/v1alpha1
kind: NicFirmwareTemplate
metadata:
name: bf3-supernic-firmware
namespace: nvidia-network-operator
spec:
nodeSelector: {}
nicSelector:
nicType: "a2dc"
pciAddresses:
- "0000:7a:00.0"
template:
nicFirmwareSourceRef: bf3-firmware-310-82-25-07
nicFirmwareVersion: 32.46.3048
updatePolicy: Update
Deviating from the officially supported firmware as defined in the configmap should be reported as merely a warning.
What is the use case for this feature / enhancement?
I would like to transition from the host-based firmware flashing method (currently implemented through scripting) to letting this operator handle it. Especially since this operator is better at doing multiple NICs in parallel. But for that to be viable, the update logic can't be dependent on a configmap that most users wouldn't be aware of.