-
Notifications
You must be signed in to change notification settings - Fork 290
✨ introduce preprovisioningKernelParams to BMH #2576
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
/retest |
078696a to
88beeae
Compare
|
/ok-to-test |
88beeae to
5322972
Compare
|
@dtantsur this is the initial idea based on the proposal, I am testing it manually , so far it worked I think I will add the e2e in a separate PR, but I think I need to add some unit testing still to this PR. |
|
/cc @dtantsur |
5322972 to
4516861
Compare
This commit: - Implements a new spec field of the BMH resource and implements related reconciliation logic in the bare metal host controller - Extends Ironic node update to servicing and preparing states too - Renames getInstanceUpdateOpts to getProvisioningInstanceUpdateOptsForNode to provide a more explicit name for the function and make it consistent with other UpdateOptsForNode type functions Notes: - The new spec field is optional and not using it should provide the same UX as older BMO versions - Kernel pre provisioning arguments can't be changed when a node transitions from inspected (available) state to provisioning state if Ironic is configured to fast-track deployments. Signed-off-by: Adam Rozman <[email protected]>
4516861 to
7ae19c9
Compare
This commit: - Combines the extra kernel params from both BMH and the PreprovisioningImage CRs. Signed-off-by: Adam Rozman <[email protected]>
This commit: - Makes the deprovisioning process able to update the extra kernel parameter field of the ironic node's driver_info before the actual deprovisioning would start. This feature can come handy when the user needs different kernel options during e.g. automated node cleaning compared to inspection. Signed-off-by: Adam Rozman <[email protected]>
|
I think the essential parts are there, I will push an e2e related PR as a followup but this is long enough as it is. Commits will be squashed if folks are okay with the implementation, but I will keep them separate for the time being. |
|
/cc @kashifest |
|
If I'm not mistaken, passing kargs to ironic only works when supplying a kernel+ramdisk (not a prebuilt ISO) for the IPA image. |
It is available, this feature also integrates the kernel params passed via the PreprovisioningImage. |
|
Note that the documentation for the When booting an initrd+kernel image, ironic either has to put the kernel args into the iPXE config or (for virtualmedia) it builds a custom ISO containing the initrd+kernel+args. But when an ISO is supplied that is not built by ironic, there is no way for it to modify the kernel args. There's no standard place in an ISO where they are stored. So the only way to append to them is to pass them to the PreprovisioningImage controller to build into the ISO at the time it creates it. |
Yes that is correct and I agree with you, this feature takes into consideration what you have noted. The reason why I am concatenating the kernel params coming from BMH and PreprovImage is because that way folks who have customized ironics e.g. with custom deploy step that could in theory do whatever with the images would be supported also, so if someone would have a feature to repackage ISOs or would do some sort of magic in their own preprov image controller , they could also take advantage of the BMH preprovExtraKernelArgs param, for others it is just a noop. |
|
AIUI this implementation correctly handles the case where the PreprovisioningImage's |
Sorry for the late reply I was not working the last two weeks. I don't think I am touching any IPA format related field or at least not implementing new control logic. What you have mentioned should be operating logic with this feature too. The only thing I aim to do related to PreprovisioningImage CR is to concatenate the the kernel params provided by PreprovisioningImage CR and BMH CR. In case of the virtualmedia pre-built ISOs the new logic would mean a NO-OP from Ironic side because after BMO updates the preprov kernel params related Ironic node API endpoint with the concatenated params Ironic will disregard it. Ironic will disregard preprov kernel params coming from API or from config file or env variable in case a pre-built IPA ISO is used regardless if it was provided by a controller or "manually". I might be misunderstanding the problem, in any case my intention is to not get in the way of PreprovisioningImage controller. IMO this feature could land, I propose to put this behind a feature gate, mark it as alpha or beta, I have not used feature gates so far so I have to look and ask around, please wait with the RC until I add the gate. |
|
v0.11-rc.0 will go out without this anyhow so most likely this wont get merged to 0.11 but anyhow it would be nice to come to a consensus soon. |
I would say lets take it in even after the .rc if we have a consensus before the actual minor |
|
To make this feature work universally we need to add an extraKernelParams field to the Spec of PreprovisioningImage. When a PreprovisioningImage is used:
And when a PreprovisioningImage is not used, the BMH controller passes the preprovisioningExtraKernelParams from the BMH directly to ironic, the same as in the PR currently. I don't think this is super complicated to implement. The PreprovisioningImage controller side of things is +4 lines of code. The BMH controller side is +1/-6 lines of code. Let's go ahead and do it now rather than set ourselves up for inconsistent behaviour between image types. |
I understand your approach, I also don't think it is super complicated, I am not objecting to implementing this, if you are okay with BMH dictating extra kernel params to the PreprovisioningImage I am fine with it. On the other hand I would like to clarify the intended use of the PreprovisioningImage controller because with the changes you are proposing I feel that we would provide a parallel way of specifying extraKernel params as there was such a way for PreprovImages already AFAIK. I am fine with that but I would like to make sure we are on the same page. The reason why I have not yet implemented what you are asking for is that I have seen the extraKernelParams status field here https://doc.crds.dev/github.com/metal3-io/baremetal-operator/metal3.io/PreprovisioningImage/[email protected]#status-extraKernelParams and checked https://github.com/openshift/image-customization-controller . So my assumption was that the desired behavior from UX/developer perspective was that the "external co-controller/image provider" implementation would (and they do) facilitate a way to specify extraKernelParams used during preprov "image building" and then it is reported by the co-controller/image provider into the status field of the PreprovImage CR and then BMH would read that status field. So in the current way of using the PreprovImage CR , it is the PreprovImage CR's status field that dictates extraKernelParams to BMH and that field is populated by external means. I have considered it redundant or potentially confusing for users to implement an extraKernelParams status field and a spec field the same time and to present the same information from to different "directions" that belongs to two different approaches of supplying this data. That said, we can leave it to the external controller/image provider to reconcile the extra kernel params if it would be coming from a new spec field of the PreprovisioningImage CR if you say we should extend this "interface" between the image providers and BMH. I am not an active user of the PreprovisioningImage CR and related controllers so I was just worried that I would confuse users and devs or introduce silent incompatibilities. Many of the image providers are not even FOSS and they are used to implement custom preprov image handling soo it is hard for me to estimate the effect of the new spec field. I also find the approach of BMH dictating extraKernelParams to the PreprovImage CR the more consistent and logical way same as you have explained, but I was deterred by the involvement of the external controllers and the fact that they are reporting external kernel params already. |
zaneb
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this patch could be far, far simpler if we stopped doing the same work 2-3 times per reconcile.
| HardwareProfile profile.Profile | ||
| RootDeviceHints *metal3api.RootDeviceHints | ||
| CustomDeploy *metal3api.CustomDeploy | ||
| PreprovisioningExtraKernelParams string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even from the naming alone here it doesn't make sense to me that ProvisionData would ever need to contain any data about PreProvisioning.
| } | ||
| deployImageInfo[deployRamdiskKey] = hostImage.ImageURL | ||
| // Extra params from the kernel have been already posted to | ||
| // Ironic node API in the Register function, so extra kernel args in driver info need |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
er, this is the Register function.
| driverOpts := clients.UpdateOptsData{ | ||
| "kernel_append_params": fmtPreprovExtraKernParams(data.PreprovisioningExtraKernelParams), | ||
| } | ||
| updater.SetDriverInfoOpts(driverOpts, ironicNode) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't need to set this again when it was already set in Register(). (And we definitely shouldn't set it 4 different times for 4 different provisioning states we might be in.) We can get rid of most of the changes to the provisioner API as well (only the ManagementAccessData needs to change).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I update it in so many places because I wanted to make it changeable not used only for initial deployment but I want to allow changing the kernel parameters before the LCM operations. Before cleaning, after registration, before deployment before servicing etc.
And I know this is a limitation for pre-built IPA ISOs or LiveISO boots, but for dynamically built IPA ISO or PXE it is very useful for debugging.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Register() is already called in all of those places: https://github.com/metal3-io/baremetal-operator/blob/main/internal/controller/metal3.io/host_state_machine.go#L369
| "ilo_username": bmcCreds.Username, | ||
| "ilo_password": bmcCreds.Password, | ||
| "ilo_address": a.hostname, | ||
| "kernel_append_params": preProvExtraKernParams, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this is not driver-specific (which certainly appears to be the case) then I don't feel like adding it to every driver is the right thing to do. You're already setting kernel_append_params later in the Register() call as part of setDeployParams(), there is no need to do it twice. So we can get rid of the change to the DriverInfo() API as well.
| CurrentImage: getCurrentImage(info.host), | ||
| PreprovisioningImage: preprovImg, | ||
| PreprovisioningNetworkData: preprovisioningNetworkData, | ||
| PreprovisioningExtraKernelParams: r.retrievePreprovisioningExtraKernelParamsSpec(info, prov), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this is actually the only function that we need to pass PreprovisioningExtraKernelParams to (see other comments), so once the others are removed we should pass preprovImg as a param here instead of calling getPreprovImage() a second time.
| updater := clients.UpdateOptsBuilder(p.log) | ||
|
|
||
| deployImageInfo := setDeployImage(p.config, bmcAccess, data.PreprovisioningImage) | ||
| deployImageInfo := setDeployImage(p.config, bmcAccess, data.PreprovisioningImage, data) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: either pass data.PreprovisioningImage and data.PreprovisioningExtraKernelParams or just data. Don't pass data and one subfield of it.
| comExtraKernelParams += " " + data.PreprovisioningExtraKernelParams | ||
| } | ||
| if trimPreprovImgKParams != "" { | ||
| comExtraKernelParams += " " + hostImage.ExtraKernelParams |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here you are adding the kargs supplied by the PreprovisioningImage to the ones passed in the BMH Spec.
But you already did that in the BMH controller (which is the correct place to do it - this shouldn't be provisioner-specific). So the ones from the PreprovisioningImage will be duplicated.
| CurrentImage *metal3api.Image | ||
| PreprovisioningImage *PreprovisioningImage | ||
| PreprovisioningNetworkData string | ||
| PreprovisioningExtraKernelParams string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a bit unfortunate that we can't use the ExtraKernelParams field inside PreprovisioningImage.GeneratedImage, but when no image is set then PreprovisioningImage has to be nil, and I don't see a way around that.
I wonder if we should stop inheriting from the imageprovider.GeneratedFormat struct - which we were only really using for convenience - here though, so that we don't have two copies of the field with different values.
Unfortunately the domain we are modelling is inherently confusing: for an ISO image the user-supplied params must be an input to the PreprovisioningImage so they can be built into the ISO, and for an initrd image the image-supplied params must be an output from the PreprovisioningImage so that ironic can add them to the PXE command line. We could try to give them different names to disambiguate (but unfortunately the one that really needs to change is the one that already exists and can't be changed without breaking APIs). But the only other real choice we have to make is whether we add the user-supplied params to the image-supplied params inside the preprovisioningimage controller or inside the bmh controller. You could make an argument either way. I figured the API made more sense if the user-supplied params were always flowing through the PreprovisioningImage (and getting added to along the way). But if it were done in the BMH controller then no changes would be required to existing PreprovisioningImage controllers to get this to work, so there's a case for that too. |
Okay, I see this is a solid idea for the target workflow, it makes sense for me. I was initally surprised that the extraKernelParams spec was missing from the PreprovisioningImage for the exact same reason. Now that I understand the intended way of directing the data I will modify it accordingly to your proposal (or at least according to my understanding of your proposal). I will get back with the changes, and big thanks for the very useful review! I will reply to the comments related to implementation details, reconciliation, API etc over time, I will need to check why I did some stuff the way I did as I have implemented this some time ago. |
|
PR needs rebase. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
This commit:
Notes:
I think this PR counts as ✨ because although the communication between BMO and the provisioner does change and there could be a +3 increase in the number of update operations, the new functionality is optional.
Related design PR: metal3-io/metal3-docs#552