-
Notifications
You must be signed in to change notification settings - Fork 424
OCPNODE-3029: WIP: handle required minimum kubelet version featuregate rollout #4929
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
OCPNODE-3029: WIP: handle required minimum kubelet version featuregate rollout #4929
Conversation
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: haircommander The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
some background openshift/enhancements#1766 |
@haircommander: This pull request references OCPNODE-3029 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the spike to target the "4.19.0" version, but no target version was set. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
} | ||
go ctrl.writebackMinimumKubeletVersionIfAppropriate(updatedPools, renderedVersions, nodeConfig, func() ([]*mcfgv1.MachineConfigPool, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
any particular reason this is a separate goroutine?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought it could be async in case it takes a while to have the mcps rollback and I didn't know what could be blocked by having it sync
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think a better solution might be to push this onto one of the work queues by doing something like this:
func (ctrl *Controller) syncKubeletConfig(key string) error {
// Key lookup stuff above here.
// Here, we detect that we need to do this for the current kubeletconfig, so we just kick that off.
if ctrl.writeMinimumKubeletVersion[kubeletCfg.Name] {
defer func() {
delete(ctrl.writeMinimumKubeletVersion, kubeletCfg.Name)
}()
return ctrl.writebackMinimumKubeletVersionIfAppropriate(...)
}
//
// Bulk of function above here.
//
if ctrl.isMinimumKubeletVersionWritebackNeeded(...) {
// Here, we update our internal state to indicate that we need to perform this action
// and enqueue the action.
ctrl.writeMinimumKubeletVersion[kubeletcfg.Name] = true
return ctrl.enqueue(kubeletCfg)
}
// End of function
}
We'll probably want to use a sync.Map
since the work queue has multiple workers that could mutate the controller struct state at any given time.
if node.Spec.MinimumKubeletVersion == node.Status.MinimumKubeletVersion || | ||
node.Status.MinimumKubeletVersion == renderedKubeletVersion || | ||
node.Spec.MinimumKubeletVersion != renderedKubeletVersion { | ||
klog.InfoS("Skipping writeback to nodes.config.Status.MinimumKubeletVersion because situation not correct", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
question: why are these conditions not correct? isn't this first one just saying that the spec and status match what we expect?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if the status matches the spec, no need to do the update (though I probably should actually check renderedKubeletVersion there too)
for the second, if the status already matches the rendered then we've already done the write back for this rendered version
for the third: we're rendering a different version than what is set in the spec..
I think I have to rework this condition though
for _, mcp := range mcps { | ||
if mcp.Status.UpdatedMachineCount > 0 { | ||
poolsUpdated[mcp.Name] = true | ||
} else if _, updated := poolsUpdated[mcp.Name]; !updated { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I understand the logic here. The if condition checks if at least 1 node is updated, and if not, whether a previous loop had updated it? Wouldn't it be easier to check if the mcp itself is updated directly, and then break out of the loop if any pool doesn't meet either condition?
Note that we should be careful of pools with no nodes in them, which I think would be covered if we copy the server check
bdece5e
to
b6c210d
Compare
b6c210d
to
c0c5483
Compare
} | ||
go ctrl.writebackMinimumKubeletVersionIfAppropriate(updatedPools, renderedVersions, nodeConfig, func() ([]*mcfgv1.MachineConfigPool, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think a better solution might be to push this onto one of the work queues by doing something like this:
func (ctrl *Controller) syncKubeletConfig(key string) error {
// Key lookup stuff above here.
// Here, we detect that we need to do this for the current kubeletconfig, so we just kick that off.
if ctrl.writeMinimumKubeletVersion[kubeletCfg.Name] {
defer func() {
delete(ctrl.writeMinimumKubeletVersion, kubeletCfg.Name)
}()
return ctrl.writebackMinimumKubeletVersionIfAppropriate(...)
}
//
// Bulk of function above here.
//
if ctrl.isMinimumKubeletVersionWritebackNeeded(...) {
// Here, we update our internal state to indicate that we need to perform this action
// and enqueue the action.
ctrl.writeMinimumKubeletVersion[kubeletcfg.Name] = true
return ctrl.enqueue(kubeletCfg)
}
// End of function
}
We'll probably want to use a sync.Map
since the work queue has multiple workers that could mutate the controller struct state at any given time.
c0c5483
to
9d06551
Compare
5c626cc
to
c1f6f9f
Compare
c1f6f9f
to
d827cc8
Compare
d827cc8
to
9e14edf
Compare
Signed-off-by: Peter Hunt <[email protected]>
Signed-off-by: Peter Hunt <[email protected]>
When a cluster admin rolls out a minimum kubelet verison, the cluster-config-operator (eventually) will render a new set of featuregates based on the new minimum version (enabling ones that are safe given the new minimum). However, on a rollback, we need to make sure the MCS won't return a rendered config using the old features, in the very odd case a cluster admin creates a newly old node and somehow overrides the osImage (unlikely). Thus, we use the similar condition as MCS to serve the config--once one node has updated, we treat the MCP as using the new config. Then, we write the status to the node config object so the node authorization plugin can allow older nodes that are now deemed safe. Signed-off-by: Peter Hunt <[email protected]>
9e14edf
to
1a7271d
Compare
@haircommander: The following tests failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
- What I did
When a cluster admin rolls out a minimum kubelet verison, the cluster-config-operator (eventually) will render
a new set of featuregates based on the new minimum version (enabling ones that are safe given the new minimum).
- How to verify it
- Description for the changelog