-
Notifications
You must be signed in to change notification settings - Fork 1.8k
OCPBUGS-48479: Adding MHC exception to Pausing MHC cluster update … #91781
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
@obrown1205: This pull request references Jira Issue OCPBUGS-48479, which is invalid:
Comment The bug has been updated to refer to the pull request using the external bug tracker. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
🤖 Tue Apr 08 14:32:42 - Prow CI generated the docs preview: https://91781--ocpdocs-pr.netlify.app/openshift-enterprise/latest/disconnected/updating/disconnected-update.html |
|
||
[NOTE] | ||
==== | ||
As an exception, you do not have to pause the `machine-api-termination-handler` resource, as it does not deploy a new node if the node has been flagged as `notReady`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
instead of naming a particular MachineHealthCheck, maybe call out the property of the MHC that means you don't need to pause it? In this case, I expect it's using the fatal Terminating
as its unhealthyConditions
, so something like:
As an exception, you do not have to pause MachineHealthChecks where the
unhealthyConditions
is a fatal condition likeTerminating
beingTrue
. Those Machines are toast anyway, so no use waiting before reaping.
or whatever the public-docs analog of that is?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Basically LGTM, but I'd like to see how to address @wking 's question.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense :) TY!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@wking how about:
"Some MachineHealthChecks might not need to be paused. If your MachineHealthCheck (MHC) resource has a fatal condition met, new nodes cannot be deployed, and pausing that MHC is unnecessary."
ee39a44
to
26ea783
Compare
26ea783
to
ed84141
Compare
|
||
[NOTE] | ||
==== | ||
Some MachineHealthChecks might not need to be paused. If your MachineHealthCheck (MHC) resource has a fatal condition met, new nodes cannot be deployed, and pausing that MHC is unnecessary. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤖 [error] RedHat.TermsErrors: Use 'unrecoverable' rather than 'fatal'. For more information, see RedHat.TermsErrors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: maybe just drop "new nodes cannot be deployed", because it's not about new Nodes watched by the MHC, it's about the Node that the MHC is sad about being terminal. Maybe something like:
If your MachineHealthCheck (MHC) resource relies on unrecoverable conditions, pausing that MHC is unnecessary.
@obrown1205: all tests passed! Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
…step
Updating to add an exception note for one of the MachineHealthCheck resources that does not need to be paused during this step.
Version(s):
4.14+
Issue:
OCPBUGS-48479
Link to docs preview:
https://91781--ocpdocs-pr.netlify.app/openshift-enterprise/latest/disconnected/updating/disconnected-update.html
https://91781--ocpdocs-pr.netlify.app/openshift-enterprise/latest/updating/updating_a_cluster/updating-cluster-cli.html
QE review:
Additional information: