Skip to content

Commit 3e88121

Browse files
authored
Merge pull request #90634 from xJustin/OSDOCS-12261-auto-repair-2
OSDOCS-12261 autorepair setting for machine pools
2 parents 098f495 + ac45bef commit 3e88121

File tree

5 files changed

+145
-0
lines changed

5 files changed

+145
-0
lines changed

Diff for: modules/rosa-autorepair-cli.adoc

+89
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,89 @@
1+
// Module included in the following assemblies:
2+
//
3+
// * rosa_cluster_admin/rosa_nodes/rosa-managing-worker-nodes.adoc
4+
// * nodes/rosa-managing-worker-nodes.adoc
5+
6+
7+
:_mod-docs-content-type: PROCEDURE
8+
[id="rosa-autorepair-cli_{context}"]
9+
= Configuring machine pool AutoRepair using the ROSA CLI
10+
11+
You can configure machine pool AutoRepair for your {product-title} cluster by using the ROSA CLI.
12+
13+
14+
.Prerequisites
15+
16+
17+
* You installed and configured the latest AWS (`aws`) and ROSA (`rosa`) CLIs on your workstation.
18+
* You logged in to your Red{nbsp}Hat account by using the `rosa` CLI.
19+
* You created a {hcp-title} cluster.
20+
* You have an existing machine pool.
21+
22+
.Procedure
23+
24+
. List the machine pools in the cluster by running the following command:
25+
+
26+
[source,terminal]
27+
----
28+
$ rosa list machinepools --cluster=<cluster_name>
29+
----
30+
+
31+
.Example output
32+
[source,terminal]
33+
----
34+
ID AUTOSCALING REPLICAS INSTANCE TYPE LABELS TAINTS AVAILABILITY ZONE SUBNET VERSION AUTOREPAIR
35+
workers No 2/2 m5.xlarge us-east-2a subnet-0df2ec3377847164f 4.16.6 Yes
36+
db-nodes-mp No 2/2 m5.xlarge us-east-2a subnet-0df2ec3377847164f 4.16.6 Yes
37+
----
38+
39+
. Enable or disable AutoRepair on a machine pool:
40+
41+
* To enable or disable AutoRepair for a machine pool, run the following command:
42+
+
43+
[source,terminal]
44+
----
45+
$ rosa edit machinepool --cluster=mycluster --machinepool=<machinepool_name> --autorepair false
46+
----
47+
+
48+
.Example output
49+
[source,terminal]
50+
----
51+
I: Updated machine pool 'machinepool_name' on cluster 'mycluster'
52+
----
53+
54+
55+
.Verification
56+
57+
. Describe the details of the machine pool:
58+
+
59+
[source,terminal]
60+
----
61+
$ rosa describe machinepool --cluster=<cluster_name> --machinepool=<machinepool_name>
62+
----
63+
+
64+
.Example output
65+
[source,terminal]
66+
----
67+
ID: machinepool_name
68+
Cluster ID: <ID_of_cluster>
69+
Autoscaling: No
70+
Desired replicas: 2
71+
Current replicas: 2
72+
Instance type: m5.xlarge
73+
Labels:
74+
Tags:
75+
Taints:
76+
Availability zone: us-east-2a
77+
...
78+
Autorepair: Yes
79+
Tuning configs:
80+
Kubelet configs:
81+
Additional security group IDs:
82+
Node drain grace period:
83+
Management upgrade:
84+
- Type: Replace
85+
- Max surge: 1
86+
- Max unavailable: 0
87+
----
88+
89+
. Verify that the AutoRepair setting is correct for your machine pool in the output.

Diff for: modules/rosa-autorepair-ocm.adoc

+31
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,31 @@
1+
// Module included in the following assemblies:
2+
//
3+
// * rosa_cluster_admin/rosa_nodes/rosa-managing-worker-nodes.adoc
4+
// * nodes/rosa-managing-worker-nodes.adoc
5+
6+
7+
:_mod-docs-content-type: PROCEDURE
8+
[id="rosa-autorepair-ocm_{context}"]
9+
= Configuring AutoRepair on a machine pool using {cluster-manager}
10+
11+
You can configure machine pool AutoRepair for your {product-title} cluster by using {cluster-manager-first}.
12+
13+
.Prerequisites
14+
15+
* You created a {hcp-title} cluster.
16+
* You have an existing machine pool.
17+
18+
.Procedure
19+
20+
21+
. Navigate to {cluster-manager-url} and select your cluster.
22+
. Under the *Machine pools* tab, click the Options menu {kebab} for the machine pool that you want to configure auto repair for.
23+
. From the menu, select *Edit*.
24+
. From the *Edit Machine Pool* dialog box that displays, find the *AutoRepair* option.
25+
. Select or clear the box next to *AutoRepair* to enable or disable.
26+
. Click *Save* to apply the change to the machine pool.
27+
28+
.Verification
29+
30+
. Under the *Machine pools* tab, select *>* next to your machine pool to expand the view.
31+
. Verify that your machine pool has the correct *AutoRepair* setting in the expanded view.

Diff for: modules/rosa-configuring-autorepair.adoc

+19
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,19 @@
1+
// Module included in the following assemblies:
2+
//
3+
// * rosa_cluster_admin/rosa_nodes/rosa-managing-worker-nodes.adoc
4+
// * nodes/rosa-managing-worker-nodes.adoc
5+
//
6+
7+
:_mod-docs-content-type: PROCEDURE
8+
[id="rosa-configuring-autorepair_{context}"]
9+
= Configuring machine pool AutoRepair
10+
11+
{hcp-title} supports an automatic repair process for machine pools, called AutoRepair. AutoRepair is useful when you want the ROSA service to detect certain unhealthy nodes, drain the unhealthy nodes, and re-create the nodes. You can disable AutoRepair if the unhealthy nodes should not be replaced, such as in cases where the nodes should be preserved. AutoRepair is enabled by default on machine pools.
12+
13+
The AutoRepair process deems a node unhealthy when the state of the node is either `NotReady` or is in an unknown state for predefined amount of time (typically 8 minutes). Whenever two or more nodes become unhealthy simultaneously, the AutoRepair process stops repairing the nodes.
14+
Similarly, when a new node is created unhealthy even after a predefined amount of time (typically 20 minutes), the service will auto-repair.
15+
16+
[NOTE]
17+
====
18+
Machine pool AutoRepair is only available for {hcp-title} clusters.
19+
====

Diff for: modules/rosa-create-objects.adoc

+3
Original file line numberDiff line numberDiff line change
@@ -796,6 +796,9 @@ The default value is `0`, meaning that no outdated nodes are removed before new
796796

797797
|--taints
798798
|Taints for the machine pool. This string value should be formatted as a comma-separated list of `key=value:ScheduleType`. This list will overwrite any modifications made to Node taints on an ongoing basis.
799+
800+
|--autorepair
801+
|AutoRepair setting for the machine pool represented as the boolean `true` or `false`.
799802
|===
800803

801804
.Optional arguments inherited from parent commands

Diff for: rosa_cluster_admin/rosa_nodes/rosa-managing-worker-nodes.adoc

+3
Original file line numberDiff line numberDiff line change
@@ -50,6 +50,9 @@ include::modules/rosa-adding-tags-cli.adoc[leveloffset=+2]
5050
include::modules/rosa-adding-taints.adoc[leveloffset=+1]
5151
include::modules/rosa-adding-taints-ocm.adoc[leveloffset=+2]
5252
include::modules/rosa-adding-taints-cli.adoc[leveloffset=+2]
53+
include::modules/rosa-configuring-autorepair.adoc[leveloffset=+1]
54+
include::modules/rosa-autorepair-ocm.adoc[leveloffset=+2]
55+
include::modules/rosa-autorepair-cli.adoc[leveloffset=+2]
5356
5457
ifdef::openshift-rosa-hcp[]
5558
include::modules/rosa-adding-tuning.adoc[leveloffset=+1]

0 commit comments

Comments
 (0)