You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Clarify replica shard allocation step in rolling restart procedure (#1246)
Updates the restart documentation to specify that setting
`cluster.routing.allocation.enable` to `primaries` disables
**_replica_** shard allocation, not all shard allocation.
The phrase “disable shard allocation” seems to be a colloquial
shorthand. Those experienced with Elasticsearch may implicitly
understand it refers to replica shards during restarts, but this can be
unclear to newer users.
Historically (and TIL 💡), the restart procedures
([doc](https://www.elastic.co/guide/en/elasticsearch/reference/6.6/rolling-upgrades.html))
used the `none` setting to fully disable shard allocation, which matched
the “disable shard allocation” phrasing . In version 6.7, the
recommended setting changed to `primaries` to allow primary shard
allocation while avoiding unnecessary replica movement. However, the
descriptive text was not updated to reflect this change, leaving behind
a vestigial phrase from earlier behavior.
Clarifying this language helps align the documentation with the actual
behavior in the example API calls we provide and reduces the risk of
misinterpretation (e.g., mistakenly using "none" instead of
"primaries").
Copy file name to clipboardExpand all lines: deploy-manage/maintenance/start-stop-services/full-cluster-restart-rolling-restart-procedures.md
+9-9
Original file line number
Diff line number
Diff line change
@@ -17,8 +17,8 @@ Nodes exceeding the low watermark threshold will be slow to restart. Reduce the
17
17
18
18
## Full-cluster restart [restart-cluster-full]
19
19
20
-
1.**Disable shard allocation.**
21
-
When you shut down a data node, the allocation process waits for `index.unassigned.node_left.delayed_timeout` (by default, one minute) before starting to replicate the shards on that node to other nodes in the cluster, which can involve a lot of I/O. Since the node is shortly going to be restarted, this I/O is unnecessary. You can avoid racing the clock by [disabling allocation](elasticsearch://reference/elasticsearch/configuration-reference/cluster-level-shard-allocation-routing-settings.md#cluster-routing-allocation-enable) of replicas before shutting down [data nodes](../../distributed-architecture/clusters-nodes-shards/node-roles.md#data-node-role):
20
+
1.**Disable replica shard allocation.**
21
+
When you shut down a data node, the allocation process waits for `index.unassigned.node_left.delayed_timeout` (by default, one minute) before starting to replicate the shards on that node to other nodes in the cluster, which can involve a lot of I/O. Since the node is shortly going to be restarted, this I/O is unnecessary. You can avoid racing the clock by [disabling allocation of replicas](elasticsearch://reference/elasticsearch/configuration-reference/cluster-level-shard-allocation-routing-settings.md#cluster-routing-allocation-enable) before shutting down [data nodes](../../distributed-architecture/clusters-nodes-shards/node-roles.md#data-node-role):
22
22
23
23
```console
24
24
PUT _cluster/settings
@@ -91,8 +91,8 @@ Nodes exceeding the low watermark threshold will be slow to restart. Reduce the
91
91
When a node joins the cluster, it begins to recover any primary shards that are stored locally. The [`_cat/health`](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-cat-health) API initially reports a `status` of `red`, indicating that not all primary shards have been allocated.
92
92
Once a node recovers its local shards, the cluster `status` switches to `yellow`, indicating that all primary shards have been recovered, but not all replica shards are allocated. This is to be expected because you have not yet re-enabled allocation. Delaying the allocation of replicas until all nodes are `yellow` allows the master to allocate replicas to nodes that already have local shard copies.
93
93
94
-
8. **Re-enable allocation.**
95
-
When all nodes have joined the cluster and recovered their primary shards, re-enable allocation by restoring `cluster.routing.allocation.enable` to its default:
94
+
8. **Re-enable replica shard allocation.**
95
+
When all nodes have joined the cluster and recovered their primary shards, re-enable replica allocation by restoring `cluster.routing.allocation.enable` to its default:
96
96
97
97
```console
98
98
PUT _cluster/settings
@@ -103,7 +103,7 @@ Nodes exceeding the low watermark threshold will be slow to restart. Reduce the
103
103
}
104
104
```
105
105
106
-
Once allocation is re-enabled, the cluster starts allocating replica shards to the data nodes. At this point it is safe to resume indexing and searching, but your cluster will recover more quickly if you can wait until all primary and replica shards have been successfully allocated and the status of all nodes is `green`.
106
+
Once replica allocation is re-enabled, the cluster starts allocating replica shards to the data nodes. At this point it is safe to resume indexing and searching, but your cluster will recover more quickly if you can wait until all primary and replica shards have been successfully allocated and the status of all nodes is `green`.
107
107
You can monitor progress with the [`_cat/health`](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-cat-health) and [`_cat/recovery`](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-cat-recovery) APIs:
108
108
109
109
```console
@@ -123,8 +123,8 @@ Nodes exceeding the low watermark threshold will be slow to restart. Reduce the
123
123
124
124
## Rolling restart [restart-cluster-rolling]
125
125
126
-
1. **Disable shard allocation.**
127
-
When you shut down a data node, the allocation process waits for `index.unassigned.node_left.delayed_timeout` (by default, one minute) before starting to replicate the shards on that node to other nodes in the cluster, which can involve a lot of I/O. Since the node is shortly going to be restarted, this I/O is unnecessary. You can avoid racing the clock by [disabling allocation](elasticsearch://reference/elasticsearch/configuration-reference/cluster-level-shard-allocation-routing-settings.md#cluster-routing-allocation-enable) of replicas before shutting down [data nodes](../../distributed-architecture/clusters-nodes-shards/node-roles.md#data-node-role):
126
+
1. **Disable replica shard allocation.**
127
+
When you shut down a data node, the allocation process waits for `index.unassigned.node_left.delayed_timeout` (by default, one minute) before starting to replicate the shards on that node to other nodes in the cluster, which can involve a lot of I/O. Since the node is shortly going to be restarted, this I/O is unnecessary. You can avoid racing the clock by [disabling allocation of replicas](elasticsearch://reference/elasticsearch/configuration-reference/cluster-level-shard-allocation-routing-settings.md#cluster-routing-allocation-enable) before shutting down [data nodes](../../distributed-architecture/clusters-nodes-shards/node-roles.md#data-node-role):
128
128
129
129
```console
130
130
PUT _cluster/settings
@@ -187,8 +187,8 @@ Nodes exceeding the low watermark threshold will be slow to restart. Reduce the
187
187
GET _cat/nodes
188
188
```
189
189
190
-
7. **Reenable shard allocation.**
191
-
For data nodes, once the node has joined the cluster, remove the `cluster.routing.allocation.enable` setting to enable shard allocation and start using the node:
190
+
7. **Re-enable replica shard allocation.**
191
+
For data nodes, once the node has joined the cluster, remove the `cluster.routing.allocation.enable` setting to enable replica shard allocation and start using the node:
0 commit comments