Skip to content

Commit a5abd1a

Browse files
thekofimensahshainaraskasgeorgewallace
authored
[DOCS] Rewrite of sizing your shards-rebase (#124444)
* Edits copied over from #120346 * Update docs/reference/how-to/size-your-shards.asciidoc Co-authored-by: shainaraskas <[email protected]> * Improve Scanability Co-authored-by: shainaraskas <[email protected]> * Reduced what is a shard section for concision * Adjusted title * Add general and distribution sizing guidelines. --------- Co-authored-by: shainaraskas <[email protected]> Co-authored-by: George Wallace <[email protected]>
1 parent 30868d3 commit a5abd1a

File tree

1 file changed

+37
-11
lines changed

1 file changed

+37
-11
lines changed

docs/reference/how-to/size-your-shards.asciidoc

+37-11
Original file line numberDiff line numberDiff line change
@@ -1,17 +1,40 @@
11
[[size-your-shards]]
22
== Size your shards
3+
[discrete]
4+
[[what-is-a-shard]]
5+
=== What is a shard?
6+
7+
A shard is a basic unit of storage in {es}. Every index is divided into one or more shards to help distribute data and workload across nodes in a cluster. This division allows {es} to handle large datasets and perform operations like searches and indexing efficiently. For more detailed information on shards, see <<nodes-shards, this page>>.
8+
9+
[discrete]
10+
[[sizing-shard-guidelines]]
11+
=== General guidelines
12+
13+
Balancing the number and size of your shards is important for the performance and stability of an {es} cluster:
14+
15+
* Too many shards can degrade search performance and make the cluster unstable. This is referred to as _oversharding_.
16+
* Very large shards can slow down search operations and prolong recovery times after failures.
17+
18+
To avoid either of these states, implement the following guidelines:
19+
20+
[discrete]
21+
[[general-sizing-guidelines]]
22+
==== General sizing guidelines
23+
24+
* Aim for shard sizes between 10GB and 50GB
25+
* Keep the number of documents on each shard below 200 million
26+
27+
[discrete]
28+
[[shard-distribution-guidelines]]
29+
==== Shard distribution guidelines
330

4-
Each index in {es} is divided into one or more shards, each of which may be
5-
replicated across multiple nodes to protect against hardware failures. If you
6-
are using <<data-streams>> then each data stream is backed by a sequence of
7-
indices. There is a limit to the amount of data you can store on a single node
8-
so you can increase the capacity of your cluster by adding nodes and increasing
9-
the number of indices and shards to match. However, each index and shard has
10-
some overhead and if you divide your data across too many shards then the
11-
overhead can become overwhelming. A cluster with too many indices or shards is
12-
said to suffer from _oversharding_. An oversharded cluster will be less
13-
efficient at responding to searches and in extreme cases it may even become
14-
unstable.
31+
To ensure that each node is working optimally, distribute shards evenly across nodes. Uneven distribution can cause some nodes to work harder than others, leading to performance degradation and instability.
32+
33+
While {es} automatically balances shards, you need to configure indices with an appropriate number of shards and replicas to allow for even distribution across nodes.
34+
35+
If you are using <<data-streams>>, each data stream is backed by a sequence of indices, each index potentially having multiple shards.
36+
37+
In addition to these these general guidelines, you should develop a tailored <<create-a-sharding-strategy, sharding strategy>> that considers your specific infrastructure, use case, and performance expectations.
1538

1639
[discrete]
1740
[[create-a-sharding-strategy]]
@@ -208,6 +231,7 @@ index can be <<indices-delete-index,removed>>. You may then consider setting
208231
<<indices-add-alias,Create Alias>> against the destination index for the source
209232
index's name to point to it for continuity.
210233

234+
See this https://www.youtube.com/watch?v=sHyNYnwbYro[fixing shard sizes video] for an example troubleshooting walkthrough.
211235

212236
[discrete]
213237
[[shard-count-recommendation]]
@@ -571,6 +595,8 @@ PUT _cluster/settings
571595
}
572596
----
573597

598+
See this https://www.youtube.com/watch?v=tZKbDegt4-M[fixing "max shards open" video] for an example troubleshooting walkthrough. For more information, see <<troubleshooting-shards-capacity-issues,Troubleshooting shards capacity>>.
599+
574600
[discrete]
575601
[[troubleshooting-max-docs-limit]]
576602
==== Number of documents in the shard cannot exceed [2147483519]

0 commit comments

Comments
 (0)