Introduction
In order to understand the context of this event, we first need to explain the different settings at play that govern the allowed number of shards per node. It is also worth noting that both primary and replica shards count towards the configured limit, whether they are assigned to a data node or not. Only shards from closed indices are not taken into account.
First, there is a cluster-wide configuration setting named `cluster.max_shards_per_node` that doesn’t enforce a specific amount of shards per node, but provides an upper limit of how many shards can be created in the whole cluster given the amount of hot/warm/cold/content data nodes available (i.e. number_of_data_nodes * cluster.max_shards_per_node). The default value for this configuration setting is 1000 shards per hot/warm/cold/content data node for normal (non-frozen) indices. Similarly, there is a sibling setting for frozen data nodes named `cluster.max_shards_per_node.frozen` whose default value is 3000 shards of frozen indices per frozen data node. For example, if a cluster has 6 non-frozen data nodes and 1 frozen data node, it is allowed to contain 6000 shards within the nodes in the non-frozen tiers and 3000 shards on the node in the frozen tier.
Then, in parallel to these cluster-wide settings, there is also a node-specific setting named `cluster.routing.allocation.total_shards_per_node` that specifies a hard limit of shards that a data node can host. This setting is set to -1 by default, which means that there is no limit as to how many shards each specific data node is allowed to host, the only constraint is that all data nodes together are not allowed to host more than `cluster.max_shards_per_node` shards.
Finally, it is also possible to define an index-specific setting named `index.routing.allocation.total_shards_per_node`, which specifies a hard limit of shards of a given index that can be hosted on a specific data node. That setting is also set to -1 by default, which means that there is no limit as to how many shards of a given index can be hosted on a specific data node.
Now that we have reviewed all the settings related to the shard count limit in a cluster, we can review how they play together within the context of this event.
What does this mean?
The configuration of total shards per node in your Elasticsearch cluster is set to a value that is deemed too high. Shards are essentially parts of an index, which allow for the distribution of data, enabling faster search and analytics. However, when the number of shards per node exceeds a certain threshold, it can lead to a situation known as oversharding.
Why does this occur?
This typically occurs when the configuration of total shards per node (i.e. `cluster.routing.allocation.total_shards_per_node`) is set to a value that is considered too high. This could be due to a misconfiguration or an attempt to increase performance by increasing the number of shards. However, this can lead to oversharding, which can cause instability in the cluster because the nodes might not have enough resources to handle all the shards.
Possible impact and consequences of total shards per node too high
Oversharding can cause a cluster to become unstable because the master nodes become unable to keep track of a large number of shards across the nodes. This can lead to decreased performance and potential data loss. It can also cause increased resource usage, as each shard consumes a certain amount of CPU, memory, and disk space.
How to resolve
To resolve this issue, it is first important to verify whether you are leveraging the frozen tier or not. If this is the case and the `cluster.routing.allocation.total_shards_per_node` setting is set higher than 3000, we recommend to decrease it to 3000 to match the default value of the `cluster.max_shards_per_node.frozen` setting and also to make sure that you don’t have more than 3000 shards on any given node first.
If you’re not leveraging the frozen tier and the `cluster.routing.allocation.total_shards_per_node` setting is set higher than 1000, we recommend to decrease it to 1000 to match the default value of the `cluster.max_shards_per_node` setting and also to make sure that you don’t have more than 1000 shards on any given node first.
To reduce the allowed number of shards that a node can host, you can use the following command:
PUT _cluster/settings { "persistent": { "cluster.routing.allocation.total_shards_per_node": <hard_limit> } }
After this command is run, each node will not be allowed to host more than <hard_limit> shards. Use with extreme care as this might lead to situations where shards cannot be allocated to any node. By default, this setting is configured to -1, which means that no hard limit is imposed on data nodes.
Conclusion
In this guide, we discussed the issue of having a number of allowed shards per node that is considered too high, its possible impact, and how to resolve it. By reducing the number of allowed shards per node, you can prevent your cluster from becoming unstable and ensure optimal performance and reliability.