Elasticsearch Max Shards Per Node Exceeded: Cluster & Index Level Limit

By Opster Team

Updated: Jan 28, 2024

| 2 min read

Overview

Elasticsearch permits you to set a limit of shards per node, which could result in shards not being allocated once that limit is exceeded. The effect of having unallocated replica shards is that you do not have replica copies of your data, and could lose data if the primary shard is lost or corrupted (cluster yellow). The outcome of having unallocated primary shards is that you are not able to write data to the index at all (cluster red). If you get this warning it is important to take the necessary actions to fix it as soon as possible.

The shards per node limit may have been set up at an index level or at a cluster level, so you need to find out which of the settings are causing this warning.

How to fix it

Check to see whether the limit is at a cluster level or index level.

Cluster level shards limit

Run:

GET /_cluster/settings

Look for a setting:

cluster.routing.allocation.total_shards_per_node

If you don’t see the above setting, then ignore this section, and go to index level shards limit below.

As a quick fix you can either delete old indices, or increase the number of shards to what you need, but be aware that a large number of shards on your node can cause performance problems, and in an extreme cases even bring your cluster down.

PUT /_cluster/settings
{
  "transient": {
	"cluster.routing.allocation.total_shards_per_node": 1000
  }
}

It is preferable to apply a permanent fix, see Shards Too Small (Oversharding) in Elasticsearch – Explained and Elasticsearch Search Latency Due to Bursts of Traffic – A Complete Guide to learn more.

Index level shards limit

It is possible to limit the number of shards per node for a given index. Check the settings for the yellow or red index with:

GET /<index>/_settings/index.routing*

Look for the setting: index.routing.allocation.total_shards_per_node

This setting is sometimes used to force Elasticsearch to spread nodes on a certain index across a cluster, but may come into conflict with other cluster allocation settings (eg. if the disk is getting full on one node, or if the number of nodes has reduced).

Before changing the setting, it is probably worth considering why Elasticsearch is unable to respect the rule, and fixing the root cause (ie delete old indices, or recover/replace a node which is down). However if that is not possible, if the current setting is just wrong, or if you only need a short term fix then you can change the index level setting using the following:

PUT <index>/_settings
{"index.routing.allocation.total_shards_per_node":-1}

Note in the code above -1 = Unbounded, or set the number to whatever you need.