Briefly, this error occurs when there is an attempt to modify the field data circuit breaker settings in Elasticsearch. The field data circuit breaker is a mechanism to prevent out of memory errors by limiting the amount of memory a single query can use. To resolve this issue, you can either increase the limit if your system has enough memory or optimize your queries to use less memory. Additionally, consider using doc values instead of field data for memory-efficient handling of large amounts of data.
This guide will help you check for common problems that cause the log ” Updated breaker settings field data: {} ” to appear. To understand the issues related to this log, read the explanation below about the following Elasticsearch concepts: settings, breaker, indices.
Settings in Elasticsearch
In Elasticsearch, you can configure cluster-level settings, node-level settings and index level settings. Here is a quick rundown of each level.
A. Cluster settings
These settings can either be:
- Persistent, meaning they apply across restarts, or
- Transient, meaning they won’t survive a full cluster restart.
If a transient setting is reset, the first one of these values that is defined is applied:
- The persistent setting
- The setting in the configuration file
- The default value
The order of precedence for cluster settings is:
- Transient cluster settings
- Persistent cluster settings
- Settings in the elasticsearch.yml configuration file
Examples
An example of persistent cluster settings update:
PUT /_cluster/settings { "persistent" : { "indices.recovery.max_bytes_per_sec" : "500mb" } }
An example of a transient update:
PUT /_cluster/settings { "transient" : { "indices.recovery.max_bytes_per_sec" : "40mb" } }
B. Index settings
These are the settings that are applied to individual indices. There is an API to update index level settings.
Examples
The following API call will set the number of replica shards to 5 for my_index index.
PUT /my_index/_settings { "index" : { "number_of_replicas" : 5 } }
To revert a setting to the default value, use null.
PUT /my_index/_settings { "index" : { "refresh_interval" : null } }
C. Node settings
These settings apply to nodes. Nodes can fulfill different roles. These include the master, data, and coordination roles. Node settings are set through the elasticsearch.yml file for each node.
Examples
Setting a node to be a data node (in the elasticsearch.yml file):
node.data: true
Disabling the ingest role for the node (which is enabled by default):
node.ingest: false
For production clusters, you will need to run each type of node on a dedicated machine with two or more instances of each, for HA (minimum three for master nodes).
Notes and good things to know
- Learning more about the cluster settings and index settings is important – it can spare you a lot of trouble. For example, if you are going to ingest huge amounts of data into an index and the number of replica shards is set to say, 5, the indexing process will be super slow because the data will be replicated at the same time it is indexed. What you can do to speed up indexing is to set the replica shards to 0 by updating the settings, and set it back to the original number when indexing is done, using the settings API.
- Another useful example of using cluster-level settings is when a node has just joined the cluster and the cluster is not assigning any shards to the node. Although shard allocation is enabled by default on all nodes, someone may have disabled shard allocation at some point (for example, in order to perform a rolling restart), and forgot to re-enable it later. To enable shard allocation, you can update the Cluster Settings API:
PUT /_cluster/settings{"transient":{"cluster.routing.allocation.enable":"all"}}
- It’s better to set cluster-wide settings with Settings API instead of with the elasticsearch.yml file and to use the file only for local changes. This will keep the same setting on all nodes. However, if you define different settings on different nodes by accident using the elasticsearch.yml configuration file, it is hard to notice these discrepancies.
- See also: Recovery
Overview
Elasticsearch has the concept of circuit breakers to deal with OutOfMemory errors that cause nodes to crash. When a request reaches Elasticsearch nodes, the circuit breakers first estimate the amount of memory needed to load the required data. They then compare the estimated size with the configured heap size limit. If the estimated size is greater than the heap size, the query is terminated and an exception is thrown to avoid the node loading more than the available heap size.
What they are used for
Elasticsearch has several circuit breakers available such as fielddata, requests, network, indices and script compilation. Each breaker is used to limit the memory an operation can use. In addition, Elasticsearch has a parent circuit breaker which is used to limit the combined memory used by all the other circuit breakers.
Examples
Increasing circuit breaker size for fielddata limit – The default limit for fielddata breakers is 40%. The following command can be used to increase it to 60%:
PUT /_cluster/settings { "persistent": { "indices.breaker.fielddata.limit": "60%" } }
Notes
- Each breaker ships with default limits and their limits can be modified as well. But this is an expert level setting and you should understand the pitfalls carefully before changing the limits, otherwise the node may start throwing OOM exceptions.
- Sometimes it is better to fail a query instead of getting an OOM exception, because when OOM appears JVM becomes unresponsive.
- It is important to keep indices.breaker.request.limit lower than indices.breaker.total.limit so that request circuit breakers trip before the total circuit breaker.
Common problems
- The most common error resulting from circuit breakers is “data too large” with 429 status code. The application should be ready to handle such exceptions.
- If the application starts throwing exceptions because of circuit breaker limits, it is important to review the queries and memory requirements. In most cases, a scaling is required by adding more resources to the cluster.
Log Context
Log “Updated breaker settings field data: {}” classname is HierarchyCircuitBreakerService.java.
We extracted the following from Elasticsearch source code for those seeking an in-depth context :
HierarchyCircuitBreakerService.this.fielddataSettings.getOverhead() : newFielddataOverhead; BreakerSettings newFielddataSettings = new BreakerSettings(CircuitBreaker.FIELDDATA; newFielddataLimitBytes; newFielddataOverhead; this.fielddataSettings.getType(); this.fielddataSettings.getDurability()); registerBreaker(newFielddataSettings); HierarchyCircuitBreakerService.this.fielddataSettings = newFielddataSettings; logger.info("Updated breaker settings field data: {}"; newFielddataSettings); } private void setAccountingBreakerLimit(ByteSizeValue newAccountingMax; Double newAccountingOverhead) { BreakerSettings newAccountingSettings = new BreakerSettings(CircuitBreaker.ACCOUNTING; newAccountingMax.getBytes(); newAccountingOverhead; HierarchyCircuitBreakerService.this.accountingSettings.getType();
[ratemypost]