Received response for a request that has timed out sent ms ago timed out ms ago action node id – How to solve this Elasticsearch error

Opster Team

Aug-23, Version: 2.3-6.8-2.3-6.8

Briefly, this error occurs when Elasticsearch receives a response for a request that has already timed out. This could be due to network latency, overloaded server, or slow processing of the request. To resolve this issue, you can increase the timeout limit in Elasticsearch settings. Alternatively, you can optimize your queries to reduce processing time or upgrade your server to handle more requests. Also, check your network for any issues that might be causing delays in communication.

We recommend you run Elasticsearch Error Check-Up which can resolve issues that cause many errors.

Advanced users might want to skip right to the common problems section in each concept or try running the Check-Up which analyses ES to pinpoint the cause of many errors and provides suitable actionable recommendations how to resolve them (free tool that requires no installation).

Overview: 

Elasticsearch search and index requests are executed within a timeout if specified, and if within the defined timeout threshold the request isn’t completed, then Elasticsearch sends the response calculated till that duration and in JSON response includes `timed_out: true`.

But the above time_out param in response is just an indicator that the request didn’t execute within a specified timeout, that it contained partial results and that the request is still  running in the background. When the request completes its execution, then Elasticsearch logs this as a *warning* in the log format below.

 Received response for a request that has timed out, sent [{}ms] ago, timed out [{}ms] ago 

If there are many such logs in a regular interval it should be fixed to prevent the outages and cluster resource saturations.

Some tips on how to identify the cause of such a timeout can be found in How to use Slow Logs the Complete Guide. This link is specific to search slow logs, but the same can be used to filter the slow indexing logs quickly as well. Most of the time, the cause of these timeouts is slow-running expensive search queries.

Once the cause of such timeout is identified below, fixes can be applied to prevent timeouts and to make clusters perform at an optimal level.

  1. Fix the expensive search queries by implementing them in different ways; One example is, Autocomplete queries which are expensive, and different implementation w.r.t to their performance impact as mentioned in Auto-complete guide .
  2. If there is a huge number of documents in a single shard or fewer shards, then split them into additional shards on different physical servers and make the performance comparison between them. More info can be found here: Shards and Replicas getting started guide.
  3. This log is present for both Elasticsearch version 2.3 and version 6.8. If you are using the latest Elasticsearch version(7.X) then use https://www.elastic.co/guide/en/elasticsearch/reference/current/tasks.html to identify the long-running tasks in Elaticsearch and kill them. 
  4. Try including the timeout in more queries (with different values) and track the metrics (how many times they timeout) to get some idea of how they perform during peak times.
  5. For older versions of Elasticsearch like (1.X, 2.X) where task management API is not available, make use of terminate_after param in the search query to terminate long-running queries.

Log Context

Log “Received response for a request that has timed out; sent [{}ms] ago; timed out [{}ms] ago; action [{}]; node [{}]; id [{}]” classname is TransportService.java.
We extracted the following from Elasticsearch source code for those seeking an in-depth context :

             final String action;
            assert clientHandlers.get(requestId) == null;
            TimeoutInfoHolder timeoutInfoHolder = timeoutInfoHandlers.remove(requestId);
            if (timeoutInfoHolder != null) {
                long time = System.currentTimeMillis();
                logger.warn("Received response for a request that has timed out; sent [{}ms] ago; timed out [{}ms] ago; action [{}]; node [{}]; id [{}]"; time - timeoutInfoHolder.sentTime(); time - timeoutInfoHolder.timeoutTime(); timeoutInfoHolder.action(); timeoutInfoHolder.node(); requestId);
                action = timeoutInfoHolder.action();
                sourceNode = timeoutInfoHolder.node();
            } else {
                logger.warn("Transport response handler not found of id [{}]"; requestId);
                action = null;




 

 [ratemypost]