Briefly, this error occurs when Elasticsearch is unable to refresh the sequence number within the specified timeout period. This could be due to heavy indexing load, slow disk I/O, or network latency. To resolve this issue, you can increase the timeout value, reduce the indexing load, optimize your disk I/O operations, or improve your network connectivity. Additionally, ensure your Elasticsearch cluster is properly sized and configured for your workload.
This guide will help you check for common problems that cause the log ” Wait for seq_no [{}] refreshed timed out [{}] ” to appear. To understand the issues related to this log, read the explanation below about the following Elasticsearch concepts: search.
Overview
Search refers to the searching of documents in an index or multiple indices. The simple search is just a GET API request to the _search endpoint. The search query can either be provided in query string or through a request body.
Examples
When looking for any documents in this index, if search parameters are not provided, every document is a hit and by default 10 hits will be returned.
GET my_documents/_search
A JSON object is returned in response to a search query. A 200 response code means the request was completed successfully.
{ "took" : 1, "timed_out" : false, "_shards" : { "total" : 2, "successful" : 2, "failed" : 0 }, "hits" : { "total" : 2, "max_score" : 1.0, "hits" : [ ... ] } }
Notes and good things to know
- Distributed search is challenging and every shard of the index needs to be searched for hits, and then those hits are combined into a single sorted list as a final result.
- There are two phases of search: the query phase and the fetch phase.
- In the query phase, the query is executed on each shard locally and top hits are returned to the coordinating node. The coordinating node merges the results and creates a global sorted list.
- In the fetch phase, the coordinating node brings the actual documents for those hit IDs and returns them to the requesting client.
- A coordinating node needs enough memory and CPU in order to handle the fetch phase.
Log Context
Log “Wait for seq_no [{}] refreshed timed out [{}]” class name is SearchService.java. We extracted the following from Elasticsearch source code for those seeking an in-depth context :
// index shard on timeout so that a timed-out listener does not use up any listener slots. final TimeValue timeout = request.getWaitForCheckpointsTimeout(); final Scheduler.ScheduledCancellable timeoutTask = NO_TIMEOUT.equals(timeout) ? null : threadPool.schedule(() -> { if (isDone.compareAndSet(false; true)) { listener.onFailure( new ElasticsearchTimeoutException("Wait for seq_no [{}] refreshed timed out [{}]"; waitForCheckpoint; timeout) ); } }; timeout; Names.SAME); // allow waiting for not-yet-issued sequence number if shard isn't promotable to primary and the timeout is less than or equal
[ratemypost]