Overview
When indexing data, Elasticsearch requires a “refresh” operation to make indexed information available for search. This means that there is a time delay between indexing and the updated information actually becoming available for the client applications.
How it works
Index operations occur in memory. The operations are accumulated in a buffer until refreshed, which requires that the buffer itself be transferred to a newly created lucene segment. Refresh happens by default every second, but it is also possible to change this frequency for a given index, or directly request a refresh through the refresh api.
Examples
You can set the refresh interval on an index like this:
PUT /my_index/_settings { "index" : { "refresh_interval" : "30s" } }
You can use a value of -1 to stop refreshing but remember to set it back once you’ve finished indexing!
You can force a refresh on a given index like this:
POST my_index/_refresh
You can also force a refresh at the end of an index operation by adding an extra parameter in the URL like this:
POST /my_index/_index?refresh=waitfor
In this case, the “waitfor” parameter will force the client to wait for the refresh to complete before returning (useful in scripts), or you can use “true” to force the refresh without keeping the script waiting.
Notes and good things to know
Refreshing is very resource intensive, so you can increase indexing speed by reducing the refresh rate. You can do this temporarily if you need to reload a lot of data. For some logging applications it is perfectly acceptable to have a 30s latency, for instance, before data actually becomes available.
Beware of the refresh interval when scripting or updating. Scripts often work faster than the refresh interval, so if necessary, you might need to call a refresh before retrieving or updating data in your scripts, or use the waitfor parameter while indexing as described above.