OpenSearch Async Search: How to Use, With Code Snippets

By Opster Expert Team - Gustavo

Updated: Jun 28, 2023

| 2 min read

What is async search?

Waiting for the payload to get to the client can take a very long time when you’re querying large amounts of data.

The async search API is designed to retrieve huge amounts of data in a stream fashion instead of a single request.

This means that instead of waiting for the query to finish retrieving all the results, the async query will be returning the results partially as it’s collecting them.

The query will return an ID and other status indicators, so you can close your Kibana DevTools console or terminal and come back later to see your query’s progress and the results fetched.

Running an async search query

The async search query receives the same parameters as a regular search.

Let’s index some documents and run a query.

POST test_async/_doc
{
  "text": "Doc1"
}

POST test_async/_doc
{
  "text": "Doc2"
}

POST test_async/_doc
{
  "text": "Example doc"
}

POST test_async/_async_search

The response will look like this:

{
  “id”: “SOME_ID”,
  "is_partial" : false,
  "is_running" : false,
  "start_time_in_millis" : 1636010235096,
  "expiration_time_in_millis" : 1636442235096,
  "response" : {
    "took" : 719,
    "timed_out" : false,
    "_shards" : {
      "total" : 1,
      "successful" : 1,
      "skipped" : 0,
      "failed" : 0
    },
    "hits" : {
      "total" : {
        "value" : 3,
        "relation" : "eq"
      },
      "max_score" : 1.0,
      "hits" : [
        {
          "_index" : "test_async",
          "_type" : "_doc",
          "_id" : "0JjG6XwBpL6RE1SX6qi6",
          "_score" : 1.0,
          "_source" : {
            "title" : "Example doc"
          }
        },
        {
          "_index" : "test_async",
          "_type" : "_doc",
          "_id" : "0ZjO6XwBpL6RE1SX0Kgt",
          "_score" : 1.0,
          "_source" : {
            "text" : "Doc1"
          }
        },
        {
          "_index" : "test_async",
          "_type" : "_doc",
          "_id" : "0pjO6XwBpL6RE1SX1KgU",
          "_score" : 1.0,
          "_source" : {
            "text" : "Doc2"
          }
        }
      ]
    }
  }
}

Important properties in async search queries

Field	Description
id	If the query takes longer than the preset time set on wait_for_completion_timeout, an ID is generated to retrieve the query status later.
is_partial	When the query is running, this parameter will always be true. Otherwise, it will indicate if the query failed or is complete.
is_running	Indicates whether the query is running or complete.
shards.total	Total amount of shards the query will be executed against.
shards.successful	The amount of shards which, up until the current point in time, have been successfully executed against.
hits.total.value	Documents returned by the query so far. These documents belong to the “shards successful”.

How to retrieve status and hits

To retrieve the status and hits of our async query we just need to run a GET request:

GET /_async_search/SOME_ID

The current status and hits of the async query will be returned.

How to retrieve status alone

If we don’t need the hits of the query and only want to check the status, we can call the status endpoint:

GET /_async_search/status/SOME_ID

The response will look like this:

{
  "id" : "FmRldE8zREVEUzA2ZVpUeGs2ejJFUFEaMkZ5QTVrSTZSaVN3WlNFVmtlWHJsdzoxMDc=",
  "is_running" : true,
  "is_partial" : true,
  "start_time_in_millis" : 1583945890986,
  "expiration_time_in_millis" : 1584377890986,
  "_shards" : {
      "total" : 562,
      "successful" : 188, 
      "skipped" : 0,
      "failed" : 0
  }
}

The “successful” property indicates the amount of shards the query was executed on.

For an async search that has been completed, the status response has an additional completion_status field that shows the HTTP status code of the completed async search.

For example, if the query executed correctly:

“completion_status” : 200

If the query had errors:

“completion_status” : 503

How to delete a query

If you want to cancel the async query at some point you can call the DELETE verb and the query will be canceled.

DELETE /_async_search/SOME_ID

If OpenSearch security features are enabled, there are two types of users that can delete queries:

1. The authenticated user that fired the query

2. A user that has cancel_task cluster privileges.

Additional parameters

Field	Description
wait_for_completion_timeout	Blocks the query execution so that it finishes after this time, defaulting to 1 second. Results will not be stored (no ID field) if the query finished before this time.
keep_on_completion	Stores results even if the query finished within wait_for_completion_timeout.
keep_alive	Defaults to 5 days and determines the amount of time the async queries status will be saved. After this time all the ongoing queries and statuses will be deleted.
batched_reduce_size	Defines how often partial results become available, defaults to 5.
request_cache	Used to enable or disable caching on a per-request basis. Defaults to true.

The following parameters cannot be changed but are worth mentioning:

Field	Description
pre_filter_shard_size	Set to 1, enforces the execution of a pre-filter roundtrip to skip the documents that don't match the query.
ccs_minimize_roundtrips	Indicates whether network round-trips should be minimized as part of cross-cluster search requests execution. Set to false.

OpenSearch by default does not limit the size of the async queries response. Storing huge responses might destabilize the cluster. To limit the maximum response size you can change the search.max_async_search_response_size cluster setting.

Conclusion

Using async search is a great idea when you need to run high demanding queries and want to retrieve partial results instead of waiting until the end of the query.

Elasticsearch OpenSearch Async Search

What is async search?

Running an async search query

Important properties in async search queries

How to retrieve status and hits

How to retrieve status alone

How to delete a query

Additional parameters

Conclusion