Elasticsearch OpenSearch DSL Exists Query

By Opster Expert Team

Updated: Jun 28, 2023

| 3 min read

Overview and background

OpenSearch provides a full query DSL (Domain Specific Language) based on JSON files to define the queries. You can think of the query DSL as an abstract syntax tree that has two types of queries:

  1. Leaf query clauses – search for a specific value in a certain field. For example, term and match queries. These queries can be used independently.
  2. Compound query clauses – wrap other leaf or compound queries and are used to logically combine multiple queries, such as the bool query, or to modify their behavior, as in the constant score query.

Term-level queries can be used to locate documents based on specific values in structured data. Structured data could include: date ranges, IP addresses, prices, product IDs, etc. Term-level queries, unlike full-text queries, do not analyze search terms, however they do match the specific terms stored in a field.

There are different types of term-level queries, one of which is the exists query. The exists query provides the functionality to return documents that contain an indexed value for a field.

It is recommended to read about OpenSearch Mapping which will help you understand how fields are going to be indexed into OpenSearch and the mapping settings. In addition, you can read about OpenSearch Search to understand the GET API request to the _search endpoint, and how we are looking for documents in an index, which will help you build a background to understand the exists query better.

What DSL exists query is used for

Some fields in documents may not have an indexed value due to several reasons. The exists query is used for returning the documents that have an indexed value for a specific field, which means it returns the documents that the specified field exists on. 

An indexed value may not exist in a document’s field because:

  1. The field in the source document is null or an empty array (i.e. [ ]).
  2. In the mapping, the field has an “index”: false setting.
  3. The length of a keyword field value exceeds the mapping’s ignore_above setting.
  4. In the mapping, ignore_malformed was defined since the field value was malformed.

How to implement the exists query

The exists query has one required parameter which is the field parameter. The field parameter, which is string type, represents the field that you want to search the index for, returning the documents that have an indexed value for that field.

In the following query, the exists query will search the “my_index” index for documents that have an indexed value for the “my_field” field, and those documents will be returned by the exists query.

GET my_index/_search
{
  "query": {
    "bool": {
      "filter": [
        {
          "exists": {
            "field": "my_field"
          }
        }
      ]
    }
  }
}

A field is determined as not existing if the JSON value is null or [ ]. The following values indicate that the field does exist:

  • Empty strings like “”.
  • Arrays that include null and another value, like [null, “foo”].
  • In field mapping, a custom null value is defined.

Notes

Since a boolean query is used to combine multiple queries, the exists query can be used inside a boolean query. You can use the must_not boolean query with the exists query to locate documents that are missing an indexed value for a field.

In the following query, the search will return the documents from the “my_index” index where the indexed value for the “my_field” field is missing.

GET my_index/_search
{
  "query": {
    "bool": {
      "must_not": {
        "exists": {
          "field": "my_field"
        }
      }
    }
  }
}

KQL Exists Query 

The Kibana Query Language (KQL) provides a simple syntax for searching and filtering OpenSearch data using either free text search or field-based search. KQL is exclusively used to filter data; it plays no part in sorting or aggregating it. There is an equivalent KQL exists query to DSL exists query that works in the same way. 

The KQL exists query matches documents that contain any value for a field. In the example below, the documents that have any value for the field user will be matched.

user:*

OpenSearch defines the existence and it includes all values, including empty text. As shown in the example, the KQL exists query simply can be used by putting the field name followed by :* which means; match the documents that have the specified field existing in any form.

To locate documents that are missing an indexed value for a field, specify “not” before the field name. In the example below, the documents that have no value for the user field will be matched.

not user:*

Summary

  • Exists query is a DSL query, that is of the type “leaf query clauses” and it is a term level query. 
  • The exists query is used for returning the documents that have an indexed value for a specific field, because some fields in documents may not have an indexed value. 
  • Exists query can be easily implemented by setting the one required parameter, the field parameter, to the field that you want to search the index for in order to return the documents that have an indexed value for that field. 
  • The exists query can be used inside a boolean query. 
  • Using the must_not boolean query with the exists query you can locate documents that are missing an indexed value for a field. 
  • There is an equivalent KQL exists query to DSL exists query that works in the same way and matches documents that contain any value for a field.