Elasticsearch Elasticsearch Query Nested

By Opster Team

Updated: Aug 26, 2023

| 2 min read

Quick Links

Introduction

Nested queries in Elasticsearch are a useful tool that allows you to execute queries against complex, nested JSON documents. They are particularly useful when dealing with arrays of objects, where each object in the array needs to be considered as a separate document.

Understanding nested data structure

Before diving into nested queries, it’s important to understand the nested data structure in Elasticsearch. A nested data type is a specialized type that allows arrays of objects to be indexed and queried independently of each other. This contrasts with the standard object data type, which indexes each object in the array as a separate field in the parent document.

For example, consider a document representing a book, with an array of authors. Each author has a name and an age. If the authors field is of type object, querying for books where one of the authors is named “John” and is 30 years old would return a book even if there is no single author named “John” who is 30. This is because the object data type treats each field in the object as independent.

On the other hand, if the authors field is of type nested, the same query would only return books where there is a single author named “John” who is 30. This is because the nested data type treats each object in the array as a separate hidden document.

Creating nested mappings

To use nested queries, you first need to create a nested mapping for your index. This can be done using the PUT mapping API. Here is an example:

PUT /my_index
{
  "mappings": {
    "properties": {
      "title": {
        "type": "text"
      },
      "authors": {
        "type": "nested",
        "properties": {
          "name": { "type": "text" },
          "age": { "type": "integer" }
        }
      }
    }
  }
}

In this example, there is the top-level title field of type text as well as the authors field of type nested, and each author object has a name and an age.

Executing nested queries

Once you have a nested mapping, you can execute nested queries using the nested query clause. Here is an example:

GET /my_index/_search
{
  "query": {
    "nested": {
      "path": "authors",
      "query": {
        "bool": {
          "must": [
            { "match": { "authors.name": "John" } },
            { "range": { "authors.age": { "gte": 30 } } }
          ]
        }
      }
    }
  }
}

In this example, the nested query looks for documents where there is an author named “John” who is at least 30 years old. The path parameter specifies the path to the nested field, and the query parameter specifies the query to execute against the nested documents.

Nested queries can be combined with other query clauses to create complex queries. For example, you could use a bool query to combine a nested query with a term query, looking for books where one of the authors is named “John” and is at least 30 years old, and the book’s title contains the word “Silence”.

GET /my_index/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "match": {
            "title": "Silence"
          }
        },
        {
          "nested": {
            "path": "authors",
            "query": {
              "bool": {
                "must": [
                  { "match": { "authors.name": "John" } },
                  { "range": { "authors.age": { "gte": 30 } } }
                ]
              }
            }
          }
        }
      ]
    }
  }
}

Nested aggregations

In addition to nested queries, Elasticsearch also supports nested aggregations. These allow you to compute aggregations on the nested documents. For example, you could compute the average age of all authors.

Here is an example of a nested aggregation:

GET /my_index/_search
{
  "aggs": {
    "johns": {
      "nested": {
        "path": "authors"
      },
      "aggs": {
        "johns_age": {
          "avg": {
            "field": "authors.age"
          }
        }
      }
    }
  }
}

In this example, the nested aggregation computes the average age of all authors. The nested clause specifies the path to the nested field, and the avg clause specifies the aggregation to compute.