Elasticsearch Elasticsearch Json Array

By Opster Team

Updated: Aug 28, 2023

| 2 min read

Introduction

Elasticsearch can handle a variety of data types, including JSON arrays. This article will delve into the intricacies of handling JSON arrays in Elasticsearch, including indexing and querying JSON arrays, and how to deal with nested JSON arrays.

Indexing JSON arrays

In Elasticsearch, a JSON array is treated as a multivalue field. Multivalue fields are fields that can contain zero, one, or more values. However, unlike traditional databases, there is no explicit array data type. Instead, any field can contain zero or more values by default, however it is defined.

Here’s an example of how you might index a JSON array in Elasticsearch:

json
PUT /my_index/_doc/1
{
  "user": "John Doe",
  "user_tags": ["elasticsearch", "json", "arrays"]
}

In this example, the `user_tags` field is a JSON array containing three elements.

Querying JSON arrays

When it comes to querying JSON arrays in Elasticsearch, it’s important to note that the original JSON array structure is not maintained. Instead, each value in the array is treated as a separate value of the field.

For instance, if you want to find all documents where `user_tags` contains “json”, you could use a match query like this:

json
GET /my_index/_search
{
  "query": {
    "match": {
      "user_tags": "json"
    }
  }
}

This query would match the document we indexed earlier, as “json” is one of the values in the `user_tags` array.

Dealing with nested JSON arrays

Things get a bit more complex when dealing with nested JSON arrays. In Elasticsearch, a nested datatype is a data type which allows documents to contain nested fields, or fields within fields. This is useful for arrays of objects, where each object can be treated as a separate document.

Here’s an example of a document with a nested JSON array:

json
PUT /my_index/_doc/1
{
  "user": "John Doe",
  "user_tags": [
    {
      "tag": "elasticsearch",
      "tag_importance": "high"
    },
    {
      "tag": "json",
      "tag_importance": "medium"
    },
    {
      "tag": "arrays",
      "tag_importance": "low"
    }
  ]
}

In this example, the `user_tags` field is a nested JSON array, where each object in the array has two fields: `tag` and `tag_importance`.

To query nested fields, you need to use the nested query. For instance, if you want to find all documents where there is a `user_tags` object with `tag` equal to “json” and `tag_importance` equal to “medium”, you could use a nested query like this:

json
GET /my_index/_search
{
  "query": {
    "nested": {
      "path": "user_tags",
      "query": {
        "bool": {
          "must": [
            { "match": { "user_tags.tag": "json" } },
            { "match": { "user_tags.tag_importance": "medium" } }
          ]
        }
      }
    }
  }
}

This query would match the document we indexed earlier, as one of the `user_tags` objects has `tag` equal to “json” and `tag_importance` equal to “medium”.