Elasticsearch Elasticsearch Return Count: Efficiently Counting Documents in Elasticsearch

By Opster Team

Updated: Jul 23, 2023

| 2 min read

Introduction

When working with Elasticsearch, it’s often necessary to count the number of documents that match a specific query or to retrieve the total number of documents in an index. In this article, we will explore different methods to efficiently count documents in Elasticsearch, including using the Count API, the Search API with size set to 0, and the cardinality aggregation. If you want to learn about Key ; doc-count – how to solve related issues, check out this guide.

1. Count API

The Count API is a dedicated API for counting the number of documents that match a query. It is more efficient than using the Search API with size set to 0, as it doesn’t need to retrieve any document data. The Count API can be used with or without a query, and it returns the total number of documents that match the query or the total number of documents in the index if no query is provided.

To use the Count API, send a GET request to the following endpoint:

GET /<index>/_count

If you want to count documents that match a specific query, include the query in the request body:

GET /<index>/_count
{
  "query": {
    "match": {
      "field": "value"
    }
  }
}

2. Search API with size set to 0

Another method to count documents in Elasticsearch is to use the Search API with the size parameter set to 0. This method is less efficient than the Count API, as it needs to follow the regular search phases, even though it doesn’t return any document data.

To use the Search API with size set to 0, send a GET request to the following endpoint:

GET /<index>/_search?size=0

If you want to count documents that match a specific query, include the query in the request body:

GET /<index>/_search?size=0
{
  "query": {
    "match": {
      "field": "value"
    }
  }
}

3. Cardinality Aggregation

The cardinality aggregation is used to count the number of unique values in a field. The cardinality aggregation uses the HyperLogLog++ algorithm, which provides an approximate count with a configurable precision.

To use the cardinality aggregation, send a GET request to the following endpoint:

GET /<index>/_search?size=0
{
  "aggs": {
    "unique_count": {
      "cardinality": {
        "field": "field"
      }
    }
  }
}

To control the precision of the cardinality aggregation, you can set the `precision_threshold` parameter. A higher value will result in a more accurate count but will consume more memory:

GET /<index>/_search?size=0
{
  "aggs": {
    "unique_count": {
      "cardinality": {
        "field": "field",
        "precision_threshold": 1000
      }
    }
  }
}

Conclusion 

In conclusion, there are multiple methods to count documents in Elasticsearch, each with its own advantages and use cases. The Count API is the most efficient method for counting documents that match a query or retrieving the total number of documents in an index. The Search API with size set to 0 can also be used for counting documents, but it is less efficient than the Count API. Finally, the cardinality aggregation is useful for counting the number of unique values in a specific field, with a configurable precision.