Elasticsearch Elasticsearch Terms Stats

By Opster Team

Updated: Aug 26, 2023

| 1 min read

Introduction 

Elasticsearch provides a wide range of functionalities, one of these is the ability to generate statistics on terms within a particular field using the combination of the `terms` and `stats` aggregations . This feature is particularly useful when you need to analyze and understand the distribution of your data.

The combination of these two aggregations allows you to retrieve and analyze the statistics of terms in a specific field. It provides you with the count, total, min, max, mean, and sum of the terms in the field. This feature is beneficial when you need to understand the distribution of your data, especially when dealing with large datasets.

How to use the Terms Stats feature in Elasticsearch.

Step 1: Indexing the data

First, you need to index your data. For instance, if you have a dataset of products with fields such as ‘product_name’, ‘price’, and ‘quantity_sold’, you can index this data into Elasticsearch.

Step 2: Using the Terms Stats feature

Once your data is indexed, you can generate statistics on the terms in a specific field. 

For example, if you want to generate statistics on the ‘product_name’ field, you can use the following command:

GET /products/_search
{
  "aggs": {
    "product_stats": {
      "terms": {
        "field": "product_name.keyword"
      },
      "aggs": {
        "price_stats": {
          "stats": {
            "field": "price"
          }
        }
      }
    }
  }
}

In this command, ‘product_stats’ is the name of the aggregation, ‘terms’ is the type of aggregation, ‘product_name.keyword’ is the field on which the aggregation is performed, and ‘price_stats’ is the sub-aggregation that generates the statistics on the ‘price’ field.

Step 3: Interpreting the results

The result of the above command will be a JSON object that contains the statistics of the terms in the ‘product_name’ field. The ‘buckets’ array in the result contains the terms and their corresponding statistics. Each bucket represents a term (i.e. a product) and contains the count, min, max, avg, and sum of the ‘price’ field for that term.

For example, if the ‘product_name’ field contains the terms ‘Product A’, ‘Product B’, and ‘Product C’, the result will contain three buckets, one for each term. Each bucket will contain the count, min, max, avg, and sum of the ‘price’ field for that term.

Step 4: Using the results

The results of this aggregation query can be used in various ways. For instance, you can use the results to understand the distribution of your data, identify trends, and make informed decisions. You can also use the results to create visualizations and dashboards in Kibana, a data visualization tool that integrates with Elasticsearch.