By Opster Expert Team - Flávio Knob

Updated: Oct 19, 2023

| 6 min read

Improving aggregation performance in OpenSearch

Even though OpenSearch is most known for its full text search capabilities, many use cases also take advantage of another very powerful feature OpenSearch delivers out of the box: the aggregations framework.

Aggregations are used everywhere in OpenSearch Dashboards. Every dashboard with visualization that sums up data collected from the Beats agents uses aggregations. OpenSearch APM, which is OpenSearch alternative to instrumentation and performance monitoring of applications, has an app in OpenSearch Dashboards that also heavily uses aggregations to present data to the user. Same thing with the Machine Learning app and the SIEM.

Even if your search use case might not include an “aggs” section directly in the request body, you might still be using aggregations without noticing, in OpenSearch Dashboards or any other visualization tool.

If you do use aggregations in your use case (or if you are planning to), there are some caveats to pay attention to in order to avoid harmful design decisions and adjustments you can make to ensure your aggregations run as efficiently as possible.

How to improve OpenSearch aggregation performance:

Limit the scope by filtering documents out
Experiment with different sharding settings
Evaluate high-cardinality fields and global ordinals
Increase refresh interval
Set size parameter to 0
Take advantage of node/shard caching
Aggregate only what you need

1. Limit the scope by filtering documents out

First things first: the more documents you can filter out, the better, and that’s what you can achieve with a query clause. Being a distributed system, the more data OpenSearch has to reach in order to calculate the aggregated values you requested, the more orchestration and network overhead effort will be needed and will add to the final performance and processing time.

Even though some aggregated values are not an exact value, but rather an approximation, it is still worthwhile to reduce the volume of data OpenSearch should consider. So if your use case allows you to simply filter by, for example, a range query, considering your UI doesn’t even give the user filter options that represent buckets older than the past three months, take advantage of that and reduce the scope of your query in order to improve the aggregations processing.

GET sales/_search
{
  "size": 0,
  "query": {
    "range": {
      "sold_at": {
        "gte": "now-3M/y",
        "lte": "now"
      }
    }
  },
  "aggs": {
    "by-category": {
      "terms": {
        "field": "category",
        "size": 10
      }
    }
  }
}

2. Experiment with different sharding settings

Sharding is one of the key components of OpenSearch’s architecture that makes its distributed processing and high-availability features possible. When it comes to aggregations, having the correct number of shards can drastically improve the performance in the search phase of query. See this guide for tips on how to find the correct number of shards per index.

Trying to find the right number of shards for an index is actually not an easy task. Even OpenSearch considers this a “million dollar” question, for which the answer is often “It depends”. As there is no one correct answer for all use cases, it’s worth experimenting.

Considering that given OpenSearch’s index design it is not possible to change the index setting index.number_of_shards without having to reindex it. To change the number of shards you will need to reindex your index, but you should be able to do that as you wish, either to adapt to new scenarios or just to experiment, and you should be able to do that without any downtime.

One way to achieve that is to use the alias pattern. With this pattern you basically hide the actual indices behind an alias, which allows us to:

Trigger long running reindex processes, while keeping your current index version available.
Having different versions of your indices available, so you can easily change from one to another, whether for test/experimentation purposes, or to rollback due to a problem identified when the new version reaches production.
Conduct A/B testing.

When it comes to experimenting with sharding, the alias pattern allows us to trigger a full reindex to another index with a different index.number_of_shards setting, wait for this reindex to complete, then switch the alias, test the new version for a while and finally safely delete the old version of the index.

Another way to change the number of shards of your index is to use the shrink or split API. You’d use the first one to reduce the number of primary shards of an index and the later for increasing it. Those APIs abstract the process of creating a new index, changing its settings and moving the documents to it. They try to achieve that through a process that involves hard-linking the segments from the source index to the target index, when possible. Even though those APIs are probably faster than a reindex, they impose some requirements, such as:

The index must be read-only
Primary shards should reside (or be moved) to the same node
The index must have a green status
It is recommended that you remove all replica shards (you can bring them back once the operations completes)
The new number of shards should be a factor of the current set number of shards

Another common pitfall when it comes to setting the number of shards is to set it to an amount that actually causes the cluster to cross the boundary where sharding brings benefits with the distributed parallel processing, and starts creating overhead bottlenecks: oversharding.

That’s one reason why you should be able to freely experiment with the number of shards you set to an index and also be ready to revise the number of shards whenever your cluster grows on nodes.

3. Evaluate high-cardinality fields and global ordinals

In most cases, a terms aggregation will not present performance issues. Usually when you apply a terms aggregation to a field where the number of generated buckets is generally low, the cluster will have no problem calculating your metrics aggregations for each bucket. That’s the case if you, for example, aggregate by country, product category, gender and so on.

But whenever you need to perform a terms aggregation on a field for which there are too many unique possible values, that presents a challenge to the cluster known as high-cardinality terms aggregation issue. You can think of an e-commerce related use case, where the cluster needs to run an aggregation by bucketing by individual user e-mails (possibly millions, if we’re talking about a really successful e-commerce). As an example from another field, imagine an information security solution where you need to bucket and report a metric for each source IP address.

Terms aggregations rely on an internal data structure known as global ordinals. Those structures maintain statistics for each unique value of a given field. Those statistics are calculated and kept at the shard level and further combined in the reduce phase to produce a final result.

The performance of terms aggregations on a given field can be harmed as the number of unique values for that field increases, but also because by default those stats are lazily calculated, meaning the first aggregation run after the last refresh will have to calculate them. If you have a scenario where you have a heavy indexing rate of documents that contain fields with high-cardinality and you frequently execute terms aggregations on those fields, your cluster might be struggling with this issue, since the global ordinals will be frequently being recalculated.

To learn more about global ordinals and how to evaluate the impact on your aggregation performance, read this guide.

4. Increase refresh interval

Every time a document gets created, updated or even deleted, that operation is first executed on a temporary structure known as an indexing buffer. OpenSearch flushes the content of these temporary structures to other data structures on disk known as segments either every time the segment is full or when the limit of a defined refresh interval is reached. By default this refresh interval is set to every 1 second.

The problem with merging the segments into larger segments every so often is that this adds an overhead cost, one which you cannot avoid if you really need your recently ingested data to be available for searching as soon as possible. On the other hand, if you can increase the index’s refresh interval, that could represent not only a gain of performance for the data ingestion, but also could free up some of the cluster’s resources to be used for executing queries.

Other than that, less frequent refreshes means less frequent global ordinals calculation, which could also help to improve the overall aggregations execution performance.

The refresh interval is set on an index by changing the refresh_interval setting like this:

PUT my_index/_settings
{
  "index.refresh_interval": "5s"
}

5. Set size parameter to 0

OpenSearch recommends that whenever you don’t need the search hits, you set the size parameter to 0. That helps avoid filling the shard request cache, which can improve aggregation performance.

POST my-index-000001/_search
{
  "size": 0,
  "aggs": {
    "my-agg-name": {
      "terms": {
        "field": "my-field"
      }
    }
  }
}

6. Take advantage of node/shard caching

By default each request to the cluster will be assigned to any of the available and eligible nodes and shards.

The _search API can receive a preference parameter, which can influence how the cluster will select the node/shards that will be responsible for fulfilling the request. You can use this parameter for taking advantage of node/shard caching, since the cluster will route requests with the same provided preference string to the same shards in the same order, as long as the cluster state and selected shards do not change.

POST /my-index-000001/_search?preference=my-custom-shard-string
{
  "query": {
    "match": {
      "user.id": "kimchy"
    }
  },
  "aggs": {
    "my-agg-name": {
      "terms": {
        "field": "my-field"
      }
    }
  }
}

You can define any string that does not start with the character _ as your preference string. If the cluster state and selected shards do not change, searches using the same string value are routed to the same shards in the same order. Using a value that represents the user’s session is recommended, since it’s common for a user to trigger several queries in a row, only narrowing the results. This will prevent the search request from being distributed to another node and not taking advantage of node-level caching.

One special value you can set for the preference parameter is _local. With that the node that is currently resolving the request will try only to use shards locally available, if any, in order to avoid network overhead. If to solve the request it needs data from shards that are only available in another node, then it uses the adaptive replica selection as a fallback.

POST /my-index-000001/_search?preference=_local
{
  "query": {
    "match": {
      "user.id": "kimchy"
    }
  },
  "aggs": {
    "my-agg-name": {
      "terms": {
        "field": "my-field"
      }
    }
  }
}

You can also use the preference parameter to force or prefer that the request be routed to specific nodes, which can be useful if, for instance, you want to give special users special treatment, so you route their requests to nodes with the fastest hardware. Here are some of the other special values the preference parameter could receive:

_only_local
_local
_only_nodes:<node-id>,<node-id>
_prefer_nodes:<node-id>,<node-id>
_shards:<shard>,<shard>

7. Aggregate only what you need

It seems obvious, but it doesn’t hurt to remind you: aggregations can be very demanding on your cluster’s resources and because of that you should be well aware of what aggregations you are requesting and if you really need them. In critical cases, if your request is composed by both the query and aggregations sections, one failing aggregation can make your whole request fail.

So, make sure you really need all those aggs you’re requesting the cluster to solve for you.

Also, asking for a stats aggregation on the field, but only using the average? Change it to the avg aggregation. Again, be aware of what you need and ask the cluster only for that.