Elasticsearch OpenSearch Cross Cluster Search and Cross Cluster Replication

By Opster Expert Team - Gustavo

Updated: Jun 28, 2023

| 6 min read

Quick links

Introduction

This article introduces OpenSearch Cross Cluster Search (CCS) and Cross Cluster Replication (CCR) features.

Previously, we compared OpenSearch and Elasticsearch Cross Cluster Search features, however, this article will focus solely on OpenSearch CCS and CCR.

Diagram explaining Cross Cluster Search (CCS) in OpenSearch.

Cross cluster search enables execution of searches from one cluster to another, allowing queries to be performed across multiple machines from a single point of access.

First, users must create the necessary roles to do a CCS in the remote cluster.

The user has to be created in both clusters, but only the remote cluster has to have role permissions.

The coordinating cluster will validate user authentication, and then the remote one will validate that users have permissions to run the query against the index.

Creating a role.

Keep in mind that role definitions must be created on the remote cluster.

After a new user is created, open the role again to map the new user:Next, users must configure the remote cluster on the coordinating one:

Mapping a new user in OpenSearch cross-cluster.
Adding the new user in OpenSearch cross cluster.

Remember that role definitions must be created on the remote cluster.

Next, users must configure the remote cluster on the coordinating one:

curl -k -XPUT -H 'Content-Type: application/json' -u 'admin:admin' 'https://localhost:9200/_cluster/settings' -d '
{
  "persistent": {
    "cluster.remote": {
      "opensearch-remote-cluster": {
        "seeds": ["<remote-cluster-node-ip>:9300"]
      }
    }
  }
}'

*Note the port, 9300. 9300 is used instead of 9200 because this is the transport layer (node to node communication). Port 9200 is for http communication (client to node communication).

Now, simply run queries against the cluster just registered, the opensearch-remote-cluster

We will use the user we just created: cross_cluster_user. Remember to create it in both clusters, with the cross_cluster role in the remote cluster. No additional roles are required in the source cluster, as only the coordinating cluster performs authentication.

First, let’s create an index on the remote cluster with a user that has the permission  to do so:

curl -XPUT -k -u user:password ‘https://remote-cluster-node-ip:9200/some_index

Thisuser can only be present in the remote cluster and can create indices.

And now, from the coordinating cluster, run a query against the remote one:

curl -XGET -k -u cross_cluster_user:password 'https://localhost:9200/opensearch-remote-cluster:some_index/_search?pretty'

If you see this error: 

no permissions for [indices:admin/shards/search_shards, indices:data/read/search] and User [name=cross_cluster_user, roles=[], requestedTenant=null]

The user hasn’t been mapped to the user created role in the remote cluster.

Summary of CCS

Cross cluster search (CCS) is a powerful tool that enables users to execute searches across multiple machines from a single point of access. To successfully set up CCS, it is essential to create users in both coordinating and remote clusters, ensuring that the remote cluster has the necessary role permissions. 

Once user authentication and permissions have been verified, configuring the remote cluster on the coordinating cluster is the next step, followed by running queries against the registered cluster using the newly created user.

Cross Cluster Replication (CCR) 

Diagram explaining Cross Cluster Replication (CCR) in OpenSearch.

Cross cluster replication allows you to mirror indices on different clusters. The initial index acts as a “leader” and then every document operation done against this index (create, update, delete) will be followed by the index in the remote(s) cluster(s).

What is important to note is the replication occurs at the index level, so you can follow one index by many clusters, or do bi-directional replication.

Some use cases for Cross Cluster Replication are: 

  • Having a backup cluster in case of problems
  • Closer cluster to the final users (latency)
  • Decoupling searching volume from indexing

Prerequisites for CCR

  • Replication plugin must be installed in all the clusters.
  • If you override the node.roles properties in your opensearch.yml file , you must add remote_cluster_client back on the follower cluster:

 node.roles: [<other_roles>, remote_cluster_client]

Permissions needed for CCR

OpenSearch includes roles for both leader and followers out of the box for non-admin users to perform all the leader/follower activities (start/stop following). This configuration is optimal in most cases. 

Role permissions needed for CCR in OpenSearch.

If you want to go granular, you can create your own roles. The following permissions can be granted:

Follower cluster

indices:admin/plugins/replication/index/setup/validate
indices:admin/plugins/replication/index/start
indices:admin/plugins/replication/index/pause
indices:admin/plugins/replication/index/resume
indices:admin/plugins/replication/index/stop
indices:admin/plugins/replication/index/update
indices:admin/plugins/replication/index/status_check
indices:data/write/plugins/replication/changes
cluster:admin/plugins/replication/autofollow/update

Leader cluster

indices:admin/plugins/replication/validate
indices:data/read/plugins/replication/file_chunk
indices:data/read/plugins/replication/changes

How to configure CCR in OpenSearch

  1. Create the follower cluster connection with the leader cluster.
  2. Create the leader index.
  3. Replicate to follower index.

Optional

  1. Confirm replication
  2. Pause replication 
  3. Resume replication 
  4. Stop replication
  5. Auto-Follow

1. Create the connection with the leader

Start creating the connection in the follower cluster:

curl -XPUT -k -H 'Content-Type: application/json' -u 'admin:admin' 'https://follower-ip-address:9200/_cluster/settings?pretty' -d '
{
  "persistent": {
    "cluster": {
      "remote": {
        "my-connection-alias": {
          "seeds": ["<leader-ip-address>:9300"]
        }
      }
    }
  }
}'

Note the port 9300. It is 9300 and not 9200 because this is the transport layer (node to node communication). Port 9200 is for http communication (client to node communication)

2. Create leader index

The leader index is a regular OpenSearch index, and must be created on the leader cluster.

curl -XPUT -k -H 'Content-Type: application/json' -u 'admin:admin' 'https://leader-ip-address:9200/leader-01?pretty'

3. Replicate to follower index

curl -XPUT -k -H 'Content-Type: application/json' -u 'admin:admin' 'https://follower-ip-address:9200/_plugins/_replication/follower-01/_start?pretty' -d '
{
   "leader_alias": "my-connection-alias",
   "leader_index": "leader-01",
   "use_roles":{
      "leader_cluster_role": "cross_cluster_replication_leader_full_access",
      "follower_cluster_role": "cross_cluster_replication_follower_full_access"
   }
}'

If the security plugin is disabled, omit the use_roles parameter. If it’s enabled, however, you must specify the leader and follower cluster roles that OpenSearch will use to authenticate the request.

Now the leader-01 index from the leader cluster will be replicated on the follower-01 index on the follower cluster.

4. Optional Activities

To remove a follower index you must first stop the replication.

a. Confirm replication

curl -XGET -k -u 'admin:admin' 'https://follower-ip-address:9200/_plugins/_replication/follower-01/_status?pretty'

b. Pause replication

curl -XPOST -k -H 'Content-Type: application/json' -u 'admin:admin' 'https://follower-ip-address:9200/_plugins/_replication/follower-01/_pause?pretty' -d '{}'

c. Resume replication

curl -XPOST -k -H 'Content-Type: application/json' -u 'admin:admin' 'https://follower-ip-address:9200/_plugins/_replication/follower-01/_resume?pretty' -d '{}

d. Stop replication

'https://follower-ip-address:9200/_plugins/_replication/follower-01/_stop?pretty' -d '{}'

e. Auto-Follow

In addition to replicating individual indices, you can create replication rules, which are a set of patterns that defines when an index should be followed.

curl -XPOST -k -H 'Content-Type: application/json' -u 'admin:admin' 'https://follower-ip-address:9200/_plugins/_replication/_autofollow?pretty' -d '
{
   "leader_alias" : "my-connection-alias",
   "name": "my-replication-rule",
   "pattern": "movies*",
   "use_roles":{
      "leader_cluster_role": "all_access",
      "follower_cluster_role": "all_access"
   }
}'

This rule will start creating and following all the indices starting with movies in the follower cluster, and then all the ones created afterwards.

To delete a replication rule, you must run the following:

curl -XDELETE -k -H 'Content-Type: application/json' -u 'admin:admin' 'https://follower-ip-address:9200/_plugins/_replication/_autofollow?pretty' -d '
{
   "leader_alias" : "my-connection-alias",
   "name": "my-replication-rule"
}'

This will avoid new indices from replicating, but the existing ones will keep following. To stop replicating the existing ones, use the replication stop API after removing the rule.

Summary of CCR

In conclusion, Cross Cluster Replication allows mirroring of indices across different clusters and offers several benefits:

  • Maintaining a backup cluster
  • Reducing latency with a cluster closer to users
  • Decoupling searching volume from indexing

During this article we learned how to configure step by step:

  • Establish a connection with the leader cluster
  • Create the leader index
  • Replicate to the follower index

And to execute maintenance tasks:

  • Confirm replication
  • Pause replication
  • Resume replication
  • Stop replication
  • Auto-Follow

By understanding and implementing Cross Cluster Replication, organizations can enhance their data management, increase redundancy, and improve user experience.