Swiftorial Logo
Home
Swift Lessons
Matchups
CodeSnaps
Tutorials
Career
Resources

Cross-Cluster Search in Elasticsearch

Introduction

Cross-Cluster Search (CCS) in Elasticsearch allows you to search across multiple clusters as if they were a single cluster. This feature is particularly useful for scaling Elasticsearch horizontally, enabling you to distribute your data across multiple clusters while still being able to query it from a single entry point.

Setting Up Cross-Cluster Search

To set up Cross-Cluster Search, you need to configure remote clusters on your local cluster. This configuration involves specifying the remote clusters' details in the Elasticsearch configuration file or through the API.

Example Configuration

In your elasticsearch.yml file, you can configure a remote cluster as follows:

    cluster:
      remote:
        cluster_one:
          seeds: ["127.0.0.1:9300"]
                    

Querying Across Clusters

Once the remote clusters are configured, you can perform searches across these clusters using the standard Elasticsearch query DSL. The indices on the remote clusters can be referenced using the <cluster_alias>:<index_name> notation.

Example Query

Here is an example of a search request that queries both local and remote indices:

    GET /local_index,cluster_one:remote_index/_search
    {
      "query": {
        "match_all": {}
      }
    }
                    
    {
      "took": 30,
      "timed_out": false,
      "_shards": {
        "total": 20,
        "successful": 20,
        "skipped": 0,
        "failed": 0
      },
      "hits": {
        "total": {
          "value": 1000,
          "relation": "eq"
        },
        "max_score": 1.0,
        "hits": [
          {
            "_index": "local_index",
            "_type": "_doc",
            "_id": "1",
            "_score": 1.0,
            "_source": {
              "field": "value"
            }
          },
          {
            "_index": "cluster_one:remote_index",
            "_type": "_doc",
            "_id": "2",
            "_score": 1.0,
            "_source": {
              "field": "value"
            }
          }
        ]
      }
    }
                    

Advanced Configuration

Elasticsearch allows for more advanced configuration options for Cross-Cluster Search, including setting up multiple seed nodes, configuring sniffing, and adjusting timeouts.

Multiple Seed Nodes

You can configure multiple seed nodes for a remote cluster to ensure high availability:

    cluster:
      remote:
        cluster_one:
          seeds: ["127.0.0.1:9300", "127.0.0.2:9300"]
                    

Configuring Sniffing

Sniffing can be enabled to dynamically discover nodes in the remote cluster:

    cluster:
      remote:
        cluster_one:
          seeds: ["127.0.0.1:9300"]
          sniff: true
                    

Adjusting Timeouts

Timeouts can be configured to control the maximum time to wait for responses from remote clusters:

    cluster:
      remote:
        cluster_one:
          seeds: ["127.0.0.1:9300"]
          skip_unavailable: true
          connections_per_cluster: 3
          initial_connect_timeout: 30s
          socket_timeout: 30s
                    

Conclusion

Cross-Cluster Search is a powerful feature in Elasticsearch that enables you to scale horizontally by distributing data across multiple clusters. By following the setup and configuration steps outlined in this tutorial, you can effectively implement Cross-Cluster Search in your Elasticsearch environment, ensuring that you can query data across clusters seamlessly.