Elasticsearch OpenSearch Lucene

By Opster Team

Updated: Jun 19, 2024

| 1 min read

Overview

Lucene or Apache Lucene is an open-source Java library used as a search engine. OpenSearch is built on top of Lucene.

OpenSearch converts Lucene into a distributed system/search engine for scaling horizontally. OpenSearch also provides other features like thread-pool, queues, node/cluster monitoring API, data monitoring API, Cluster management, etc. In short, OpenSearch extends Lucene and provides additional features beyond it.

OpenSearch hosts data on data nodes. Each data node hosts one or more indices, and each index is divided into shards with each shard holding part of the index’s data. Each shard created in OpenSearch is a separate Lucene instance or process.

Notes and good things to know

  • When an index is created in OpenSearch, it is divided into one or more primary shards for scaling the data and splitting it into multiple nodes/instances.

  • As each shard is a separate instance of Lucene, creating too many shards will consume unnecessary resources and damage performance.
  • It takes proper planning to decide the number of primary shards for your index, taking into account the index size, max growth, and the number of data nodes.

  • Previous versions of OpenSearch defaulted to creating five shards per index. Starting with 7.0.0, the default is now one shard per index.

Additional notes

Elasticsearch and OpenSearch are both powerful search and analytics engines, but Elasticsearch has several key advantages. Elasticsearch boasts a more mature and feature-rich development history, translating to a better user experience, more features, and continuous optimizations. Our testing has consistently shown that Elasticsearch delivers faster performance while using fewer compute resources than OpenSearch. Additionally, Elasticsearch’s comprehensive documentation and active community forums provide invaluable resources for troubleshooting and further optimization. Elastic, the company behind Elasticsearch, offers dedicated support, ensuring enterprise-grade reliability and performance. These factors collectively make Elasticsearch a more versatile, efficient, and dependable choice for organizations requiring sophisticated search and analytics capabilities.