Elasticsearch Elasticsearch Disk Underutilization on Warm Tier

By Opster Team

Updated: Mar 10, 2024

| 2 min read

What does this mean?

If there is more disk space allocated to warm nodes in the Elasticsearch cluster than needed,  the disk resources are not being utilized efficiently and there is potential to reduce costs by optimizing disk usage.

Why does this occur?

This can occur due to various reasons, such as:

  1. Overestimation of storage requirements during the initial setup of the cluster.
  2. Decrease in data volume over time, leading to unused disk space.
  3. Inefficient data management practices, such as deleting old or unnecessary data without revising storage requirements.
  4. Removal of some replica shards that were added to support a high usage peak.

Possible impact and consequences of disk underutilization

The possible impact of this issue includes:

  1. Increased costs: Over-allocating disk space can lead to higher infrastructure costs, as you are paying for resources that are not being used effectively, not only storage-wise., but if the provisioned data storage resulted from a specific memory-to-disk ratio, you might also be paying for too much RAM.
  2. Suboptimal performance: Underutilized disk space can result in inefficient data storage and retrieval, which can negatively impact the performance of your Elasticsearch cluster.

How to resolve

To resolve the issue of disk underutilization on the warm tier, you can take the following steps:

1. Move to smaller disk capacity:

   By moving to smaller disks, you can reduce the amount of unused disk space and optimize resource utilization. This can be done by resizing the disk volumes or migrating data to smaller disk volumes.

2. Reduce the number of warm data nodes: Reducing the number of warm data nodes can help you optimize disk utilization by distributing data more evenly across the available warm nodes. This can also help you save money and improve the performance of your Elasticsearch cluster.

Command example to drain data to other warm nodes so that the specified warm node can be deprovisioned:

PUT /_cluster/settings
{
  "transient": {
    "cluster.routing.allocation.exclude._ip": "10.0.0.1"
  }
}

3. Optimize data management and indexing strategies:

   Review your data management and indexing strategies to ensure that they are efficient and do not lead to unnecessary disk space usage. This can include optimizing index settings, using time-based indices, and implementing data lifecycle policies.

4. Monitor disk usage regularly:

   Regularly monitor disk usage in your Elasticsearch cluster to identify and address underutilization issues proactively. You can use monitoring tools like Elasticsearch Monitoring or third-party solutions to keep track of disk usage and other cluster performance metrics.

Conclusion

By following the steps outlined in this guide, you can resolve the issue of disk underutilization on the Elasticsearch warm tier and optimize your cluster’s disk usage. This will help you reduce operational costs and improve the overall performance of your Elasticsearch deployment.