Elasticsearch Node Concurrent Recoveries Setting is Too High / Low

Opster Team

Last updated: Apr 26, 2022

| 1 min read

In addition to reading this guide, we recommend you run the Elasticsearch Health Check-Up. It will detect issues and improve your Elasticsearch performance by analyzing your shard sizes, threadpools, memory, snapshots, disk watermarks and more.

The Elasticsearch Check-Up is free and requires no installation.

An overview of Node_Concurrent_Recoveries_High and Node_Concurrent_Recoveries_Low. 

What it means

The node concurrent recoveries setting determines the maximum number of shards that can be recovered at once from each node. Recovering shards requires both disk and network resources, so it is advisable to limit the number of shards that can be recovered from a given node at any one time. 

If, on the other hand, the concurrent recoveries setting is too limited and is set too low, the cluster may not be able to recover shards at all, or recovery may be slower than usual. This could create performance issues since the cluster has fewer replicas than planned, or may even leave the index unwritable, with the cluster staying yellow or red for a long period of time.  

There are a number of different settings that are similar but have subtle differences:

cluster.routing.allocation.node_concurrent_incoming_recoveries (default 2)

How many concurrent incoming shard recoveries (normally replicas) are allowed to happen on a node. 

cluster.routing.allocation.node_concurrent_outgoing_recoveries (default 2)

How many concurrent outgoing shard recoveries are allowed to happen on a node. 

cluster.routing.allocation.node_concurrent_recoveries (default 2)

This is a convenience function to simultaneously set both cluster.routing.allocation.node_concurrent_incoming_recoveries and cluster.routing.allocation.node_concurrent_outgoing_recoveries.

cluster.routing.allocation.node_initial_primaries_recoveries (default 4)

This is different from the above because it involves the recovery of a primary node using data from the local disk. Because these operations don’t require networking, a larger number of operations may be carried out in parallel on the same node.

How to resolve it

Check the current cluster settings:

GET _cluster/settings

If necessary, change the concurrent recovery settings. In general the defaults are good values to use.

PUT _cluster/settings
{
  "transient": {
    "cluster.routing.allocation.node_concurrent_recoveries ": 2
  }
}


Watch product tour

Try AutoOps to find & fix Elasticsearch problems

Analyze Your Cluster
Skip to content