The number and size allocation strategy of Elasticsearch index fragments

1. Basic understanding of sharding

Shard is data sharding, which is the data carrier of ES. Data in ES is divided into primary shard (primary shard) and replica shard (replica shard). Each primary carries part of the data of a single index and is distributed on each node. Replica is a copy of a primary, that is, a backup. The principle of shard allocation is to distribute as evenly as possible to each node in the cluster, so as to minimize the impact of some shards on the entire cluster and even services when an accident occurs.

Each slice is an instance of Lucene with complete functions.

2. Shard creation strategy

The purpose of sharding is to achieve distributed, and one of the benefits of distributed is to achieve "high availability" (including high performance, such as improving throughput, which will be discussed later), the allocation strategy of sharding is extremely The above are all about how to improve availability, such as shard allocation awareness , forced awareness , etc.

There is no "silver bullet" for internet development, and there is no optimal value for allocating the number of shards, the best way to create a sharding strategy is to use the same query and index loads you see in production on production hardware Benchmark against production data. The allocation strategy of shards is mainly measured by two indicators: the number and the size of a single shard.

3. The basic strategy of shard allocation

  • ES uses data fragmentation (shard) to improve service availability, scatter data on different nodes to reduce the impact on data integrity when a single node fails, and use replica (repiica) to ensure data integrity . Regarding the default allocation strategy of shards, before 7.x, there are 5 primary shards by default, and each primary shard is assigned a replica by default, that is, 5 primary 1 secondary, and after 7.x, 1 primary 1 secondary by default
  • When ES allocates shards of a single index, it will allocate each shard to as many nodes as possible. However, the actual situation depends on the number of shards and indexes the cluster has and their sizes, which may not always be evenly distributed.
  • Paimary can only configure the number when the index is created, while replica can be allocated at any time, and primary supports read and write operations, while replica only supports client-side read operations, and data is automatically managed by es and synchronized from primary.
  • ES does not allow Primary and its Replica to be placed in the same node, and the same node does not accept two identical Replicas
  • The same node allows multiple index shards to exist at the same time.

4. How much is the number of shards allocated?

  • Avoid too many shards : Most searches will hit more than one shard. Each shard runs the search on a single CPU thread. While shards can run multiple concurrent searches, searches across a large number of shards can exhaust a node's search thread pool . This results in low throughput and slow searches.
  • The fewer shards the better : each shard uses memory and CPU resources. In most cases, a small set of large shards uses fewer resources than many small shards.

5. Shard size decision

  • Reasonable capacity for fragmentation : 10GB-50GB. While not a hard limit, shards between 10GB and 50GB tend to work well. Depending on the network and use case, larger shards may be available. In index lifecycle management, generally 50GB is set as the maximum threshold for a single index.
  • The relationship between the heap memory capacity and the number of fragments : less than 20 fragments per GB of heap memory, the number of fragments that can be accommodated by a node is proportional to the heap memory of the node. For example, a node with 30GB of heap memory should have at most 600 shards. If the node exceeds 20 shards per GB, consider adding another node.

Query the current node heap memory size:

GET _cat/nodes?v=true&h=heap.current
  • Avoid heavy load nodes: If too many shards are assigned to a specific node, it will cause the current node to be a heavy load node

6. Important configuration

6.1 Custom properties

node.attr.{attribute}

How to view node properties?

GET _cat/nodeattrs?v

6.2 Index-level configuration

  • index.routing.allocation.include.{attribute}: Indicates that the index can be allocated on nodes that contain one of multiple values.
  • index.routing.allocation.require.{attribute}: Indicates that the index should be allocated on the node containing the specified value of the index (usually a value is generally set).
  • index.routing.allocation.exclude.{attribute}: Indicates that the index can only be allocated on nodes that do not contain all specified values.
//索引创建之前执行
PUT <index_name>
{
    
    
  "settings": {
    
    
    "number_of_shards": 3,
    "number_of_replicas": 1,
    "index.routing.allocation.include._name": "node1"
  }
}

6.3 Cluster-level configuration

Elasticsearch provides two ways to modify cluster-wide settings,

  • persistent: permanent modification, persistent modification is saved in /path.data/cluster.name/nodes/0/_state/global-n.st, if you want to delete the settings, just delete this file.
  • transient: Invalid after the cluster restarts.
PUT _cluster/settings
{
  "persistent": {
    "cluster.routing.allocation.awareness.attributes": "rack_id"
  }
}

7. Index shard allocation: Index Shard Allocation

7.1 Shard balance strategy: shard rebalance

A cluster is balanced when it has the same number of shards on each node without shards of any index concentrated on any node. Elasticsearch runs an automatic process called rebalancing , which moves shards between nodes in the cluster to improve its balance. Rebalancing follows all other shard allocation rules, such as allocation filtering and force awareness , which may prevent it from fully balancing the cluster. In this case, rebalancing strives to achieve the most balanced cluster within the rules you configure. If you use data tiers then Elasticsearch will automatically apply allocation filtering rules to place each shard in the appropriate tier. These rules mean that the balancer works independently within each layer.

cluster.routing.rebalance.enable

( Dynamic ) Enable or disable rebalancing for a specific type of shard:

  • all- (default) Allow shard balancing for all types of shards.
  • primaries- Allows shard balancing for primary shards only.
  • replicas- Shard balancing is only allowed for replica shards.
  • none- Any kind of shard balancing is not allowed for any index.

cluster.routing.allocation.allow_rebalance

( Dynamic ) Specifies when shard rebalancing is allowed:

  • always- Always allow rebalancing.
  • indices_primaries_active- only if all master nodes in the cluster are assigned.
  • indices_all_active- (default) only if all shards in the cluster (primaries and replicas) are allocated.

7.2 Delay allocation strategy (default 1m):

When a node leaves the cluster for any reason (intentional or not), the master node reacts as follows

  • Promote a replica shard to primary to replace any primary shards on the node.
  • Allocate replica shards to replace lost replicas (assuming there are enough nodes).
  • Rebalance the shards evenly across the remaining nodes.

These operations are designed to protect the cluster from data loss by ensuring that each shard is fully replicated as quickly as possible. Even if we limit concurrent recovery both at the node level and at the cluster level , this "shard shuffle" still puts a lot of extra load on the cluster, which may be unnecessary if lost nodes are likely to return soon

7.3 Fragmentation filtering: that is (Shard allocation filtering, controlling which node the shard is assigned to).

  • index.routing.allocation.include.{attribute}: Indicates that the index can be allocated on at least one node that contains one of multiple values.
  • index.routing.allocation.require.{attribute}: Indicates that the index should be allocated on the node containing the specified value of the index (usually a value is generally set).
  • index.routing.allocation.exclude.{attribute}: Indicates that the index can only be allocated on nodes that do not contain all specified values.

7.4 Shard Allocation Awareness Strategy: Shard Allocation Awareness

Shard Allocation Awareness is designed to improve service availability. By customizing node attributes as awareness attributes, Elasticsearch can take physical hardware configuration into account when allocating shards. If Elasticsearch knows which nodes are on the same physical server, in the same rack, or in the same region, it can separate the primary replica shards to minimize the risk of data loss in the event of a failure.

Enable Fragment Aware Policies

Configure node properties

node.attr.rack_id: rack1

Use the following settings to tell the master node which attributes need to be considered when allocating shards. This information will be saved in the cluster state information of each candidate node

PUT _cluster/settings
{
    
    
  "persistent": {
    
    
    "cluster.routing.allocation.awareness.attributes": "rack_id"
  }
}

7.5 Forced awareness strategy: Forced awareness

By default, if a region fails, Elasticsearch assigns all failed replica shards to other regions. But the remaining regions may not have enough performance redundancy to host these shards.

To prevent a single location from being overloaded in the event of a failure, you can set cluster.routing.allocation.awareness.forcenot to allocate replicas until a node at another location becomes available.

Deploy Forced Sense Policies

Set the mandatory perception strategy, tell the master node to divide the area by a certain attribute, and tell the area what values

cluster.routing.allocation.awareness.attributes: zone
cluster.routing.allocation.awareness.force.zone.values: zone1,zone2 

Guess you like

Origin blog.csdn.net/wlei0618/article/details/127434907