Solve the problem of Yellow and Red in es cluster

1. Cluster health


Shards are healthy, there are three states of nodes in the cluster: green, yellow, red

Red: At least one primary shard is not allocated, indicating that the cluster is not working properly.

Yellow: Indicates that the operating status of the node is in a warning state, and all primary shards can be run directly at present, but at least one replica shard cannot work normally.

Green: The running status of the node is healthy. All primary shards and replica shards can work normally.

Index Health: The state of the worst shard

Cluster Health: Status of worst index

2. Health-related APIs


Explain
the status of the API cluster (check the number of nodes) GET _cluster/health
the health status of all indexes (check the problematic index) GET _cluster/health?level=indices the
health status of a single index (check the specific index) GET _cluster/health/ my_index
Shard-level index GET _cluster/health?level=shards
returns the reason for the first unallocated Shard GET _cluster/allocation/explain
 

3. Some reasons why shards are not allocated


INDEX_CREATE: Failed to create the index. Before all the fragments of the index are allocated, there will be a short red, which does not necessarily mean that there is a problem.

CLUSTER_RECOVER: During the cluster restart phase, there will be this problem

INDEX_REOPEN: Open a previously Closed index

DANGLING_INDEX_IMPORTED: When a node leaves the cluster, an index is deleted. When the node returns, it will cause Dangling problems

4. Common problems and solutions

The cluster turns red, and you need to check whether any nodes are offline. If so, usually restarting the offline nodes can solve the problem

Due to the problem caused by the configuration, the related configuration needs to be fixed (such as wrong box_type, wrong number of replicas)

Due to disk space constraints, caused by shard filtering rules (Shard Filtering), you need to adjust the rules or add nodes

For nodes returning to the cluster, causing danging to turn red, you can directly delete the dangling index

5. Summary of Cluster Red & Yellow Problems


Red & Yellow is a common problem in cluster operation and maintenance

In addition to cluster failures, some operations such as creating and adding replicas will cause the cluster to be temporarily red and yellow, so a certain delay needs to be set when monitoring and alarming

Find the real reason by checking the number of nodes and using the relevant API provided by ES

You can specify Move or Reallocate slices
———————————————————
Copyright statement: This article is the original article of CSDN blogger "I am also a pedestrian in Linjiang", following CC 4.0 BY-SA Copyright agreement, please attach the original source link and this statement for reprinting.
Original link: https://blog.csdn.net/weixin_56752399/article/details/120992261

Supongo que te gusta

Origin blog.csdn.net/qq_32907195/article/details/131979554
Recomendado
Clasificación