1. Cluster health
Shards are healthy, there are three states of nodes in the cluster: green, yellow, red
Red: At least one primary shard is not allocated, indicating that the cluster is not working properly.
Yellow: Indicates that the operating status of the node is in a warning state, and all primary shards can be run directly at present, but at least one replica shard cannot work normally.
Green: The running status of the node is healthy. All primary shards and replica shards can work normally.
Index Health: The state of the worst shard
Cluster Health: Status of worst index
2. Health-related APIs
Explain
the status of the API cluster (check the number of nodes) GET _cluster/health
the health status of all indexes (check the problematic index) GET _cluster/health?level=indices the
health status of a single index (check the specific index) GET _cluster/health/ my_index
Shard-level index GET _cluster/health?level=shards
returns the reason for the first unallocated Shard GET _cluster/allocation/explain
3. Some reasons why shards are not allocated
INDEX_CREATE: Failed to create the index. Before all the fragments of the index are allocated, there will be a short red, which does not necessarily mean that there is a problem.
CLUSTER_RECOVER: During the cluster restart phase, there will be this problem
INDEX_REOPEN: Open a previously Closed index
DANGLING_INDEX_IMPORTED: When a node leaves the cluster, an index is deleted. When the node returns, it will cause Dangling problems
4. Common problems and solutions
The cluster turns red, and you need to check whether any nodes are offline. If so, usually restarting the offline nodes can solve the problem
Due to the problem caused by the configuration, the related configuration needs to be fixed (such as wrong box_type, wrong number of replicas)
Due to disk space constraints, caused by shard filtering rules (Shard Filtering), you need to adjust the rules or add nodes
For nodes returning to the cluster, causing danging to turn red, you can directly delete the dangling index
5. Summary of Cluster Red & Yellow Problems
Red & Yellow is a common problem in cluster operation and maintenance
In addition to cluster failures, some operations such as creating and adding replicas will cause the cluster to be temporarily red and yellow, so a certain delay needs to be set when monitoring and alarming
Find the real reason by checking the number of nodes and using the relevant API provided by ES
You can specify Move or Reallocate slices
———————————————————
Copyright statement: This article is the original article of CSDN blogger "I am also a pedestrian in Linjiang", following CC 4.0 BY-SA Copyright agreement, please attach the original source link and this statement for reprinting.
Original link: https://blog.csdn.net/weixin_56752399/article/details/120992261