The UNASSIGNED fragment of the ES stepping record cannot be recovered

problem background

change node

We have an ES cluster online, with three machines running a total of 6 nodes. It has been running online for several months and there has been no problem. Unfortunately, just yesterday, the disk of node 3 in the cluster failed, causing the machine to be directly paralyzed. At first, everyone thought that the problem was not a big one. Doesn’t ES have disaster recovery? If you change to a new node, you can automatically allocate shards.

unassigned

After we changed to a new node with full confidence, the cluster status was always red, and we found that there were more than 180 unassigned shards.

curl -XGET http://localhost:9200/_cluster/health

{
    "cluster_name": "escluster",
    "status": "red",
    "timed_out": false,
    "number_of_nodes": 6,
    "number_of_data_nodes": 6,
    "active_primary_shards": 498,
    "active_shards": 767,
    "relocating_shards": 0,
    "initializing_shards": 0,
    "unassigned_shards": 185,
    "delayed_unassigned_shards": 0,
    "number_of_pending_tasks": 0,
    "number_of_in_flight_fetch": 0,
    "task_max_waiting_in_queue_millis": 0,
    "active_shards_percent_as_number": 80.5672268907563
}
curl -XGET http://localhost:9200/_cat/shards | grep UNASSIGNED

image-20220727145303496

Troubleshooting

Shard recovery concurrency ❌

Since Unassigned shards appears, it means that some fragments are not fragmented. At the beginning of the period, we took it for granted that it should be a new node joining the cluster, and the fragmentation has not been restored yet. In order to speed up shard allocation, we increased the concurrent number of shard recovery.

curl -XPUT http://localhost:9200/_cluster/settings -H 'Content-Type: application/json' -d'
{
    "persistent": {
        "cluster.routing.allocation.node_concurrent_recoveries": 10 
    }
}
'

However, it was useless, and after waiting for a long time, there was still no change.

allocation explain

Then we use allocation explainthe command to view the allocation status of the fragment

curl -XGET http://localhost:9200/_cluster/allocation/explain?pretty

image-20220727150428645

We can see through unassigned_info, NODE_LEFT, which means that the node is gone. last_allocation_status said more clearly: no_valid_shard_copy, there is no valid shard copy. allocate_explanation also said: cannot allocate because a previous copy of the primary shard existed but can no longer be found on the nodes in the cluster, to the effect that no usable copy can be found on the cluster nodes.

We are also very confused. In order to make ES disaster recovery, the ES index has 1 copy by default. According to the strategy of ES fragmentation, the replica fragmentation will not be distributed on the same machine as the primary fragmentation. Yesterday, it was down. If one node is lost, the primary allocation and replica fragmentation should not be lost. Could it be... Could it be that there is no copy of this index? ? ?

With the mentality of giving it a try, we checked the information of one of the missing indexes

curl -XGET http://localhost:9200/XXX-2022.03.15/_settings
{
    "XXX-2022.03.15": {
        "settings": {
            "index": {
                "routing": {
                    "allocation": {
                        "require": {
                            "box_type": "hot"
                        }
                    }
                },
                "number_of_shards": "1",
                "provided_name": "XXX-2022.03.15",
                "creation_date": "1647273614797",
                "number_of_replicas": "0",
                "uuid": "Dy7G3ZaESYqLB_aFk8M3Cg",
                "version": {
                    "created": "7080099"
                }
            }
        }
    }
}

I don’t know if I don’t check it. I’m surprised when I check it. The number of shards in this index is 1, and there is no copy... Where is my copy? ? ? Quickly confirmed with R&D, because the machine disk is relatively small, in order to save storage, the development will not leave a copy when writing the index! ! !

Good guy, I'm direct good guy, we still count on ES disaster recovery, this is a hammer disaster. The case was solved and the problem was found, but the data could not be retrieved either.

solution

The data cannot be retrieved, but the cluster cannot always be red, and there are still more than 180 unassigned shards to be processed.

reroute❌

By searching for related solutions on the Internet, I learned that the problem can be solved by rebuilding the route.

curl -H 'Content-Type: application/json' \
    -XPOST http://localhost:9200/_cluster/reroute?pretty -d '{
    "commands" : [ {
        "allocate_stale_primary" :
            {
              "index" : "XXX", 
              "shard" : 0,
              "node" : "target-data-node-id",
              "accept_data_loss" : true
            }
        }
    ]
}'

But because the data node has been lost, we will receive the following error:

image-20220727153403732

What this means is that unless the lost node rejoins the cluster, the data will be gone.

allocate_empty_primary

The data cannot be recovered, so we can only clear the shards.

curl -H 'Content-Type: application/json' \
    -XPOST http://localhost:9200/_cluster/reroute?pretty -d '{
    "commands" : [ {
        "allocate_empty_primary" :
            {
              "index" : "XXX", 
              "shard" : 0,
              "node" : "target-data-node-id",
              "accept_data_loss" : true
            }
        }
    ]
}'

delete index

There is also a more thorough solution, which is to delete all the broken indexes and be done. Anyway, the data is gone. What is the difference between an index without data and a salted fish? Out of sight and out of sight.

References

Guess you like

Origin blog.csdn.net/qq_32907195/article/details/132272370