[Zookeeper] zookeeper split brain problem

Insert picture description here

1 Overview

Reprinted: https://blog.csdn.net/yjp198713/article/details/79400927

1.1 Why does zookeeper deploy base servers?

所谓的zookeeper容错是指,当宕掉几个zookeeper服务器之后,剩下的个数必须大于宕掉的个数,也就是剩下的服务数必须大于n/2,zookeeper才可以继续使用,无论奇偶数都可以选举leader。

Up to 2 of the 5 machines are down and can still be used because the remaining 3 machines are greater than 5/2. The reason why it is best to have an odd number is that it will save resources under the condition of the maximum number of fault-tolerant servers. For example, when the maximum fault-tolerant number is 2, the corresponding zookeeper service number is 5 for odd numbers and 6 for even numbers. That is, in the case of 6 zookeeper services, up to 2 services can be down, so from the perspective of resource saving, there is no need to deploy 6 (even) zookeeper services.

zookeeper has such a feature: 集群中只要有过半的机器是正常工作的,那么整个集群对外就是可用的. That is to say, if there are 2 zookeepers, then as long as 1 zookeeper is dead, it cannot be used, because 1 is not more than half, so the death tolerance of 2 zookeepers is 0; similarly, if there are 3 zookeepers, one is dead , There are 2 normal ones left, more than half, so the tolerance of 3 zookeepers is 1; for the same reason, you can list a few more: 2->0;3->1;4->1;5->2;6->2you will find a rule, the tolerance of 2n and 2n-1 is the same, both are n -1, so in order to be more efficient, why add an unnecessary zookeeper.

Based on the above, it can be concluded that there is a resource saving perspective

1.2 Zookeeper Split-Brain problem

1.2.1 What is split brain?

In the vernacular, for example, when there are two nodes in your cluster, they all know that a master needs to be elected in this cluster. Then when there is no problem with the communication between the two, a consensus will be reached and one of them will be selected as the master. But if there is a problem with the communication between them, then the two nodes will feel that there is no master, so each elects itself as the master. So there will be two masters in the cluster.

For Zookeeper, there is a very important question, which is based on what kind of situation to judge a node death and down. In a distributed system, these are all judged by the monitor, but it is also difficult for the monitor to determine the status of other nodes. The only reliable way is the heartbeat. Zookeeper also uses the heartbeat to judge whether the client is still alive.

Using ZooKeeper to do master HA is basically the same way. Each node tries to register a temporary node that symbolizes the master. Others that fail to register successfully become slaves, and monitor the temporary nodes created by the master through the watch mechanism. Zookeeper uses internal The heartbeat mechanism determines the status of the master. Once the master has an accident, Zookeeper can quickly learn and notify other slavers, and other slavers will respond later. This completes a switch. This mode is also a more general mode, and most of them are basically implemented in this way.

But there is a very serious problem here. Failure to notice will cause the system to split brain within a short period of time, because the heartbeat may time out because the master is down, but it may also be a problem with the network between the master and zookeeper. The same can cause. This situation is suspended animation, the master did not die, but a network problem with ZooKeeper caused Zookeeper to think that it was down and notify other nodes to switch, so that one of the slaves became the master, but the original master did not Not dead, at this time the client also gets the master switch message, but there will still be some delays. Zookeeper needs to communicate one by one. At this time, the entire system is very chaotic. Maybe some clients have been notified to connect to the new master. However, some clients are still connected to the old master. If two clients need to update the same data of the master at the same time and they happen to be connected to the new and old master at the moment, serious problems will occur.

2. Summary

Feign death : The master is considered dead due to a heartbeat timeout (caused by network reasons), but the master is still alive.

Split brain
: due to suspended animation, a new master election will be initiated, and a new master will be elected, but the old master network is connected again, resulting in two masters, some clients connect to the old master and some clients connect to The new master.

2.2. What caused it?

The main reason is that the Zookeeper cluster and Zookeeper client judge that the timeout cannot be completely synchronized, which means that it may be one after the other. If the cluster is discovered before the client, the above situation will occur. At the same time, after discovering and switching, notify each client of the order of speed. Generally, the probability of this situation is very small. The master needs to be disconnected from the Zookeeper cluster network but there is no problem with the network between other cluster roles. The above conditions must be met, but once it occurs, it will cause serious consequences and inconsistent data. .

2.2. How does zookeeper solve it?

To solve the Split-Brain problem, there are generally 3 ways:

  1. Quorums (ˈkwôrəm quorum), such as a cluster of 3 nodes, Quorums = 2, which means that the cluster can tolerate 1 node failure, at this time, a lead can be elected, and the cluster is still available. For example, for a cluster of 4 nodes, its Quorums = 3, Quorums must exceed 3, which is equivalent to the tolerance of the cluster is still 1. If 2 nodes fail, then the entire cluster is still invalid
  2. Using Redundant communications, redundant communication methods, multiple communication methods are used in the cluster to prevent one communication method from failing and the nodes in the cluster cannot communicate.
    Fencing is a way of sharing resources. For example, if you can see the shared resources, it means that you are in the cluster. The lock that can obtain the shared resources is the leader. If you can't see the shared resources, you won't be in the cluster.

ZooKeeper默认采用了Quorums这种方式,即只有集群中超过半数节点投票才能选举出Leader。这样的方式可以确保leader的唯一性,要么选出唯一的一个leader,要么选举失败。在ZooKeeper中Quorums有2个作用

  1. The minimum number of nodes in the cluster is used to elect Leader to ensure that the cluster is available
  2. Notify the client that the minimum number of nodes in the cluster has stored the data before the data has been safely stored. Once these nodes have saved the data, the client will be notified that it has been safely saved and can continue other tasks. The remaining nodes in the cluster will eventually save the data

Suppose a leader suspended animation, and the rest of the followers elect a new leader. At this time, the old leader is resurrected and still considers itself the leader. At this time, it will also be rejected when it sends write requests to other followers. Because every time a new leader is generated, an epoch is generated. This epoch is incremented. If followers confirm the existence of the new leader and know its epoch, they will reject all requests with an epoch smaller than the current leader epoch. Is there any follower who does not know the existence of a new leader? It is possible, but certainly not the majority, otherwise the new leader cannot be produced. The writing of Zookeeper also follows the quorum mechanism. Therefore, writing that does not receive most of the support is invalid. Even if the old leader considers itself to be the leader, it still has no effect.

Guess you like

Origin blog.csdn.net/qq_21383435/article/details/108697979