Analysis of Redis Sentinel Architecture and Cluster Election Principle-04

Sentinel is a special redis service that does not provide read and write services, and is mainly used to monitor redis instance nodes.
Under the sentinel architecture, the client finds the master node of redis from the sentinel for the first time, and then directly accesses the master node of redis, instead of accessing the master node of redis through the sentinel agent every time. When the master node of redis changes, the sentinel will Perceive it at the first time, and notify the client of the new redis master node (here, the client of redis generally implements the subscription function, and subscribes to the node change message issued by sentinel)

1. When a master server is considered offline by a sentinel, the sentinel will negotiate with other sentinels to select the leader of the sentinel for failover.

2. Every sentinel that finds that the master server has gone offline can ask other sentinels to elect itself as the leader of the sentinel. The election is first come first served . At the same time, each sentinel will automatically increase the configuration epoch (election cycle) every time it is elected, and only one sentinel leader will be selected in each epoch.

3. If more than half of the sentinels elect a certain sentinel as the leader. Afterwards, the sentinel performs a failover operation and elects a new master from the surviving slaves. This election process is similar to the cluster's master election.

4. There is only one sentinel node in the sentinel cluster, and the master and slave of redis can also run normally and elect the master. If the master hangs up, the only sentinel node is the sentinel leader, and the new master can be elected normally.

5. However, for high availability, it is generally recommended to deploy at least three sentinel nodes.

再选举出一个slave作为master
通知其余的slave,新的master是谁
通知客户端一个主从的变化
最后,sentinel会等待旧的master复活,然后将新master成为slave
那么,如何选择“合适”的slave节点呢?
  1. Select the slave node with the highest slave-priority (slave node priority, manually configured), return if it exists, and continue if it does not exist.
  2. Secondly, the slave node with the largest copy offset (the most complete copy) will be selected, and if it exists, it will return, if it does not exist, it will continue
  3. Finally, the slave node with the smallest run_id will be selected (the earliest node to start)

Analysis of Redis cluster election principle

When the slave finds that its master has changed to FAIL state, it tries to perform Failover in order to become the new master. Since the hanged master may have multiple slaves, there is a process in which multiple slaves compete to become the master node. The process is as follows:

1.slave finds that its master becomes FAIL

2. Add 1 to the cluster currentEpoch recorded by yourself, and broadcast FAILOVER_AUTH_REQUEST information

3. Other nodes receive the information, only the master responds, judges the legitimacy of the requester, and sends FAILOVER_AUTH_ACK, and only sends ack once for each epoch

4. Try the failover slave to collect the FAILOVER_AUTH_ACK returned by the master

5. The slave becomes the new Master after receiving more than half of the master's acks (this explains why the cluster needs at least three master nodes, if there are only two, when one of them hangs, only one master node cannot be elected successfully)

6. The slave broadcasts the Pong message to notify other cluster nodes. The slave node does not try to initiate an election as soon as the master node enters the FAIL state, but there is a certain delay. A certain delay ensures that we wait for the FAIL state to propagate in the cluster. If the slave tries to vote immediately, other masters may not be aware of the FAIL state. , may downvote

• Delay calculation formula:

 DELAY = 500ms + random(0 ~ 500ms) + SLAVE_RANK * 1000ms

SLAVE_RANK indicates the rank of the total amount of data that this slave has copied from the master. The smaller the Rank, the newer the copied data. In this way, the slave with the latest data will initiate the election first (theoretically).

Cluster split brain data loss problem

The redis cluster does not have a more than half mechanism and there will be a split-brain problem. After the network partition leads to a split-brain, multiple master nodes provide external writing services. Once the network partition is restored, one of the master nodes will become a slave node. At this time, a large amount of data will be lost.

只作用在一个Redis节点上,即使Redis通过sentinel保证高可用,如果这个master节点由于某些原因发生了主从切换,那么就会出现锁丢失的情况:
在Redis的master节点上拿到了锁;但是这个加锁的key还没有同步到slave节点;
master故障,发生故障转移,slave节点升级为master节点;导致锁丢失。

Since the split brain is caused by the network, in addition to improving the network and hardware, the following configurations are mainly used to improve the data loss problem caused by the split brain. The circumvention method can add parameters in the redis configuration (this method cannot avoid data loss 100%, refer to the cluster leader election mechanism):

min‐replicas‐to‐write 1

The minimum number of slaves that successfully synchronizes the write data, this number can imitate more than half of the mechanism configuration, for example, a total of three nodes in the cluster can be configured with 1, plus the leader is 2, more than half

Note: This configuration will affect the availability of the cluster to a certain extent. For example, if there are less than one slave, the cluster will not be able to provide services even if the leader is normal. It needs to be weighed in specific scenarios.

min-slaves-max-lag 10

Once the delay of all slave replication and synchronization reaches 10s, the master will not accept any requests at this time.

By reducing the value of the min-slaves-max-lag parameter, you can avoid a large amount of data loss in the event of a failure. Once the delay exceeds this value, no data will be written to the master.

Guess you like

Origin blog.csdn.net/u011134399/article/details/131150156