Raft ABC bis

Raft distributed consensus-based agreement is to build a lot of distributed infrastructure services, to some extent, it played the role of the heart, for this purpose it is necessary for some of the difficulties Raft in-depth understanding.

Correct understanding commited

A common misconception is copied to the copy of the majority can be regarded as commited, in fact, is not enough. Must be missing the corresponding operation has been performed in this step. Raft personal understanding in practical use, is the existence of a Raft log in to append multiple copies of a moment of downtime, due on_apply has not been performed (), in fact, has not been successful to the upper user ACK this log.
Therefore, this log can truncate, you can also follow in the wake of the subsequent log commited himself was commited.

Comprehend to compare the conditions of the main election

Why do we need to master since the last update time and compare election_timeout

This relatively large rules are as follows:
a follower requests received after vote, it calculates the time of receipt of the request and received the last time the main heartbeat interval, and then this time interval and a smaller convention were compared. If the former is smaller (description at this time there are other nodes for a long time did not receive the Lord's heartbeat, launched a vote), rejected the vote request, or to vote in favor.

Consider three interconnection IDC A, B, C, B is the primary copy, if for some reason B, C of the network is disconnected, if the above is not limiting, B initiates the request with the highest vote term information, A vote will be the same request, so that the master becomes B from C. Similarly, over the future, and very likely again become the main C from B. This repeated cycle, there is no way to provide a steady throughout the primary replication group.

Compare Why use LastLogIndex and LastLogTerm conduct of the old and new log

The rules are as follows:
after a vote follower received request, and a case where the current term equal requests term: requires the use of the current node LastLogIndex, LastLogTerm and request LastLogIndex, LastLogTerm comparison, if the former is high, the rejected request; if the latter is large, the consent request.

Under consideration, why should this? Or considering the above three interconnection IDC A, B, C, B is the primary copy, if for some reason B, C illogical network, C is the split node. When the network is divided, the current Term C is the highest, it is likely to initiate a vote, so that the B Stepdown a, currentTerm A and B improved; C, but because LogIndex low, it sends a request to the A and B Vote in the same a, Last LogIndex B is too small when compared to find themselves, but could not get a vote of confidence in a, B, and therefore vote failure. Followed by A or B and then initiate Vote, it is to use the highest Term (C of Term + 1), the latest LastLogIndex (A or B LastLogIndex), will be the main election success, and have the latest data.

Proper understanding of the principle of member changes

Raft采用的成员变更单策略相对简单,每次只增删一个节点,这样就不会出现两个多数集合,不会造成决议冲突的情况。按照如下规则进行处理:
  • When Leader received AddPeer / RemovePeer on processed, rather than waiting until committed, so that we can immediately use the new peer set replication AddPeer / RemovePeer request.
  • Leader AddPeer start when he sent a request to prevent the completion of the last round AddPeer not commit.
  • Leader delete itself when the node will be after RemovePeer Committed, were closed.

According to the above rules, you can achieve secure dynamic node additions and deletions, because the node dynamically adjusted with Leader elections are two parallel processes, nodes requires some easing of checks to ensure that the majority of the collection of selected primary and AppendEntries:

  • Node can accept AppendEntries not come from his own request Leader
  • Node may not belong Candidate Votes own node list

In order to avoid the same time there are two nodes changes under way at the time of change has not committed under way, the node is not allowed to change. Node change there is a problem, for a two-node Cluster, launched RemovePeer. This time a node hang up, the other one does not receive RemovePeer request, so that the system will stop working. It is strongly recommended cluster nodes> = 3.

Guess you like

Origin blog.51cto.com/xiamachao/2466092