Raft of log replication algorithm

Previous article: Leader election Raft of the algorithm
  before finished Leader electoral process Raft algorithm, this article describes log replication on the basis of an article on.

Raft of log replication algorithm

  Look at the basic contents of the log contains:

  1. Copy command can be executed by the state machine
  2. Number tenure: Leader of the current term of office which is created when the log No.
  3. Index: integer that identifies the location where the log

Status log is divided into two: uncommitted, has been submitted (logs for security, will not be deleted or overwritten).

Normally 1

  • When Leaderreceiving a request sent by the client (the request may be contained in a state machine executing a copy command), Leader will place the request as a new content is added to the log (number of the current term Leaderin which the term number, index No. current Leaderset of logs in a log stored locally highest index number plus 1).
    • LeaderCan make up a given index number to the log (i.e., two or more possible to create a log entry with the same index term in a) in the current term
  • Then the log by AppendEntries RPCsending a message to the other servers in the network (hereinafter referred Follower) to copy the log.
  • In the network Followerafter receiving the message log successfully copied reply is returned.
  • In Leaderreceiving the majority of the network Followerafter successfully copied reply, and Leaderthen consider the log can be submitted . At this point Leaderyou will do three things simultaneously:
  1. The logs to the Leaderlocal copy of the state machine
  2. All to Followerall the received notification message transmission log Followerthe commit log, and then applied to the respective local copy of the state machine
  3. The execution result notification from the client

  When the log messages in the network most successful Followerlocal copy state machine after the execution, the log can be considered to have been submitted . During the pre-log is submitted, if Leadersome previous log has not been submitted, it will be submitted together.
  The network, some Followermay be due to network state reasons for slow response or crash, it Leaderwill be indefinitely repeated attempts to send AppendEntries RPCa message to the Follower. Until it succeeds.

1.1 log consistency check

  In the above, we said Followerupon receiving AppendEntries RPCthe message is returned to replicate the success of reply. Indeed consistency check after receiving the first message in a log (normally Leaderthe Followerlog will remain consistent consistency check not failed), the consistency check as follows:

  • In the Leadercreation AppendEntries RPCmessage, the message will contain a term of number and index number before the current log log entries.
  • FollowerIn receiving the AppendEntries RPCmessage, the log before the term of office will check whether the number and the index number to be matched
    • If a match to the description and Leaderthe previous logs are consistent.
    • If no match is rejected AppendEntries RPCmessage.

  A consistency check is inductive process. Normally , the first log in the network must meet the log consistency check, then the second log comprises a first log term number and index number, so long as Leaderthe Followerfirst log consistent, then the second logs will meet consistency checks, so each log will meet after the consistency check.

  To arrive at the log matching attributes:

  • If two different entities have the same log index number and term of office, then they have the same storage command.
  • If two different entities log term and have the same index number, all the previous entries in the log are the same. (Derived from the consistency check results)

2 special circumstances

  The network could not have been in a normal situation. Because Leaderor a Followerpossible crash, resulting in the log can not always consistent. Therefore, the following three cases:

  1. FollowerThe current lack of Leaderlog entries on the present.
  2. FollowerThere is a current Leaderlog entry does not exist. (Such as the old Leadersimply AppendEntries RPCsend a message to a part Followeron crash out, then elected a new Leaderserver did not receive exactly the AppendEntries RPCserver messages)
  3. Or Followerthat is currently missing Leaderlog entries exist on, there is current Leaderlog entry does not exist

Map

  FIG uppermost log index is a number (1-12), each square represents a log message, the number represents the log block in which the number term. FIG current Leader(FIG uppermost row representing the current log Leaderlog) time in term of number of 8. This figure illustrates the reason for the existence of the above three cases:

  • Follower A, B ( Followercollapse does not receive the Leadertransmitted AppendEntries RPCmessage) satisfies the first case described above.
  • ( FollowerC at the time term for 6, FollowerD in term of time. 7) is Leader, but not fully completed log transmission will collapse. Satisfying the above described third case.
  • Followere at the time of term 4, FollowerF in the term of time of 3 Leader,, but not completely finished sending the log will be collapsed while the other elected server Leaderdoes not receive a new time Leadertransmitted AppendEntries RPCmessage, satisfy the third case.

2.1 log inconsistent solutions

  LeaderBy forced Followerto deal with inconsistencies repeating his log log. This means that the Followerconflict log log will be Leadercovered log entries. Therefore Leaderyou must find the Followerlocation of the beginning of the log conflict occurs, then delete Followerall the Leaderlog conflict. Then send their logs to Followerin order to resolve the conflict.
LeaderIt does not delete or overwrite your own local log entry

  These steps starting from a consistency check before it comes to the log.

  • When a conflict log, Followerwill be rejected by Leadersending the AppendEntries RPCmessage, and returns a response message to inform Leaderlog conflict.
  • LeaderFor each Followermaintain a nextIndexvalue. This value is used to determine to be transmitted to the Followerlocation index of the next log. (The value at the current server has been elected Leaderafter reset to a local log of the last index number +1)
  • When Leaderlater we learned that log conflict, is decremented nextIndexvalue. And resend AppendEntries RPCto that Follower. And repeat this process until Followerreceiving the message.
  • Once Followeraccepted the AppendEntries RPCnews, Leaderthen according to nextIndexthe position value can be determined conflict, thereby forcing Followerlog repeat their logs to resolve the conflict.

Map

  • Where A : As shown, the server S1 is the only time of 2 log term <index:2,term:2>sent to the server S2 will crash out.
  • Case b : server S5 in term of the elected Hour 3 Leader( S5 timer lead in overtime, No. 3 increments term therefore higher than the server S3, S4 , may be elected Leader), but did not have time to send log will crash out.
  • Case c server: S1 in term of time of 4 re-elected Leader( S1 restart, a term still 2 receives new LeaderS5 updated heartbeat message sent by a term of 3, while in LeaderS5 after a crash, the server S1 for the first timer expires, therefore poll for a term of 4 update, greater than elsewhere in the network server tenure, has been elected Leader), while the log <index:2,term:2>is sent to the server S2, S3 , but has not submitted a notification server logs will crash out.
  • Where D : where ( A-> D ) If the term server 2 is S1 as Leadercrash out, S5 is a time of 3 elected term Leader, because the log <index:2,term:2>has not been replicated on most servers, and has not been submitted, so S5 through their own log <index:2,term:3>overwrite logs <index:2,term:2>.
  • Where E : where ( A-> E ) and if it is 2 when the term server S1 as Leader, and <index:2,term:2>sent to the S2, S3 , the majority of members successfully copied to the server. And successfully submitted to the journal, then even S1 crash out, S5 can not be successfully elected Leader, because S5 does not have the latest network log entries have been submitted ( here illustrate an article on Leader Raft algorithm of elections in election Leaderrequirements There are no requirements that point presentation ).

2.2 Election of Leaderthe requirements of the log

  • Raft use voting procedures to prevent Candidatewin the election, unless they log log entry contains all submitted.
  • CandidateMost can contact the cluster must be elected, which means that each entry submitted must be present in at least one server. If the Candidatelog at least as of the most recent log server log (the precise definition of the latest ), it will save all the entries submitted.
  • Raft through the index and a term of comparison last entry in the log to determine which of two log is up to date. Log If the log entry has a different last term, the new term with a more up to date. If the log ends with the same term, places the index larger log shall prevail.

  Optimization solutions
  in Followerreject AppendEntries RPCmessage, may choose the log would conflict with the first term in the log index term contained in the reject message is returned to Leader, so that Leadercan quickly locate the position of the conflict. With this information, Leaderyou can decrement nextIndexto bypass the mandate of entries for all conflicts. Each has a term in which the conflict log entry requires a AppendEntries RPCmessage, rather than a need Each log entry AppendEntries RPCmessage.

3 log replication security

Raft ensure that any moment each attribute here is set up

  • LeaderOnly additional characteristics: Leadernever overwrite or delete its log entries, only added new ones.
  • Log Match: If two entities comprising a log having the same index and the term, then the index until the given date, all entries in the log are the same.
  • LeaderIntegrity: If a log is prompted to submit within a given term, then the entry will appear in all higher term leader of the log.
  • State Machine Safety: If the server application log of a given index entity to its state machine, then no other servers can use a different log into the same index.

3.1 Leaderintegrity proof

  Assuming that Leaderintegrity does not hold, then proved to be contradictory.
  Assume a term of T is Leadersubmitted current log entry term, but the log is not higher than the term T term for U future of the new Leaderstored.

  1. Is submitted for a term of T is not present in the logs must be elected for a term of U of Leaderreplicated state machine (because Leadernever overwrite or delete its log entries).
  2. Term of T is Leadercopied to the majority of members of the local cluster log. And a term of U of Leaderreception to vote in the election stage most of the members of the cluster, so there is at least one member of the cluster (hereinafter referred to voters) that is received is from the term T of the Leaderlogs are sent, but also for a term of U of LeaderHe voted. So the voters is the key to prove contradictory.
  3. Voters must be for the term of office for the U of Leaderthe mandate for the vote before the T 's Leadercommit log sent. Otherwise, as the voters will reject term T of Leaderthe AppendEntries PRCrequest (because once the term for receiving U of the Leadervoting request voters term will be higher than T ).
  4. When voters for the term of U of Leadervoting, the store will have the log entry. Assumed to be in term of T and U each between Leaderboth contain the log entry ( Leadernever delete a log entry, but Followeronly in the Leaderdelete entries when a conflict).
  5. Voters for the term of U of Leadervoting, so the term of U of Leaderlogs and voters must be at least as new log. This will lead to a contradiction in the two conflicts generate.
  6. First, if the term of the voters and U of Leaderthe same date the log term. Then the term of U of Leaderlogs and voters at least as long as a log. So for a term of U of Leaderthe log will contain all log voters. This is a contradiction, because before the assumption of voters included tenure is submitted to T log, and a term of U are Leadernot included.
  7. Otherwise, the term of office for the U 's Leaderlast term of office of the log number must be greater than the number of the last term of office a log of voters. Moreover, it is more than T big, because the number of voters a log term of at least T (which contains a term T entries for all of the submitted). Creating a term of U 's Leaderlast log entry of older Leadermust contain entries submitted in its log (by hypothesis). Then, by matching the log attribute for a term of U of Leaderlogs it must also contain an entry has been submitted, which is contradictory.
  8. This proved contradictory, so all term greater than T 's Leadermust include all term for T logs are submitted.
  9. Log matching attributes to ensure the future Leaderwill also include indirect log entries submitted.

Next article: the relationship between changes in membership Raft Algorithms

Guess you like

Origin www.cnblogs.com/cbkj-xd/p/12152222.html