Illustration of the distributed consensus protocol Paxos

The Paxos protocol/algorithm is an important protocol in distributed systems. How important is it?

<Transaction Processing of Distributed Systems> :

Mike Burrows, the author of Google Chubby, said that there is only one consensus algorithm in the world, and that is Paxos, and all other algorithms are defective.

<Large scale distributed storage system> :

After understanding these two distributed protocols (Paxos/2PC), learning other distributed protocols becomes quite easy.

There are two parts to learning the Paxos algorithm: a) the principle/proof of the algorithm; b) the understanding/operation of the algorithm.

Understanding the operation of this algorithm can basically be used in engineering practice. And understanding the process is relatively much easier.

On the Internet, I think the good thing about Paxos belongs to this article: Paxos Diagram and Paxos Algorithm Detailed Explanation , I will further elaborate on the examples on the wiki here. Some paxos basics can be basically understood through the two articles mentioned here and the content on the wiki.

Algorithm content

The content of Paxos in the original author's "Paxos Made Simple" is relatively concise:

Phase 1

(a) A proposer selects a proposal number n and sends a prepare request with number n to a majority of acceptors.

(b) If an acceptor receives a prepare request with number n greater than that of any prepare request to which it has already responded, then it responds to the request with a promise not to accept any more proposals numbered less than n and with the highest-numbered pro-posal (if any) that it has accepted.

Phase 2

(a) If the proposer receives a response to its prepare requests (numbered n) from a majority of acceptors, then it sends an accept request to each of those acceptors for a proposal numbered n with a value v , where v is the value of the highest-numbered proposal among the responses, or is any value if the responses reported no proposals.

(b) If an acceptor receives an accept request for a proposal numbered n, it accepts the proposal unless it has already responded to a prepare request having a number greater than n.

Borrowing the flow chart in the paxos graphic text can be summarized as:

Examples and detailed explanations

There are three types of roles in Paxos Proposer, Acceptorand Learner, and the main interaction process is between Proposerand Acceptor.

ProposerAcceptorThere are mainly four types of message communication in the interaction between the two, as shown in the following figure :

These 4 types of messages correspond to the two-phase 4 processes of the paxos algorithm:

  • phase 1
    • a) The proposer sends prepare messages to more than half of the acceptors in the network
    • b) The acceptor normally replies to the promise message
  • phase 2
    • a) When there are enough acceptors to reply to the promise message, the proposer sends the accept message
    • b) Under normal circumstances, the acceptor replies to the accepted message

Because there may be other proposers making the above request for the same thing in the whole process, there will be some special cases in each process, which is also what is done to achieve consistency. If there are no other proposers competing during the entire process, then the result of this operation is deterministically unobjectionable. But if there are other proposers, the situation is different.

Take the example on the paxos Chinese wiki as an example. To put it simply, this example uses several members to propose taxes to determine the tax proportion of the final bill.

The following figure basically only draws the interaction between proposer and one acceptor. The time marker T2 always follows T1. The propose number is referred to as N.

One of the situations is as follows:

A3 sends accepted to A1 at T1, and then receives A5's prepare at T2, and A1 notifies A5 of the final result (tax rate 10%) at T3. There are two situations here:

  • The N5 sent by A5 is smaller than the N1 sent by A1, then A3 directly rejects A5
  • The N5 sent by A5 is greater than the N1 sent by A1, then A3 replies to the promise, but with A1's (N1, 10%)

This can correspond to the paxos flow chart for better understanding. The acceptor will record (MaxN, AcceptN, AcceptV) .

After A5 receives the promise, the follow-up process can proceed smoothly. However, when accept is issued, because (AcceptN, AcceptV) is received, the AcceptV corresponding to the largest AcceptN will be taken. In the example, 10% of A1 is used as AcceptV. If no other recorded AcceptV is found when the promise is received, its value can be determined by itself.

In view of the above conflict between A1 and A5, the final accepted value of both A1 and A5 is 10%.

In fact, for the acceptor in the four processes, special processing may be caused due to the intervention of other proposers when replying to promise and accepted. So basically, you can understand the entire algorithm when you receive requests from other proposers at these two time points. For example, when replying to a promise, it may be rejected because the N sent by the proposer is not large enough:

If a promise is issued to other proposers with a larger N when the accepted message is sent, the accept issued by the proposer will also be rejected, as shown in the figure:

This corresponds to Phase 2 b):

it accepts the proposal unless it has already responded to a prepare request having a number greater than n.

Summarize

Leslie Lamport didn't describe Paxos mathematically, but he made it clear in English. Understand the contents of the two Phases of Paxos clearly, and the entire algorithm process is not complicated.

As for a globally unique and increasing proposer number that has been mentioned in Paxos, how to implement it is quoted as follows:

How to generate a unique number? What is mentioned in "Paxos made simple" is to let all Proposers choose from disjoint data sets. For example, if there are 5 Proposers in the system, each Proposer can be assigned an identifier j (0~4), then The number of resolutions proposed by each proposer can be 5*i + j (i can be used to represent the number of times the proposal is proposed)

Reference documentation

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324896758&siteId=291194637