Zab Paxos shelf

2016-10-26 21:50 by bangerlee,  495  reads,  0  comments,  favoriteseditors

introduction

"Advanced Distributed System Theory - Paxos" introduces the consensus protocol Paxos. Today we will learn about two other common consensus protocols - Raft and Zab. By comparing with Paxos, understand the core ideas of Raft and Zab, and deepen the understanding of the consensus protocol.

 

Raft

Paxos is more theoretical and less about how to apply it to engineering practice. The difficulty of understanding coupled with the skinny reality, it is very difficult to implement a correct distributed system based on Paxos in a production environment [1] :

Raft [2][3] was proposed in 2013. Although it was not proposed for a long time, many systems have been implemented based on Raft. Compared to Paxos, the buying point of Raft is that it is easier to understand and easier to implement.

 

In order to make it easier to understand and implement, Raft decomposes and concretizes the problem: the leader handles the change operation request uniformly, and the role of the consistency protocol is to ensure the consistency of the operation log copy (log replication) between nodes, and the term is used as the logical clock. (logical clock) guarantees timing, and nodes run the same state machine [4] to get consistent results. The specific process of the Raft protocol is as follows:

  1. The client initiates a request, and each request contains an operation instruction
  2. The request is processed by the leader, the leader appends the operation instruction (entry) to the operation log, and then initiates an AppendEntries request to the follower, trying to make the operation log copy land on the follower
  3. If the Follower majority (quorum) agrees to the AppendEntries request, the Leader performs the commit operation and hands the instruction to the state machine for processing
  4. After the state machine processing is completed, the result is returned to the Client

Instructions use log index (instruction id) and term number to ensure timing. Under normal circumstances, the Leader and Follower state machines execute instructions in the same order, resulting in the same result and the same state.

 

Downtime, network differentiation, etc. can cause leader re-election (each election generates a new leader at the same time, a new term is generated), and the status of the leader/follower is inconsistent. In Raft, the leader maintains a nextIndex value for itself and all followers, which indicates the instruction id to be processed by the leader and the instruction id to be sent to the follower. When L nextIndex is not equal to F nextIndex , it means that the leader operation log and follower operation log exist Inconsistent, at this time, it will start from the initial inconsistency in the follower operation log, and the leader operation log will cover the follower until L nextIndex and F nextIndex are equal.

 

The existence of leaders in Paxos is to improve the resolution efficiency. The presence and number of leaders does not affect the consistency of resolutions. Raft requires a unique leader, and concretizes the consistency problem as maintaining the consistency of log copies to achieve a comparison A goal that is easier to understand and easier to achieve in terms of Paxos.

 

Oats

The full name of Zab [5][6] is Zookeeper atomic broadcast protocol, which is a consensus protocol used internally by Zookeeper. Compared with Paxos, the biggest feature of Zab is to ensure strong consistency (strong consistency, or linearizable consistency).

 

Like Raft, Zab requires the only Leader to participate in the decision. Zab can be decomposed into three stages: discovery, sync, and broadcast:

  • discovery : PL (prospective leader) is elected, PL collects Follower epoch (cepoch), and new epoch is generated according to the follower's feedback PL (new epoch is generated every time a new leader is elected, similar to Raft's term)
  • sync : PL fills in the missing state compared to the majority of Followers, and then each Follower fills in the missing state compared to PL. After PL and Follower complete state synchronization, PL becomes the official Leader (established leader)
  • broadcast : The Leader processes the write operation of the Client and broadcasts the state change to the Followers. After the majority of the Followers pass, the Leader initiates the implementation of the state change (deliver/commit)

The health status of the leader and the follower is determined by heartbeat. Normally, Zab is in the broadcast stage. When abnormal situations such as leader downtime and network isolation occur, Zab returns to the discovery stage.

 

After understanding the basic principles of Zab, let's look at how Zab ensures strong consistency. Zab achieves strong consistency by constraining the order of transactions. The first broadcast transaction is committed and FIFO first. Zab calls it primary order (hereinafter referred to as PO). The core of implementing PO is zxid.

 

Each transaction in Zab corresponds to a zxid, which consists of two parts: <e, c>, e is the epoch generated by the leader election, and c is the number of the transaction in the current epoch, increasing in turn. Assuming that the zxids of two transactions are z and z', respectively, when ze < z'.e or ze = z'.e && zc < z'.c, define z to occur before z' (z < z' ).

 

为实现PO,Zab对Follower、Leader有以下约束:

  1. 有事务z和z',如果Leader先广播z,则Follower需保证先commit z对应的事务
  2. 有事务z和z',z由Leader p广播,z'由Leader q广播,Leader p先于Leader q,则Follower需保证先commit z对应的事务
  3. 有事务z和z',z由Leader p广播,z'由Leader q广播,Leader p先于Leader q,如果Follower已经commit z,则q需保证已commit z才能广播z'

第1、2点保证事务FIFO,第3点保证Leader上具备所有已commit的事务。

 

相比Paxos,Zab约束了事务顺序、适用于有强一致性需求的场景。

 

Paxos、Raft、Zab再比较

除Paxos、Raft和Zab外,Viewstamped Replication(简称VR)[7][8]也是讨论比较多的一致性协议。这些协议包含很多共同的内容(Leader、quorum、state machine等),因而我们不禁要问:Paxos、Raft、Zab和VR等分布式一致性协议区别到底在哪,还是根本就是一回事?[9]

 

Paxos、Raft、Zab和VR都是解决一致性问题的协议,Paxos协议原文倾向于理论,Raft、Zab、VR倾向于实践,一致性保证程度等的不同也导致这些协议间存在差异。下图帮助我们理解这些协议的相似点和区别[10]

相比Raft、Zab、VR,Paxos更纯粹、更接近一致性问题本源,尽管Paxos倾向理论,但不代表Paxos不能应用于工程。基于Paxos的工程实践,须考虑具体需求场景(如一致性要达到什么程度),再在Paxos原始语意上进行包装。

 

小结

以上介绍分布式一致性协议Raft、Zab的核心思想,分析Raft、Zab与Paxos的异同。实现分布式系统时,先从具体需求和场景考虑,Raft、Zab、VR、Paxos等协议没有绝对地好与不好,只是适不适合。

 

[1] Paxos made live - An engineering perspective, Tushar Chandra, Robert Griesemer and Joshua Redstone, 2007

[2] In Search of an Understandable Consensus Algorithm, Diego Ongaro and John Ousterhout, 2013

[3] In Search of an Understandable Consensus Algorithm (Extended Version), Diego Ongaro and John Ousterhout, 2013

[4] Implementing Fault-Tolerant Services Using the State Machine, Fred B. Schneider, 1990

[5] Zab:High-performance broadcast for primary-backup systems, FlavioP.Junqueira,BenjaminC.Reed,andMarcoSerafini, 2011

[6] ZooKeeper's atomic broadcast protocol: Theory and practice, Andr´e Medeiros, 2012

[7] Viewstamped Replication A New Primary Copy Method to Support Highly-Available Distributed Systems, Brian M.Oki and Barbar H.Liskov, 1988

[8] Viewstamped Replication Revisited, Barbara Liskov and James Cowling, Barbara Liskov and James Cowling ,2012

[9] Can’t we all just agree? The morning paper, 2015

[10] Vive La Difference: Paxos vs. Viewstamped Replication vs. Zab, Robbert van Renesse, Nicolas Schiper and Fred B. Schneider, 2014

 

http://www.cnblogs.com/bangerlee/p/5991417.html

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=326535076&siteId=291194637