10 minutes algorithms understand Raft

Distributed systems at the same time greatly improve the availability, fault tolerance, bringing consistency (CAP theory). Raft algorithm can solve the consistency problem in distributed system environments.

We are familiar with ETCD registration center on the use of this algorithm; you are now looking at this micro-channel public number of articles, is stored in high-availability storage server based on Raft algorithm.

Not have the patience to see the text, directly pulled Chapter IV .

First, what Raft algorithm?

In the past, it has been the standard Paxos distributed protocol, but Paxos difficult to understand, more difficult to achieve, Google's Chubby distributed lock system Paxos achieve ever encountered as many pit. Later, Stanford University raised Raft algorithm.

Raft is a consensus algorithm for managing replication logs. Its effect is equivalent to (Multi-) Paxos, with the Paxos as efficient, but the structure and Paxos different. This makes it easier to understand Raft than Paxos, provides a better basis for constructing practical systems.

The figure is Diego Ongaro Stanford University and John Ousterhout in "In Search of an Understandable Consensus Algorithm" a paper (paper presented Raft algorithm), the learning curve is plotted based on experimental data Raft. Subjects were undergraduate and graduate students at Stanford University and the University of California, Berkeley. Paxos is difficult to find these genius. So for the majority of people do not understand Paxos algorithm is quite normal, I do not understand Raft principle not surprising.

 

Second, what is the consistency (Consensus)

Consistency is the basic problem of fault-tolerant distributed systems. Consistency involves multiple server status (Values) agreement. Once they make a decision on the state, the decision is final. When most of the server is available, the typical consensus algorithm will make progress. For example, even if the two servers fails, five servers in the cluster can continue to run. If more servers fail, they will stop progress (but never return incorrect results).

Three, Raft algorithm

Chapters paper Raft algorithm presented include six parts, get an idea on the line, then pulled behind this article, there is a workable game aid in the understanding of this algorithm.

1, Raft basics

Raft cluster comprising a plurality of servers, the server 5 is typical, it allows the system to tolerate two faults. At any given time, each server in one of three states, leader (Leader), followers (Follower) or candidate (Candidate). You can convert several states see each other.

Leader: handles all client interaction, logging replication, usually only one Leader

Follower: Similar voters, completely passive

Candidate: Similar Proposer lawyer, may be selected as a new leader

 

2, Election Leader

Raft heartbeat mechanism to trigger leader election. When the server starts, they start as a Follower of. As long as the server receives a valid Candidate Leader or RPC request, the server will remain Follower state. Leader sends periodic heartbeat (AppendEntries RPC with no log entries) to all Follower to maintain its privileges. If a Follower called not received any communication in a period of time the election timeout, the Follower that there is no viable leader and began to elect a new Leader.

3, log replication

Once a Leader is elected, it began to provide services to client requests. Each client request contains the command executed by copying the state machine. Leader command attached to it as a new log entries, and then sent in parallel to each other AppendEntries RPC server to copy the entries. When an entry is safely copied, Leader which the entry to the state machine and returning the execution results to the client. If the running Follower crash or slow, or network packet loss, Leader indefinitely retry AppendEntries RPC (even though it responds to the client) until all the log entries of all Follower ultimate storage. (Behind the game there is a request command menu, is to imitate requested by the client)

In addition to the above three points, the article also focused describe three aspects of security, Follower and Candidate crashes, time and availability.

Fourth, the visual Raft algorithm

github上有一个帮助大家理解算法的页面,地址是https://raft.github.io/raftscope/index.html

建议用电脑浏览器打开,如果在手机微信里打开,需要选择“访问原网页”

我截了一个运行状态的截图,左侧显示五台服务器,右侧显示日志。

 

在服务器图标上点击鼠标右键会出现操作菜单。操作菜单对应服务节点的状态改变,其中request模拟客户端请求服务器集群执行任务,会在右边产生日志。

 

多操作一会,一定能够理解Raft算法是怎么运行的!

五、总结

Raft算法具备强一致、高可靠、高可用等优点,具体体现在:

强一致性:虽然所有节点的数据并非实时一致,但Raft算法保证Leader节点的数据最全,同时所有请求都由Leader处理,所以在客户端角度看是强一致性的。

高可靠性:Raft算法保证了Committed的日志不会被修改,State Matchine只应用Committed的日志,所以当客户端收到请求成功即代表数据不再改变。Committed日志在大多数节点上冗余存储,少于一半的磁盘故障数据不会丢失。

高可用性:从Raft算法原理可以看出,选举和日志同步都只需要大多数的节点正常互联即可,所以少量节点故障或网络异常不会影响系统的可用性。即使Leader故障,在选举超时到期后,集群自发选举新Leader,无需人工干预,不可用时间极小。但Leader故障时存在重复数据问题,需要业务去重或幂等性保证。

高性能:与必须将数据写到所有节点才能返回客户端成功的算法相比,Raft算法只需要大多数节点成功即可,少量节点处理缓慢不会延缓整体系统运行。

Guess you like

Origin www.cnblogs.com/ExMan/p/11945708.html