java framework - framework 1ZooKeeper Zookeeper distributed services functions and works

https://www.cnblogs.com/felixzh/p/5869212.html

Zookeeper functions and works

 

What 1.ZooKeeper that?
ZooKeeper is a distributed, open-source coordination service for distributed applications, Google's Chubby is an open source implementation, it is a cluster manager, monitors the status of each node in the cluster reasonable next step according to the feedback node submitted operating. Finally, the easy to use interface and efficient performance, function and stability of the system available to users

2.ZooKeeper provides what?

1) File System

2) notification mechanism

3.Zookeeper file system

each subdirectory entry as NameService are referred to as znode, and file systems, we are able to freely add, delete znode, increases at a znode, delete sub znode, the only znode difference is that the data is stored. 

There are four types of znode: 

. 1, PERSISTENT- directory node persistence 

after zookeeper client is disconnected, the node is still present 

2, PERSISTENT_SEQUENTIAL- persistent directory node sequence number 

after zookeeper client disconnects, the node is still exists, but Zookeeper numbered sequentially to the node name 

3, EPHEMERAL- temporary directory node 

after zookeeper client is disconnected, the node is removed 

4, EPHEMERAL_SEQUENTIAL- sequence number temporary directory node 

client is disconnected after zookeeper, the node is deleted, but Zookeeper numbered sequentially to the node name 

<ignore_js_op>  

4.Zookeeper notification mechanism

The client registers its concern to monitor directory node, when the node directory changes (change data is deleted, delete the subdirectory node increases), zookeeper will notify the client.

5.Zookeeper did what?

1. 2. Naming Service Cluster Management Configuration Management 3. 4. 5. Distributed Lock queue management

6.Zookeeper naming service

to create a directory in the zookeeper's file system, that is the only path. When we deploy the machine can not be determined using tborg upstream program can be a good agreement with the downstream path program, through path that is able to explore mutual discovery.

7.Zookeeper configuration management

program always requires configuration, if the program dispersed deployed on multiple machines, one by one to change the configuration becomes difficult. Now put all of these configurations zookeeper put up, is stored in a directory node Zookeeper, and then all relevant applications to the directory node to monitor, once the configuration information changes, each application will be notified of Zookeeper, Zookeeper then obtained from the new configuration information to the system like

<ignore_js_op>  

8.Zookeeper cluster management

so-called cluster management without care about two things: is there a machine quit and join the electoral master. 

On the first point, all the machines agreed to create a temporary directory node in the parent directory GroupMembers, then listens for the child node change message parent directory node. Once the machine hang up, the machine is connected to the zookeeper's off, it creates a temporary directory node is removed, all other machines have notified: a sibling directory is removed, so everyone knows: it on board a.

The new machine is similar to join, all machines receive notification: new sibling to join, highcount there has been, for the second point, we change a little bit, all the machines in order to create a temporary directory node number, select the smallest number each time the machine as a master just fine.

<ignore_js_op>  

9.Zookeeper distributed lock

file system consistency with the zookeeper, lock problem easily. Lock service can be divided into two categories, one is to maintain exclusivity, the other is to control the timing. 

For the first category, we will znode on a zookeeper seen as a lock, which is achieved by createznode way. All clients to create / distribute_lock nodes, that create the ultimate success of the client that is owned the lock. Run delete distribute_lock node that you create will release the lock. 

For the second category, / distribute_lock already pre-existing, all clients to create a temporary directory node sequentially numbered under it, and choose the same master, the lowest-numbered lock was exhausted deleted, and convenient.

<ignore_js_op>  

10.Zookeeper queue management

are two types of queues:

1, synchronous queue, when they gathered all the members of a queue, the queue is available, otherwise it has been waiting for all members to arrive. 

2, the queue enqueue and dequeue operation in FIFO mode. 

The first category, create a temporary directory node in the directory under the agreement, whether the number is the number of nodes monitor our requirements. 

The second type, and a distributed lock service control sequence consistent with the basic principles of the scene, the numbered columns, the columns by number.

11. distributed data replication 

Zookeeper as a cluster to provide consistent data services, naturally, it is to do data replication across all machines. Data replication benefits: 

1, fault tolerance: a node error to warrant the whole system to stop working, the other node can take over its work; 

2, increase system scalability: the load is distributed across multiple nodes, or nodes increases to increase the load capacity of the system; 

3, improve performance: the client to the nearest local access node, improve user access speed. 

Read and write access from the client point of view of transparency, data replication cluster following two points: 

1, the write master (WriteMaster): modification of the data submitted to the specified node. Read no such limitation, any node can read. The client in this case is required to read and write the difference, commonly known as separate read and write; 

2, any write (Write Any): modification of data may be submitted to any node, repeat the same. In this case, the client and the role of changes in the cluster nodes and transparent.

For zookeeper, the way it is used in writing any. By increasing the machine, its read throughput and responsiveness scalability is very good, and writing, with the increase in throughput capacity of the machine certainly decline (this is also the reason it established the observer), and the ability to respond will depend on the specific implementation, is delay final copy to maintain consistency, or copy quickly respond immediately.

12.Zookeeper Role Description

<ignore_js_op>  

13.Zookeeper client

<ignore_js_op>  

14.Zookeeper designed to

1. The final consistency: client regardless of which connect to the Server, it is presented to the same view, which is the most important zookeeper performance. 

2. Reliability: simple, robust, good performance, if the message is accepted to a server, it will be acceptable to all servers. 

3. real-time: Zookeeper ensure that clients will receive updated information server within a range of time intervals, or the information server failure. However, due to network delay and other reasons, Zookeeper not guarantee that two clients can simultaneously receive data just updated, the latest data if needed, should call sync before reading data () interface. 

4. Wait independent (wait-free): slow or failure of the client may not be fast intervention request client, so that each client can effectively wait. 

5. Atomicity: Update only success or failure, no intermediate state. 

6. order: the partial order and includes a global ordered two types: global order means that if a message before the message is posted on a server b, a message will be posted on the message before all b Server; a partial order means that if the same message b Post a message sender after, ahead of a bound b. 

15.Zookeeper works

core Zookeeper is atomic broadcast, this mechanism ensures synchronization between the various Server. Protocol Implementation of this mechanism is called Zab agreement. Zab protocol has two modes, which are recovery mode (selected from the master) and broadcast mode (synchronous). When the service starts or after leader crashes, Zab entered recovery mode, when the leader is elected, and most finished after Server synchronization and leader of the state, recovery mode is over. State synchronization ensures the leader and Server have the same system state. 

In order to ensure consistency of the order of affairs, zookeeper using increasing transaction id number (zxid) to identify the transaction. All proposals (proposal) are proposed to be added when zxid. Implementation zxid is a 64-bit number that is high 32 epoch used to identify the relationship between the leader has changed every time a leader is elected, it will have a new epoch, the current logo belonging to the reign of the leader. 32 for the low counts.

16.Zookeeper under Server operating state

of each Server has three states in the course of their work: 

LOOKING: Server does not know who the current leader is searching
LEADING: Current Server is the elected leader
FOLLOWING: leader has been elected, the current Server to synchronize

17.Zookeeper primary election process (basic paxos)

when a new leader leader leader crashes or loss of most of the follower, this time zk into recovery mode, recovery mode need to re-elect , so that all Server are restored to a proper state. Zk election algorithm, there are two: one is based on the realization of basic paxos, the other is based on fast paxos algorithm implemented. The default election algorithm is fast paxos.

1. Election of thread held by the current thread Server initiated election, its main function is to vote for statistical and elected recommended Server; 

2. Election of threads to launch a first inquiry (including myself) to all Server; 

3. Election Thread after receiving the reply, verify launched their own inquiry (verification zxid are the same), and then get the other side of the id (myid), and stored in the current inquiry object list, and finally get the other party's leader proposed the relevant information (id, zxid) and stores the information to the voting record of the table when the election; 

4. Server reply is received after all, it is calculated that the maximum zxid Server, Server and related information to set the next time you want to vote Server; 

5. Thread the current zxid largest Server to Server to the current recommended Leader, if at this time to get the winning Server Server votes n / 2 + 1, and set the current recommendation for the leader of the winning Server, based on the relevant winning Server information set their own state, otherwise, continue this process until the leader is elected. By process analysis we can conclude: To obtain the support of the majority Leader Server, the Server total number must be odd 2n + 1, and the number of surviving Server is not less than n + 1 after each start Server will repeat the above process. In recovery mode, if it is just a state of collapse or just restore the server will start restoring data and session information from the disk snapshot, zk will record the transaction log and regular snapshots facilitate state recovery in recovery. As shown in a detailed flow chart of the main selected from: 

<ignore_js_op>  

18.Zookeeper selected from the main flow (FAST Paxos)

FAST Paxos processes in the electoral process, first proposed a Server to Server all they have to be Leader, When Server receives a proposal other , conflict resolution and zxid epoch, and to accept the proposal of the other party, and then send the message to accept the proposal to complete the other side and repeat the process, will ultimately have to be elected Leader.

<ignore_js_op>  

19.Zookeeper synchronization process

is selected after completion Leader, zk enters a state synchronization process. 

1. Leader wait server connection; 

2 .Follower connection leader, the maximum sent to zxid Leader; 

. 3 .Leader synchronization point is determined according to the follower zxid; 

.. 4 after the synchronization has become a follower uptodate status notification; 

After 5 .Follower receive uptodate news, but also to re-accept the request of servicing the client.

<ignore_js_op>  

20.Zookeeper workflow -Leader

. data recovery. 1; 

2 Learner maintaining heartbeat, receives the request and determines Learner Learner request message type; 

. 3 .Learner message types are PING message, the REQUEST message, ACK message , revalidate messages, according to different message types, different processing is performed. 

PING message refers heartbeat Learner's;

the REQUEST message is an offer information Follower transmitted, including a write request and the synchronization request;

the ACK message is Follower of the proposed reply, more than half of Follower through, then commit the proposal;

revalidate message is SESSION to extend the effective time.

<ignore_js_op>  

21.Zookeeper workflow -Follower

Follower has four main functions: 

1. Leader sends a request to (the PING message, REQUEST message, ACK message, REVALIDATE message); 

2. Leader message received and processed; 

3. a Client request, if a write request is sent to the Leader vote;

4. Client returns the result. 


Follower message loop processing are several messages from the Leader: 

. 1 .PING message: Heartbeat message; 

2 .PROPOSAL message: Leader initiated the proposal requires Follower vote; 

. 3 .COMMIT message: a server information about the latest proposals; 

. 4 .UPTODATE message: indicates that synchronization is complete; 

. 5 .REVALIDATE message: The REVALIDATE result of Leader, to be closed revalidate the session or allow him to receive the message; 

6 .sync messages: SYNC return results to the client, the news was originally initiated by the client, to force to get the latest updates.

<ignore_js_op>  

Well, that's my understanding of the zookeeper, and then I will continue to update you as new technologies please look forward to it! ! !

1.ZooKeeper是什么?
ZooKeeper是一个分布式的,开放源码的分布式应用程序协调服务,是Google的Chubby一个开源的实现,它是集群的管理者,监视着集群中各个节点的状态根据节点提交的反馈进行下一步合理操作。最终,将简单易用的接口和性能高效、功能稳定的系统提供给用户

2.ZooKeeper提供了什么?

1)文件系统

2)通知机制

3.Zookeeper文件系统

每个子目录项如 NameService 都被称作为znode,和文件系统一样,我们能够自由的增加、删除znode,在一个znode下增加、删除子znode,唯一的不同在于znode是可以存储数据的。 

有四种类型的znode: 

1、PERSISTENT-持久化目录节点 

客户端与zookeeper断开连接后,该节点依旧存在 

2、PERSISTENT_SEQUENTIAL-持久化顺序编号目录节点 

客户端与zookeeper断开连接后,该节点依旧存在,只是Zookeeper给该节点名称进行顺序编号 

3、EPHEMERAL-临时目录节点 

客户端与zookeeper断开连接后,该节点被删除 

4、EPHEMERAL_SEQUENTIAL-临时顺序编号目录节点 

客户端与zookeeper断开连接后,该节点被删除,只是Zookeeper给该节点名称进行顺序编号 

<ignore_js_op> 

4.Zookeeper通知机制

客户端注册监听它关心的目录节点,当目录节点发生变化(数据改变、被删除、子目录节点增加删除)时,zookeeper会通知客户端。

5.Zookeeper做了什么?

1.命名服务   2.配置管理   3.集群管理   4.分布式锁  5.队列管理

6.Zookeeper命名服务

在zookeeper的文件系统里创建一个目录,即有唯一的path。在我们使用tborg无法确定上游程序的部署机器时即可与下游程序约定好path,通过path即能互相探索发现。

7.Zookeeper的配置管理

程序总是需要配置的,如果程序分散部署在多台机器上,要逐个改变配置就变得困难。现在把这些配置全部放到zookeeper上去,保存在 Zookeeper 的某个目录节点中,然后所有相关应用程序对这个目录节点进行监听,一旦配置信息发生变化,每个应用程序就会收到 Zookeeper 的通知,然后从 Zookeeper 获取新的配置信息应用到系统中就好

<ignore_js_op> 

8.Zookeeper集群管理

所谓集群管理无在乎两点:是否有机器退出和加入、选举master。 

对于第一点,所有机器约定在父目录GroupMembers下创建临时目录节点,然后监听父目录节点的子节点变化消息。一旦有机器挂掉,该机器与 zookeeper的连接断开,其所创建的临时目录节点被删除,所有其他机器都收到通知:某个兄弟目录被删除,于是,所有人都知道:它上船了。

新机器加入也是类似,所有机器收到通知:新兄弟目录加入,highcount又有了,对于第二点,我们稍微改变一下,所有机器创建临时顺序编号目录节点,每次选取编号最小的机器作为master就好。

<ignore_js_op> 

9.Zookeeper分布式锁

有了zookeeper的一致性文件系统,锁的问题变得容易。锁服务可以分为两类,一个是保持独占,另一个是控制时序。 

对于第一类,我们将zookeeper上的一个znode看作是一把锁,通过createznode的方式来实现。所有客户端都去创建 /distribute_lock 节点,最终成功创建的那个客户端也即拥有了这把锁。用完删除掉自己创建的distribute_lock 节点就释放出锁。 

对于第二类, /distribute_lock 已经预先存在,所有客户端在它下面创建临时顺序编号目录节点,和选master一样,编号最小的获得锁,用完删除,依次方便。

<ignore_js_op> 

10.Zookeeper队列管理

两种类型的队列:

1、同步队列,当一个队列的成员都聚齐时,这个队列才可用,否则一直等待所有成员到达。 

2、队列按照 FIFO 方式进行入队和出队操作。 

第一类,在约定目录下创建临时目录节点,监听节点数目是否是我们要求的数目。 

第二类,和分布式锁服务中的控制时序场景基本原理一致,入列有编号,出列按编号。

11.分布式与数据复制 

Zookeeper作为一个集群提供一致的数据服务,自然,它要在所有机器间做数据复制。数据复制的好处: 

1、容错:一个节点出错,不致于让整个系统停止工作,别的节点可以接管它的工作; 

2、提高系统的扩展能力 :把负载分布到多个节点上,或者增加节点来提高系统的负载能力; 

3、提高性能:让客户端本地访问就近的节点,提高用户访问速度。 

从客户端读写访问的透明度来看,数据复制集群系统分下面两种: 

1、写主(WriteMaster) :对数据的修改提交给指定的节点。读无此限制,可以读取任何一个节点。这种情况下客户端需要对读与写进行区别,俗称读写分离; 

2、写任意(Write Any):对数据的修改可提交给任意的节点,跟读一样。这种情况下,客户端对集群节点的角色与变化透明。

对zookeeper来说,它采用的方式是写任意。通过增加机器,它的读吞吐能力和响应能力扩展性非常好,而写,随着机器的增多吞吐能力肯定下降(这也是它建立observer的原因),而响应能力则取决于具体实现方式,是延迟复制保持最终一致性,还是立即复制快速响应。

12.Zookeeper角色描述

<ignore_js_op> 

13.Zookeeper与客户端

<ignore_js_op> 

14.Zookeeper设计目的

1.最终一致性:client不论连接到哪个Server,展示给它都是同一个视图,这是zookeeper最重要的性能。 

2.可靠性:具有简单、健壮、良好的性能,如果消息被到一台服务器接受,那么它将被所有的服务器接受。 

3.实时性:Zookeeper保证客户端将在一个时间间隔范围内获得服务器的更新信息,或者服务器失效的信息。但由于网络延时等原因,Zookeeper不能保证两个客户端能同时得到刚更新的数据,如果需要最新数据,应该在读数据之前调用sync()接口。 

4.等待无关(wait-free):慢的或者失效的client不得干预快速的client的请求,使得每个client都能有效的等待。 

5.原子性:更新只能成功或者失败,没有中间状态。 

6.顺序性:包括全局有序和偏序两种:全局有序是指如果在一台服务器上消息a在消息b前发布,则在所有Server上消息a都将在消息b前被发布;偏序是指如果一个消息b在消息a后被同一个发送者发布,a必将排在b前面。 

15.Zookeeper工作原理

Zookeeper 的核心是原子广播,这个机制保证了各个Server之间的同步。实现这个机制的协议叫做Zab协议。Zab协议有两种模式,它们分别是恢复模式(选主)和广播模式(同步)。当服务启动或者在领导者崩溃后,Zab就进入了恢复模式,当领导者被选举出来,且大多数Server完成了和 leader的状态同步以后,恢复模式就结束了。状态同步保证了leader和Server具有相同的系统状态。 

为了保证事务的顺序一致性,zookeeper采用了递增的事务id号(zxid)来标识事务。所有的提议(proposal)都在被提出的时候加上了zxid。实现中zxid是一个64位的数字,它高32位是epoch用来标识leader关系是否改变,每次一个leader被选出来,它都会有一个新的epoch,标识当前属于那个leader的统治时期。低32位用于递增计数。

16.Zookeeper 下 Server工作状态

每个Server在工作过程中有三种状态: 

LOOKING:当前Server不知道leader是谁,正在搜寻
LEADING:当前Server即为选举出来的leader
FOLLOWING:leader已经选举出来,当前Server与之同步

17.Zookeeper选主流程(basic paxos)

当leader崩溃或者leader失去大多数的follower,这时候zk进入恢复模式,恢复模式需要重新选举出一个新的leader,让所有的Server都恢复到一个正确的状态。Zk的选举算法有两种:一种是基于basic paxos实现的,另外一种是基于fast paxos算法实现的。系统默认的选举算法为fast paxos。

1.选举线程由当前Server发起选举的线程担任,其主要功能是对投票结果进行统计,并选出推荐的Server; 

2.选举线程首先向所有Server发起一次询问(包括自己); 

3.选举线程收到回复后,验证是否是自己发起的询问(验证zxid是否一致),然后获取对方的id(myid),并存储到当前询问对象列表中,最后获取对方提议的leader相关信息(id,zxid),并将这些信息存储到当次选举的投票记录表中; 

4.收到所有Server回复以后,就计算出zxid最大的那个Server,并将这个Server相关信息设置成下一次要投票的Server; 

5.线程将当前zxid最大的Server设置为当前Server要推荐的Leader,如果此时获胜的Server获得n/2 + 1的Server票数,设置当前推荐的leader为获胜的Server,将根据获胜的Server相关信息设置自己的状态,否则,继续这个过程,直到leader被选举出来。 通过流程分析我们可以得出:要使Leader获得多数Server的支持,则Server总数必须是奇数2n+1,且存活的Server的数目不得少于n+1. 每个Server启动后都会重复以上流程。在恢复模式下,如果是刚从崩溃状态恢复的或者刚启动的server还会从磁盘快照中恢复数据和会话信息,zk会记录事务日志并定期进行快照,方便在恢复时进行状态恢复。选主的具体流程图所示: 

<ignore_js_op> 

18.Zookeeper选主流程(fast paxos)

fast paxos流程是在选举过程中,某Server首先向所有Server提议自己要成为leader,当其它Server收到提议以后,解决epoch和 zxid的冲突,并接受对方的提议,然后向对方发送接受提议完成的消息,重复这个流程,最后一定能选举出Leader。

<ignore_js_op> 

19.Zookeeper同步流程

选完Leader以后,zk就进入状态同步过程。 

1. Leader等待server连接; 

2 .Follower连接leader,将最大的zxid发送给leader; 

3 .Leader根据follower的zxid确定同步点; 

4 .完成同步后通知follower 已经成为uptodate状态; 

5 .Follower收到uptodate消息后,又可以重新接受client的请求进行服务了。

<ignore_js_op> 

20.Zookeeper工作流程-Leader

1 .恢复数据; 

2 .维持与Learner的心跳,接收Learner请求并判断Learner的请求消息类型; 

3 .Learner的消息类型主要有PING消息、REQUEST消息、ACK消息、REVALIDATE消息,根据不同的消息类型,进行不同的处理。 

PING 消息是指Learner的心跳信息;

REQUEST消息是Follower发送的提议信息,包括写请求及同步请求;

ACK消息是 Follower的对提议的回复,超过半数的Follower通过,则commit该提议;

REVALIDATE消息是用来延长SESSION有效时间。

<ignore_js_op> 

21.Zookeeper工作流程-Follower

Follower主要有四个功能: 

1.向Leader发送请求(PING消息、REQUEST消息、ACK消息、REVALIDATE消息); 

2.接收Leader消息并进行处理; 

3.接收Client的请求,如果为写请求,发送给Leader进行投票;

4.返回Client结果。 


Follower的消息循环处理如下几种来自Leader的消息: 

1 .PING消息: 心跳消息; 

2 .PROPOSAL消息:Leader发起的提案,要求Follower投票; 

3 .COMMIT消息:服务器端最新一次提案的信息; 

4 .UPTODATE消息:表明同步完成; 

5 .REVALIDATE消息:根据Leader的REVALIDATE结果,关闭待revalidate的session还是允许其接受消息; 

6 .SYNC消息:返回SYNC结果到客户端,这个消息最初由客户端发起,用来强制得到最新的更新。

<ignore_js_op> 

好了,以上就是我对zookeeper的 理解了,以后我还会继续为大家更新新的技术请大家期待吧!!!

Guess you like

Origin www.cnblogs.com/hmit/p/11607987.html