Principle zookeeper learning

First, what is the zookeeper

Zookeeper is a distributed coordination service can be used for service discovery, distributed lock, distributed leadership election, configuration management. All of this foundation are Zookeeper provides a tree structure similar to the Linux file system (can be considered lightweight memory file system, but only for a small amount of information stored, totally unsuitable for storing large amounts of files or large files), while providing monitoring and notification mechanism for each node. Since it is a file system, you have to mention how Zookeeper is to ensure the consistency of data.


 

Two, zookeeper cluster architecture

Zookeeper Cluster is a high availability cluster based on master-slave replication, usually as the primary server Master server provides writing services, other Slave Master server to obtain the latest data server via asynchronous replication, and provide reading services, there is no choice in the tradition ZooKeeper the Master / Slave concept, but the introduction of the Leader, Follower and Observer three roles, each role undertaken as follows:

  • Leader  a cluster Zookeeper same time there will only be a real work of Leader, it initiates and maintains a heartbeat and among Follwer and the Observer. All writes must be completed by then by the Leader to Leader writes broadcast to other servers.
  • Follower  a Zookeeper cluster may exist multiple Follower, Leader will respond to it in a heartbeat. When the Follower can be processed directly and return a read request from the client, while the write request will be forwarded to the Leader process, and is responsible for processing the Leader write request, the request to vote ( "write more than half success" strategy).
  • Observer  role and Follower similar, but without the right to vote.

 Zookeeper in the cluster master and slave is how to ensure data consistency?

In order to ensure consistency with the availability of a write operation, Zookeeper specifically designed to support the consistency of a protocol called Atomic Broadcast (ZAB) crash recovery. Based on this agreement, Zookeeper implements a master-slave architecture model system to maintain data consistency between copies of each cluster.

When writing data to ensure consistency: Zookeeper client will randomly connect to a cluster node Zookeeper, if it is a read request, reads data directly from the current node; if it is a write request and the current node is not a leader, then the node will to leader commit the transaction, will broadcasting leader, as long as more than half of the nodes successfully written, the write request is submitted (class 2PC protocol).

Leader election run during the server (when the leader when the machine how to choose master)?

zookeeper in a clustered mode, leader of downtime will not affect the continued provision of services, but the leader of downtime can not provide service outside the main process when the new election, there will be a brief pause process (this is the difference between the eureka).

  • Clusters already exist leader now if one of the servers: For a cluster already exists Leader concerned, such a situation is generally too late to start a machine, before they start work in the cluster have been on this case, when the machine is trying to elect Leader, Leader will be informed of current information server for the machine, the machine only needs to establish a connection and Leader, and state synchronization can be.
  • There is need to re-election leader downtime cluster leader: for example server3 is down. Each Server is the remaining issue a vote. Server1 and Server2 will be myid own server as a Leader to vote, each ballot will contain the server elected and ZXID, use (myid, ZXID) to indicate at this time to vote Server1 is (1, 0), Server2 of vote for (2, 0), then the vote will be sent to each cluster other machines. When selected a new leader, the second step is to ensure consistent data synchronization with the leader of all the data nodes.

            Voting process. For each vote, the server will need others to vote and their vote PK, PK rules are as follows

    • Priority check ZXID. ZXID larger server priority as Leader.

    If ZXID same, then compare myid. myid larger servers as Leader server.

Why is better to use an odd number of servers constitute ZooKeeper cluster?

zookeeper has such a feature: a cluster as long as more than half of the machines are working properly, the entire cluster is available outside. This means that if there are two zookeeper, as long as there is a dead zookeeper can not be used, because 1 is not more than half, so the death of two zookeeper tolerance is zero; the same token, if there are three zookeeper, a dead , the remaining two normal, more than half (2> 3/2), the tolerance is 3 zookeeper 1. If four zookeeper hang up if two, leaving two (not more than 2 4/2), but apparently it is half a cluster or unavailable, four tolerance or 1. So instead of an even number of units can not be deployed, but not for the even Taiwan high availability role, waste server.


 

Third, some of the important concepts of ZooKeeper

ZooKeeper data stored in memory, which also ensures high throughput and low latency (but memory limits the capacity that can be stored is not too large, this limit is kept small amount of data stored in znode further reasons).

ZooKeeper is a high-performance. In particular high performance in the "read" more than "write" applications, because the "write" causes among all the servers in sync. ( "Read" more than "write" is a typical scene coordination services.)

Session (Session)

  Session 指的是 ZooKeeper 服务器与客户端会话。在 ZooKeeper 中,一个客户端连接是指客户端和服务器之间的一个 TCP 长连接。客户端启动的时候,首先会与服务器建立一个 TCP 连接,从第一次连接建立开始,客户端会话的生命周期也开始了。通过这个连接,客户端能够通过心跳检测与服务器保持有效的会话,也能够向Zookeeper服务器发送请求并接受响应,同时还能够通过该连接接收来自服务器的Watch事件通知。 Session的sessionTimeout值用来设置一个客户端会话的超时时间。当由于服务器压力太大、网络故障或是客户端主动断开连接等各种原因导致客户端连接断开时,只要在sessionTimeout规定的时间内能够重新连接上集群中任意一台服务器,那么之前创建的会话仍然有效。

在为客户端创建会话之前,服务端首先会为每个客户端都分配一个sessionID。由于 sessionID 是 Zookeeper 会话的一个重要标识,许多与会话相关的运行机制都是基于这个 sessionID 的,因此,无论是哪台服务器为客户端分配的 sessionID,都务必保证全局唯一。

Watcher

  Watcher(事件监听器),是Zookeeper中的一个很重要的特性。Zookeeper允许用户在指定节点上注册一些Watcher,并且在一些特定事件触发的时候,ZooKeeper服务端会将事件通知到感兴趣的客户端上去,该机制是Zookeeper实现分布式协调服务的重要特性。

ACL

Zookeeper采用ACL(AccessControlLists)策略来进行权限控制,类似于 UNIX 文件系统的权限控制。Zookeeper 定义了如下5种权限


 

四、zookeeper 的数据结构

ZooKeeper 允许分布式进程通过共享的层次结构命名空间进行相互协调,这与标准文件系统类似。 名称空间由 ZooKeeper 中的数据寄存器组成 - 称为znode,这些类似于文件和目录。 与为存储设计的典型文件系统不同,ZooKeeper数据保存在内存中,这意味着ZooKeeper可以实现高吞吐量和低延迟。

 

1、PERSISTENT--持久化目录节点      客户端与zookeeper断开连接后,该节点依旧存在

2、PERSISTENT_SEQUENTIAL-持久化顺序编号目录节点   客户端与zookeeper断开连接后,该节点依旧存在,只是Zookeeper给该节点名称进行顺序编号

3、EPHEMERAL-临时目录节点   客户端与zookeeper断开连接后,该节点被删除

4、EPHEMERAL_SEQUENTIAL-临时顺序编号目录节点   客户端与zookeeper断开连接后,该节点被删除,只是Zookeeper给该节点名称进行顺序编号


 

五、zookeeper的作用

1、命名服务

在zookeeper的文件系统里创建一个目录,即有唯一的path,在我们使用tborg无法确定上游程序的部署机器时即可与下游程序约定好path,通过path即能互相探索发现。

2、配置管理

 程序总是需要配置的,如果程序分散部署在多台机器上,要逐个改变配置就变得困难。好吧,现在把这些配置全部放到zookeeper上去,保存在 Zookeeper 的某个目录节点中,然后所有相关应用程序对这个目录节点进行监听,一旦配置信息发生变化,每个应用程序就会收到 Zookeeper 的通知,然后从 Zookeeper 获取新的配置信息应用到系统中就好。

 3、集群管理

所谓集群管理无在乎两点:是否有机器退出和加入、选举master。

第一点,所有机器约定在父目录GroupMembers下创建临时目录节点,然后监听父目录节点的子节点变化消息。一旦有机器挂掉,该机器与 zookeeper的连接断开,其所创建的临时目录节点被删除,所有其他机器都收到通知:某个兄弟目录被删除,于是,所有人都知道他掉线了。新机器加入 也是类似,所有机器收到通知:新兄弟目录加入。

对于第二点,我们稍微改变一下,所有机器创建临时顺序编号目录节点,每次选取编号最小的机器作为master就好。

4、分布式锁

       有了zookeeper的一致性文件系统,锁的问题变得容易。锁服务可以分为两类,一个是保持独占,另一个是控制时序。

        对于第一类,我们将zookeeper上的一个znode看作是一把锁,通过createznode的方式来实现。所有客户端都去创建 /distribute_lock 节点,最终成功创建的那个客户端也即拥有了这把锁。用完删除掉自己创建的distribute_lock 节点就释放出锁。

        对于第二类, /distribute_lock 已经预先存在,所有客户端在它下面创建临时顺序编号目录节点,和选master一样,编号最小的获得锁,用完删除,依次方便。  

Guess you like

Origin www.cnblogs.com/jalja/p/11441174.html