[Reprint] Zookeeper Learning Series [a] teach you some basic concepts Zookeeper

Zookeeper Learning Series [a] teach you some basic concepts Zookeeper

https://segmentfault.com/a/1190000018927058

 

Foreword

Interest Group recently joined the technical department, was assigned the task Zookeeper's research. Among the study found Zookeeper due to its open source nature and its superior performance characteristics, widely used in the industry, there are a lot of scenarios, and these different scenarios is actually the underlying principles are similar, as long as you truly understand Some basic concepts and mechanisms of Zookeeper, they can comprehend by analogy.

Ever since, for the first time and within the project team members have shared Zookeeper as a principle and client demo after demo service registry, I sprouted out of the idea of ​​finishing a topic, as a starting point, and slowly picked up his blog Share the road.

Benpian content focuses on the following points:

  1. What is Zookeeper
  2. Zookeeper data model
  3. Zookeeper service basic operations
  4. Sessions
  5. Watches
  6. to sum up

一、What is Zookeeper

My first contact with Zookeeper is because the micro-architecture of our project management service using a service registry recommended Dubbo, Dubbo is Zookeeper. In essence, Zookeeper is a distributed service coordination, coordination in a distributed environment and management services is a complex process. ZooKeeper solve this problem through its simple architecture and API. ZooKeeper allows developers to focus on core application logic without having to worry about the distributed nature of applications. Zookeeper first applications in ecology Hadoop, Apache HBase use ZooKeeper to track the status of distributed data.

In fact it is well understood from its name, Zoo - Zoo, Keeper - administrator, there are many animals in the zoo, where the animals can make distributed environment than in a wide range of services, and Zookeeper do is manage these services.

ZooKeeper's goal is to design the complex and error-prone distributed consistency of service packages together to form a highly efficient and reliable set of primitives, and a series of easy to use interface to users.

Primitive: the operating system or computer network language category. It is composed by a number of instructions for a process to complete certain functions. · That the Executive has the indivisibility of primitives must be continuous, not be interrupted during execution.

Zookeeper provide services mainly through: data collection structure primitives + + watch mechanism to achieve.

Zookeeper distributed application can be achieved in conjunction with data such as publish / subscribe, load balancing, name services, distributed coordination / notification, cluster management, Master election, distributed locking and distributed queue functions.

Two, Zookeeper data model

ZNode

Zookeeper internal data model

We can see from the chart, Zookeeper data model and Unix file system directory tree very similar, with a level namespace. Each node is referred to there are - ZNode, a node can have child nodes, while also allowing a small amount of data stored in a node below the node. (It will be appreciated to allow a file may be a directory of a file system)

(1) reference node

ZNode path by reference as if the Unix file path. The path must be absolute, so they must have a forward slash character /to begin with, in addition, the path name must be unique and can not be changed.

This feature is also reflected in the service registration of Dubbo, Dubbo source has a cross-cutting class URL, dubbo at bus mode to save time and deliver configuration information, that is, the configuration information is placed URLon passing, you can always get relevant configuration information. Dubbo wrote at the time of registration to the Registrar by the central node is the name URLof URIthe configuration information is encoded and composition. As shown below.

GUI form of the path name of the node structure and Dubbo

Command pathname node structure and Dubbo line forms
It belongs to this part of the expansion of knowledge of the content will be more specifically described in the chapter after the service registry.

(2) ZNode structure

As mentioned earlier, two kinds of files and directories znode both characteristics, both as maintains data files as meta-information, ACL, timestamp data structure, like a directory path as part of the same can be identified.

ZNode consists of the following components:

  • Stat data structure

    • 操作控制列表(ACL) - 每个节点都有一个ACL来做节点的操作控制,这个列表规定了用户的权限,限定了特定用户对目标节点的操作

      • CREATE - 创建子节点的权限
      • READ - 获取节点数据和子节点列表的权限
      • WRITE - 更新节点数据的权限
      • DELETE - 删除子节点的权限
      • ADMIN - 设置节点ACL的权限
    • 版本 - ZNode有三个数据版本

      • version - 当前ZNode的版本
      • cversion - 当前ZNode子节点的版本
      • aversion - 当前ACL列表的版本
    • Zxid

      • 可以理解成Zookeeper中时间戳的一种表现形式,也可以理解成事务ID的概念
      • 如果Zxid1的值小于Zxid2的值,那么Zxid1所对应的事件发生在Zxid2所对应的事件之前。
      • ZooKeeper的每个节点维护者三个Zxid值,分别为:cZxid、mZxid、pZxid。

        • cZxid:节点创建时间 create
        • mZxid:节点最近一次修改时间 modify
        • pZxid:该节点的子节点列表最后一次被修改时的时间,子节点内容变更不会变更pZxid
  • data域
  • children节点

下面有几个需要注意的知识点着重讲一下:

A. 状态信息/节点属性

下图是我在服务器上使用zkClient,用get命令获取到的某个Dubbo微服务接口节点的状态信息,来作为示例,

[zk: localhost:2181(CONNECTED) 0] get /dubbo/com.***.microservice.ucs.api.UniqueControlApi 127.0.0.1 // 节点数据Data域 cZxid = 0xdd59 //Created ZXID,表示该ZNode被创建时的事务ID ctime = Thu Apr 18 15:17:11 CST 2019 //Created Time,表示该ZNode被创建的时间 mZxid = 0xdd59 //Modified ZXID,表示该ZNode最后一次被更新时的事务ID mtime = Thu Apr 18 15:17:11 CST 2019 //Modified Time,表示该节点最后一次被更新的时间 pZxid = 0xdd62 //表示该节点的子节点列表最后一次被修改时的事务ID。注意,只有子节点列表变更了才会变更pZxid,子节点内容变更不会影响pZxid。 cversion = 4 //子节点的版本号 dataVersion = 0 //数据节点版本号 aclVersion = 0 //ACL版本号 ephemeralOwner = 0x0 //创建该节点的会话的sessionID。如果该节点是持久节点,那么这个属性值为0。 dataLength = 9 // Data域内容长度 numChildren = 4 // 子节点个数 众所周知,Dubbo接口子节点分为providers/configurators/routers/consumers 

B. Data域

关于Data域,Zookeeper中每个节点存储的数据要被原子性的操作,也就是说读操作将获取与节点相关的所有数据,写操作也将替换掉节点的所有数据。

值得注意的是,Zookeeper虽然可以存储数据,但是从设计目的上,并不是为了做数据库或者大数据存储,相反,它是用来管理调度数据,比如分布式应用中的配置文件信息、状态信息、汇集位置等,这些数据通常是很小的数据,KB为大小单位。ZNode对数据大小也有限制,至多1M。实际上从这里,就可以推导出Zookeeper用于分布式配置中心的可行性。

C. Zxid

在ZooKeeper中,能改变ZooKeeper服务器状态的操作称为事务操作。一般包括数据节点创建与删除、数据内容更新和客户端会话创建与失效等操作。对应每一个事务请求,ZooKeeper都会为其分配一个全局唯一的事务ID,用Zxid表示。

由上图的示例可以看出,Zxid是一个64位的数字。前32位叫做epoch,用来标识Zookeeper 集群中的Leader节点,当Leader节点更换时,就会有一个新的epoch。后32位则为递增序列。从这些Zxid中可以间接地识别出ZooKeeper处理这些事务操作请求的全局顺序。

(3)节点类型

ZNode节点类型严格来说有四种:持久节点、临时节点、持久顺序节点、临时顺序节点

  • PERSISTENT 持久节点 - 该节点的生命周期不依赖于session,创建之后客户端断开连接,节点依旧存在,只用客户端执行删除操作,节点才能被删除;
  • EPHEMERAL 临时节点 - 该节点的声明周期依赖于session,客户端断开连接,临时节点就会自动删除。另外,临时节点不允许有子节点。
  • SEQUENTIAL 顺序节点 - 当选择创建顺序节点时,ZooKeeper通过将10位的序列号附加到原始名称来设置znode的路径。例如,如果将具有路径 /myapp 的znode创建为顺序节点,则ZooKeeper会将路径更改为 /myapp0000000001 ,并将下一个序列号设置为0000000002。如果两个顺序节点是同时创建的,那么ZooKeeper不会对每个znode使用相同的数字。顺序节点在锁定和同步中起重要作用。

Create a client node type .png

三、Zookeeper服务基本操作

Zookeeper service basic operations

如上图,标明了Zookeeper服务的九种基本操作,进入ZkClient.sh,使用help,可以看到这几种操作。

[zk: localhost:2181(CONNECTED) 1] help   
ZooKeeper -server host:port cmd args
    stat path [watch] // 获取指定节点的状态信息
    set path data [version] // setData操作 ls path [watch] // 查看某个节点下的所有子节点信息 delquota [-n|-b] path // 删除节点配额 ls2 path [watch] // ls + stat 两个命令结合 setAcl path acl // 设置ACL setquota -n|-b val path // 设置节点配额,-n 是限制子节点个数 -b是限制节点数据长度 history // 历史命令 redo cmdno // 执行历史命令 printwatches on|off delete path [version] // 删除指定路径节点,有子节点需要先删除子节点 sync path // 同步视图 listquota path // 查看节点配额信息 rmr path // 删除节点及其子节点 get path [watch] // 获取当前节点数据内容 create [-s] [-e] path data acl // 创建节点 addauth scheme auth quit getAcl path // 获取ACL close connect host:port

从命令中可以看到,更新ZooKeeper操作是有限制的。delete或setData必须明确要更新的Znode的版本号,我们可以调用exists找到。如果版本号不匹配,更新将会失败。

更新ZooKeeper操作是非阻塞式的。因此客户端如果失去了一个更新(由于另一个进程在同时更新这个Znode),他可以在不阻塞其他进程执行的情况下,选择重新尝试或进行其他操作。

四、Sessions

在 ZooKeeper 中,一个客户端连接是指客户端和服务器之间的一个 TCP 长连接。客户端启动的时候,首先会与服务器建立一个 TCP 连接,从第一次连接建立开始,客户端会话的生命周期也开始了。通过这个连接,客户端能够通过心跳检测与服务器保持有效的会话,也能够向Zookeeper服务器发送请求并接受响应,同时还能够通过该连接接收来自服务器的Watch事件通知。

客户端以特定的时间间隔发送心跳以保持会话有效。如果ZooKeeper Server Ensembles在超过服务器开启时指定的期间(会话超时)都没有从客户端接收到心跳,则它会判定客户端死机。

会话超时通常以毫秒为单位。当会话由于任何原因结束时,在该会话期间创建的临时节点也会被删除。

五、Watches

在我看来,Watches - 监听事件,是Zookeeper中一个很重要的特性,也是实现Zookeeper大多数功能的核心特性之一。简单来说, Zookeeper允许Client端在指定节点上注册Watches,在某些特定事件触发的时候,Zookeeper服务端会将事件异步通知到感兴趣(即注册了Watches)的客户端上去。可以理解成一个订阅/发布系统,是不是。

Znode更改是与znode相关的数据的修改或znode的子项中的更改。只触发一次watches。如果客户端想要再次通知,则必须通过另一个读取操作来完成。当连接会话过期时,客户端将与服务器断开连接,相关的watches也将被删除。

下面说完简单的,来说点复杂的部分。

几个特性先了解下:

  • One-time trigger 一次watch时间只会被触发一遍,如果节点再次发生变化,除非之前有重新设置过watches,不然不会收到通知;
  • Sent to Client 当watch的对象状态发生改变时,将会触发此对象上watch所对应的事件。watch事件将被异步地发送给客户端,并且ZooKeeper为watch机制提供了有序的一致性保证(Ordering guarantee)。
  • The data for which the watch was set 发送给客户端的数据信息,实际上就是你这个watch监视的类型,见下文介绍

Zookeeper的Watches 分为两种,数据监听器(Data Watches)和子节点监听器(Children Watches)。即你可以对某个节点的Data设置watches,也可以对某个子节点设置watches。

可以看下Zookeeper Java 客户端 Zkclient 中的设置watches的代码:

// listener 监听器
// path 节点路径

// 子节点监听器
private List<String> addTargetChildListener(String path, IZkChildListener listener) { return client.subscribeChildChanges(path, listener); } // 节点数据的监听器 public void addChildDataListener(String path, IZkDataListener listener) { try { // 递归创建节点 client.subscribeDataChanges(path, listener); } catch (ZkNodeExistsException e) { } }

作为开发者,需要知道监控节点的什么操作会触发你设置的watches。

  1. 一个成功的setData操作将触发Znode的数据watches
  2. 一个成功的create操作将触发Znode的数据watches以及子节点watches
  3. 一个成功的delete操作将触发Znode的数据watches和子节点watches

再看下ZkClient中的数据监听器接口IZkDataListener

public interface IZkDataListener {
    // 监控节点数据更新的时候会触发 这段逻辑 public void handleDataChange(String dataPath, Object data) throws Exception; // 监控节点被删除的时候会触发 这段逻辑 public void handleDataDeleted(String dataPath) throws Exception; }

再看下ZkClient中的子节点监听器接口IZkChildListener

public interface IZkChildListener {

    /** * Called when the children of the given path changed. * 监控节点的子节点列表改变时会触发这段逻辑 * * @param parentPath * The parent path * @param currentChilds * The children or null if the root node (parent path) was deleted. * @throws Exception */ public void handleChildChange(String parentPath, List<String> currentChilds) throws Exception; }

实际上看到这就能联想到,Zookeeper是可以当做分布式配置中心来使用的,只不过你需要自己扩展他异步通知节点数据变化之后的逻辑,更新你的配置。在后面的章节会更新相关demo。

关于Watches 详细介绍可以参考官网的介绍:

ZooKeeper Watches

https://zookeeper.apache.org/...

六、 总结

本章内容算是Zookeeper系列的开篇,介绍了Zookeeper的几个基础概念,并且给出了相关实例,助于理解。

现在我们再回过头来看看Zookeeper的特性:

① 顺序一致性
从同一个客户端发起的事务请求,最终将会严格按照其发起顺序被应用到ZooKeeper中。

② 原子性
所有事务请求的结果在集群中所有机器上的应用情况是一致的,也就是说要么整个集群所有集群都成功应用了某一个事务,要么都没有应用,一定不会出现集群中部分机器应用了该事务,而另外一部分没有应用的情况。

③ 单一视图
无论客户端连接的是哪个ZooKeeper服务器,其看到的服务端数据模型都是一致的。

④ 可靠性
一旦服务端成功地应用了一个事务,并完成对客户端的响应,那么该事务所引起的服务端状态变更将会被一直保留下来,除非有另一个事务又对其进行了变更。

⑤ 实时性
通常人们看到实时性的第一反应是,一旦一个事务被成功应用,那么客户端能够立即从服务端上读取到这个事务变更后的最新数据状态。这里需要注意的是,ZooKeeper仅仅保证在一定的时间段内,客户端最终一定能够从服务端上读取到最新的数据状态。

今天的内容中,顺序一致性是通过ZXid来实现的,全局唯一,顺序递增,同一个session中请求是FIFO的;可靠性的描述也可以通过今天的知识进行理解,一次事务的应用,服务端状态的变更会以Zxid、Znode数据版本、数据、节点路径的形式保存下来。剩下的几种特性是怎么实现的,在学习完Zookeeper集群相关的内容之后应该就能理解。

本篇文章中借鉴了网上几篇优秀的文章,并且结合了我本人一些思考和实践。希望能对你学习了解Zookeeper起到一些帮助。

The next chapter, I will introduce the cluster aspects Zookeeper knowledge, CAPpractice theory in Zookeeper, as well as how to build Zookeeper cluster.

reference

[1]  HTTPS: //zookeeper.apache.org / ...  official document (highly recommended)

[2]  HTTPS: //www.cnblogs.com/sundd ...  author should be a relatively deep understanding of official documents, I found his article and the context of the official website there are similarities. Very well written

[3]  HTTPS: //www.jianshu.com/p/a17 ...  author Zookeeper made a general introduction to understand

[4] https://www.w3cschool.cn/zook... w3cSchool tutorial

Read 1.7k updated on December 20

Guess you like

Origin www.cnblogs.com/jinanxiaolaohu/p/12116239.html