Zookeeper in the end help us solve what problem?

Zookeeper mode from the design point of view, is a distributed service management framework observer model-based design, which is responsible for data storage and management of concern to everyone, and then accept the registration of the observer, once the status of these data changes, it Zookeeper will be responsible for notifying those observers have been registered on the Zookeeper react accordingly, in order to achieve a similar cluster Master / Slave management mode, the internal details about the detailed architecture Zookeeper Zookeeper, etc. can read the source code

The following details these typical application scenarios, that is, in the end Zookeeper help us solve what problem? The following will give the answer.

Uniform Naming Service (Name Service)

Distributed applications, typically require a complete set of naming conventions, both to generate a unique name and easy to identify and remember people, usually with the name of the tree structure is an ideal choice, the name of the tree structure is a hierarchical directory structure, both on the people friendly and will not be repeated. Here you might think JNDI, yes Zookeeper's Name Service and JNDI to complete the functions are similar, they are all related to the directory structure to a certain level of resources, but more Zookeeper's Name Service is a broad sense the association, maybe you do not need to associate the name to a specific resource, you may only need a name will not be repeated, as if to produce a unique number as the primary key in the database.

Name Service is already a built-in feature Zookeeper, Zookeeper you just call the API will be realized. Call to create such an interface can easily create a directory node.

Configuration Management (Configuration Management)

Configuration management is very common in distributed application environments, such as the need for multiple PC Server system running the same application, but some configurations of applications they run are the same, if you want to modify the configuration of these same items, then each run must modify the application of PC Server, this is very cumbersome and error-prone.

Configuration information such as this can be managed to Zookeeper, the configuration information is stored in a directory node Zookeeper, and then all will need to modify the application configuration monitoring machine status information, once the configuration information changes, each application machine Zookeeper will receive notice of, and to obtain from Zookeeper new configuration information to the system.

Figure 2. Configuration management structure of FIG.

 

Typical application scenarios ZooKeeper

 

 

Cluster Management (Group Membership)

Zookeeper can easily implement cluster management functionality, if more than one Server cluster to form a service, you must be a "Explorer" to know the current status of each machine in the cluster service, if there is not service the machine, the cluster other clusters you must know in order to make adjustments reassign service strategy. Also when increased service capacity of the cluster, will increase one or more Server, we must also let the "Explorer" know.

Zookeeper can not only help you to maintain the current state of the cluster service machine, and can help you select a "Explorer", so that the Explorer to manage the cluster, which is another feature of Leader Election Zookeeper.

Their implementation is to create a EPHEMERAL type of directory node in Zookeeper, then each Server calls getChildren (String path, boolean watch) method to true and set the watch on them to create a directory node's parent directory node, because it is EPHEMERAL directory node, when it was created Server death, the directory node also will be removed, so Children will change, this time on the Watch getChildren will be called, so others will know Server Server already has a station died. New Server is the same principle.

Zookeeper how to implement Leader Election, is elected a Master Server. As per the previous EPHEMERAL Server creates a directory node, except that it is a SEQUENTIAL directory node, so it's a EPHEMERAL_SEQUENTIAL directory node. The reason why it is EPHEMERAL_SEQUENTIAL directory node, because we can give every numbering Server, we can select the current smallest number of Server as Master, if the minimum number of Server die, because it is EPHEMERAL node, the node corresponding Server also dead it is deleted, so the current node list a minimum number of nodes appeared, we choose the node for the current Master. This realization of the dynamic selection Master, avoids the problem of a single Master traditionally prone to the single point of failure.

Figure 3. Figure cluster management structure

 

Typical application scenarios ZooKeeper

 

 

This part of the sample code below, see Annex complete code:

Listing 3. Leader Election critical code

void findLeader() throws InterruptedException {
byte[] leader = null;
try {
leader = zk.getData(root + "/leader", true, null);
} catch (Exception e) {
logger.error(e);
}
if (leader != null) {
following();
} else {
String newLeader = null;
try {
byte[] localhost = InetAddress.getLocalHost().getAddress();
newLeader = zk.create(root + "/leader", localhost,
ZooDefs.Ids.OPEN_ACL_UNSAFE, CreateMode.EPHEMERAL);
} catch (Exception e) {
logger.error(e);
}
if (newLeader != null) {
leading();
} else {
mutex.wait();
}
}
}

 

Shared lock (Locks)

Shared lock is easy to implement in the same process, but in the cross-process or bad realized between different Server. Zookeeper is very easy to implement this feature, Server implementations are required to obtain a lock EPHEMERAL_SEQUENTIAL create a directory node, and then call getChildren method to get the current directory node list node directory is not the least is the directory node to create their own, if it is their own created, then it obtained the lock, if not then it calls exists (String path, boolean watch) method and monitor changes in the directory node list on Zookeeper, until the node that you create a directory node list is the minimum number of to obtain a lock, release the lock is very simple, just delete the directory node in front of its own creation on the line.

The flowchart of FIG. 4. Zookeeper implemented Locks

 

Typical application scenarios ZooKeeper

 

 

The synchronization lock codes are as follows, see Annex complete code:

Listing 4. The key lock synchronization

void getLock() throws KeeperException, InterruptedException{
List<``String``> list = zk.getChildren(root, false);
String[] nodes = list.toArray(new String[list.size()]);
Arrays.sort(nodes);
if(myZnode.equals(root+"/"+nodes[0])){
doAction();
}
else{
waitForLock(nodes[0]);
}
}
void waitForLock(String lower) throws InterruptedException, KeeperException {
Stat stat = zk.exists(root + "/" + lower,true);
if(stat != null){
mutex.wait();
}
else{
getLock();
}
}

 

队列管理

Zookeeper 可以处理两种类型的队列:

1. 当一个队列的成员都聚齐时,这个队列才可用,否则一直等待所有成员到达,这种是同步队列。

2. 队列按照 FIFO 方式进行入队和出队操作,例如实现生产者和消费者模型。

同步队列用 Zookeeper 实现的实现思路如下:

创建一个父目录 /synchronizing,每个成员都监控标志(Set Watch)位目录 /synchronizing/start 是否存在,然后每个成员都加入这个队列,加入队列的方式就是创建 /synchronizing/member_i 的临时目录节点,然后每个成员获取 / synchronizing 目录的所有目录节点,也就是 member_i。判断 i 的值是否已经是成员的个数,如果小于成员个数等待 /synchronizing/start 的出现,如果已经相等就创建 /synchronizing/start。

用下面的流程图更容易理解:

图 5. 同步队列流程图

 

Typical application scenarios ZooKeeper

 

 

同步队列的关键代码如下,完整的代码请看附件:

清单 5. 同步队列

void addQueue() throws KeeperException, InterruptedException{
zk.exists(root + "/start",true);
zk.create(root + "/" + name, new byte[0], Ids.OPEN_ACL_UNSAFE,
CreateMode.EPHEMERAL_SEQUENTIAL);
synchronized (mutex) {
List<``String``> list = zk.getChildren(root, false);
if (list.size() < size) {
mutex.wait();
} else {
zk.create(root + "/start", new byte[0], Ids.OPEN_ACL_UNSAFE,
CreateMode.PERSISTENT);
}
}
}

 

当队列没满是进入 wait(),然后会一直等待 Watch 的通知,Watch 的代码如下:

public void process(WatchedEvent event) {
if(event.getPath().equals(root + "/start") &&
event.getType() == Event.EventType.NodeCreated){
System.out.println("得到通知");
super.process(event);
doAction();
}
}

 

FIFO 队列用 Zookeeper 实现思路如下:

实现的思路也非常简单,就是在特定的目录下创建 SEQUENTIAL 类型的子目录 /queue_i,这样就能保证所有成员加入队列时都是有编号的,出队列时通过 getChildren( ) 方法可以返回当前所有的队列中的元素,然后消费其中最小的一个,这样就能保证 FIFO。

下面是生产者和消费者这种队列形式的示例代码,完整的代码请看附件:

清单 6. 生产者代码

boolean produce(int i) throws KeeperException, InterruptedException{
ByteBuffer b = ByteBuffer.allocate(4);
byte[] value;
b.putInt(i);
value = b.array();
zk.create(root + "/element", value, ZooDefs.Ids.OPEN_ACL_UNSAFE,
CreateMode.PERSISTENT_SEQUENTIAL);
return true;
}

 

清单 7. 消费者代码

int consume() throws KeeperException, InterruptedException{
int retvalue = -1;
Stat stat = null;
while (true) {
synchronized (mutex) {
List<``String``> list = zk.getChildren(root, true);
if (list.size() == 0) {
mutex.wait();
} else {
Integer min = new Integer(list.get(0).substring(7));
for(String s : list){
Integer tempValue = new Integer(s.substring(7));
if(tempValue < min) min = tempValue;
}
byte[] b = zk.getData(root + "/element" + min,false, stat);
zk.delete(root + "/element" + min, 0);
ByteBuffer buffer = ByteBuffer.wrap(b);
retvalue = buffer.getInt();
return retvalue;
}
}
}
}

总结

Zookeeper 作为 Hadoop 项目中的一个子项目,是 Hadoop 集群管理的一个必不可少的模块,它主要用来控制集群中的数据,如它管理 Hadoop 集群中的 NameNode,还有 Hbase 中 Master Election、Server 之间状态同步等。

This article describes the basics of Zookeeper, and describes several typical application scenarios. These are the basic functions Zookeeper, the most important thing is Zoopkeeper provides a mechanism for a well distributed cluster management, it is based on this hierarchical directory tree data structure, and the nodes in the tree for effective management , which can design a variety of distributed data management model, not just a few common scenarios mentioned above.

Guess you like

Origin www.cnblogs.com/lfs2640666960/p/11104838.html