[Reserved] zookeeper learning

First, what is the zookeeper

The official rhetoric: Zookeeper distributed service framework is a subproject of Apache Hadoop, which is mainly used to solve some of the data management problems often encountered in distributed applications, such as: a unified naming service, state synchronization service, cluster management, distributed application management and other configuration items.

Good abstract, we have to change the way, take a look at what features it provides, and then use it to look at these features can do something.

 

Two, zookeeper provides what

Simply put, zookeeper = + file system notification mechanism.

1, the file system

Zookeeper maintain a data structure similar to the file system:

 

Each subdirectory 00216 NameService are referred to as znode, and file systems, we are able to freely add, delete znode, increases at a znode, delete sub znode, the only difference is that znode can store data.

There are four types of znode:

1, PERSISTENT- persistence directory node

After the client disconnected zookeeper, the node is still present

2, PERSISTENT_SEQUENTIAL- persistent directory node sequence number

After zookeeper client is disconnected, the node is still present, but Zookeeper numbered sequentially to the node name

3, EPHEMERAL- temporary directory node

After the client disconnected zookeeper, the node is deleted

4, EPHEMERAL_SEQUENTIAL- temporary directory node sequentially numbered

After zookeeper client is disconnected, the node is deleted, but Zookeeper numbered sequentially to the node name

 

2, notification mechanism

The client registers its concern to monitor directory node, when the node directory changes (change data is deleted, delete the subdirectory node increases), zookeeper will notify the client.

 

It's that simple, let's see if you can do something about it?

 

Third, what we do with zookeeper

1. Naming Service

This seems to be the simplest, create a directory in the zookeeper's file system, that is the only path. When we deploy the machine can not be determined using tborg upstream program can be a good agreement with the downstream path program, through path that is able to Discovery, there or be square up to each other.

 

2, configuration management

Always you need to configure the program, if the program dispersed deployed on multiple machines, one by one to change the configuration becomes difficult. Okay, now to put all these configurations zookeeper up, is stored in a directory node Zookeeper, and then all relevant applications to the directory node to monitor, once the configuration information changes, each application will receive Zookeeper notice, then get from Zookeeper new configuration information to the system just fine.

 

3, cluster management

The so-called cluster management not care about two things: Is there a machine quit and join the electoral master.

On the first point, all the machines agreed to create a temporary directory node in the parent directory GroupMembers, then listens for the child node change message parent directory node. Once the machine hang up, the machine is connected to the zookeeper's off, it creates a temporary directory node is removed, all other machines have notified: a sibling directory is removed, so everyone knows: it on board a. The new machine is similar to join, all machines receive notification: new sibling to join, highcount there has been.

For the second point, we change a little bit, all the machines in order to create a temporary directory node number, select the smallest number each time the machine is like a master.

 

4,   distributed lock

With zookeeper consistency of the file system, lock the problem easier. Lock service can be divided into two categories, one is to maintain exclusivity, the other is to control the timing.

For the first category, we will znode on a zookeeper seen as a lock, which is achieved by createznode way. All clients to create / distribute_lock nodes, that create the ultimate success of the client that is owned the lock. Toilet had a saying: to be rushing, rushing and go, run out deleted distribute_lock node that you create will release the lock.

对于第二类, /distribute_lock 已经预先存在,所有客户端在它下面创建临时顺序编号目录节点,和选master一样,编号最小的获得锁,用完删除,依次方便。

 

5、队列管理

两种类型的队列:

1、 同步队列,当一个队列的成员都聚齐时,这个队列才可用,否则一直等待所有成员到达。

2、队列按照 FIFO 方式进行入队和出队操作。

第一类,在约定目录下创建临时目录节点,监听节点数目是否是我们要求的数目。

第二类,和分布式锁服务中的控制时序场景基本原理一致,入列有编号,出列按编号。

 

终于了解完我们能用zookeeper做什么了,可是作为一个程序员,我们总是想狂热了解zookeeper是如何做到这一点的,单点维护一个文件系统没有什么难度,可是如果是一个集群维护一个文件系统保持数据的一致性就非常困难了。

 

四、            分布式与数据复制

Zookeeper作为一个集群提供一致的数据服务,自然,它要在所有机器间做数据复制。数据复制的好处:

1、 容错
一个节点出错,不致于让整个系统停止工作,别的节点可以接管它的工作;

2、提高系统的扩展能力
把负载分布到多个节点上,或者增加节点来提高系统的负载能力;

3、提高性能
让客户端本地访问就近的节点,提高用户访问速度。

 

从客户端读写访问的透明度来看,数据复制集群系统分下面两种:

1、写主(WriteMaster)
对数据的修改提交给指定的节点。读无此限制,可以读取任何一个节点。这种情况下客户端需要对读与写进行区别,俗称读写分离;

2、写任意(Write Any)
对数据的修改可提交给任意的节点,跟读一样。这种情况下,客户端对集群节点的角色与变化透明。

 

对zookeeper来说,它采用的方式是写任意。通过增加机器,它的读吞吐能力和响应能力扩展性非常好,而写,随着机器的增多吞吐能力肯定下降(这也是它建立observer的原因),而响应能力则取决于具体实现方式,是延迟复制保持最终一致性,还是立即复制快速响应。

我们关注的重点还是在如何保证数据在集群所有机器的一致性,这就涉及到paxos算法。

 

五、            数据一致性与paxos算法

据说Paxos算法的难理解与算法的知名度一样令人敬仰,所以我们先看如何保持数据的一致性,这里有个原则就是:

在一个分布式数据库系统中,如果各节点的初始状态一致,每个节点都执行相同的操作序列,那么他们最后能得到一个一致的状态。

Paxos算法解决的什么问题呢,解决的就是保证每个节点执行相同的操作序列。好吧,这还不简单,master维护一个全局写队列,所有写操作都必 须放入这个队列编号,那么无论我们写多少个节点,只要写操作是按编号来的,就能保证一致性。没错,就是这样,可是如果master挂了呢。

Paxos算法通过投票来对写操作进行全局编号,同一时刻,只有一个写操作被批准,同时并发的写操作要去争取选票,只有获得过半数选票的写操作才会 被批准(所以永远只会有一个写操作得到批准),其他的写操作竞争失败只好再发起一轮投票,就这样,在日复一日年复一年的投票中,所有写操作都被严格编号排 序。编号严格递增,当一个节点接受了一个编号为100的写操作,之后又接受到编号为99的写操作(因为网络延迟等很多不可预见原因),它马上能意识到自己 数据不一致了,自动停止对外服务并重启同步过程。任何一个节点挂掉都不会影响整个集群的数据一致性(总2n+1台,除非挂掉大于n台)。

总结一下,数据一致性是如何保证的?是投票投出来的,幸福也是一样啊。

参考资料,直接复制有图片和部分文字:

http://blog.csdn.net/chen77716/article/details/6166675

http://blog.sina.com.cn/s/blog_5374d6e30100sn4l.html

http://rdc.taobao.com/team/jm/archives/448

http://www.ibm.com/developerworks/cn/opensource/os-cn-zookeeper/

 

 

转自:https://www.cnblogs.com/aspnethot/articles/3533956.html

Guess you like

Origin www.cnblogs.com/iCanhua/p/11331458.html