Transfer: zookeeper environment construction

 

 When it comes to distributed development, Zookeeper must be understood and mastered. Distributed message service kafka, hbase to hadoop and other distributed big data processing will use Zookeeper, so we will explain Zookeeper as the basis here.

   Zookeeper is a distributed service framework, which is mainly used to solve some data management problems often encountered in distributed applications, such as: unified naming service, state synchronization service, cluster management, management of distributed application configuration items, etc.

  The core of Zookeeper is broadcasting, which ensures synchronization between servers. The protocol that implements this mechanism is called the Zab protocol.
  The Zab protocol has two modes, which are recovery mode (select master) and broadcast mode (synchronization). When the service starts or after the leader crashes, Zab enters recovery mode. When the leader is elected and most servers have finished synchronizing with the leader's state, the recovery mode ends.
  State synchronization ensures that the leader and server have the same system state. In order to ensure the sequential consistency of transactions, zookeeper uses an increasing transaction id number (zxid) to identify transactions.
  All proposals are made with zxid added. In the implementation, zxid is a 64-bit number, and its high-order 32 bits are the epoch used to identify whether the leader relationship has changed. Every time a leader is elected, it will have a new epoch to identify the current reign of that leader. The lower 32 bits are used to count up.
  Each server has three states in the working process:
  LOOKING: The current server does not know who the leader is and is searching.
  LEADING: The current server is the elected leader.
  FOLLOWING: The leader has been elected, and the current server is synchronized with it.

 

  There are three installation modes of ZooKeeper: stand-alone mode, cluster mode and cluster pseudo-distributed mode

environment

  CentOS7.0 (use zkServer.cmd in windows)

  Latest version of ZooKeeper

  Install with root user (if used for hbase, change all file permissions to hadoop user)

     Java environment, preferably the latest version.

  When distributed, it is necessary to ensure normal communication between multiple machines, close the firewall or allow the involved ports to pass.

download

  Go to the official website to download: http://zookeeper.apache.org/releases.html#download

  After downloading, put it into the /usr/local/ folder in CentOS, and unzip it into the current file /usr/local/zookeeper (for how to unzip, please refer to the previous Haproxy installation article)

Install

Standalone mode

  Enter the conf subdirectory under the zookeeper directory, and rename the zoo_sample.cfg file. Zookeeper will find this file as the default configuration file when it starts:

mv /usr/local/zookeeper/conf/zoo_sample.cfg  zoo.cfg

  Configure zoo.cfg parameters

copy code
# The number of milliseconds of each tick
tickTime=2000
# The number of ticks that the initial 
# synchronization phase can take
initLimit = 10
# The number of ticks that can pass between 
# sending a request and getting an acknowledgement
syncLimit=5
# the directory where the snapshot is stored.
# do not use /tmp for storage, /tmp here is just 
# example sakes.
dataDir=/usr/local/zookeeper/data
dataLogDir=/usr/local/zookeeper/log
# the port at which the clients will connect
clientPort=2181
#
# Be sure to read the maintenance section of the 
# administrator guide before turning on autopurge.
#
#http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance
#
# The number of snapshots to retain in dataDir
#autopurge.snapRetainCount=3
# Purge task interval in hours
# Set to "0" to disable auto purge feature
#autopurge.purgeInterval=1
copy code

参数说明:

tickTime:毫秒值.这个时间是作为 Zookeeper 服务器之间或客户端与服务器之间维持心跳的时间间隔,也就是每个 tickTime 时间就会发送一个心跳。
dataDir:顾名思义就是 Zookeeper 保存数据的目录,默认情况下,Zookeeper 将写数据的日志文件也保存在这个目录里。
dataLogDir:顾名思义就是 Zookeeper 保存日志文件的目录
clientPort:这个端口就是客户端连接 Zookeeper 服务器的端口,Zookeeper 会监听这个端口,接受客户端的访问请求。

  再创建上面配置的data和log文件夹:

mkdir  /usr/local/zookeeper/data
mkdir  /usr/local/zookeeper/log

启动zookeeper

  先进入/usr/local/zookeeper文件夹

cd /usr/local/zookeeper

  再运行 

bin/zkServer.sh start

  检测是否成功启动:执行

bin/zkCli.sh 
echo stat|nc localhost 2181

 

 伪集群模式

  所谓伪集群, 是指在单台机器中启动多个zookeeper进程, 并组成一个集群. 以启动3个zookeeper进程为例,模拟3台机。
  将zookeeper的目录多拷贝2份:
  zookeeper/conf/zoo.cfg文件与单机一样,只改为下面的内容:

copy code
tickTime=2000 
initLimit=5 
syncLimit=2 
dataDir=/usr/local/zookeeper/data
dataLogDir=/usr/local/zookeeper/log
clientPort=2180
server.0=127.0.0.1:2888:3888
server.1=127.0.0.1:2889:3889 
server.2=127.0.0.1:2890:3890
copy code

  新增了几个参数, 其含义如下:

1 initLimit: zookeeper集群中的包含多台server, 其中一台为leader, 集群中其余的server为follower. initLimit参数配置初始化连接时, follower和leader之间的最长心跳时间. 此时该参数设置为5, 说明时间限制为5倍tickTime, 即5*2000=10000ms=10s.
2 syncLimit: 该参数配置leader和follower之间发送消息, 请求和应答的最大时间长度. 此时该参数设置为2, 说明时间限制为2倍tickTime, 即4000ms.
3 server.X=A:B:C 其中X是一个数字, 表示这是第几号server. A是该server所在的IP地址. B配置该server和集群中的leader交换消息所使用的端口. C配置选举leader时所使用的端口. 由于配置的是伪集群模式, 所以各个server的B, C参数必须不同.
参照zookeeper/conf/zoo.cfg, 配置zookeeper1/conf/zoo.cfg, 和zookeeper2/conf/zoo.cfg文件. 只需更改dataDir, dataLogDir, clientPort参数即可.

在之前设置的dataDir中新建myid文件, 写入一个数字, 该数字表示这是第几号server. 该数字必须和zoo.cfg文件中的server.X中的X一一对应.
/usr/local/zookeeper/data/myid文件中写入0, /usr/local/zookeeper1/data/myid文件中写入1, /Users/apple/zookeeper2/data/myid文件中写入2.

  分别进入/usr/local/zookeeper/bin, /usr/local/zookeeper1/bin, /usr/local/zookeeper2/bin三个目录, 启动server。启动方法与单机一致。

bin/zkServer.sh start

  分别检测是否成功启动:执行

bin/zkCli.sh 
echo stat|nc localhost 2181

 

集群模式

  集群模式的配置和伪集群基本一致.
  由于集群模式下, 各server部署在不同的机器上, 因此各server的conf/zoo.cfg文件可以完全一样.
  下面是一个示例:

copy code
tickTime=2000 
initLimit=5 
syncLimit=2 
dataDir=/usr/local/zookeeper/data
dataLogDir=/usr/local/zookeeper/log
clientPort=2180
server.0=192.168.80.30:2888:3888
server.1=192.168.80.31:2888:3888
server.2=192.168.80.32:2888:3888
copy code

  示例中部署了3台zookeeper server, 分别部署在192.168.80.30, 192.168.80.31, 192.168.80.32上. 

  需要注意的是, 各server的dataDir目录下的myid文件中的数字必须不同,192.168.80.30 server的myid为0, 192.168.80.31 server的myid为1, 192.168.80.32 server的myid为2

  分别进入/usr/local/zookeeper/bin目录, 启动server。启动方法与单机一致。

bin/zkServer.sh start

  分别检测是否成功启动:执行

bin/zkCli.sh 
echo stat|nc localhost 2181

   这时会报大量错误?其实没什么关系,因为现在集群只起了1台server,zookeeper服务器端起来会根据zoo.cfg的服务器列表发起选举leader的请求,因为连不上其他机器而报错,那么当我们起第二个zookeeper实例后,leader将会被选出,从而一致性服务开始可以使用,这是因为3台机器只要有2台可用就可以选出leader并且对外提供服务(2n+1台机器,可以容n台机器挂掉)。

 

ZooKeeper服务命令

1. 启动ZK服务: zkServer.sh start
2. 查看ZK服务状态: zkServer.sh status
3. 停止ZK服务: zkServer.sh stop
4. 重启ZK服务: zkServer.sh restart

 

zk客户端命令:

  ZooKeeper 命令行工具类似于Linux的shell环境,使用它可以对ZooKeeper进行访问,数据创建,数据修改等操作.

  使用 zkCli.sh -server 192.168.80.31:2181 连接到 ZooKeeper 服务,连接成功后,系统会输出 ZooKeeper 的相关环境以及配置信息。命令行工具的一些简单操作如下:

copy code
1. 显示根目录下、文件: ls / 使用 ls 命令来查看当前 ZooKeeper 中所包含的内容
2. 显示根目录下、文件: ls2 / 查看当前节点数据并能看到更新次数等数据
3. 创建文件,并设置初始内容: create /zk "test" 创建一个新的 znode节点“ zk ”以及与它关联的字符串
4. 获取文件内容: get /zk 确认 znode 是否包含我们所创建的字符串
5. 修改文件内容: set /zk "zkbak" 对 zk 所关联的字符串进行设置
6. 删除文件: delete /zk 将刚才创建的 znode 删除
7. 退出客户端: quit
8. 帮助命令: help
copy code

 扩展

  通过上述命令实践,我们可以发现,zookeeper使用了一个类似文件系统的树结构,数据可以挂在某个节点上,可以对这个节点进行删改。另外我们还发现,当改动一个节点的时候,集群中活着的机器都会更新到一致的数据。 

zookeeper的数据模型

  在简单使用了zookeeper之后,我们发现其数据模型有些像操作系统的文件结构,结构如下图所示
<iframe id="iframe_0.4486236831020234" style="margin: 0px; padding: 0px; border-width: initial; border-style: none; width: 340px; height: 316px;" src="data:text/html;charset=utf8,%3Cstyle%3Ebody%7Bmargin:0;padding:0%7D%3C/style%3E%3Cimg%20id=%22img%22%20src=%22http://www.aboutyun.com/data/attachment/forum/201408/26/221909ijck0pk60je6z0j0.jpg?_=6564839%22%20style=%22border:none;max-width:973px%22%3E%3Cscript%3Ewindow.onload%20=%20function%20()%20%7Bvar%20img%20=%20document.getElementById('img');%20window.parent.postMessage(%7BiframeId:'iframe_0.4486236831020234',width:img.width,height:img.height%7D,%20'http://www.cnblogs.com');%7D%3C/script%3E" frameborder="0" scrolling="no"></iframe>
(1)     每个节点在zookeeper中叫做znode,并且其有一个唯一的路径标识,如/SERVER2节点的标识就为/APP3/SERVER2
(2)     Znode可以有子znode,并且znode里可以存数据,但是EPHEMERAL类型的节点不能有子节点
(3)     Znode中的数据可以有多个版本,比如某一个路径下存有多个数据版本,那么查询这个路径下的数据就需要带上版本。
(4)     znode 可以是临时节点,一旦创建这个 znode 的客户端与服务器失去联系,这个 znode 也将自动删除,Zookeeper 的客户端和服务器通信采用长连接方式,每个客户端和  服务器通过心跳来保持连接,这个连接状态称为 session,如果 znode 是临时节点,这个 session 失效,znode 也就删除了
(5)     znode 的目录名可以自动编号,如 App1 已经存在,再创建的话,将会自动命名为 App2 
(6)     znode 可以被监控,包括这个目录节点中存储的数据的修改,子节点目录的变化等,一旦变化可以通知设置监控的客户端,这个功能是zookeeper对于应用最重要的特性,通过这个特性可以实现的功能包括配置的集中管理,集群管理,分布式锁等等。  

 选举流程

  当 leader崩溃或者leader失去大多数的follower,这时候zk进入恢复模式,恢复模式需要重新选举出一个新的leader,让所有的 Server都恢复到一个正确的状态。Zk的选举算法有两种:一种是基于basic paxos实现的,另外一种是基于fast paxos算法实现的。系统默认的选举算法为fast paxos。
basic paxos流程:
1 .选举线程由当前Server发起选举的线程担任,其主要功能是对投票结果进行统计,并选出推荐的Server;
2 .选举线程首先向所有Server发起一次询问(包括自己);
3 .选举线程收到回复后,验证是否是自己发起的询问(验证zxid是否一致),然后获取对方的id(myid),并存储到当前询问对象列表中,最后获取对方提议的leader相关信息(id,zxid),并将这些信息存储到当次选举的投票记录表中;
4. 收到所有Server回复以后,就计算出zxid最大的那个Server,并将这个Server相关信息设置成下一次要投票的Server;
5. 线程将当前zxid最大的Server设置为当前Server要推荐的Leader,如果此时获胜的Server获得n/2 + 1的Server票数, 设置当前推荐的leader为获胜的Server,将根据获胜的Server相关信息设置自己的状态,否则,继续这个过程,直到leader被选举出来。
通 过流程分析我们可以得出:要使Leader获得多数Server的支持,则Server总数必须是奇数2n+1,且存活的Server的数目不得少于 n+1.每个Server启动后都会重复以上流程。在恢复模式下,如果是刚从崩溃状态恢复的或者刚启动的server还会从磁盘快照中恢复数据和会话信 息,zk会记录事务日志并定期进行快照,方便在恢复时进行状态恢复。 

Application scenarios

   It refers to the address of the resource or service obtained by the specified name, and the information of the provider. Using Zookeeper, it is easy to create a global path, and this path can be used as a name, which can point to the cluster in the cluster, the address of the service provided, the remote object, etc. Simply put, using Zookeeper as a naming service is to use the path as the name, and the data on the path is the entity that its name points to.

  The open source distributed service framework Dubbo of Alibaba Group uses ZooKeeper as its naming service to maintain a global list of service addresses. In the Dubbo implementation:
When the service provider starts, it writes its own URL address to the /dubbo/${serviceName}/providers directory of the specified node on ZK, and this operation completes the service release.
When the service consumer starts, it subscribes to the provider URL address in the /dubbo/{serviceName}/providers directory, and writes its own URL address to the /dubbo/{serviceName}/consumers directory.
  Note that all addresses registered with ZK are ephemeral nodes, which ensures that service providers and consumers can automatically sense resource changes.
  In addition, Dubbo also monitors service granularity by subscribing to the information of all providers and consumers in the /dubbo/{serviceName} directory.

 

 

Author: Huanzui 
public account [Daily life of a code farmer] Technical group: 319931204 No. 1 group: 437802986 No. 2 group: 340250479 
Source: http://zhangs1986.cnblogs.com/ 
The copyright of this article belongs to the author and the blog garden, welcome to reprint , but this statement must be retained without the consent of the author, and a link to the original text is given in an obvious position on the article page, otherwise the right to pursue legal responsibility is reserved.

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=326205137&siteId=291194637