什么?Zookeeper | 集群安装

ZooKeeper是一个分布式开源框架,提供了协调分布式应用的基本服务,它向外部应用暴露一组通用服务——分布式同步(Distributed Synchronization)、命名服务(Naming Service)、集群维护(Group Maintenance)等,简化分布式应用协调及其管理的难度,提供高性能的分布式服务。ZooKeeper本身可以以Standalone模式安装运行,不过它的长处在于通过分布式ZooKeeper集群(一个Leader,多个Follower),基于一定的策略来保证ZooKeeper集群的稳定性和可用性,从而实现分布式应用的可靠性。

先来说一下测试环境,主要有以下三台主机:

主机名 IP地址 ZooKeeper myid
slave-01 10.73.150.4 server.1 1
slave-02 10.73.150.54 server.2 2
slave-03 10.73.150.197 server.3 3

一、主机名称到IP地址映射配置


ZooKeeper集群中具有两个关键的角色:LeaderFollower。集群中所有的节点作为一个整体对分布式应用提供服务,集群中每个节点之间都互相连接,所以,在配置ZooKeeper集群的时候,每一个节点的hostIP地址的映射都要配置上集群中其它节点的映射信息。

例如,我的ZooKeeper集群中每个节点的配置,以slave-01为例,/etc/hosts内容如下所示:

[root@slave-01 ~]# cat /etc/hosts
127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6

10.73.150.4   slave-01
10.73.150.54  slave-02
10.73.150.197 slave-03

ZooKeeper采用一种称为Leader election的选举算法。在整个集群运行过程中,只有一个Leader,其他的都是Follower,如果ZooKeeper集群在运行过程中Leader出了问题,系统会采用该算法重新选出一个Leader。因此,各个节点之间要能够保证互相连接,必须配置上述映射。

ZooKeeper集群启动的时候,会首先选出一个Leader,在Leader election过程中,某一个满足选举算法的节点就能成为Leader。整个集群的架构可以参考http://zookeeper.apache.org/doc/trunk/zookeeperOver.html#sc_designGoals

二、修改ZooKeeper配置文件


单机Zookeeper的安装请参考:什么?Zookeeper | Linux下的安装

在其中一台机器(slave-01)上,安装ZooKeeper,并修改配置文件conf/zoo.cfg,内容如下所示:

[root@slave-01 conf]# vim zoo.cfg 

tickTime=2000
dataDir=/usr/local/zookeeper-3.4.11/data
dataLogDir=/usr/local/zookeeper-3.4.11/log
clientPort=2181
initLimit=5
syncLimit=2
server.1=slave-01:2888:3888
server.2=slave-02:2888:3888
server.3=slave-03:2888:3888

其中,2888端口用于内部选举投票,3888端口用于数据传递和广播。

注意:创建相应的目录。

上述配置内容说明,可以参考http://zookeeper.apache.org/doc/trunk/zookeeperStarted.html#sc_RunningReplicatedZooKeeper

三、远程复制分发安装文件


上面已经在一台机器slave-01上配置完成ZooKeeper,现在可以将该配置好的安装文件远程拷贝到集群中的各个节点对应的目录下:

[root@slave-01 local]# pwd
/usr/local
[root@slave-01 local]# ll | grep "zookeeper"
drwxr-xr-x  12   502 games 4096 1130 14:11 zookeeper-3.4.11
[root@slave-01 local]# scp -r zookeeper-3.4.11/ root@slave-02:/usr/local/

四、设置myid


在我们配置的dataDir指定的目录下面,创建一个myid文件,里面内容为一个数字,用来标识当前主机,conf/zoo.cfg文件中配置的server.X中X为什么数字,则myid文件中就输入这个数字,例如:

[root@slave-01 data]# echo "1" > /usr/local/zookeeper-3.4.11/data/myid

[root@slave-02 data]# echo "2" > /usr/local/zookeeper-3.4.11/data/myid

[root@slave-03 data]# echo "3" > /usr/local/zookeeper-3.4.11/data/myid

五、启动ZooKeeper集群


ZooKeeper集群的每个节点上,执行启动ZooKeeper服务的脚本,如下所示:

[root@slave-01 ~]# /usr/local/zookeeper-3.4.11/bin/zkServer.sh start

[root@slave-02 ~]# /usr/local/zookeeper-3.4.11/bin/zkServer.sh start

[root@slave-03 ~]# /usr/local/zookeeper-3.4.11/bin/zkServer.sh start

以节点slave-01为例,日志如下所示:

[root@slave-01 bin]# tail -f zookeeper.out

2017-11-30 14:44:44,812 [myid:1] - WARN  [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:QuorumCnxManager@584] - Cannot open channel to 3 at election address slave-03/10.73.150.197:3888
java.net.ConnectException: 拒绝连接 (Connection refused)
        at java.net.PlainSocketImpl.socketConnect(Native Method)
        at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
        at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
        at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
        at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
        at java.net.Socket.connect(Socket.java:589)
        at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:558)
        at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectAll(QuorumCnxManager.java:610)
        at org.apache.zookeeper.server.quorum.FastLeaderElection.lookForLeader(FastLeaderElection.java:845)
        at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:957)
2017-11-30 14:44:44,813 [myid:1] - INFO  [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:QuorumPeer$QuorumServer@184] - Resolved hostname: slave-03 to address: slave-03/10.73.150.197
2017-11-30 14:44:44,813 [myid:1] - INFO  [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:FastLeaderElection@854] - Notification time out: 12800
2017-11-30 14:44:57,620 [myid:1] - WARN  [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:QuorumCnxManager@584] - Cannot open channel to 2 at election address slave-02/10.73.150.54:3888
java.net.ConnectException: 拒绝连接 (Connection refused)
        at java.net.PlainSocketImpl.socketConnect(Native Method)
        at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
        at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
        at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
        at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
        at java.net.Socket.connect(Socket.java:589)
        at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:558)
        at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectAll(QuorumCnxManager.java:610)
        at org.apache.zookeeper.server.quorum.FastLeaderElection.lookForLeader(FastLeaderElection.java:845)
        at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:957)
2017-11-30 14:44:57,622 [myid:1] - INFO  [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:QuorumPeer$QuorumServer@184] - Resolved hostname: slave-02 to address: slave-02/10.73.150.54
2017-11-30 14:44:57,627 [myid:1] - WARN  [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:QuorumCnxManager@584] - Cannot open channel to 3 at election address slave-03/10.73.150.197:3888
java.net.ConnectException: 拒绝连接 (Connection refused)
        at java.net.PlainSocketImpl.socketConnect(Native Method)
        at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
        at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
        at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
        at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
        at java.net.Socket.connect(Socket.java:589)
        at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:558)
        at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectAll(QuorumCnxManager.java:610)
        at org.apache.zookeeper.server.quorum.FastLeaderElection.lookForLeader(FastLeaderElection.java:845)
        at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:957)
2017-11-30 14:44:57,634 [myid:1] - INFO  [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:QuorumPeer$QuorumServer@184] - Resolved hostname: slave-03 to address: slave-03/10.73.150.197
2017-11-30 14:44:57,635 [myid:1] - INFO  [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:FastLeaderElection@854] - Notification time out: 25600
2017-11-30 14:45:02,228 [myid:1] - INFO  [slave-01/10.73.150.4:3888:QuorumCnxManager$Listener@743] - Received connection request /10.73.150.54:34976
2017-11-30 14:45:02,237 [myid:1] - INFO  [WorkerReceiver[myid=1]:FastLeaderElection@602] - Notification: 1 (message format version), 2 (n.leader), 0x0 (n.zxid), 0x1 (n.round), LOOKING (n.state), 2 (n.sid), 0x1 (n.peerEpoch) LOOKING (my state)
2017-11-30 14:45:02,249 [myid:1] - INFO  [WorkerReceiver[myid=1]:FastLeaderElection@602] - Notification: 1 (message format version), 2 (n.leader), 0x0 (n.zxid), 0x1 (n.round), LOOKING (n.state), 1 (n.sid), 0x1 (n.peerEpoch) LOOKING (my state)
2017-11-30 14:45:02,260 [myid:1] - WARN  [WorkerSender[myid=1]:QuorumCnxManager@584] - Cannot open channel to 3 at election address slave-03/10.73.150.197:3888
java.net.ConnectException: 拒绝连接 (Connection refused)
        at java.net.PlainSocketImpl.socketConnect(Native Method)
        at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
        at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
        at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
        at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
        at java.net.Socket.connect(Socket.java:589)
        at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:558)
        at org.apache.zookeeper.server.quorum.QuorumCnxManager.toSend(QuorumCnxManager.java:534)
        at org.apache.zookeeper.server.quorum.FastLeaderElection$Messenger$WorkerSender.process(FastLeaderElection.java:454)
        at org.apache.zookeeper.server.quorum.FastLeaderElection$Messenger$WorkerSender.run(FastLeaderElection.java:435)
        at java.lang.Thread.run(Thread.java:748)
2017-11-30 14:45:02,260 [myid:1] - INFO  [WorkerSender[myid=1]:QuorumPeer$QuorumServer@184] - Resolved hostname: slave-03 to address: slave-03/10.73.150.197
2017-11-30 14:45:02,462 [myid:1] - INFO  [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:QuorumPeer@979] - FOLLOWING
2017-11-30 14:45:02,482 [myid:1] - INFO  [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Learner@86] - TCP NoDelay set to: true
2017-11-30 14:45:02,504 [myid:1] - INFO  [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Environment@100] - Server environment:zookeeper.version=3.4.11-37e277162d567b55a07d1755f0b31c32e93c01a0, built on 11/01/2017 18:06 GMT
2017-11-30 14:45:02,505 [myid:1] - INFO  [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Environment@100] - Server environment:host.name=slave-01
2017-11-30 14:45:02,505 [myid:1] - INFO  [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Environment@100] - Server environment:java.version=1.8.0_131
2017-11-30 14:45:02,505 [myid:1] - INFO  [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Environment@100] - Server environment:java.vendor=Oracle Corporation
2017-11-30 14:45:02,505 [myid:1] - INFO  [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Environment@100] - Server environment:java.home=/usr/local/src/jdk1.8.0_131/jre
2017-11-30 14:45:02,505 [myid:1] - INFO  [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Environment@100] - Server environment:java.class.path=/usr/local/zookeeper-3.4.11/bin/../build/classes:/usr/local/zookeeper-3.4.11/bin/../build/lib/*.jar:/usr/local/zookeeper-3.4.11/bin/../lib/slf4j-log4j12-1.6.1.jar:/usr/local/zookeeper-3.4.11/bin/../lib/slf4j-api-1.6.1.jar:/usr/local/zookeeper-3.4.11/bin/../lib/netty-3.10.5.Final.jar:/usr/local/zookeeper-3.4.11/bin/../lib/log4j-1.2.16.jar:/usr/local/zookeeper-3.4.11/bin/../lib/jline-0.9.94.jar:/usr/local/zookeeper-3.4.11/bin/../lib/audience-annotations-0.5.0.jar:/usr/local/zookeeper-3.4.11/bin/../zookeeper-3.4.11.jar:/usr/local/zookeeper-3.4.11/bin/../src/java/lib/*.jar:/usr/local/zookeeper-3.4.11/bin/../conf:.:/usr/local/src/jdk1.8.0_131/lib/dt.jar:/usr/local/src/jdk1.8.0_131/lib/tools.jar
2017-11-30 14:45:02,505 [myid:1] - INFO  [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Environment@100] - Server environment:java.library.path=/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib
2017-11-30 14:45:02,505 [myid:1] - INFO  [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Environment@100] - Server environment:java.io.tmpdir=/tmp
2017-11-30 14:45:02,506 [myid:1] - INFO  [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Environment@100] - Server environment:java.compiler=<NA>
2017-11-30 14:45:02,506 [myid:1] - INFO  [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Environment@100] - Server environment:os.name=Linux
2017-11-30 14:45:02,506 [myid:1] - INFO  [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Environment@100] - Server environment:os.arch=amd64
2017-11-30 14:45:02,506 [myid:1] - INFO  [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Environment@100] - Server environment:os.version=3.10.0-514.el7.x86_64
2017-11-30 14:45:02,506 [myid:1] - INFO  [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Environment@100] - Server environment:user.name=root
2017-11-30 14:45:02,506 [myid:1] - INFO  [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Environment@100] - Server environment:user.home=/root
2017-11-30 14:45:02,506 [myid:1] - INFO  [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Environment@100] - Server environment:user.dir=/usr/local/zookeeper-3.4.11/bin
2017-11-30 14:45:02,510 [myid:1] - INFO  [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:ZooKeeperServer@173] - Created server with tickTime 2000 minSessionTimeout 4000 maxSessionTimeout 40000 datadir /usr/local/zookeeper-3.4.11/data/version-2 snapdir /usr/local/zookeeper-3.4.11/log/version-2
2017-11-30 14:45:02,512 [myid:1] - INFO  [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Follower@65] - FOLLOWING - LEADER ELECTION TOOK - 30437
2017-11-30 14:45:02,518 [myid:1] - INFO  [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:QuorumPeer$QuorumServer@184] - Resolved hostname: slave-02 to address: slave-02/10.73.150.54
2017-11-30 14:45:02,538 [myid:1] - INFO  [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Learner@332] - Getting a diff from the leader 0x0
2017-11-30 14:45:09,796 [myid:1] - INFO  [slave-01/10.73.150.4:3888:QuorumCnxManager$Listener@743] - Received connection request /10.73.150.197:46546
2017-11-30 14:45:09,816 [myid:1] - INFO  [WorkerReceiver[myid=1]:FastLeaderElection@602] - Notification: 1 (message format version), 3 (n.leader), 0x100000000 (n.zxid), 0x1 (n.round), LOOKING (n.state), 3 (n.sid), 0x1 (n.peerEpoch) FOLLOWING (my state)

我启动的顺序是slave-01>slave-02>slave-03,由于ZooKeeper集群启动的时候,每个节点都试图去连接集群中的其它节点,先启动的肯定连不上后面还没启动的,所以上面日志前面部分的异常是可以忽略的。通过后面部分可以看到,集群在选出一个Leader后,最后稳定了。

其他节点可能也出现类似问题,属于正常。

六、安装验证


可以通过ZooKeeper的脚本来查看启动状态,包括集群中各个节点的角色(或是Leader,或是Follower),如下所示,是在ZooKeeper集群中的每个节点上查询的结果:

[root@slave-01 ~]# /usr/local/zookeeper-3.4.11/bin/zkServer.sh status
ZooKeeper JMX enabled by default
Using config: /usr/local/zookeeper-3.4.11/bin/../conf/zoo.cfg
Mode: follower
[root@slave-02 bin]# /usr/local/zookeeper-3.4.11/bin/zkServer.sh status
ZooKeeper JMX enabled by default
Using config: /usr/local/zookeeper-3.4.11/bin/../conf/zoo.cfg
Mode: leader
[root@slave-03 bin]# /usr/local/zookeeper-3.4.11/bin/zkServer.sh status
ZooKeeper JMX enabled by default
Using config: /usr/local/zookeeper-3.4.11/bin/../conf/zoo.cfg
Mode: follower

通过上面状态查询结果可见,slave-02是集群的Leader,其余的两个节点是Follower。

另外,可以通过客户端脚本,连接到ZooKeeper集群上。对于客户端来说,ZooKeeper是一个整体(ensemble),连接到ZooKeeper集群实际上感觉在独享整个集群的服务,所以,你可以在任何一个节点上建立到服务集群的连接,例如:

[root@slave-01 ~]# /usr/local/zookeeper-3.4.11/bin/zkCli.sh -server slave-01:2181

Connecting to slave-01:2181
2017-11-30 14:53:00,809 [myid:] - INFO  [main:Environment@100] - Client environment:zookeeper.version=3.4.11-37e277162d567b55a07d1755f0b31c32e93c01a0, built on 11/01/2017 18:06 GMT
2017-11-30 14:53:00,814 [myid:] - INFO  [main:Environment@100] - Client environment:host.name=slave-01
2017-11-30 14:53:00,814 [myid:] - INFO  [main:Environment@100] - Client environment:java.version=1.8.0_131
2017-11-30 14:53:00,816 [myid:] - INFO  [main:Environment@100] - Client environment:java.vendor=Oracle Corporation
2017-11-30 14:53:00,816 [myid:] - INFO  [main:Environment@100] - Client environment:java.home=/usr/local/src/jdk1.8.0_131/jre
2017-11-30 14:53:00,816 [myid:] - INFO  [main:Environment@100] - Client environment:java.class.path=/usr/local/zookeeper-3.4.11/bin/../build/classes:/usr/local/zookeeper-3.4.11/bin/../build/lib/*.jar:/usr/local/zookeeper-3.4.11/bin/../lib/slf4j-log4j12-1.6.1.jar:/usr/local/zookeeper-3.4.11/bin/../lib/slf4j-api-1.6.1.jar:/usr/local/zookeeper-3.4.11/bin/../lib/netty-3.10.5.Final.jar:/usr/local/zookeeper-3.4.11/bin/../lib/log4j-1.2.16.jar:/usr/local/zookeeper-3.4.11/bin/../lib/jline-0.9.94.jar:/usr/local/zookeeper-3.4.11/bin/../lib/audience-annotations-0.5.0.jar:/usr/local/zookeeper-3.4.11/bin/../zookeeper-3.4.11.jar:/usr/local/zookeeper-3.4.11/bin/../src/java/lib/*.jar:/usr/local/zookeeper-3.4.11/bin/../conf:.:/usr/local/src/jdk1.8.0_131/lib/dt.jar:/usr/local/src/jdk1.8.0_131/lib/tools.jar
2017-11-30 14:53:00,816 [myid:] - INFO  [main:Environment@100] - Client environment:java.library.path=/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib
2017-11-30 14:53:00,816 [myid:] - INFO  [main:Environment@100] - Client environment:java.io.tmpdir=/tmp
2017-11-30 14:53:00,817 [myid:] - INFO  [main:Environment@100] - Client environment:java.compiler=<NA>
2017-11-30 14:53:00,817 [myid:] - INFO  [main:Environment@100] - Client environment:os.name=Linux
2017-11-30 14:53:00,817 [myid:] - INFO  [main:Environment@100] - Client environment:os.arch=amd64
2017-11-30 14:53:00,817 [myid:] - INFO  [main:Environment@100] - Client environment:os.version=3.10.0-514.el7.x86_64
2017-11-30 14:53:00,817 [myid:] - INFO  [main:Environment@100] - Client environment:user.name=root
2017-11-30 14:53:00,817 [myid:] - INFO  [main:Environment@100] - Client environment:user.home=/root
2017-11-30 14:53:00,817 [myid:] - INFO  [main:Environment@100] - Client environment:user.dir=/root
2017-11-30 14:53:00,819 [myid:] - INFO  [main:ZooKeeper@441] - Initiating client connection, connectString=slave-01:2181 sessionTimeout=30000 watcher=org.apache.zookeeper.ZooKeeperMain$MyWatcher@1de0aca6
2017-11-30 14:53:00,853 [myid:] - INFO  [main-SendThread(slave-01:2181):ClientCnxn$SendThread@1035] - Opening socket connection to server slave-01/10.73.150.4:2181. Will not attempt to authenticate using SASL (unknown error)
Welcome to ZooKeeper!
JLine support is enabled
2017-11-30 14:53:01,057 [myid:] - INFO  [main-SendThread(slave-01:2181):ClientCnxn$SendThread@877] - Socket connection established to slave-01/10.73.150.4:2181, initiating session
2017-11-30 14:53:01,143 [myid:] - INFO  [main-SendThread(slave-01:2181):ClientCnxn$SendThread@1302] - Session establishment complete on server slave-01/10.73.150.4:2181, sessionid = 0x100002b78380000, negotiated timeout = 30000

WATCHER::

WatchedEvent state:SyncConnected type:None path:null
[zk: slave-01:2181(CONNECTED) 0] ls /
[zookeeper]

不管是在 leader 节点还是在 follower 节点对数据进行修改,三个节点的数据都是同步的。

附、实验过程中的问题


主机名与IP地址映射配置问题

启动ZooKeeper集群时,如果ZooKeeper集群中slave-01节点的日志出现如下错误:

java.net.SocketTimeoutException  
        at sun.nio.ch.SocketAdaptor.connect(SocketAdaptor.java:109)  
        at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:371)  
        at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectAll(QuorumCnxManager.java:404)  
        at org.apache.zookeeper.server.quorum.FastLeaderElection.lookForLeader(FastLeaderElection.java:688)  
        at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:622)  
2012-01-08 06:37:46,026 - INFO  [QuorumPeer:/0:0:0:0:0:0:0:0:2181:FastLeaderElection@697] - Notification time out: 6400  
2012-01-08 06:37:57,431 - WARN  [QuorumPeer:/0:0:0:0:0:0:0:0:2181:QuorumCnxManager@384] - Cannot open channel to 2 at election address slave-02/202.106.199.35:3888  
java.net.SocketTimeoutException  
        at sun.nio.ch.SocketAdaptor.connect(SocketAdaptor.java:109)  
        at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:371)  
        at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectAll(QuorumCnxManager.java:404)  
        at org.apache.zookeeper.server.quorum.FastLeaderElection.lookForLeader(FastLeaderElection.java:688)  
        at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:622)  
2012-01-08 06:38:02,442 - WARN  [QuorumPeer:/0:0:0:0:0:0:0:0:2181:QuorumCnxManager@384] - Cannot open channel to 3 at election address slave-03/202.106.199.35:3888  

很显然,slave-01在启动时连接集群中其他节点(slave-02、slave-03)时,主机名映射的IP与我们实际配置的不一致,所以集群中各个节点之间无法建立链路,整个ZooKeeper集群启动是失败的。
上面错误日志中slave-02/202.106.199.35:3888实际应该是slave-02/202.192.168.0.178:3888就对了,但是在进行域名解析的时候映射有问题,修改每个节点的/etc/hosts文件,将ZooKeeper集群中所有节点主机名到IP地址的映射配置上。

猜你喜欢

转载自blog.csdn.net/liupeifeng3514/article/details/80503012