Zookeeper introductory notes (2)--zookeeper command introduction summary
In recent work, I found that the students in the group had a fault in zk. When the problem was encountered, they were at a loss. They didn't know how to start. Many intelligently read the prompts and logs. Here is a collection of commands about zk for your reference.
zkCmd command line location
The command usage of the zk module and other Hadoop environment modules is slightly different. Its commands are mainly located in the /bin directory of its installation directory.
[root@nn1 bin]# ls
zkCleanup.sh zkCli.sh zkEnv.sh zkServer-initialize.sh zkServer.sh zookeeper-client zookeeper-server zookeeper-server-cleanup zookeeper-server-initialize
[root@nn1 bin]# pwd
/usr/hdp/2.6.1.0-129/zookeeper/bin
zk startup and configuration
First, look at the zk configuration file: /conf/zoo.cfg
clientPort=2181
initLimit=10
autopurge.purgeInterval=24
syncLimit=5
tickTime=3000
dataDir=/hadoop/zookeeper
autopurge.snapRetainCount=30
server.1=dn1:2888:3888
server.2=dn2:2888:3888
server.3=dn3:2888:3888
clientPort:
ClientPort is the port definition on which the zookeeper server listens for client connections. (The client port monitored here refers to the interface where zookeeper-server listens to the client, and port 2181 is commonly used)
tickTime
tickTime is the basic time unit in zookeeper, representing the interval between sending heartbeats, in milliseconds
initLimit
initLimit is used to configure Zookeeper to accept the client ( the client mentioned here is not the client that the user connects to the Zookeeper server, but the Follower server connected to the Leader in the Zookeeper server cluster) How many heartbeat intervals can be tolerated when initializing the connection number. When the Zookeeper server has not received the return information from the client after more than 5 heartbeats (that is, tickTime), it indicates that the client connection failed. The total length of time is 5*2000=10 seconds. (Special note: this parameter refers to the follower, not the client)
datadir
datadir is the directory where zookeeper persistent data is stored.
server.x=A:B:C
Here x represents the number of the server (no actual relationship with node1, 2, 3 in the configured host, just a number), A represents the IP address of the server X, and B represents the server X and the cluster. The port on which the leader server exchanges information (it is a different port from clientPort and has a different function, it communicates with the leader), C represents that if the leader server in the cluster hangs up, a port is needed to re-elect and select a new one. Leader, this port is the port used to communicate with each other during the election (the port dedicated to electing the Leader).
syncLimit
When sending a message between the Leader and the Follower, the time length of the request and the response cannot exceed the length of the tickTime. The value here is 5, which means that the time length cannot exceed 5*3000=15000, which is 1.5 seconds.
In addition to the configuration file zoo.cfg, there are several configuration files such as log4j.properties and zookeeper-env.sh in the configuration folder. Among them, Log4j is similar to the traditional log configuration file. zookeeper-env.sh is mainly about the environment variable configuration of zookeeper:
[root@namenode conf]# more zookeeper-env.sh
export JAVA_HOME=/usr/jdk64/jdk1.8.0_112
export ZOOKEEPER_HOME=/usr/hdp/current/zookeeper-server
export ZOO_LOG_DIR=/var/log/zookeeper
export ZOOPIDFILE=/var/run/zookeeper/zookeeper_server.pid
export SERVER_JVMFLAGS=-Xmx1024m
export JAVA=$JAVA_HOME/bin/java
export CLASSPATH=$CLASSPATH:/usr/share/zookeeper/*
export JMXPORT=22222
export JMXAUTH=false
export JMXSSL=false
There is a JMX configuration, which needs special attention.
After configuring the relevant information, you can actually start the zk service of this node through "/bin/zkServer start".
zk cmd
zk cmd mainly includes two: zkServer.sh and zkCli.sh.
zkServer.sh
View server status
[root@namenode bin]# sh zkServer.sh status
ZooKeeper JMX enabled by default
Using config: /usr/hdp/2.6.1.0-129/zookeeper/bin/../conf/zoo.cfg
Mode: leader
[root@dn2 bin]# sh zkServer.sh status
ZooKeeper JMX enabled by default
Using config: /usr/hdp/2.6.1.0-129/zookeeper/bin/../conf/zoo.cfg
Mode: follower
Looking at the two examples shown above, you can see the status of the node zk where the command is executed. The first shows the leader node, and the second shows the follower node.
Start and stop the server
[root@dn2 bin]# sh zkServer.sh
ZooKeeper JMX enabled by default
Using config: /usr/hdp/2.6.1.0-129/zookeeper/bin/../conf/zoo.cfg
Usage: zkServer.sh {start|start-foreground|stop|restart|status|upgrade|print-cmd}
four-word command
This part is slightly more content than the previous method of viewing status, but the method used is different. The method used here is the 4-word command unique to zk. Personally, the so-called four-word command is actually using the nc command to operate zk, and its main format is echo xxxx|nc Ip Port, where xxxx is the four-word command.
View specific node information stat
[root@dn2 bin]# echo stat|nc 127.0.0.1 2181
Zookeeper version: 3.4.6-129--1, built on 05/31/2017 03:01 GMT
Clients:
/10.10.0.23:58458[1](queued=0,recved=18672,sent=18672)
/10.10.0.23:58294[1](queued=0,recved=6314,sent=6314)
/10.10.0.21:33103[1](queued=0,recved=76025,sent=76036)
/10.10.0.23:59646[1](queued=0,recved=3062,sent=3062)
/10.10.0.21:34046[1](queued=0,recved=18186,sent=18186)
/10.10.0.22:57146[1](queued=0,recved=300427,sent=300428)
/10.10.0.22:58808[1](queued=0,recved=22203,sent=22203)
/10.10.0.23:58446[1](queued=0,recved=3091,sent=3091)
/10.10.0.23:58326[1](queued=0,recved=18557,sent=18557)
/10.10.0.22:57118[1](queued=0,recved=67287,sent=67287)
/10.10.0.23:59648[1](queued=0,recved=3062,sent=3062)
/10.10.0.23:58322[1](queued=0,recved=20252,sent=20252)
/127.0.0.1:46964[0](queued=0,recved=1,sent=0)
/10.10.0.24:50152[1](queued=0,recved=76042,sent=76053)
/10.10.0.23:58330[1](queued=0,recved=38120,sent=38120)
/10.10.0.22:57140[1](queued=0,recved=25628,sent=25628)
/10.10.0.21:33544[1](queued=0,recved=6112,sent=6112)
/10.10.0.22:57138[1](queued=0,recved=186846,sent=186846)
/10.10.0.22:58820[1](queued=0,recved=22203,sent=22203)
Latency min/avg/max: 0/0/1227
Received: 981083
Sent: 981109
Connections: 19
Outstanding: 0
Zxid: 0x9000193d0
Mode: follower
Node count: 519
Output the details of the related service configuration conf
[root@dn2 bin]# echo conf|nc dn3 2181
clientPort=2181
dataDir=/hadoop/zookeeper/version-2
dataLogDir=/hadoop/zookeeper/version-2
tickTime=3000
maxClientCnxns=60
minSessionTimeout=6000
maxSessionTimeout=60000
serverId=3
initLimit=10
syncLimit=5
electionAlg=3
electionPort=3888
quorumPort=2888
peerType=0
dn3 上的zoo配置文件
clientPort=2181
initLimit=10
autopurge.purgeInterval=24
syncLimit=5
tickTime=3000
dataDir=/hadoop/zookeeper
autopurge.snapRetainCount=30
server.1=dn1:2888:3888
server.2=dn2:2888:3888
server.3=dn3:2888:3888
After careful observation, we can find that the configuration information displayed by the command is exactly the same as the configuration information in the configuration file.
Callback information cons
[root@dn3 ~]# echo cons|nc dn2 2181
/10.10.0.23:58458[1](queued=0,recved=21454,sent=21454,sid=0x262b94c0cca0005,lop=PING,est=1523527255302,to=10000,lcxid=0x87,lzxid=0xffffffffffffffff,lresp=1523598357204,llat=0,minlat=0,avglat=0,maxlat=872)
/10.10.0.23:58294[1](queued=0,recved=7241,sent=7241,sid=0x262b94c0cca0000,lop=PING,est=1523527213803,to=30000,lcxid=0x8c,lzxid=0xffffffffffffffff,lresp=1523598354168,llat=0,minlat=0,avglat=0,maxlat=18)
/10.10.0.21:33103[1](queued=0,recved=95946,sent=95958,sid=0x262b92e6c950007,lop=GETC,est=1523527817543,to=30000,lcxid=0x18b5f,lzxid=0x90001d465,lresp=1523598358526,llat=0,minlat=0,avglat=0,maxlat=626)
List unhandled sessions and ephemeral node dumps
Due to the large amount of dump content, the following is divided into two parts to describe the returned content
Unprocessed sessions (only for the Leader node):
[root@namenode version-2]# echo dump|nc localhost 2181
SessionTracker dump:
Session Sets (20):
3 expire at Fri Apr 13 14:34:21 CST 2018:
0x262b40dd1156bf9
0x362b252fd64a76b
0x362b252fd64a76a
0 expire at Fri Apr 13 14:34:24 CST 2018:
0 expire at Fri Apr 13 14:34:27 CST 2018:
6 expire at Fri Apr 13 14:34:30 CST 2018:
0x362b252fd64a764
0x362b252fd64a76c
0x262b252fd1c0005
Temporary node:
ephemeral nodes dump:
Sessions with Ephemerals (16):
0x362b252fd632693:
/storm/supervisors/75138bac-0922-4978-82b5-301f6a8bbfe4
0x162b2607478005c:
/consumers/atlas/ids/atlas_namenode.hadoop.wish.me-1523413973525-43469beb
/consumers/atlas/owners/ATLAS_HOOK/0
0x362b252fd630014:
/hbase-unsecure/rs/datanode2.hadoop.wish.me,16021,1523411228465
0x262b252fd1c0004:
/hbase-unsecure/master
0x262b252fd1c0006:
/hbase-unsecure/rs/namenode.hadoop.wish.me,16021,1523411302942
List environment variables envi
[root@namenode version-2]# echo envi|nc localhost 2181
Environment:
zookeeper.version=3.4.6-129--1, built on 05/31/2017 03:01 GMT
host.name=namenode.hadoop.wish.me
java.version=1.8.0_112
java.vendor=Oracle Corporation
java.home=/usr/jdk64/jdk1.8.0_112/jre
java.library.path=/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib
java.io.tmpdir=/tmp
java.compiler=<NA>
os.name=Linux
os.arch=amd64
os.version=3.10.0-693.2.2.el7.x86_64
user.name=zookeeper
user.home=/home/zookeeper
user.dir=/home/zookeeper
Check if the service is faulty
[root@namenode version-2]# echo ruok|nc localhost 2181
imok
If imok is returned, it means that there is no problem with the service; if there is no response, it means that there is a problem with the service.
For more details and more commands, you can refer to the following official problem description:
https://zookeeper.apache.org/doc/r3.3.3/zookeeperAdmin.html
zkCli.sh
As mentioned in the previous section, as long as zk is installed, it can be executed anywhere. But the following zkCli.sh is a script like the original zkServer.sh, which needs to be switched to the bin directory and executed manually. The essence of zkCli.sh here is to generate a client to interact with zkServer, which is similar to the interaction of other modules with zk. This mode is essentially a piece of java code executed to simulate the client, and there is also a C script, but the source code needs to be compiled. If you are interested, you can refer to the solution on the official website. The general usage is similar.
link zkServer
Execute the command to link zkServer. After the link is successful, all subsequent commands are performed under this link:
[root@dn3 bin]# sh zkCli.sh -server dn3:2181
Connecting to dn3:2181
2018-04-13 15:24:57,212 - INFO [main:Environment@100] - Client environment:zookeeper.version=3.4.6-129--1, built on 05/31/2017 03:01 GMT
2018-04-13 15:24:57,215 - INFO [main:Environment@100] - Client environment:host.name=dn3
....
2018-04-13 15:24:57,220 - INFO [main:ZooKeeper@438] - Initiating client connection, connectString=dn3:2181 sessionTimeout=30000 watcher=org.apache.zookeeper.ZooKeeperMain$MyWatcher@63c12fb0
Welcome to ZooKeeper!
2018-04-13 15:24:57,245 - INFO [main-SendThread(dn3:2181):ClientCnxn$SendThread@1019] - Opening socket connection to server dn3/10.10.0.24:2181. Will not attempt to authenticate using SASL (unknown error)
JLine support is enabled
2018-04-13 15:24:57,296 - INFO [main-SendThread(dn3:2181):ClientCnxn$SendThread@864] - Socket connection established to dn3/10.10.0.24:2181, initiating session
2018-04-13 15:24:57,302 - INFO [main-SendThread(dn3:2181):ClientCnxn$SendThread@1279] - Session establishment complete on server dn3/10.10.0.24:2181, sessionid = 0x362b94920720016, negotiated timeout = 30000
WATCHER::
WatchedEvent state:SyncConnected type:None path:null
[zk: dn3:2181(CONNECTED) 0]
Note: The welcome to zookeeper shown above means that the link is basically successful. Since the SSL method is not used here, there will be some prompts. The command line prompt that appears at the end is where you need to type commands later. You can view the command description through help:
[zk: dn3:2181(CONNECTED) 0] h
ZooKeeper -server host:port cmd args
stat path [watch]
set path data [version]
ls path [watch]
delquota [-n|-b] path
ls2 path [watch]
setAcl path acl
setquota -n|-b val path
history
redo cmdno
printwatches on|off
delete path [version]
sync path
listquota path
rmr path
get path [watch]
create [-s] [-e] path data acl
addauth scheme auth
quit
getAcl path
close
connect host:port
Here are some commonly used commands
- ls displays a list of current nodes (/ represents the root node, and further down are the child nodes)
[zk: dn3:2181(CONNECTED) 1] ls / [cluster, registry, controller, brokers, storm, zookeeper, infra-solr, hbase-unsecure, admin, isr_change_notification, templeton-hadoop, controller_epoch, hiveserver2, rmstore, consumers, ambari-metrics-cluster, config] [zk: dn3:2181(CONNECTED) 2] ls /storm [assignments, backpressure, blobstoremaxkeysequencenumber, credentials, nimbuses, logconfigs, leader-lock, storms, errors, supervisors, workerbeats, blobstore]
Here from the storm node, you can clearly see the specific situation under the storm node.
create creates a new znode node (-s: sequential node -e: temporary data node, which will disappear after restart). Command format: create [-s] [-e] path data acl
[zk: dn3:2181(CONNECTED) 3] create /zk_test test_data
Created /zk_test
[zk: dn3:2181(CONNECTED) 4] ls /
[cluster, registry, controller, brokers, storm, zookeeper, infra-solr, hbase- unsecure, admin, isr_change_notification, templeton-hadoop, controller_epoch, hiveserver2, rmstore, consumers, ambari-metrics-cluster, config, zk_test][zk: dn3:2181(CONNECTED) 5] ls /zk_test
[]
can see, root directory Below, a new zk_test node has been added. If you need to select the node type, you can add -s or e, and you can also add ACL permission control later.get Get node information
[zk: dn3:2181(CONNECTED) 9] get /storm
cZxid = 0x10000018f
ctime = Tue Feb 06 17:18:15 CST 2018
mZxid = 0x10000018f
mtime = Tue Feb 06 17:18:15 CST 2018
pZxid = 0x1000002dd
cversion = 12
dataVersion = 0
aclVersion = 0
ephemeralOwner = 0x0
dataLength = 1
numChildren = 12
Before explaining the node information, let's explain each change of the zxid. zk state (creating any node, or updating the data of any node, or deleting any node will cause the Zookeeper state to change, resulting in an increase in the value of zxid), Correspondingly, a Transcation id will be incremented, and this new id is also zxid. Since this id is always increasing, if zxid1<zxid2 occurs, then zxid1 must first occur with zxid2. Where: cZxid represents the zxid when the node is created.
The first value cZxid returned by the above command line represents the zxid when the node was created, so the order of node creation can be judged by this value. Let's take a look at the value of cZxid of the two nodes created successively
[zk: dn3:2181(CONNECTED) 10] get /zk_test
test_data
cZxid = 0x9000201af
ctime = Fri Apr 13 15:45:34 CST 2018
mZxid = 0x9000201af
mtime = Fri Apr 13 15:45:34 CST 2018
pZxid = 0x9000201af
cversion = 0
dataVersion = 0
aclVersion = 0
ephemeralOwner = 0x0
dataLength = 9
numChildren = 0
[zk: dn3:2181(CONNECTED) 11] get /zk_test_tmp
test_data_tmp
cZxid = 0x90002048e
ctime = Fri Apr 13 15:53:50 CST 2018
mZxid = 0x90002048e
mtime = Fri Apr 13 15:53:50 CST 2018
pZxid = 0x90002048e
cversion = 0
dataVersion = 0
aclVersion = 0
ephemeralOwner = 0x362b94920720016
dataLength = 13
numChildren = 0
It can be seen from the above that since the creation time of the two is not too far, only about 10 minutes, so the value of cZxid is not very different.
The second parameter value, ctime, represents the timestamp when the node was created; the third parameter, mZxid, is the timestamp when the latest update of the node occurred (when we update the data later, we can compare the value and the time of creation); The 4 parameters mtime are the timestamp of the latest update of the node; the fifth parameter pZxid represents the zxid when the node's child node list was last modified. Note: pzxid will only be changed if the list of child nodes is changed, and changes in the content of child nodes will not affect pzxid ; the sixth parameter cversion is the version number of child nodes, since it is incremented, it also represents the more detailed times of child nodes; dataversion data The version number of the node. Similarly, this represents the number of updates of the data node; dataLength represents the number of bytes of node data; numChildren represents the number of child nodes; aclversion represents the number of authorizations for the acl of the node; the last parameter: ephemeralOwner. If the node is a permanent node, it indicates the session id bound to the node. If the node is not a permanent node but a temporary node, the value is 0.
- Set the node information set
set command, update the data content of the set node, the specific command format: set node_pah data[version]. Let's look at the situation before setting:
[zk: dn1:2181(CONNECTED) 1] get /zk_test test_data cZxid = 0x9000201af ctime = Fri Apr 13 15:45:34 CST 2018 mZxid = 0x9000201af mtime = Fri Apr 13 15:45:34 CST 2018 pZxid = 0x9000201af cversion = 0 dataVersion = 0 aclVersion = 0 ephemeralOwner = 0x0 dataLength = 9 numChildren = 0
Look at the situation after setting:
> [zk: dn1:2181(CONNECTED) 2] set /zk_test hellolst
> cZxid = 0x9000201af
> ctime = Fri Apr 13 15:45:34 CST 2018
> mZxid = 0x90007bb7d
> mtime = Mon Apr 16 14:07:26 CST 2018
> pZxid = 0x9000201af
> cversion = 0
> dataVersion = 1
> aclVersion = 0
> ephemeralOwner = 0x0
> dataLength = 8
> numChildren = 0
> [zk: dn1:2181(CONNECTED) 3] get /zk_test
> hellolst
> cZxid = 0x9000201af
> ctime = Fri Apr 13 15:45:34 CST 2018
> mZxid = 0x90007bb7d
> mtime = Mon Apr 16 14:07:26 CST 2018
> pZxid = 0x9000201af
> cversion = 0
> dataVersion = 1
> aclVersion = 0
> ephemeralOwner = 0x0
> dataLength = 8
> numChildren = 0
>
It can be found that the value of dataversion has become 1 after the update, and there is 1 version of data. At the same time, mtime (the timestamp of the latest update of the node) has also changed, and the length is exactly 8 bytes (hellolst)
stat View node statistics
It is basically the same as get, but the data content is missing
- ls2 path
is different from ls in that in addition to the files and subdirectories under the node, you can also see the relevant information of the node itself[zk: dn1:2181(CONNECTED) 8] ls2 /storm
[assignments, backpressure, blobstoremaxkeysequencenumber, credentials, nimbuses, logconfigs, leader-lock, storms, errors, supervisors, workerbeats, blobstore]
cZxid = 0x10000018f
ctime = Tue Feb 06 17:18:15 CST 2018
mZxid = 0x10000018f
mtime = Tue Feb 06 17:18:15 CST 2018
pZxid = 0x1000002dd
cversion = 12
dataVersion = 0
aclVersion = 0
ephemeralOwner = 0x0
dataLength = 1
numChildren = 12
[zk: dn1:2181(CONNECTED) 9] ls /storm
[assignments, backpressure, blobstoremaxkeysequencenumber, credentials, nimbuses, logconfigs, leader-lock, storms, errors, supervisors, workerbeats, blobstore]delete
deletes the znode node.[zk: dn1:2181(CONNECTED) 10] delete /zk_test
-quit 退出
[zk: dn1:2181(CONNECTED) 11] quitQuitting...
2018-04-16 14:31:04,275 - INFO [main:ZooKeeper@684] - Session: 0x162b9559e1c0025 closed
2018-04-16 14:31:04,275 - INFO [main-EventThread:ClientCnxn$EventThread@524] - EventThread shut down