Basic introduction and installation tutorial of Zookeeper

1. Getting Started with Zookeeper

1.1 Overview

Zookeeper is an open source distributed Apache project that provides coordination services for distributed applications.
Insert image description here

1.2 Features

Insert image description here

1.3 Data structure

Insert image description here

1.4 Application scenarios

The services provided include: unified naming service, unified configuration management, unified cluster management, dynamic online and offline server nodes, soft load balancing, etc.

Unified naming service

Insert image description here
Unified configuration management
Insert image description here
Unified cluster management
Insert image description here
Dynamic server online and offline

Insert image description here
Soft load balancing

Insert image description here

1.5 Download address

1. Website homepage:

https://zookeeper.apache.org/

2. Download the apache archive directory

http://archive.apache.org/dist/zookeeper/

2. Zookeeper installation

2.1 Local mode installation and deployment

1. Preparations before installation
(1) Install Jdk
(2) Copy the Zookeeper installation package to the Linux system
(3) Unzip it to the specified directory
[ybb@hadoop102 software]$ tar -zxvf zookeeper-3.4.10.tar.gz -C /opt/ module/
2. Configuration modification
(1) Modify zoo_sample.cfg in the path /opt/module/zookeeper-3.4.10/conf to zoo.cfg;
[ybb@hadoop102 conf]$ mv zoo_sample.cfg zoo.cfg
(2) Open zoo .cfg file, modify the dataDir path:
[ybb@hadoop102 zookeeper-3.4.10]$ vim zoo.cfg
modify the following content:
dataDir=/opt/module/zookeeper-3.4.10/zkData
(3) in /opt/module/ Create the zkData folder
[ybb@hadoop102 zookeeper-3.4.10]$ mkdir zkData
3. Operating Zookeeper
(1) Start Zookeeper
[ybb@hadoop102 zookeeper-3.4.10]$ bin/zkServer.sh start
(2) Check whether the process is started
[ybb@hadoop102 zookeeper-3.4.10]$ jps
4020 Jps
4001 QuorumPeerMain
(3) Check the status:
[ybb@hadoop102 zookeeper-3.4.10]$ bin/zkServer.sh status
ZooKeeper JMX enabled by default
Using config: /opt/module/zookeeper-3.4.10/bin/…/conf/zoo.cfg
Mode: standalone
(4) Start the client:
[ybb@hadoop102 zookeeper-3.4.10]$ bin/zkCli.sh
(5) Exit the client:
[zk: localhost:2181(CONNECTED) 0] quit
(6) Stop Zookeeper
[ybb @hadoop102 zookeeper-3.4.10]$ bin/zkServer.sh stop

2.2 Interpretation of configuration parameters

The meaning of the parameters in the configuration file zoo.cfg in Zookeeper is interpreted as follows:
1. tickTime =2000: The number of communication heartbeats, the heartbeat time between the Zookeeper server and the client, in milliseconds. The
basic time used by Zookeeper, the time interval for maintaining heartbeats between servers or between clients and servers, that is, a heartbeat will be sent every tickTime time , the time unit is milliseconds.
It is used for the heartbeat mechanism and sets the minimum session timeout to twice the heartbeat time. (The minimum timeout of session is 2*tickTime)
2. initLimit =10: LF initial communication time limit
The maximum number of heartbeats (number of tickTimes) that can be tolerated during the initial connection between the Follower server and the Leader server in the cluster. Use it to limit the number of times the Zookeeper server in the cluster connects to the Leader. time limit.
3. syncLimit =5: LF synchronization communication time limit
. The maximum response time unit between the Leader and Follower in the cluster. If the response exceeds syncLimit * tickTime, the Leader considers the Follower to be dead and deletes the Follower from the server list.
4. dataDir: data file directory + data persistence path,
mainly used to save data in Zookeeper.
5. clientPort =2181: Client connection port
The port that listens for client connections.

3. Internal principles of Zookeeper

3.1 Election mechanism

1) Half mechanism: More than half of the machines in the cluster survive and the cluster is available. Therefore, Zookeeper is suitable for installation on an odd number of machines.
2) Although Zookeeper does not specify Master and Slave in the configuration file. However, when Zookeeper works, one node is the Leader and the others are Followers. The Leader is temporarily generated through the internal election mechanism.
3) Let’s use a simple example to illustrate the entire election process.
Suppose there is a Zookeeper cluster composed of five servers. Their IDs range from 1 to 5. At the same time, they are all newly started, that is, there is no historical data. The amount of data stored is the same. Let's see what happens assuming these servers are started in sequence, as shown in the figure.
Insert image description here

(1) Server 1 is started. At this time, only one server is started. There is no response to the packets it sends, so its election status is always in the LOOKING state.
(2) Server 2 starts, it communicates with Server 1, which was started initially, and exchanges its own election results with each other. Since both have no historical data, Server 2 with a larger ID value wins, but since it does not reach more than half, servers agree to elect it (more than half of them are 3 in this example), so servers 1 and 2 continue to maintain the LOOKING state.
(3) Server 3 starts. According to the previous theoretical analysis, server 3 becomes the boss among servers 1, 2, and 3. The difference from the above is that three servers elected it at this time, so it becomes the leader of this election. Leader.
(4) Server 4 starts. According to the previous analysis, theoretically server 4 should be the largest among servers 1, 2, 3, and 4. However, since more than half of the previous servers have elected server 3, it can only receive the current My little brother's life is at stake.
(5) Server 5 starts and acts as the younger brother like server 4.

3.2 Node type

Insert image description here

3.3 Stat structure

1) czxid-The transaction zxid that creates the node
will receive a timestamp in the form of zxid, which is the ZooKeeper transaction ID, every time the ZooKeeper state is modified.
The transaction ID is the total order of all modifications in ZooKeeper. Each modification has a unique zxid, and if zxid1 is less than zxid2, then zxid1 occurs before zxid2.
2) ctime - the number of milliseconds since the znode was created (since 1970)
3) mzxid - the transaction zxid of the last updated znode
4) mtime - the number of milliseconds of the last modified znode (since 1970)
5) pZxid - the last updated znode Child node zxid
6) cversion - znode child node change number, znode child node modification number
7) dataversion - znode data change number
8) aclVersion - change number of znode access control list
9) ephemeralOwner - if it is a temporary node, this is owned by znode The session id of the user. It is 0 if it is not a temporary node.
10) dataLength - the data length of the znode
11) numChildren - the number of znode child nodes

3.4 Listener principle

Insert image description here

3.5 Writing data process

Insert image description here

4. Zookeeper in action

4.1 Distributed installation and deployment

1. Cluster planning:
Deploy Zookeeper on three nodes: hadoop102, hadoop103 and hadoop104.

2. Unzip and install
(1) Unzip the Zookeeper installation package to the /opt/module/ directory
[ybb@hadoop102 software]$ tar -zxvf zookeeper-3.4.10.tar.gz -C /opt/module/
(2) Synchronize /opt/ Module/zookeeper-3.4.10 directory content to hadoop103, hadoop104
[ybb@hadoop102 module]$ xsync zookeeper-3.4.10/
3. Configure the server number (1) Create zkData [ybb@hadoop102 zookeeper-3.4.10]$ mkdir -p zkData
in the directory /opt/module/zookeeper-3.4.10/ (2) Create zkData in /opt/module/zookeeper-3.4 Create a myid file in the .10/zkData directory [ybb@hadoop102 zkData]$ touch myid to add the myid file. Note that it must be created in linux. It may be garbled in notepad++ (3) Edit the myid file [ybb@hadoop102 zkData] $ vi myid Add the number corresponding to the server in the file: 2 (4) Copy the configured zookeeper to other machines [ybb@hadoop102 zkData] $ xsync myid and modify the content of the myid file on hadoop102 and hadoop103 to 3, 4 4.Configure the zoo.cfg file (1) Rename zoo_sample.cfg in the directory /opt/module/zookeeper-3.4.10/conf to zoo.cfg [ybb@hadoop102 conf]$ mv zoo_sample.cfg zoo.cfg (2) Open the zoo.cfg file [ybb@hadoop102 conf]$ vim zoo.cfg and modify the data storage path configuration dataDir=/opt/module/zookeeper-3.4.10/zkData . Add the following configuration ############ ###########cluster######################### server.2=hadoop102:2888:3888 server.3 =hadoop103:2888:3888 server.4=hadoop104:2888:3888 (3) Synchronize zoo.cfg configuration file [ybb@hadoop102 conf]$ xsync zoo.cfg (4) Interpretation of configuration parameters server.A=B:C:D . A is a number, indicating which server this is; configure a file myid in cluster mode. This file is in the dataDir directory. There is a data in this file which is the value of A. Zookeeper reads this file when it starts and gets it. The data is compared with the configuration information in zoo.cfg to determine which server it is. B is the IP address of this server; C is the port for exchanging information between this server and the Leader server in the cluster; D is if the Leader server in the cluster hangs up, a port is needed to re-elect and select a new Leader. This port is the port used by servers to communicate with each other when performing elections. 4. Cluster operation (1) Start Zookeeper respectively [ybb@hadoop102 zookeeper-3.4.10]$ bin/zkServer.sh start [ybb@hadoop103 zookeeper-3.4.10]$ bin/zkServer.sh start [ybb@hadoop104 zookeeper-3.4. 10]$ bin/zkServer.sh start (2) View status [ybb@hadoop102 zookeeper-3.4.10]# bin/zkServer.sh status JMX enabled by default Using config: /opt/module/zookeeper-3.4.10/bin /…/conf/zoo.cfg Mode: follower [ybb@hadoop103 zookeeper-3.4.10]# bin/zkServer.sh status JMX enabled by default Using config: /opt/module/zookeeper-3.4.10/bin/…/ conf/zoo.cfg Mode: leader [ybb@hadoop104 zookeeper-3.4.5]# bin/zkServer.sh status JMX enabled by default Using config: /opt/module/zookeeper-3.4.10/bin/…/conf/zoo .cfg Mode: follower

















































4.2 Client command line operation

Insert image description here

1. Start client

[ybb@hadoop103 zookeeper-3.4.10]$ bin/zkCli.sh

2. Show all operation commands

[zk: localhost:2181(CONNECTED) 1] help

3. View the content contained in the current znode

[zk: localhost:2181(CONNECTED) 0] ls /
[zookeeper]

4. View detailed data of the current node

[zk: localhost:2181(CONNECTED) 1] ls2 /
[zookeeper]
cZxid = 0x0
ctime = Thu Jan 01 08:00:00 CST 1970
mZxid = 0x0
mtime = Thu Jan 01 08:00:00 CST 1970
pZxid = 0x0
cversion = -1
dataVersion = 0
aclVersion = 0
ephemeralOwner = 0x0
dataLength = 0
numChildren = 1

5. Create 2 ordinary nodes respectively

[zk: localhost:2181(CONNECTED) 3] create /sanguo “jinlian”
Created /sanguo
[zk: localhost:2181(CONNECTED) 4] create /sanguo/shuguo “liubei”
Created /sanguo/shuguo

6. Get the value of a node

[zk: localhost:2181(CONNECTED) 5] get /sanguo
jinlian
cZxid = 0x100000003
ctime = Wed Aug 29 00:03:23 CST 2018
mZxid = 0x100000003
mtime = Wed Aug 29 00:03:23 CST 2018
pZxid = 0x100000004
cversion = 1
dataVersion = 0
aclVersion = 0
ephemeralOwner = 0x0
dataLength = 7
numChildren = 1
[zk: localhost:2181(CONNECTED) 6]
[zk: localhost:2181(CONNECTED) 6] get /sanguo/shuguo
liubei
cZxid = 0x100000004
ctime = Wed Aug 29 00:04:35 CST 2018
mZxid = 0x100000004
mtime = Wed Aug 29 00:04:35 CST 2018
pZxid = 0x100000004
cversion = 0
dataVersion = 0
aclVersion = 0
ephemeralOwner = 0x0
dataLength = 6
numChildren = 0

7.Create ephemeral nodes

[zk: localhost:2181(CONNECTED) 7] create -e /sanguo/wuguo “zhouyu”
Created /sanguo/wuguo
(1) It can be viewed on the current client
[zk: localhost:2181(CONNECTED) 3] ls /sanguo
[wuguo, shuguo]
(2) Exit the current client and then restart the client
[zk: localhost:2181(CONNECTED) 12] quit
[ybb@hadoop104 zookeeper-3.4.10]$ bin/zkCli.sh
(3) Check again that the ephemeral node in the root directory has been deleted
[zk: localhost:2181(CONNECTED) 0] ls /sanguo
[shuguo]

8. Create a node with a serial number

(1)先创建一个普通的根节点/sanguo/weiguo

[zk: localhost:2181(CONNECTED) 1] create /sanguo/weiguo “caocao”
Created /sanguo/weiguo
(2) Create a node with a serial number
[zk: localhost:2181(CONNECTED) 2] create -s /sanguo/weiguo /xiaoqiao “jinlian”
Created /sanguo/weiguo/xiaoqiao0000000000
[zk: localhost:2181(CONNECTED) 3] create -s /sanguo/weiguo/daqiao “jinlian”
Created /sanguo/weiguo/daqiao0000000001
[zk: localhost:2181(CONNECTED) ) 4] create -s /sanguo/weiguo/diaocan “jinlian”
Created /sanguo/weiguo/diaocan0000000002
If there is no serial number node originally, the serial number will increase sequentially from 0. If there are already 2 nodes under the original node, the reordering will start from 2, and so on.

9. Modify node data value

[zk: localhost:2181(CONNECTED) 6] set /sanguo/weiguo “simayi”

10. Node value change monitoring

(1)在hadoop104主机上注册监听/sanguo节点数据变化

[zk: localhost:2181(CONNECTED) 26] [zk: localhost:2181(CONNECTED) 8] get /sanguo watch
(2) Modify the data of /sanguo node on the hadoop103 host
[zk: localhost:2181(CONNECTED) 1] set /sanguo “xisi”
(3) Observe the monitoring of hadoop104 host receiving data changes
WATCHER::
WatchedEvent state:SyncConnected type:NodeDataChanged path:/sanguo

11. Node's child node change monitoring (path change)

(1)在hadoop104主机上注册监听/sanguo节点的子节点变化

[zk: localhost:2181(CONNECTED) 1] ls /sanguo watch
[aa0000000001, server101]
(2) Create a child node on the hadoop103 host/sanguo node
[zk: localhost:2181(CONNECTED) 2] create /sanguo/jin " simayi"
Created /sanguo/jin
(3) Observe that the hadoop104 host receives the monitoring of child node changes
WATCHER::
WatchedEvent state:SyncConnected type:NodeChildrenChanged path:/sanguo

12.Delete node

[zk: localhost:2181(CONNECTED) 4] delete /sanguo/jin

13.Recursively delete nodes

[zk: localhost:2181(CONNECTED) 15] rmr /sanguo/shuguo

14.View node status

[zk: localhost:2181(CONNECTED) 17] stat /sanguo
cZxid = 0x100000003
ctime = Wed Aug 29 00:03:23 CST 2018
mZxid = 0x100000011
mtime = Wed Aug 29 00:21:23 CST 2018
pZxid = 0x100000014
cversion = 9
dataVersion = 1
aclVersion = 0
ephemeralOwner = 0x0
dataLength = 4
numChildren = 1

Guess you like

Origin blog.csdn.net/qq_44696532/article/details/135451853