Big Data Technology - Zookeeper Distributed Framework Coordination Service (1. Getting Started)

1. Getting Started with Zookeeper

1.1 Overview

Zookeeper is an open source distributed Apache project that provides coordination services for distributed frameworks.
insert image description here
Zookeeper working mechanism
insert image description here
Zookeeper is understood from the perspective of design patterns: it is a distributed service management framework based on the observer pattern design, which is responsible for storing and managing the data that everyone cares about, and then accepts the registration of observers. Once the state of these data changes , Zookeeper will be responsible for notifying those observers registered with Zookeeper to respond accordingly.

1.2 Features

insert image description here

1) Zookeeper: a leader (Leader), a cluster of multiple followers (Follower).
2) As long as more than half of the nodes in the cluster survive, the Zookeeper cluster can serve normally. So Zookeeper is suitable for installing an odd number of servers.
3) Global data consistency: Each server saves a copy of the same data, and the data is consistent no matter which server the client connects to.
4) The update requests are executed sequentially, and the update requests from the same Client are executed sequentially in the order in which they are sent.
5) Data update atomicity, a data update either succeeds or fails.
6) Real-time, within a certain time range, the Client can read the latest data.

1.3 Data structure

The structure of the ZooKeeper data model is very similar to the Unix file system, which can be regarded as a tree as a whole, and each node is called a ZNode. Each ZNode can store 1MB of data by default, and each ZNode can be uniquely identified by its path.

insert image description here

1.4 Application scenarios

The services provided include: unified naming service, unified configuration management, unified cluster management, dynamic online and offline server nodes, soft load balancing, etc.
Unified Naming Service
insert image description here
In a distributed environment, it is often necessary to uniformly name applications/services for easy identification.
For example: IP is not easy to remember, but domain name is easy to remember.
Unified configuration management
insert image description here
1) In a distributed environment, configuration file synchronization is very common.
(1) It is generally required that in a cluster, the configuration information of all nodes is consistent, such as Kafka cluster.
(2) After modifying the configuration file, it is hoped that it can be quickly synchronized to each node.
2) Configuration management can be implemented by ZooKeeper.
(1) Configuration information can be written to a Znode on ZooKeeper.
(2) Each client server monitors this Znode.
(3) Once the data in Znode is modified, ZooKeeper will notify each client server.
Unified cluster management
insert image description here
1) In a distributed environment, it is necessary to know the status of each node in real time.
(1) Some adjustments can be made according to the real-time status of the nodes.
2) ZooKeeper can realize real-time monitoring of node status changes
(1) Node information can be written to a ZNode on ZooKeeper.
(2) Monitor this ZNode to obtain its real-time status changes.
Servers dynamically go online and offline with
insert image description here
soft load balancing.
Record the number of visits to each server in Zookeeper, and let the server with the least number of visits handle the latest client requests.
insert image description here

1.5 Download address

1) Homepage of the official website:
link: https://zookeeper.apache.org/ .
2) Download the screenshot
insert image description here
insert image description here
insert image description here
3) Download the tar package installed in the Linux environment
insert image description here

Two, Zookeeper local installation

2.1 Local Mode

1) Preparation before installation
(1) Install JDK
(2) Copy the apache-zookeeper-3.5.7-bin.tar.gz installation package to the Linux system
(3) Unzip it to the specified directory

[xusheng@hadoop102 software]$ tar -zxvf apache-zookeeper-3.5.7-bin.tar.gz -C /opt/module/

insert image description here
insert image description here

(4) Modify the name

[xusheng@hadoop102 module]$ mv apache-zookeeper-3.5.7 -bin/ zookeeper-3.5.7

insert image description here

2) Configuration modification
(1) Modify zoo_sample.cfg under the path of /opt/module/zookeeper-3.5.7/conf to zoo.cfg;

[xusheng@hadoop102 conf]$ mv zoo_sample.cfg zoo.cfg

(2) Open the zoo.cfg file and modify the dataDir path:

[xusheng@hadoop102 zookeeper-3.5.7]$ vim zoo.cfg

Modify the following content:

dataDir=/opt/module/zookeeper-3.5.7/zkData

insert image description here

(3) Create a zkData folder on the directory /opt/module/zookeeper-3.5.7/

[xusheng@hadoop102 zookeeper-3.5.7]$ mkdir zkData

insert image description here

3) Operate Zookeeper
(1) Start Zookeeper

[xusheng@hadoop102 zookeeper-3.5.7]$ bin/zkServer.sh start

(2) Check whether the process is started

[xusheng@hadoop102 zookeeper-3.5.7]$ jps
4020 Jps
4001 QuorumPeerMain

insert image description here

(3) View status

[xusheng@hadoop102 zookeeper-3.5.7]$ bin/zkServer.sh status
ZooKeeper JMX enabled by default
Using config: /opt/module/zookeeper-3.5.7/bin/../conf/zoo.cfg
Mode: standalone

insert image description here

(4) Start the client

[xusheng@hadoop102 zookeeper-3.5.7]$ bin/zkCli.sh

(5) Exit the client:

[zk: localhost:2181(CONNECTED) 0] quit

(6) Stop Zookeeper

[xusheng@hadoop102 zookeeper-3.5.7]$ bin/zkServer.sh stop

2.2 Interpretation of configuration parameters

The interpretation of the parameters in the configuration file zoo.cfg in Zookeeper is as follows:
1) tickTime = 2000: Communication heartbeat time, Zookeeper server and client heartbeat time, in milliseconds
insert image description here

2) initLimit = 10: LF initial communication time limit
insert image description here
Leader and Follower can tolerate the maximum number of heartbeats (number of tickTime) when initially connecting
3) syncLimit = 5: LF synchronization communication time limit
insert image description here
If the communication time between Leader and Follower exceeds syncLimit * tickTime, Leader thinks that Follwer is dead, and deletes Follwer from the server list.
4) dataDir: Save the data in Zookeeper
Note: The default tmp directory is easy to be deleted regularly by the Linux system, so the default tmp directory is generally not used.
5) clientPort = 2181: Client connection port, usually not modified.

3. Zookeeper cluster operation

3.1 Cluster operation

3.1.1 Cluster installation

1) Cluster planning
Zookeeper is deployed on three nodes of hadoop102, hadoop103 and hadoop104.
Thinking: If there are 10 servers, how many Zookeepers need to be deployed?
2) Unzip and install
(1) Unzip the Zookeeper installation package in hadoop102 to the /opt/module/ directory

[xusheng@hadoop102 software]$ tar -zxvf apache-zookeeper-3.5.7-bin.tar.gz -C /opt/module/

(2) Modify the name of apache-zookeeper-3.5.7-bin to zookeeper-3.5.7

[xusheng@hadoop102 module]$ mv apache-zookeeper-3.5.7-bin/ zookeeper-3.5.7

3) Configure the server number
(1) Create zkData in the /opt/module/zookeeper-3.5.7/ directory

[xusheng@hadoop102 zookeeper-3.5.7]$ mkdir zkData

(2) Create a myid file in the /opt/module/zookeeper-3.5.7/zkData directory

[xusheng@hadoop102 zkData]$ vi myid

insert image description here

Add the number corresponding to the server in the file (note: there should be no blank lines on the top and bottom, and no spaces on the left and right)

2

Note: To add the myid file, it must be created in Linux, and it may be garbled in notepad++
(3) Copy the configured zookeeper to other machines

[xusheng@hadoop102 module ]$ xsync zookeeper-3.5.7

And modify the contents of the myid file to 3 and 4 on hadoop103 and hadoop104 respectively.
4) Configure the zoo.cfg file
(1) Rename the zoo_sample.cfg in the directory /opt/module/zookeeper-3.5.7/conf to zoo. cfg

[xusheng@hadoop102 conf]$ mv zoo_sample.cfg zoo.cfg

(2) Open the zoo.cfg file

[xusheng@hadoop102 conf]$ vim zoo.cfg

#Modify data storage path configuration

dataDir=/opt/module/zookeeper-3.5.7/zkData

#Add the following configuration

#######################cluster##########################
server.2=hadoop102:2888:3888
server.3=hadoop103:2888:3888
server.4=hadoop104:2888:3888

insert image description here

(3) Interpretation of configuration parameters
server.A=B:C:D.

A is a number, indicating the number of the server; configure a file myid in the cluster mode, this file is in the dataDir directory, and there is a data in this file that is the value of A. When Zookeeper starts, read this file and get it Compare the data with the configuration information in zoo.cfg to determine which server it is.
B is the address of the server;
C is the port for the Follower of this server to exchange information with the Leader server in the cluster;
D is the port for re-election in case the Leader server in the cluster hangs up, and a new Leader is elected. And this port is the port used to communicate between servers when performing elections.

(4) Synchronize the zoo.cfg configuration file

[xusheng@hadoop102 conf]$ xsync zoo.cfg

insert image description here

5) Cluster operation
(1) Start Zookeeper separately

[xusheng@hadoop102 zookeeper-3.5.7]$ bin/zkServer.sh start
[xusheng@hadoop103 zookeeper-3.5.7]$ bin/zkServer.sh start
[xusheng@hadoop104 zookeeper-3.5.7]$ bin/zkServer.sh start

(2) View status

[xusheng@hadoop102 zookeeper-3.5.7]# bin/zkServer.sh status
JMX enabled by default
Using config: /opt/module/zookeeper-3.5.7/bin/../conf/zoo.cfg
Mode: follower
[xusheng@hadoop103 zookeeper-3.5.7]# bin/zkServer.sh status
JMX enabled by default
Using config: /opt/module/zookeeper-3.5.7/bin/../conf/zoo.cfg
Mode: leader
[xusheng@hadoop104 zookeeper-3.4.5]# bin/zkServer.sh status
JMX enabled by default
Using config: /opt/module/zookeeper-3.5.7/bin/../conf/zoo.cfg
Mode: follower

3.1.2 Election mechanism (interview focus)

Zookeeper election mechanism - first start
insert image description here
SID: server ID. It is used to uniquely identify a machine in a ZooKeeper cluster. Each machine cannot be repeated, and it is consistent with myid.
ZXID: Transaction ID. ZXID is a transaction ID used to identify a server state change. At a certain moment, the ZXID value of each machine in the cluster may not be exactly the same, which is related to the processing logic of the ZooKeeper server for the client "update request".
Epoch: The code name for each Leader term. The logical clock value in the same round of voting process is the same when there is no Leader. This number increases each time a vote is cast

(1) Server 1 starts and initiates an election. Server 1 votes for itself. At this time, server 1 has one vote, less than half (3 votes), the election cannot be completed, and the status of server 1 remains
LOOKING;
(2) Server 2 is started, and another election is initiated. Servers 1 and 2 cast their own votes and exchange ballot information: At this time, server 1 finds that the myid of server 2 is larger than the one currently voted for (server 1), and changes the vote to recommend server 2. At this time, server 1 has 0 votes and server 2 has 2 votes. If there is no more than half of the results, the election cannot be completed, and the status of servers 1 and 2 remains LOOKING.
(3) Server 3 starts and initiates an election. At this point, both servers 1 and 2 will change their votes to server 3. The result of this vote: Server 1 has 0 votes, Server 2 has 0 votes, and Server 3 has 3 votes. At this time, server 3 has more than half of the votes, and server 3 is elected as the leader. Servers 1 and 2 change their status to FOLLOWING, and server 3 changes their status to LEADING;
(4) Server 4 starts and initiates an election. At this time, servers 1, 2, and 3 are no longer in the LOOKING state, and the ballot information will not be changed. The result of exchanging ballot information: Server 3 has 3 votes, and Server 4 has 1 vote. At this time, server 4 obeys the majority, changes the ballot information to server 3, and changes the status to FOLLOWING;
(5) server 5 starts, and acts as a younger brother like 4.

Zookeeper election mechanism - non-first startup
insert image description here
SID: server ID. It is used to uniquely identify a machine in a ZooKeeper cluster. Each machine cannot be repeated, and it is consistent with myid.
ZXID: Transaction ID. ZXID is a transaction ID used to identify a server state change. At a certain moment, the ZXID value of each machine in the cluster may not be exactly the same, which is related to the processing logic of the ZooKeeper server for the client "update request".
Epoch: The code name for each Leader term. The logical clock value in the same round of voting process is the same when there is no Leader. This number will increase each time a vote is cast.

(1) When a server in the ZooKeeper cluster has one of the following two situations, it will start to enter the Leader election: • The server is initialized and started.
• The server cannot maintain a connection with the Leader while it is running.
(2) When a machine enters the Leader election process, the current cluster may also be in the following two states:
• There is already a Leader in the cluster.
For the first case where a leader already exists, when a machine tries to elect a leader, it will be informed of the current server's leader information. For this machine, it only needs to establish a connection with the leader machine and perform state synchronization.
• There is indeed no Leader in the cluster. .
Suppose ZooKeeper consists of 5 servers, the SIDs are 1, 2, 3, 4, and 5, and the ZXIDs are 8, 8, 8, 7, and 7, and the server with SID 3 is the leader. At some point, servers 3 and 5 fail, so Leader election begins.

insert image description here

3.1.3 ZK cluster start and stop script

1) Create a script in the /home/xusheng/bin directory of hadoop102

[xusheng@hadoop102 bin]$ vim zk.sh

insert image description here

Write the following in the script

#!/bin/bash

case $1 in
"start"){
    
    
	for i in hadoop102 hadoop103 hadoop104
	do
		echo ---------- zookeeper $i 启动 ------------
		ssh $i "/opt/module/zookeeper-3.5.7/bin/zkServer.sh start"
	done
}
;;
"stop"){
    
    
	for i in hadoop102 hadoop103 hadoop104
	do
		echo ---------- zookeeper $i 停止 ------------
		ssh $i "/opt/module/zookeeper-3.5.7/bin/zkServer.sh stop"
	done
}
;;
"status"){
    
    
	for i in hadoop102 hadoop103 hadoop104
	do
		echo ---------- zookeeper $i 状态 ------------
		ssh $i "/opt/module/zookeeper-3.5.7/bin/zkServer.sh status"
	done
}
;;
esac

2) Increase the script execution permission

[xusheng@hadoop102 bin]$ chmod u+x zk.sh

3) Zookeeper cluster startup script

[xusheng@hadoop102 module]$ zk.sh start

insert image description here

4) Zookeeper cluster stop script

[xusheng@hadoop102 module]$ zk.sh stop

3.2 Client command line

3.2.1 Command line syntax

insert image description here
1) Start the client

[xusheng@hadoop102 zookeeper-3.5.7]$ bin/zkCli.sh -server hadoop102:2181

2) Display all operation commands

[zk: hadoop102:2181(CONNECTED) 1] help

insert image description here

3.2.2 znode node data information

1) View the content contained in the current znode

[zk: hadoop102:2181(CONNECTED) 0] ls /
[zookeeper]

2) View the detailed data of the current node

[zk: hadoop102:2181(CONNECTED) 5] ls -s /
[zookeeper]cZxid = 0x0
ctime = Thu Jan 01 08:00:00 CST 1970
mZxid = 0x0
mtime = Thu Jan 01 08:00:00 CST 1970
pZxid = 0x0
cversion = -1
dataVersion = 0
aclVersion = 0
ephemeralOwner = 0x0
dataLength = 0
numChildren = 1

(1) czxid: Create a node transaction zxid A ZooKeeper transaction ID will be generated every time the ZooKeeper state is modified. The transaction ID is the total order of all modifications in ZooKeeper. Each modification has a unique zxid, if zxid1 is smaller than zxid2, then zxid1
occurs before zxid2.
(2) ctime: the number of milliseconds when the znode was created (since 1970)
(3) mzxid: the zxid of the last updated transaction of the znode
(4) mtime: the number of milliseconds when the znode was last modified (since 1970)
(5) pZxid: The last updated child node zxid of znode
(6) cversion: znode child node change number, znode child node modification times
(7) dataversion: znode data change number
(8) aclVersion: znode access control list change number
(9) ephemeralOwner: if It is a temporary node, and this is the session id of the znode owner. 0 if it is not a temporary node.
(10) dataLength: the data length of the znode
(11) numChildren: the number of child nodes of the znode

3.2.3 Node type (persistent/ephemeral/serialized/unserialized)

insert image description here
1) Create 2 normal nodes respectively (permanent node + no serial number)

[zk: localhost:2181(CONNECTED) 3] create /sanguo "diaochan"
Created /sanguo
[zk: localhost:2181(CONNECTED) 4] create /sanguo/shuguo
"liubei"
Created /sanguo/shuguo

insert image description here

Note: When creating a node, assign a value
2) Get the value of the node

![\[zk: localhost:2181(CONNECTED) 5\] get -s /sanguo
diaochan
cZxid = 0x100000003
ctime = Wed Aug 29 00:03:23 CST 2018
mZxid = 0x100000003
mtime = Wed Aug 29 00:03:23 CST 2018
pZxid = 0x100000004
cversion = 1
dataVersion = 0
aclVersion = 0
ephemeralOwner = 0x0
dataLength = 7
numChildren = 1
\[zk: localhost:2181(CONNECTED) 6\] get -s /sanguo/shuguo
liubei
cZxid = 0x100000004
ctime = Wed Aug 29 00:04:35 CST 2018
mZxid = 0x100000004
mtime = Wed Aug 29 00:04:35 CST 2018
pZxid = 0x100000004
cversion = 0
dataVersion = 0
aclVersion = 0
ephemeralOwner = 0x0
dataLength = 6
numChildren = 0](https://img-blog.csdnimg.cn/3c3b0a8634f347578fc99af348e09fb5.png)

3) Create a node with serial number (permanent node + serial number)
(1) First create an ordinary root node /sanguo/weiguo

[zk: localhost:2181(CONNECTED) 1] create /sanguo/weiguo
"caocao"
Created /sanguo/weiguo

(2) Create a node with a serial number

[zk: localhost:2181(CONNECTED) 2] create -s /sanguo/weiguo/zhangliao "zhangliao"
Created /sanguo/weiguo/zhangliao0000000000
[zk: localhost:2181(CONNECTED) 3] create -s /sanguo/weiguo/zhangliao "zhangliao"
Created /sanguo/weiguo/zhangliao0000000001
[zk: localhost:2181(CONNECTED) 4] create -s /sanguo/weiguo/xuchu "xuchu"
Created /sanguo/weiguo/xuchu0000000002

insert image description here

If there is no serial number node, the serial number will increase sequentially from 0. If there are already 2 nodes under the original node,
start from 2 when reordering, and so on.
4) Create short-lived nodes (short-lived nodes + without sequence number or with sequence number))
(1) Create short-lived nodes without sequence number

[zk: localhost:2181(CONNECTED) 7] create -e /sanguo/wuguo
"zhouyu"
Created /sanguo/wuguo

(2) Create short-lived nodes with serial numbers

[zk: localhost:2181(CONNECTED) 2] create -e -s /sanguo/wuguo
"zhouyu"
Created /sanguo/wuguo0000000001

insert image description here

(3) It can be viewed on the current client

[zk: localhost:2181(CONNECTED) 3] ls /sanguo
[wuguo, wuguo0000000001, shuguo]

(4) Exit the current client and restart the client

[zk: localhost:2181(CONNECTED) 12] quit
[xusheng@hadoop104 zookeeper-3.5.7]$ bin/zkCli.sh

(5) Check again that the ephemeral nodes in the root directory have been deleted

[zk: localhost:2181(CONNECTED) 0] ls /sanguo
[shuguo]

insert image description here

5) Modify the node data value

[zk: localhost:2181(CONNECTED) 6] set /sanguo/weiguo "simayi"

insert image description here

3.2.4 Listener principle

The client registers to listen to the directory nodes it cares about. When the directory nodes change (data changes, node deletions, subdirectory node additions and deletions), ZooKeeper will notify the client. The monitoring mechanism ensures that any changes in any data saved by ZooKeeper can quickly respond to the application program that monitors the node.
insert image description here

1. Detailed explanation of monitoring principle

1) First, there must be a main() thread
2) Create a Zookeeper client in the main thread, then two threads will be created, one is responsible for network connection communication (connet), and the other is responsible for monitoring (listener).
3) Send the registered listening event to Zookeeper through the connect thread.
4) Add the registered listening event to the list in Zookeeper's registered listener list.
5) Zookeeper will send this message to the listener thread when it detects data or path changes.
6) The process() method is called inside the listener thread.

2. Common monitoring

1) Monitor the change of node data
get path [watch]
2) Monitor the change of child node increase or decrease
ls path [watch]

1) Node value change monitoring
(1) Register on the hadoop104 host to monitor/sanguo node data changes

[zk: localhost:2181(CONNECTED) 26] get -w /sanguo

insert image description here

(2) Modify the data of the /sanguo node on the hadoop103 host

[zk: localhost:2181(CONNECTED) 1] set /sanguo "xisi"

(3) Observe the monitoring of data changes received by the hadoop104 host

WATCHER::
WatchedEvent state:SyncConnected type:NodeDataChanged
path:/sanguo

Note: If you modify the value of /sanguo several times in hadoop103, you will no longer receive monitoring on hadoop104. Because you register once, you can only monitor once. If you want to monitor again, you need to register again.
2) Monitor changes in child nodes of nodes (path changes)
(1) Register on the hadoop104 host to monitor changes in child nodes of /sanguo nodes

[zk: localhost:2181(CONNECTED) 1] ls -w /sanguo
[shuguo, weiguo]

(2) Create child nodes on the hadoop103 host/sanguo node

[zk: localhost:2181(CONNECTED) 2] create /sanguo/jin "simayi"
Created /sanguo/jin

(3) Observe that the hadoop104 host receives the monitoring of child node changes

WATCHER::
WatchedEvent state:SyncConnected type:NodeChildrenChanged
path:/sanguo

Note: The path change of the node is also registered once and takes effect once. If you want to take effect multiple times, you need to register multiple times.

3.2.5 Node deletion and viewing

1) delete node

[zk: localhost:2181(CONNECTED) 4] delete /sanguo/jin

insert image description here

2) Recursively delete nodes

[zk: localhost:2181(CONNECTED) 15] deleteall /sanguo/shuguo

3) View node status

[zk: localhost:2181(CONNECTED) 17] stat /sanguo
cZxid = 0x100000003
ctime = Wed Aug 29 00:03:23 CST 2018
mZxid = 0x100000011
mtime = Wed Aug 29 00:21:23 CST 2018
pZxid = 0x100000014
cversion = 9
dataVersion = 1
aclVersion = 0
ephemeralOwner = 0x0
dataLength = 4
numChildren = 1

3.3 Client API operations

3.3.1 IDEA environment

1) Create a project: : zookeeper
2) Add pom

 <dependencies>
    <dependency>
        <groupId>junit</groupId>
        <artifactId>junit</artifactId>
        <version>RELEASE</version>
    </dependency>
    <dependency>
        <groupId>org.apache.logging.log4j</groupId>
        <artifactId>log4j-core</artifactId>
        <version>2.8.2</version>
    </dependency>
        <dependency>
            <groupId>org.apache.zookeeper</groupId>
            <artifactId>zookeeper</artifactId>
            <version>3.5.7</version>
        </dependency>
    </dependencies>

3) Copy the log4j.properties file to the project root directory.
You need to create a new file under the project's src/main/resources directory, name it "log4j.properties", and fill in the file.

log4j.rootLogger=INFO, stdout
log4j.appender.stdout=org.apache.log4j.ConsoleAppender
log4j.appender.stdout.layout=org.apache.log4j.PatternLayout
log4j.appender.stdout.layout.ConversionPattern=%d %p [%c] - %m%n
log4j.appender.logfile=org.apache.log4j.FileAppender
log4j.appender.logfile.File=target/spring.log
log4j.appender.logfile.layout=org.apache.log4j.PatternLayout
log4j.appender.logfile.layout.ConversionPattern=%d %p [%c] - %m%n

4) Create package name com.xusheng.zk
5) Create class name zkClient

3.3.2 Create ZooKeeper client

// 注意:逗号左右不能有空格
    private String connectString = "192.168.10.102:2181,192.168.10.103:2181,192.168.10.104:2181";
    private int sessionTimeout = 200000;
    private ZooKeeper zkClient;

    @Before
    public void init() throws IOException {
    
    

        zkClient = new ZooKeeper(connectString, sessionTimeout, new Watcher() {
    
    
            @Override
            public void process(WatchedEvent watchedEvent) {
    
    

                System.out.println("-------------------------------");
                List<String> children = null;
                try {
    
    
                    children = zkClient.getChildren("/", true);

                   for (String child : children) {
    
    
                        System.out.println(child);
                    }

                    System.out.println("-------------------------------");
                } catch (KeeperException e) {
    
    
                    e.printStackTrace();
               } catch (InterruptedException e) {
    
    
                    e.printStackTrace();
                }
            }
        });
    }

3.3.3 Create child nodes

// 创建子节点
    @Test
    public void create() throws KeeperException, InterruptedException {
    
    
        // 参数 1:要创建的节点的路径; 参数 2:节点数据 ; 参数 3:节点权限 ;参数 4:节点的类型
        String nodeCreated = zkClient.create("/xusheng", "ss.avi".getBytes(), ZooDefs.Ids.OPEN_ACL_UNSAFE, CreateMode.PERSISTENT);
    }

Test: View the creation of nodes on the zk client of hadoop102

[zk: localhost:2181(CONNECTED) 16] get -s /xusheng
shuaige

insert image description here

3.3.4 Get child nodes and listen to nodes

 // 获取子节点
    @Test
    public void getChildren() throws KeeperException, InterruptedException {
    
    
        List<String> children = zkClient.getChildren("/", true);

        for (String child : children) {
    
    
            System.out.println(child);
        }

        // 延时阻塞
        Thread.sleep(Long.MAX_VALUE);
    }

(1) See the following nodes on the IDEA console:

zookeeper
sanguo
xusheng

(2) Create another node /atguigu1 on the hadoop102 client and observe the IDEA console

[zk: localhost:2181(CONNECTED) 3] create /xusheng1 "xusheng1"

(3) Delete the node /atguigu1 on the hadoop102 client and observe the IDEA console

[zk: localhost:2181(CONNECTED) 4] delete /xusheng1 

insert image description here

3.3.5 Determine whether Znode exists

 // 判断 znode 是否存在
    @Test
    public void exist() throws KeeperException, InterruptedException {
    
    
        Stat stat = zkClient.exists("/xusheng", false);
        System.out.println(stat==null? "not exist " : "exist");
    }

3.3.6 Complete code

package com.xusheng.zk;

import org.apache.zookeeper.*;
import org.apache.zookeeper.data.Stat;
import org.junit.Before;
import org.junit.Test;

import java.io.IOException;
import java.util.List;

public class zkClient {
    
    

    // 注意:逗号左右不能有空格
    private String connectString = "192.168.10.102:2181,192.168.10.103:2181,192.168.10.104:2181";
    private int sessionTimeout = 200000;
    private ZooKeeper zkClient;

    @Before
    public void init() throws IOException {
    
    

        zkClient = new ZooKeeper(connectString, sessionTimeout, new Watcher() {
    
    
            @Override
            public void process(WatchedEvent watchedEvent) {
    
    

               /* System.out.println("-------------------------------");
                List<String> children = null;
                try {
    
    
                    children = zkClient.getChildren("/", true);

                   for (String child : children) {
    
    
                        System.out.println(child);
                    }

                    System.out.println("-------------------------------");
                } catch (KeeperException e) {
    
    
                    e.printStackTrace();
               } catch (InterruptedException e) {
    
    
                    e.printStackTrace();
                }*/
            }
        });
    }
    // 创建子节点
    @Test
    public void create() throws KeeperException, InterruptedException {
    
    
        // 参数 1:要创建的节点的路径; 参数 2:节点数据 ; 参数 3:节点权限 ;参数 4:节点的类型
        String nodeCreated = zkClient.create("/xusheng", "ss.avi".getBytes(), ZooDefs.Ids.OPEN_ACL_UNSAFE, CreateMode.PERSISTENT);
    }
    // 获取子节点
    @Test
    public void getChildren() throws KeeperException, InterruptedException {
    
    
        List<String> children = zkClient.getChildren("/", true);

        for (String child : children) {
    
    
            System.out.println(child);
        }

        // 延时阻塞
        Thread.sleep(Long.MAX_VALUE);
    }
    // 判断 znode 是否存在
    @Test
    public void exist() throws KeeperException, InterruptedException {
    
    
        Stat stat = zkClient.exists("/xusheng", false);
        System.out.println(stat==null? "not exist " : "exist");
    }
}

3.4 The process of writing data from the client to the server

The write request of the write process is sent directly to the Leader node
insert image description here

The write request of the write process is sent to the follower node
insert image description here

4. Cases of dynamic monitoring of servers going online and offline

4.1 Requirements

In a distributed system, there can be multiple master nodes, which can go online and offline dynamically, and any client can perceive the master node server's online and offline in real time.

4.2 Demand Analysis

The server goes online and offline dynamically
insert image description here

4.3 Implementation

(1) First create a /servers node on the cluster

[zk: localhost:2181(CONNECTED) 10] create /servers "servers"
Created /servers

(2) Create a package name in Idea: com.xusheng.zkcase1
(3) The server registers the code with Zookeeper

4.4 Testing

1) Increase or decrease the server on the Linux command line
(1) Start the DistributeClient client
(2) Create a temporary node with a serial number on the client/servers directory of zk on hadoop102

[zk: localhost:2181(CONNECTED) 1] create -e -s /servers/hadoop102 "hadoop102"
[zk: localhost:2181(CONNECTED) 2] create -e -s /servers/hadoop103 "hadoop103"

insert image description here

(3) Observe the changes in the Idea console

[hadoop102, hadoop103]

(4) Execute the delete operation

[zk: localhost:2181(CONNECTED) 8] delete  /servers/hadoop1020000000000

(5) Observe the changes in the Idea console

[hadoop103]

insert image description here

2) Operate on Idea to increase or decrease the server
(1) Start the DistributeClient client (if it has already been started, no restart is required)
(2) Start the DistributeServer service
insert image description here
insert image description here

Five, ZooKeeper distributed lock case

What is a distributed lock?
For example, when "process 1" uses the resource, it will first acquire the lock. After "process 1" acquires the lock, it will keep the resource exclusively, so that other processes cannot access the resource. "Process 1" uses up the resource In the future, the lock will be released and other processes can acquire the lock. Through this lock mechanism, we can ensure that multiple processes in the distributed system can access the critical resource in an orderly manner. Then we call this lock in this distributed environment a distributed lock.
insert image description here

5.1 Native Zookeeper implementation of distributed lock case

1) Distributed lock implementation

package com.xusheng.zk.case2;

import org.apache.zookeeper.*;
import org.apache.zookeeper.data.Stat;

import java.io.IOException;
import java.util.Collections;
import java.util.List;
import java.util.concurrent.CountDownLatch;

public class DistributedLock {
    
    

    private final String connectString = "hadoop102:2181,hadoop103:2181,hadoop104:2181";
    private final int sessionTimeout = 2000;
    private final ZooKeeper zk;
    //ZooKeeper 连接
    private CountDownLatch connectLatch = new CountDownLatch(1);
    //ZooKeeper 节点等待
    private CountDownLatch waitLatch = new CountDownLatch(1);
    // 当前 client 等待的子节点
    private String waitPath;
    // 当前 client 创建的子节点
    private String currentMode;

    public DistributedLock() throws IOException, InterruptedException, KeeperException {
    
    

        // 获取连接
        zk = new ZooKeeper(connectString, sessionTimeout, new Watcher() {
    
    
            @Override
            public void process(WatchedEvent watchedEvent) {
    
    
                // connectLatch  如果连接上zk  可以释放,// 连接建立时, 打开 latch, 唤醒 wait 在该 latch 上的线程
                if (watchedEvent.getState() == Event.KeeperState.SyncConnected){
    
    
                    connectLatch.countDown();
                }

                // waitLatch  需要释放,// 发生了 waitPath 的删除事件
                if (watchedEvent.getType()== Event.EventType.NodeDeleted && watchedEvent.getPath().equals(waitPath)){
    
    
                    waitLatch.countDown();
                }
            }
        });

        // 等待zk正常连接后,往下走程序
        connectLatch.await();

        // 判断根节点/locks是否存在
        Stat stat = zk.exists("/locks", false);
        //如果根节点不存在,则创建根节点,根节点类型为永久节点
        if (stat == null) {
    
    
            // 创建一下根节点
            zk.create("/locks", "locks".getBytes(), ZooDefs.Ids.OPEN_ACL_UNSAFE, CreateMode.PERSISTENT);
        }
    }

    // 对zk加锁
    public void zklock() {
    
    
        // 创建对应的临时带序号节点
        try {
    
    
            //在根节点下创建临时顺序节点,返回值为创建的节点路径
            currentMode = zk.create("/locks/" + "seq-", null, ZooDefs.Ids.OPEN_ACL_UNSAFE, CreateMode.EPHEMERAL_SEQUENTIAL);

            // wait一小会, 让结果更清晰一些
            Thread.sleep(10);

            // 判断创建的节点是否是最小的序号节点,如果是获取到锁;如果不是,监听他序号前一个节点
            List<String> children = zk.getChildren("/locks", false);

            // 如果children 只有一个值,那就直接获取锁; 如果有多个节点,需要判断,谁最小
            if (children.size() == 1) {
    
    
                return;
            } else {
    
    
                Collections.sort(children);

                // 获取节点名称 seq-00000000
                String thisNode = currentMode.substring("/locks/".length());
                // 通过seq-00000000获取该节点在children集合的位置
                int index = children.indexOf(thisNode);

                // 判断
                if (index == -1) {
    
    
                    System.out.println("数据异常");
                } else if (index == 0) {
    
    
                    // index == 0, 说明 thisNode 在列表中最小, 当前client 获得锁
                    // 就一个节点,可以获取锁了
                    return;
                } else {
    
    
                    // 需要监听  他前一个节点变化
                    waitPath = "/locks/" + children.get(index - 1);
                    // 在 waitPath 上注册监听器, 当 waitPath 被删除时zookeeper 会回调监听器的 process 方法
                    zk.getData(waitPath,true,new Stat());

                    // 等待监听
                    waitLatch.await();

                    return;
                }
            }


        } catch (KeeperException e) {
    
    
            e.printStackTrace();
        } catch (InterruptedException e) {
    
    
            e.printStackTrace();
        }


    }

    // 解锁
    public void unZkLock() {
    
    

        // 删除节点
        try {
    
    
            zk.delete(this.currentMode,-1);
        } catch (InterruptedException e) {
    
    
            e.printStackTrace();
        } catch (KeeperException e) {
    
    
            e.printStackTrace();
        }

    }

}

2) Distributed lock test
(1) Create two threads

package com.xusheng.zk.case2;

import org.apache.zookeeper.KeeperException;
import java.io.IOException;

public class DistributedLockTest {
    
    
    public static void main(String[] args) throws InterruptedException, IOException, KeeperException {
    
    
        // 创建分布式锁 1
        final DistributedLock lock1 = new DistributedLock();
        // 创建分布式锁 2
        final DistributedLock lock2 = new DistributedLock();

        new Thread(new Runnable() {
    
    
            @Override
            public void run() {
    
    // 获取锁对象
                try {
    
    
                    lock1.zklock();
                    System.out.println("线程1 启动,获取到锁");
                    Thread.sleep(5 * 1000);

                    lock1.unZkLock();
                    System.out.println("线程1 释放锁");
                } catch (InterruptedException e) {
    
    
                    e.printStackTrace();
                }


            }
        }).start();
        new Thread(new Runnable() {
    
    
            @Override
            public void run() {
    
    // 获取锁对象

                try {
    
    
                    lock2.zklock();
                    System.out.println("线程2 启动,获取到锁");
                    Thread.sleep(5 * 1000);

                    lock2.unZkLock();
                    System.out.println("线程2 释放锁");
                } catch (InterruptedException e) {
    
    
                    e.printStackTrace();
                }
            }
        }).start();

    }
}

(2) Observe console changes:
insert image description here

5.2 Curator framework implements distributed lock case

1) Problems in native Java API development
(1) Session connection is asynchronous and needs to be handled by itself. For example, using CountDownLatch
(2) Watch needs to be registered repeatedly, otherwise it will not take effect
(3) The complexity of development is still relatively high
(4) It does not support multi-node deletion and creation. You need to recurse by yourself
2) Curator is a framework that specifically solves distributed locks, and solves the problems encountered in the development of distributed native Java APIs.
For details, please check the official document: https://curator.apache.org/index.html
3) Curator case practice
(1) Add dependencies

 		<dependency>
            <groupId>org.apache.curator</groupId>
            <artifactId>curator-framework</artifactId>
            <version>4.3.0</version>
        </dependency>
        <dependency>
            <groupId>org.apache.curator</groupId>
            <artifactId>curator-recipes</artifactId>
            <version>4.3.0</version>
        </dependency>
        <dependency>
            <groupId>org.apache.curator</groupId>
            <artifactId>curator-client</artifactId>
            <version>4.3.0</version>
        </dependency>

(2) Code implementation

package com.xusheng.zk.case3;

import org.apache.curator.framework.CuratorFramework;
import org.apache.curator.framework.CuratorFrameworkFactory;
import org.apache.curator.framework.recipes.locks.InterProcessMutex;
import org.apache.curator.retry.ExponentialBackoffRetry;

public class CuratorLockTest {
    
    

    public static void main(String[] args) {
    
    
        // 创建分布式锁1
        InterProcessMutex lock1 = new InterProcessMutex(getCuratorFramework(), "/locks");
        // 创建分布式锁2
        InterProcessMutex lock2 = new InterProcessMutex(getCuratorFramework(), "/locks");
        new Thread(new Runnable() {
    
    
            @Override
            public void run() {
    
    // 获取锁对象
                try {
    
    
                    lock1.acquire();
                    System.out.println("线程1 获取到锁");

                    lock1.acquire();
                    System.out.println("线程1 再次获取到锁");

                    Thread.sleep(5 * 1000);

                    lock1.release();
                    System.out.println("线程1 释放锁");

                    lock1.release();
                    System.out.println("线程1  再次释放锁");

                } catch (Exception e) {
    
    
                    e.printStackTrace();
                }
            }
        }).start();

        new Thread(new Runnable() {
    
    
            @Override
            public void run() {
    
    // 获取锁对象
                try {
    
    
                    lock2.acquire();
                    System.out.println("线程2 获取到锁");// 测试锁重入

                    lock2.acquire();
                    System.out.println("线程2 再次获取到锁");

                    Thread.sleep(5 * 1000);

                    lock2.release();
                    System.out.println("线程2 释放锁");

                    lock2.release();
                    System.out.println("线程2  再次释放锁");

                } catch (Exception e) {
    
    
                    e.printStackTrace();
                }
            }
        }).start();
    }
    // 分布式锁初始化
    private static CuratorFramework getCuratorFramework() {
    
    
        //重试策略,初试时间 3 秒,重试 3
        ExponentialBackoffRetry policy = new ExponentialBackoffRetry(3000, 3);
        //通过工厂创建 Curator
        CuratorFramework client = CuratorFrameworkFactory.builder().connectString("hadoop102:2181,hadoop103:2181,hadoop104:2181")
                .connectionTimeoutMs(2000)
                .sessionTimeoutMs(2000)
                .retryPolicy(policy).build();
        // 启动客户端
        client.start();

        System.out.println("zookeeper 启动成功");
        return client;
    }
}

(3) Observe console changes:
insert image description here

6. Enterprise interview questions (interview focus)

6.1 Election mechanism

Half-majority mechanism, more than half of the votes passed, that is, passed.
(1) Election rules for the first start:
when more than half of the votes are cast, the one with the larger server id wins
(2) Election rules for the second start:
①The one with the larger EPOCH wins directly
②The same EPOCH, the one with the larger transaction id wins
③The same transaction id, the server id big wins

6.2 How many ZKs should be installed less in the production cluster?

Install an odd number of units.
Production experience:
⚫ 10 servers: 3 zk;
⚫ 20 servers: 5 zk;
⚫ 100 servers: 11 zk;
⚫ 200 servers: 11 zk
Large number of servers: benefits, improve reliability; disadvantages: Improve communication latency

6.3 Common commands

ls、get、create、delete

Guess you like

Origin blog.csdn.net/m0_52435951/article/details/124579370