ZooKeeper environment construction (nanny-level tutorial)

1. Introduction to ZooKeeper

In the distributed field, an indispensable component is ZooKeeper.

ZooKeeper is a highly available distributed data management and coordination framework, and can well ensure data consistency in a distributed environment.

ZooKeeper was created by Yahoo and is an open source implementation of Google Chubby. The consistency of Chubby is based on the Paxos algorithm, and ZK uses a variant of the Paxos protocol, ZAB (ZooKeeper Atomic Broadcast protocol, full name: ZooKeeper Atomic Message Broadcast Protocol).

The main application scenarios of ZooKeeper include: data publishing/subscription, load balancing, naming service, distributed coordination/notification, cluster management, Master election, distributed locks, distributed queues, etc. Currently, Zookeeper is used as a core component in more and more distributed systems (Hadoop, HBase, Storm, Kafka).

Server has two main roles: Leader and Follower.

  • Leader: responsible for the initiation and resolution of voting, and updating the system status;

  • Follower: Receive client requests and return results to the client, and participate in voting in the election process;

ZooKeeper cluster

Fun fact about the ZooKeeper name . In the early stage of the project, considering that many internal projects were named after animals (such as the famous Pig project), Yahoo engineers hoped to give this project an animal name. Raghu Ramakrishnan, the chief scientist of the research institute at the time, joked: "If this goes on like this, we will become a zoo!" Putting the components together, Yahoo's entire distributed system looks like a large zoo, and Zookeeper is just used to coordinate the distributed environment—and thus, the name Zookeeper was born.

2. ZooKeeper installation

There are two ways to deploy ZooKeeper:

  • Standalone mode (stand-alone mode): used in the development environment, single server
  • Cluster mode (multi-server mode): used in production environment, the number of servers is odd

Why is ZK set to an odd number?

Zookeeper has such a feature: as long as more than half of the machines in the cluster are working normally, the entire cluster is available to the outside world. That is to say, if there are 2 zookeeper servers, as long as 1 server hangs, the entire zookeeper cluster will not be able to use, because 1 is not more than half, so the error tolerance of 2 zookeeper servers is 0; similarly, if there are 3 One of the zookeepers hangs up, and there are still 2 working normally, more than half of them, so the tolerance of 3 zookeepers is 1; for the same reason, you can list a few more: 2 -> 0; 3 -> 1; 4 -> 1 ; 5 -> 2; 6 -> 2 will find a rule, the tolerance of 2n and 2n-1 is the same, both are n-1, so in order to be more efficient, why add an unnecessary zookeeper.

It can be seen that the number of servers in the ZK cluster is at least three.

0. Preparations

Requirements list:

  • OS: Ubuntu-18.04

    If you need the installation steps of the operating system, please refer to: Virtual Machine Installation (Nanny Level Tutorial)

  • ZooKeeper 3.7.0

    Official website download address: https://zookeeper.apache.org/releases.html

  • JDK:JDK1.8

    ZooKeeper runs based on JVM. The ZK installed in this article requires JDK version 1.8 and above (JDK 8 LTS, JDK 11 LTS, JDK 12; Java 9 and 10 versions are not supported)

    Official website download address: http://java.sun.com/javase/downloads/index.jsp

In order to take care of basic users, the required software is placed on Baidu network disk

链接:https://pan.baidu.com/s/1kjcuNNCY2FxYA5o7Z2tgkQ 
提取码:nuli 

Official installation steps:

1) Install JDK

2) Set Java heap size (Java stack size)

This is an important step in order to avoid memory swapping that affects ZooKeeper performance. To determine the correct value, you need to load test and make sure you are well below the usage limit that causes the swap

3) Install ZooKeeper

4) Create a configuration file, the file name can be chosen arbitrarily, it is recommended to put the configuration file in the conf directory of ZooKeeper and name it zoo.cfg, so that it is convenient to start the service without specifying the configuration file.

Complete the following configuration:

tickTime=2000
initLimit=5
syncLimit=2
dataDir=/var/lib/zookeeper/
clientPort=2181
maxClientCnxns=60
server.1=zoo1:2888:3888
server.2=zoo2:2888:3888
server.3=zoo3:2888:3888

Parameter Description:

parameter Defaults illustrate
tickTime 2000 Client-Server communication heartbeat time
The time interval for maintaining heartbeats between Zookeeper servers or between clients and servers, that is, a heartbeat is sent every tickTime. tickTime is in milliseconds.
initLimit 10 Leader-Follower initial communication time limit
The maximum number of heartbeats (the number of tickTimes) that can be tolerated during the initial connection between the follower server (F) and the leader server (L) in the cluster.
syncLimit 5 Leader-Follower synchronous communication time limit
The maximum number of heartbeats (the number of tickTimes) that can be tolerated between the request and response between the follower server and the leader server in the cluster.
dataDir /tmp/zookeeper The data file directory
Zookeeper saves the data directory. By default, Zookeeper also saves the log files for writing data in this directory.
clientPort 2181 Client connection port
The port on which the client connects to the Zookeeper server. Zookeeper will listen to this port and accept access requests from clients.
maxClientCnxns 60 Maximum supported client connections
server.id=host:port:port Cluster information (server number, server address, LF communication port, election port)
is written in a special format. The rules are as follows:
server.N=YYY:A:B, where N is used to indicate a serial number of the server in the cluster , we need to create a file in the dataDir directory , the myidcontent of the file is the corresponding number N
A is the port number, which is used for machine communication in the cluster (only the leader listens to this port
B is the port number, which is used for the election of the leader (every Zookeeper listens) this port)

For more information about parameters, please refer to: https://zookeeper.apache.org/doc/r3.7.0/zookeeperAdmin.html#sc_configuration

5) Create myid file

myid​ Create a file in the dataDir directory set in the previous step .

myid​The file

​ The ID size is between 1 and 255. If extended features are enabled, such as TTL nodes, the ID needs to be between 1 and 254.

6) Create initial identity fileinitialize

initializeThe file is located in the dataDir directory and is created when a new cluster is started.

7) Start the ZooKeeper service as follows

$ java -cp zookeeper.jar:lib/*:conf org.apache.zookeeper.server.quorum.QuorumPeerMain zoo.conf

1. Stand-alone mode

Stand-alone mode is the first way for beginners or users with limited resources. This article mainly introduces the stand-alone mode installation of ZK.

Assuming the current user name is xiaobai(if your user name is not xiaobai, you can take two methods: one is to create a xiaobai user, and the other is to modify the corresponding configuration according to your user name), combined with the official installation steps, we press Install as follows:

1) Install JDK

If already installed, skip

Convention:

  • Upload or download the required installation package to the soft directory under the Home directory~/soft

  • The installation directory is located in the opt directory under the Home directory~/opt

Upload jdk to the ~/softdirectory, make sure the file has been uploaded, enter the command ls ~/softto enter the verification

ls command

Next unzip the file

mkdir ~/opt
tar -xvf ~/soft/jdk-8u261-linux-x64.tar.gz  -C ~/opt

set soft connection

cd ~/opt
ln -s jdk1.8.0_261/ jdk

Configure environment variables and open the bash configuration file

cd 
vi .bashrc

Press ito enter insert mode, add the following code at the end , then press to escexit editing, enter to :xsave

export JAVA_HOME=/home/xiaobai/opt/jdk
export PATH=$PATH:$JAVA_HOME/bin

Enter the following command to make the modification take effect, you can use the javacommand to verify whether the configuration is successful

source .bashrc
java

java command

2) Install ZooKeeper

(1) Upload the ZooKeeper installation package downloaded from Baidu network disk apache-zookeeper-3.7.0-bin.tar.gzto the ~/softdirectory

Of course, you can also copy the download link from the official website and use the wgetcommand to download.

Make sure the file has been uploaded, enter the command ls ~/softto enter the verification

ls command

(2) Unzip the ZooKeeper installation package to the ~/optdirectory

tar -xvf ~/soft/apache-zookeeper-3.7.0-bin.tar.gz  -C ~/opt
ls ~/opt/apache-zookeeper-3.7.0-bin

(3) Create soft links

cd ~/opt
ln -s apache-zookeeper-3.7.0-bin zookeeper

ls command

  • The bin directory includes executable scripts, such as the commonly used zkServer.sh, zkCli.sh
  • conf directory contains configuration files
  • docs directory contains related documentation
  • The lib directory contains related jar packages

(4) Modify the configuration file -

cd ~/opt/zookeeper/conf
cp zoo_sample.cfg zoo.cfg
vi zoo.cfg

The modified dataDirvalue is/home/xiaobai/opt/zookeeper/tmp

Note that for Xiaobai, if you don't want to use the vi command, you can use sudo gedit ~/opt/zookeeper/conf/zoo.cfgNotepad to open a file similar to Windows to edit the file, and handle similar situations later.

The vi command is simple to use : after entering the file, enter the letter ito enter the insert mode => modify the content of the file to node1 => press the Esckey to enter the command line mode => enter :the bottom line mode => enter xor wqsave and exit.

If you do not want to save the file after modification, enter the bottom line mode and then enter q!to execute and exit without saving.

dataDir directory

3) Configure environment variables

vi  ~/.bashrc  

Add the following at the end of the file:

export ZOOKEEPER_HOME=/home/xiaobai/opt/zookeeper
export PATH=$ZOOKEEPER_HOME/bin:$PATH

Make environment variables take effect:

source ~/.bashrc

4) Start Zookeeper

zkServer.sh start

View progress

Enter the command to jpssee if the startup is successful

QuorumPeerMain

View status

Enter a command to zkServer.sh statusview status

ZooKeeper Status View

5) Client connection

interview method:

  • Via client tools:

    • Command Line Tool: zkCli.sh

    • Interface tool: ZooInspector

      Download address: https://issues.apache.org/jira/secure/attachment/12436620/ZooInspector.zip

      It can also be downloaded from the Baidu network disk provided earlier.

  • Via the Java API

Here is a simple demonstration with the command line tool zkCli.sh,

(1) Start the client

zkCli.sh -server localhost:2181

ZK client start

(2) Create a node

create /test 888
create -s /test/lock 666
create -s /test/lock 666

create create node

(3) View nodes

ls /
ls -s /tset

View node information

ZooKeeper maintains a tree-like hierarchy. The nodes in the tree are called znodes. Each znode will save its own data content and a series of attribute information. Each Znode has a unique path identifier; it should be noted that the znode data cannot exceed 1MB.

ZooKeeper's directory tree can be viewed through the tool ZooInspector .

You can view detailed node information through commands ls -s path. The following is a brief explanation of the above information:

  • [lock0000000000, lock0000000001] //Refer to which nodes are in this directory

  • cZxid = 0xd //Created ZXID, indicating the transaction ID when the ZNode was created

  • ctime = Thu Dec 16 20:52:57 CST 2021 //Created Time, indicating the time when the ZNode was created

  • mZxid = 0xd //Modified ZXID, indicating the transaction ID when the ZNode was last updated

  • mtime = Thu Dec 16 20:52:57 CST 2021 //Modified Time, indicating the last time the node was updated

  • pZxid = 0xf //Indicates the transaction ID when the child node list of this node was last modified. Note that pZxid will only be changed if the list of child nodes is changed, and changes to the content of child nodes will not affect pZxid.

  • cversion = 2 //version number of the child node

  • dataVersion = 0 //version number of the data node

  • aclVersion = 0 //ACL version number

  • ephemeralOwner = 0x0 //seddionID of the session that created this node. If the node is a persistent node, the value of this attribute is 0.

  • dataLength = 3 //Length of data content

  • numChildren = 2 //Number of child nodes

(4) Delete node

delete /test/local0000000001
deleteall /test
ls

delete delete node

(5) Exit the client

quit

Commonly used commands are listed below





























Classification Order describe
help help View help
create node create create [-s] [-e] path data acl Among them, -s or -e specify node characteristics, sequence or temporary node respectively, if not specified, it means persistent node; acl is used for permission control
read node ls ls path [watch]
get get path [watch]
ls2 ls2 path [watch]
stat stat path [watch] Get the status information of the node
update node set set path data [version] data is the new content to be updated, version indicates the data version
delete node delete delete path [version]
deleteall is a recursive delete command
Synchronize sync 使客户端的Znode视图与Zookeeper同步
ACL getACL/setACL 为Znode获取/设置ACL
配额 setquota 设置子节点个数以及数据长度的配额 setquota –n 4 /zookeeper/node 设置/zookeeper/node 子节点个数最大为4
delquota delquota命令用于删除配额, -n为子节点个数, -b为节点数据长度,如:delquota –n 2
listquota 命令用于显示配额,如listquota /storm
操作历史 history/redo history用于列出最近的命令历史,redo命令用于再次执行某个命令,使用方式为redo cmdid 如 redo 20
会话 connect 连接服务器
close 关闭当前连接,可用connect 再次连接,不会退出客户端
quit 关闭连接并退出连接客户端

2. 集群模式

集群模式这里只做简单介绍,假设有三台服务器node1node2node3

在单机模式的步骤:2)安装ZooKeeper -> (4)修改配置文件

1)修改zoo.cfg文件时,在后面添加如下集群信息:

server.1=node1:2888:3888
server.2=node2:2888:3888
server.3=node3:2888:3888

2)分别在node1node2node3三台服务器的/home/xiaobai/opt/zookeeper/tmp目录中,创建两个文件

touch myid
touch initialize
  • myid: 分别设置node1node2node3三台服务器的文件myid的内容分别为123,比如对服务器node1,它对应的集群id号为1,myid文件的内容即为1

  • initialize: 文件initialize留空即可

注意

  • If the server name is used instead of ip between servers, pay attention to modifying the hosts file of each server
  • When configuring multiple servers, you can configure a certain server first, then use the remote copy command scpto synchronize, and then fine-tune the respective servers, such as modifying the myid file.

3. Common exceptions and solutions

1. The port is occupied

Error message: Address already in use

Solution:

  • On the one hand, you can choose to stop the process that is currently occupying the port, and use the command netstat -nltpin combination with the command grepto query

  • On the other hand, you can modify zoo.cfg and change the port number

2. Not enough disk space

Error message: No space left on device

Solution: clear the disk or disk

3. Unable to find myid file

Error message: myid file is missing

Solution: dataDirCreate a myid file in the corresponding directory and set the correct content (the id corresponding to the server)

4. The leader election port of other machines in the cluster is not open

Error message: Cannot open channel to 2 at election address /122.228.242.21:3888

Solution:

  • Check whether the firewall of each server is closed, use the commandsudo ufw status

  • Check whether the content in each server /etc/hostsis consistent, and whether the IPs of all nodes are configured

  • Check that the time of each server is consistent

  • Modify the zoo.cfg of each server, and modify the host corresponding to its own cluster information in each server to0.0.0.0

    For example, for the server node1 in the example, modify the cluster information of its zoo.cfg to

    server.1=0.0.0.0:2888:3888
    server.2=node2:2888:3888
    server.3=node3:2888:3888
    

5. The most direct and efficient solution is to analyze the log files

Guess you like

Origin blog.csdn.net/tangyi2008/article/details/121984758