Initial ZooKeeper and cluster construction example

what is zookeeper

Zookeeper, a collaborative service for distributed applications, is an open-source implementation of Google's Chubby and a distributed coordination service for Hadoop . Services such as synchronization, configuration maintenance, and clustering or naming can be implemented based on these interfaces.

Zookeeper is a cluster composed of multiple services, one leader, multiple followers, each server saves a data part, global data is consistent, distributed read and write, update request forwarding is implemented by the leader.

The update requests are performed in sequence, and the update requests from the same client are executed in the order in which they are sent. The data update is atomic. A data update either succeeds or fails, and the globally unique data attempt is consistent. No matter which server the client connects to, the data attempt is consistent. .

Why use zookeeper

Most distributed applications need a master, coordinator or controller to manage physically distributed sub-processes (such as resources, task allocation, etc.). At present, most applications need to develop private coordinator programs and lack a general mechanism. Coordination Repeated programming of programs is wasteful, and it is difficult to form a general and scalable coordinator. ZooKeeper: Provides a general distributed lock service to coordinate distributed applications

How zookeeper works

The core of zookeeper is atomic broadcast. This mechanism ensures the synchronization between servers. The protocol that implements this mechanism is called the Zab protocol. The Zab protocol has two modes, which are recovery mode and broadcast mode.

  1. When the service starts or after the leader crashes, Zab enters the recovery mode. When the leader is elected and most servers have completed the state synchronization with the leader, the recovery mode ends. The state synchronization guarantees The leader and server have the same system state.

  2. Once the leader has synchronized the state with most followers, he can start broadcasting messages, that is, enter the broadcasting state. At this time, when a server joins the zookeeper service, it will start in recovery mode and send out the leader, And perform state synchronization with the leader, and when the synchronization ends, it also participates in broadcasting messages.

illustrate:

    Broadcast mode needs to ensure that proposals are processed in order, so zk uses an increasing transaction id number (zxid) to ensure that. All proposals (proposal) are added with zxid when they are proposed. In the implementation, zxid is a 64-bit Number, its high 32 bits are the epoch used to identify whether the leader relationship has changed, each time a leader is elected, it will have a new epoch. The low 32 bits are an incremental count.

    When the leader crashes or the leader loses most of the followers, zk enters the recovery mode, and the recovery mode needs to re-elect a new leader to restore all servers to a correct state.

    The zookeeper service remains in the Broadcast state consistently until the leader crashes or the leader loses most of the followers support.

    The Broadcast mode is very similar to 2pc (two-phrase commit) in distributed transactions: that is, the leader proposes a resolution, the followers vote, and the leader calculates the voting result to decide whether to pass the resolution. transaction), otherwise do nothing.

Leader election

After each server starts, it asks other servers who it wants to vote for. For the inquiries of other servers, the server replies to the id of its recommended leader and the zxid of the last transaction according to its own status each time (each time when the system starts The server will recommend itself), after receiving the replies from all servers, it calculates which server has the largest zxid, and sets the server-related information as the server to vote next time. The server with the most votes in the calculation process is the winner. , if the winner has more than half of the votes, the server will be elected as the leader. Otherwise, continue this process until the leader is elected. The leader will start waiting for the server to connect, the follower will connect to the leader, and the largest zxid will be sent to the leader and the leader. The synchronization point is determined according to the follower's zxid. After the synchronization is completed, the follower is notified that it has become the uptodate state. After the follower receives the uptodate message, it can re-accept the client's request for service.

Zookeeper's data model

Hierarchical directory structure, named according to conventional file system specifications

Each node is called znode in zookeeper, and it has a unique path identifier

A node Znode can contain data and child nodes, but a node of type EPHEMERAL cannot have child nodes

The data in Znode can have multiple versions. For example, if there are multiple data versions in a certain path, then you need to bring the version to query the data under this path.

The client application can set the monitor on the node, the node does not support partial read and write, but a complete read and write at one time

Zoopkeeper provides a good mechanism for distributed cluster management, which is its data structure based on a hierarchical directory tree, and effectively manages the nodes in the tree, so that a variety of distributed clusters can be designed. data management model

Zookeeper node

There are two types of Znodes, ephemeral and persistent

The type of a Znode is determined at creation and cannot be modified later

When the client session of the ephemeral znode ends, zookeeper will delete the ephemeral znode, and the ephemeral znode cannot have child nodes

Persistent znodes do not depend on client sessions and are only deleted when the client explicitly wants to delete the persistent znode

Znode has four forms of directory nodes, PERSISTENT, PERSISTENT_SEQUENTIAL, EPHEMERAL, EPHEMERAL_SEQUENTIAL.

The znode can be monitored, including the modification of the data stored in this directory node, the change of the sub-node directory, etc. Once the change is made, the client who sets the monitoring can be notified. This function is the most important feature of zookeeper for applications.

The functions that can be achieved through this feature include centralized management of configuration, cluster management, distributed locks, etc.

Role of Zookeeper

The leader (leader), responsible for the initiation and resolution of voting, and updating the system status

Learner (learner), including follower (follower) and observer (observer).

The follower is used to accept client requests and return results to the client, and participate in voting in the process of electing the leader

Observer can accept client connections and forward write requests to the leader, but the observer does not participate in the voting process and only synchronizes the status of the leader. The purpose of the observer is to expand the system and improve the reading speed

Client (client), request initiator

Watcher

Watcher is a core function in ZooKeeper. Watcher can monitor data changes of directory nodes and changes in subdirectories. Once these states change, the server will notify all Watchers set on this directory node, so that each client can quickly Knowing that the state of the directory node it cares about changes, and reacting accordingly

You can set the observed operations: exists, getChildren, getData

Actions that can trigger observations: create, delete, setData

A "watch" mechanism allows clients to be notified when a znode changes in some way.

Watches can be set up for "actions" of a ZooKeeper service, which other operations of the service can trigger.

For example, a client can call the exists operation on a client and set an observation on it. If the znode does not exist at this time, exists returns false. If the znode is created by other clients after a period of time, then this The watch will be triggered and the previous client will be notified.

Zookeeper cluster construction

Zookeeper can not only provide services on a single machine, but also support multiple machines to form a cluster to provide services. In fact, Zookeeper also supports another pseudo-cluster method, that is, you can run multiple Zookeeper instances on one physical machine.

Zookeeper achieves high availability through replication, as long as more than half of the machines in the collective are available, it can ensure that the service continues.

Cluster disaster tolerance:

 As long as 2 of the 3 machines are available, the leader can be elected and provide services to the outside world (2n+1 machines can accommodate n machines to hang up).

Zookeeper pseudo-distributed environment to build:

1. Go to the Zookeeper official website to download the latest version of Zookeeper.

[root@localhost zookeeper-cluster]# pwd
/export/search/zookeeper-cluster
[root@localhost zookeeper-cluster]# ls
zookeeper-3.4.6.tar.gz
[root@localhost zookeeper-cluster]#
[root@localhost zookeeper-cluster]# tar -zxvf zookeeper-3.4.6.tar.gz #Create
the first cluster node
[root@localhost zookeeper-cluster]# mv zookeeper-3.4.6 zookeeper-3.4.6-node1
[root@localhost zookeeper-cluster]# cd zookeeper-3.4.6-node1
[root@localhost zookeeper-3.4.6-node1]# pwd
/export/search/zookeeper-cluster/zookeeper-3.4.6-node1 #Create
data storage path
[root @localhost zookeeper-3.4.6-node1]# mkdir data
[root@localhost zookeeper-3.4.6-node1]# cd ../ #Create
the second and third cluster nodes
[root@localhost zookeeper-cluster]# cp zookeeper-3.4.6-node1 zookeeper-3.4.6-node2 -R
[root@localhost zookeeper-cluster]# cp zookeeper-3.4.6-node1 zookeeper-3.4.6-node3 -R
[root@localhost zookeeper-cluster]# ls
zookeeper-3.4.6-node1  zookeeper-3.4.6-node2  zookeeper-3.4.6-node3  zookeeper-3.4.6.tar.gz
[root@localhost zookeeper-cluster]# cd zookeeper-3.4.6-node1/conf/
[root@localhost conf]# ls
configuration.xsl  log4j.properties  zoo_sample.cfg
#创建zoo.cfg文件
[root@localhost conf]# cp zoo_sample.cfg zoo.cfg

2. Configure the zoo.cfg file:

#zookeeper-3.4.6-node1的配置
tickTime=2000
initLimit=10
syncLimit=5
clientPort=2181
dataDir=/export/search/zookeeper-cluster/zookeeper-3.4.6-node1/data
server.1=localhost:2887:3887
server.2=localhost:2888:3888
server.3=localhost:2889:3889

#zookeeper-3.4.6-node2的配置
tickTime=2000
initLimit=10
syncLimit=5
clientPort=2182
dataDir=/export/search/zookeeper-cluster/zookeeper-3.4.6-node2/data
server.1=localhost:2887:3887
server.2=localhost:2888:3888
server.3=localhost:2889:3889

#zookeeper-3.4.6-node3的配置
tickTime=2000
initLimit=10
syncLimit=5
clientPort=2183
dataDir=/export/search/zookeeper-cluster/zookeeper-3.4.6-node3/data
server.1=localhost:2887:3887
server.2=localhost:2888:3888
server.3=localhost:2889:3889

Parameter Description:

tickTime=2000:
tickTime is the time interval for maintaining heartbeats between Zookeeper servers or between clients and servers, that is, a heartbeat will be sent every tickTime;

initLimit=10:
The initLimit configuration item is used to configure Zookeeper to accept clients (the client mentioned here is not the client that the user connects to the Zookeeper server, but the Follower server connected to the Leader in the Zookeeper server cluster) The longest time to initialize the connection How many heartbeat intervals can be tolerated.
When the Zookeeper server has not received the return information from the client after more than 10 heartbeats (that is, tickTime), it indicates that the client connection failed. The total time length is 10*2000=20 seconds;

syncLimit=5:
The syncLimit configuration item identifies the length of the message, request and response time sent between the Leader and the Follower. The longest time length cannot exceed the length of tickTime, and the total time length is 5*2000=10 seconds;

dataDir=/export/search/zookeeper-cluster/zookeeper-3.4.6-node1/data
dataDir, as the name suggests, is the directory where Zookeeper saves data. By default, Zookeeper also saves log files for writing data in this directory;

clientPort=2181
clientPort This port is the port where the client connects to the Zookeeper server, and Zookeeper will listen to this port to accept the client's access request;

server.1=localhost:2887:3887
server.2=localhost:2888:3888
server.3=localhost:2889:3889
server.A=B:C:D:
A is a number, indicating the number of the server, B is the IP address of the server
C. The first port is used for information exchange among cluster members, which means that the port
D is used to elect the leader when the leader hangs up.

3. Create ServerID

In addition to modifying the zoo.cfg configuration file, a file myid must be configured in cluster mode. This file is in the dataDir directory. There is a data in this file that is the value of A. The dataDir path configured in zoo.cfg in the above configuration file Create myid file in

[root@localhost zookeeper-cluster]# cat /export/search/zookeeper-cluster/zookeeper-3.4.6-node1/data/myid
1
[root@localhost zookeeper-cluster]# cat /export/search/zookeeper-cluster/zookeeper-3.4.6-node2/data/myid
22
[root@localhost zookeeper-cluster]# cat /export/search/zookeeper-cluster/zookeeper-3.4.6-node3/data/myid
3

4. Start zookeeper

[root@localhost zookeeper-cluster]# /export/search/zookeeper-cluster/zookeeper-3.4.6-node1/bin/zkServer.sh start
JMX enabled by default
Using config: /export/search/zookeeper-cluster/zookeeper-3.4.6-node1/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED
[root@localhost zookeeper-cluster]# /export/search/zookeeper-cluster/zookeeper-3.4.6-node2/bin/zkServer.sh start
JMX enabled by default
Using config: /export/search/zookeeper-cluster/zookeeper-3.4.6-node2/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED
[root@localhost zookeeper-cluster]# /export/search/zookeeper-cluster/zookeeper-3.4.6-node3/bin/zkServer.sh start
JMX enabled by default
Using config: /export/search/zookeeper-cluster/zookeeper-3.4.6-node3/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED

5. Check whether the cluster is started

[root@localhost zookeeper-cluster]# echo stat|nc localhost 2181
[root@localhost zookeeper-cluster]# echo stat|nc localhost 2182
[root@localhost zookeeper-cluster]# echo stat|nc localhost 2183
#或者
[root@localhost zookeeper-cluster]# /export/search/zookeeper-cluster/zookeeper-3.4.6-node1/bin/zkCli.sh
[root@localhost zookeeper-cluster]# /export/search/zookeeper-cluster/zookeeper-3.4.6-node2/bin/zkCli.sh
[root@localhost zookeeper-cluster]# /export/search/zookeeper-cluster/zookeeper-3.4.6-node3/bin/zkCli.sh

Pseudo-cluster deployment considerations :

Three servers are deployed on one machine; it should be noted that if multiple servers are deployed on one machine, each machine needs a different clientPort.

For example, server.1 is 2181, server.2 is 2182, and server.3 is 2183

The only thing to notice about the last few lines is that

The number server.X corresponds to the number in data/myid. You have written 1, 2, and 3 in the myid files of the 3 servers respectively, then the zoo.cfg in each server is equipped with server.1, server.2, and server.3 and it is OK

--------------------------------------Dividing line---------- ----------------------------

Ubuntu  14.04 installs distributed storage Sheepdog+ZooKeeper   http://www.linuxidc.com/Linux/2014-12/110352.htm

CentOS  6 installs sheepdog virtual machine distributed storagehttp:   //www.linuxidc.com/Linux/2013-08/89109.htm

ZooKeeper Cluster Configuration  http://www.linuxidc.com/Linux/2013-06/86348.htm

Using ZooKeeper to implement distributed shared lock  http://www.linuxidc.com/Linux/2013-06/85550.htm

Distributed service framework ZooKeeper -- manage data in a distributed environmenthttp:  //www.linuxidc.com/Linux/2013-06/85549.htm

The practice of building a ZooKeeper cluster environmenthttp  ://www.linuxidc.com/Linux/2013-04/83562.htm

ZooKeeper server cluster environment configuration measurement  http://www.linuxidc.com/Linux/2013-04/83559.htm

ZooKeeper Cluster Installation  http://www.linuxidc.com/Linux/2012-10/72906.htm

 

Reprinted from: https://www.linuxidc.com/Linux/2015-02/114230.htm

 

ZooKeeper service commands:
1. Start the ZK service: ./zkServer.sh start
2. View the ZK service status: ./zkServer.sh status
3. Stop the ZK service: ./zkServer.sh stop
4. Restart the ZK service: ./zkServer .sh restart

 

 

Zookeeper working principle, installation configuration, tool command introduction:https://www.cnblogs.com/kunpengit/p/4045334.html

 

 

        

 

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=326172783&siteId=291194637