Zookeeper principle and installation, cluster server

 

Zookeeper principle and installation

 

zookeeper is a highly available, high-performance coordination service

Which problems to solve
In distributed applications, there are often partial failures, that is, when a message is passed between nodes, the sender cannot know whether the receiver has received the message due to the death of the network or the receiver process.

Since partial failures are inherent in distributed systems, zookeeper cannot avoid partial failures, but it can help you handle partial failures correctly.

In order to solve this problem, zookeeper has the following characteristics:
1: zookeeper provides rich building blocks to implement many coordinated data structures and protocols
2: access atomicity, the client either reads all the data, or fails to read, and does not appear Only read part
3: zookeeper runs on a group of machines, has high availability, helps the system avoid single point of failure, and deletes faulty servers
4: Sequential consistency: any client's update request will be submitted in the order in which it was
sent5 : single system image: when a server fails and its clients need to connect to other servers, all servers updated later than the failed server will not receive requests until the update catches up with the failed server
6: timeliness: any The lag that the client can see is limited, no more than tens of seconds, and the sync operation is provided to force the server connected to the client to synchronize with the leader
7: Session: each client will try to connect to the configuration list when connecting One of the servers in the server will automatically connect to another server if it fails, and so on, until a server is successfully connected, thereby creating a session. The client can set a timeout for each session. Once the session expires, all short-term znodes will be Lost, because zookeeper will automatically send heartbeat packets, it rarely happens
8: Dating mechanism (rendezvous), in the process of interaction, the coordinated parties are not allowed to know each other in advance, or even need not exist at the same time
9: ACL: zookeeper provides digest (pass the username and password), host (pass the host name), ip (pass the ip address) 3 authentication modes, rely on the authentication mechanism of zookeeper Each ACL is an identity corresponding to a set of permissions, if we want to give A read permission for the client domain of demo.com can be created in java language like this:
new ACL(Perms.READ, new Id("host", "demo.com"));
Ids.OPEN_ACL_UNSAFE is to grant all permissions other than ADMIN to everyone.
Zookeeper can also integrate third-party authentication systems

10: Provides an open source shared repository for common coordination patterns
11: High performance (official stats) Benchmark throughput of 10000+ using 5 decent machines for write-heavy workloads

Principle
Zookeeper uses the zab protocol, which is similar to the Paxos algorithm but different in operation. The protocol includes 2 repeated stages of
leader election: all machines in the cluster elect a leader together, and other machines become followers. Once half The above followers will synchronize the state, indicating that this stage is completed (the official data is in order for 200 milliseconds in this stage)
Atomic broadcast: all machines forward the write operation to the leader, and the leader broadcasts the update to the followers, only more than half of the followers Only after synchronizing the modification will the leader submit the update, and the client will receive the successful update information


Its core is a streamlined file system, forming a tree-like data structure, using the concept of a node (znode) uniformly, a node can have child nodes, can also be used to save data, and has an associated ACL, because zookeeper is Designed to implement coordination services, usually small data files are used, so the data that znode can store is limited to 1M.
Zookeeper uses Unicode strings separated by slashes to refer to similar file system paths, but it must be standard and does not support ./ Special characters, use the /zookeeper subtree to save management information
. The communication between the client and the server adopts a tcp long connection, and the client and the server maintain the connection of the seesion through heartbeat. Ephemeral nodes are deleted when the session expires.
Functions are realized by monitoring nodes and node changes, such as cluster management, centralized management of configuration, distributed locks, etc.
Zookeeper achieves high availability through replication. As long as more than half of the machines in the cluster are available, services can be provided, so a cluster usually requires odd number of machines

The life cycle of zookeeper has the following three states: CONNECTION, CONNECTED, CLOSED
The newly generated zookeeper instance is in the CONNECTION state, and enters the CONNECTED state by establishing a connection. When the zookeeper instance is disconnected and reconnected, the zookeeper instance is between CONNECTED and COONECTION. Conversion, calling the close method or the session timeout will enter the CLOSE state and cannot be recovered

 

znode features
There are two types of znodes, ephemeral node and persistent node, which are determined at creation and cannot be modified. The ephemeral node will be removed when the client session ends
and any type of child node cannot be created.
If the znode is created when If the sequence identifier is set, then this znode will add a sequence number through a monotonically increasing counter maintained by the parent node. This sequence number can be used for global sorting. The
watch mechanism allows the client to get the changes of the znode, and the observation can only trigger Once, in order to be notified multiple times, the client needs to re-register the required observations


Installation configuration: (single-machine simulation cluster)
Download the latest version of zookeeper 
Create 3 folders server1 server2 server3
Unzip it to 3 folders
Configure zoo.cfg (the configuration path should not appear in Chinese)
is a java property file that
can be placed under conf
or / If the ZOOCFGDIR environment variable is configured in the etc/zookeeper subdirectory
, it can also be saved in the directory specified by the environment variable

#Basic time unit in milliseconds
tickTime=2000 #Time
for all followers to connect and synchronize with the leader
#If more than half of the followers fail to complete synchronization at this time, the leader will give up the leadership and perform another #leader
election
initLimit=5 #The
time when a follower synchronizes with the leader. If the follower fails to complete synchronization within this time, it will restart itself. All clients associated with this follower will link to another follower
syncLimit=2
#Storage Local file system location of persistent data
dataDir=xxxx/zookeeper/server1/data
dataLogDir=xxx/zookeeper/server1/dataLog #Listening
port for client connections
clientPort=2181 #The
first port is the follower link leader, the second Additional server links for leader election phase
server.1=127.0.0.1:2888:3888
server.2=127.0.0.1:2889:3889
server.3=127.0.0.1:2890:3890

Create a myid file under dataDir, write numbers in it, and the numbers after server. are the same

 

 

 

----------------------------------------------------------------------------------------------

zookeeper - cluster server

 

Zookeeper is a management software for cluster servers. It is convenient to manage various resources in the cluster.

Straight to the topic, introduce the steps to build a zookeeper cluster:

1. Download zookeeper. You can download the latest zookeeper version at http://zookeeper.apache.org/releases.html  official website

2. Unzip the downloaded zookeeper archive locally. Path assumes ZOO_HOME

3. Create a configuration file zoo.cfg in the ZOO_HOME/conf directory, or copy the content of zoo_sample.cfg to zoo.cfg

4. Insert a sentence here, if it is running in stand-alone mode. Just a few simple lines of configuration

 

  1. tickTime=2000  
  2. dataDir=/var/lib/zookeeper  
  3. clientPort=2181  
tickTime=2000
dataDir=/var/lib/zookeeper
clientPort=2181

 

 

tickTime: In milliseconds, the time used to do the heartbeat. The minimum expiration time of a session must be more than twice the tickTime.

dataDir: save some data information when zookeeper is running

clientProt: The port that zookeeper listens on after startup is used to provide clients with connections

 

If it is a cluster environment, some configuration needs to be added.

 

  1. tickTime=2000  
  2. dataDir=/var/lib/zookeeper  
  3. clientPort=2181  
  4. initLimit = 5  
  5. syncLimit=2  
  6. server.1=zoo1:2888:3888  
  7. server.2=zoo2:2888:3888  
  8. server.3=zoo3:2888:3888  
tickTime=2000
dataDir=/var/lib/zookeeper
clientPort=2181
initLimit = 5
syncLimit=2
server.1=zoo1:2888:3888
server.2=zoo2:2888:3888
server.3=zoo3:2888:3888

 

 

Before introducing the configuration items, let's first introduce the two concepts leader and follower in the zookeeper cluster.

The zookeeper cluster requires a server as the leader, responsible for accepting all write requests from clients. The other servers act as followers to keep data synchronization with the leader. If the communication between the set leader and other servers is abnormal, the servers in the cluster re-select another server as the leader through the voting mechanism.

Next, we will introduce the newly added configuration items.

initLimit: refers to the maximum time limit for servers in the cluster to connect to the leader. For example, configure initLimit=5 tickTime=2000 in the example, then the maximum time is 5*2000, which is 10 seconds

syncLimit : is the time to receive data between the follower and the leader in the cluster. Timing is similar to initLimit.

server.x=zoo1:2888:3888

x represents the id of the server server.1 server.2 server.3 is the list of servers in the cluster

zoo1 zoo2 zoo3 are the two ports behind the server's ip or domain name. The first port is used to connect to the leader server. The second port is used to elect the leader server

 

5. Add a myid configuration file to the dataDir of the cluster server. It records the server id, which is the x in server.x. Tells which server in the server list the machine is on when it starts.

   The configuration file of each server in the cluster can be kept consistent. but myid is different

6. Start zookeeper 

   Execute on each server in the cluster:

   ZOO_HOME/bin/zkServer.sh start start zookeeper

Guess you like

Origin http://10.200.1.11:23101/article/api/json?id=326566338&siteId=291194637