The first phase of ZooKeeper learning --- a brief introduction to Zookeeper

https://www.cnblogs.com/wuxl360/p/5817471.html

1. Distributed coordination technology

Before introducing ZooKeeper, let's introduce a technology-distributed coordination technology. So what is distributed coordination technology? So let me tell you, in fact, distributed coordination technology is mainly used to solve the synchronization control between multiple processes in a distributed environment, so that they can access certain critical resources in an orderly manner to prevent the consequences of "dirty data". At this time, some people may say that this is simple, and it can be easily solved by writing a scheduling algorithm. People who say this may not have a good understanding of distributed systems, so this misunderstanding occurs. If these processes all run on one computer, it is relatively easy to handle. The problem is that he is in a distributed environment, and then the problem comes again. What is distributed? I can't explain a sentence or two of this, but I drew a picture for you, hoping to help you understand the content in this area. If you think it's wrong, you can make a brick. Let's take a look at this picture, as shown in Figure 1.1 Show.

Figure 1.1 Distributed system diagram

Let's analyze this picture for you. There are three machines in this picture, and each machine runs an application. Then we connect these three machines through the network to form a system to provide services to users. For the user, the architecture of this system is transparent, and he cannot feel what kind of architecture my system is. Then we can call this system a distributed system .

Then let's analyze how to schedule processes in this distributed system. I assume that a resource is mounted on the first machine, and then these three physically distributed processes must compete for this resource, but we We don't want them to access at the same time. At this time, we need a coordinator to allow them to access this resource in an orderly manner. This coordinator is the lock we often mention . For example, when "process-1" uses the resource, it will first acquire the lock. After "process 1" acquires the lock, it will keep the resource exclusively , so that other processes can The resource cannot be accessed. After "process 1" uses up the resource, it will release the lock and let other processes obtain the lock. Then, through this lock mechanism, we can ensure that multiple processes in the distributed system can access in an orderly manner. the critical resource. Then we call this lock in this distributed environment a distributed lock . This distributed lock is also the core content of our distributed coordination technology implementation, so how to achieve this distribution is what we will talk about later.

Second, the realization of distributed lock

Well we know that in order to prevent multiple processes in a distributed system from interfering with each other, we need a distributed coordination technique to schedule these processes. The core of this distributed coordination technology is to realize this distributed lock . So how is this lock implemented? This is indeed relatively difficult to achieve.

1.1 Problems faced

After looking at the distributed environment shown in Figure 1.1, one may feel that this is not difficult. It is nothing more than implementing the primitives of scheduling processes on the same machine in a distributed environment through the network. Yes, on the surface it can be said. But the problem is with the network. In a distributed system, the assumption of all being on the same machine doesn't exist: because the network is unreliable.

For example, on the same machine, if your invocation of a service succeeds, it is successful, and if the invocation fails, such as an exception is thrown, the invocation fails. However, in a distributed environment, due to the unreliability of the network, your failure to call a service does not necessarily mean that it is a failure. It may be that the execution is successful, but the response fails when it returns. Also, both A and B call the C service. In terms of time, A also calls some first, and B calls later, so the final result must be that A's request arrives before B? We have to rethink these assumptions on the same machine, and we also need to think about how these problems affect our design and coding. Also, in order to improve reliability in a distributed environment, we often deploy multiple sets of services, but how to achieve consistency among multiple sets of services is relatively easy to synchronize between multiple processes on the same machine Yes, but it's a big problem in a distributed environment.

So distributed coordination is much more difficult than scheduling multiple processes on the same machine, and if you develop a separate coordinator for each distributed application. On the one hand, the repeated writing of the coordinator is wasteful, and it is difficult to form a general and scalable coordinator. On the other hand, the coordinator has a large overhead, which will affect the original performance of the system. Therefore, there is an urgent need for a universal coordination mechanism with high reliability and availability to coordinate distributed applications.

1.2 Implementers of distributed locks

At present, Google's Chubby and Apache's ZooKeeper are the implementers of distributed locks that have done a good job in distributed coordination technology. Some people will ask why you need to get a ZooKeeper now that you have Chubby, is Chubby not doing well enough? This is not the case, mainly because Chbby is non-open source and used by Google itself. Later, Yahoo imitated Chubby and developed ZooKeeper, which also implemented a similar distributed lock function, and donated ZooKeeper as an open source program to Apache, so that the lock service provided by ZooKeeper can be used. And it has been tested in the distributed field for a long time, and its reliability and availability have been verified by theory and practice. So when we build some distributed systems, we can use such systems as a starting point to build our systems, which will save a lot of costs and have fewer bugs.

3. Overview of ZooKeeper

ZooKeeper is a highly available, high-performance and consistent open source coordination service designed for distributed applications. It provides a basic service: distributed lock service . Due to the open source nature of ZooKeeper, our developers later explored other usage methods based on distributed locks: configuration maintenance, group services, distributed message queues , distributed notification/coordination , etc.

Note: The performance characteristics of ZooKeeper determine that it can be used in large, distributed systems. In terms of reliability , it does not crash due to a single node error. Among other things, its strict sequential access control means that complex control primitives can be applied on the client side. ZooKeeper's guarantee of consistency, availability, and fault tolerance is also the success of ZooKeeper. All its successes are inseparable from the protocol it adopts - the Zab protocol, which will be introduced later.

There are so many services mentioned above, such as distributed locks, configuration maintenance, group services, etc. How are they implemented? I believe this is what everyone cares about. When ZooKeeper implements these services, it first designs a new data structure - Znode , and then defines some primitives based on the data structure , that is, some operations on the data structure. Having these data structures and primitives is not enough, because our ZooKeeper works in a distributed environment, and our services are sent to our distributed applications in the form of messages through the network, so we also need a notification mechanism - Watcher mechanism. So to sum up, the services provided by ZooKeeper are mainly realized through three parts: data structure + primitive + watcher mechanism. Then I will introduce ZooKeeper from these three aspects.

4. ZooKeeper data model

4.1 ZooKeeper data model Znode

ZooKeeper has a hierarchical namespace, which is very similar to a standard file system, as shown in Figure 3.1 below.

Figure 4.1 ZooKeeper data model and file system directory tree

 

From the figure, we can see that the data model of ZooKeeper is very similar in structure to that of the standard file system. It adopts this tree-like hierarchical structure. Each node in the ZooKeeper tree is called -Znode. Like a file system's directory tree, each node in a ZooKeeper tree can have child nodes. But there are also differences:

(1) Reference method

Zonde is referenced by path , like file paths in Unix. Paths must be absolute, so they must start with a slash character . In addition, they must be unique, that is, each path has only one representation, so these paths cannot be changed. In ZooKeeper, paths consist of Unicode strings, and there are some restrictions. The string "/zookeeper" is used to store management information, such as critical quota information.

(2)  Znode structure

The Znode in the ZooKeeper namespace has both the characteristics of a file and a directory. It not only maintains data structures such as data, meta information, ACL, and timestamps like files, but also can be used as part of path identification like directories. Each node in the graph is called a Znode. Each Znode consists of 3 parts:

 stat: This is status information, describing the version, permissions and other information of the Znode

 data: data associated with the Znode

 children: child nodes under the Znode

Although ZooKeeper can associate some data, it is not designed as a conventional database or big data store. On the contrary, it is used to manage scheduling data , such as configuration file information, state information, aggregation location, etc. in distributed applications . The common characteristic of these data is that they are small data, usually in KB. Both ZooKeeper servers and clients are designed to strictly check and limit the data size of each Znode to a maximum of 1M, but should be much smaller in normal use.

(3) Data access

The data stored by each node in ZooKeeper is to be operated atomically . That is to say, a read operation will get all the data related to the node, and a write operation will also replace all the data of the node. In addition, each node has its own ACL (Access Control List), which specifies the permissions of the user, that is, the operations that a specific user can perform on the target node.

(4) Node type

There are two types of nodes in ZooKeeper, temporary nodes and permanent nodes . The type of a node is determined when it is created and cannot be changed.

① Temporary nodes: The life cycle of the nodes depends on the session that created them. Once the session (Session) ends, the temporary node will be automatically deleted, of course, it can also be deleted manually. Although each ephemeral Znode is bound to a client session, they are still visible to all clients. Additionally, ZooKeeper ephemeral nodes are not allowed to have children.

② Permanent nodes: The life cycle of the nodes does not depend on the session, and they can only be deleted when the client displays the delete operation.

(5)  Sequence node

When creating a Znode, the user can request to add an incrementing count to the end of the ZooKeeper path . This count is unique to this node's parent and is in the format "%10d" (10 digits, with zero-padded digits, eg "0000000001"). When the count value is greater than 2 32 -1, the counter will overflow.

(6) Observation

Clients can set watches on nodes, which we call monitors . When the state of the node changes (the addition, deletion, and modification of the Znode), the operation corresponding to the watch will be triggered. When a watch is triggered, ZooKeeper will send only one notification to the client, because the watch can only be triggered once, which reduces network traffic.

4.2 Time in ZooKeeper

ZooKeeper has various forms of recording time, which include the following main properties:

(1) Zxid

Every operation that causes the state of a ZooKeeper node to change will cause the node to receive a timestamp in Zxid format, and this timestamp is globally ordered. That is, that is, every change to a node will result in a unique Zxid. If the value of Zxid1 is less than the value of Zxid2, then the event corresponding to Zxid1 occurs before the event corresponding to Zxid2. In fact, each node maintainer of ZooKeeper has three Zxid values: cZxid, mZxid, and pZxid.

 cZxid : is the timestamp in Zxid format corresponding to the creation time of the node.
② mZxid : is the timestamp in Zxid format corresponding to the modification time of the node.

In the implementation, Zxid is a 64-bit number, and its high 32 bits are the epoch used to identify whether the leader relationship has changed. Every time a leader is elected, it will have a new epoch. The lower 32 bits are an increment count(2) Version number

Every operation on a node will cause the node's version number to be incremented. Each node maintains three version numbers, which are:

① version : node data version number
② cversion : child node version number
③ aversion : ACL version number owned by the node

4.3 ZooKeeper Node Properties

Through the previous introduction, we can understand that a node itself has many important properties that represent its state, as shown in the following figure.

Figure 4.2 Znode node attribute structure

5. Operation in ZooKeeper service

There are 9 basic operations in ZooKeeper, as shown in the following figure:

Figure 5.1 ZooKeeper class method description

Update ZooKeeper operations are limited. delete or setData must specify the version number of the Znode to be updated, which we can find by calling exists. If the version numbers do not match, the update will fail.

Update ZooKeeper operations are non-blocking. So if the client loses an update (due to another process updating the Znode at the same time), he can choose to retry or do something else without blocking the execution of the other process.

Although ZooKeeper can be thought of as a filesystem, some primitives for filesystem operations are discarded for convenience. Because the file is very small and made monolithic read-write, no open, close, or ground operations are required.

6. Watch trigger

(1) watch overview

ZooKeeper can set watches for all read operations , including exists(), getChildren(), and getData(). The watch event is a one-time trigger . When the state of the watch object changes, the event corresponding to the watch on this object will be triggered. Watch events will be sent to clients asynchronously , and ZooKeeper provides ordered consistency guarantees for the watch mechanism . In theory, the client receives watch events faster than it sees the state change of the watch object.

(2) watch type

The watches managed by ZooKeeper can be divided into two categories:

①Data  watch(data watches): getData and exists are responsible for setting data watch
②Child  watch(child watches): getChildren is responsible for setting child watch

We can set different watches by manipulating the returned data :

① getData and exists: return data information about the node ② getChildren: return the list of children

therefore

①  A successful setData operation will trigger the data watch of Znode

②A  successful create operation will trigger Znode's data watch and child watch

③  A successful delete operation will trigger Znode's data watch and child watch

(3) watch registration and local triggering

Figure 6.1 The watch setting operation and the corresponding trigger are shown in the figure below:

①  The watch on the exists operation is triggered when the monitored Znode is created , deleted , or data is updated .
 The watch on the getData operation is triggered when the monitored Znode is deleted or data is updated . It cannot be triggered when it is created, because the getData operation will only succeed if the Znode must exist.
 The watch on the getChildren operation is triggered when the child node of the monitored Znode is created or deleted , or when the Znode itself is deleted . You can distinguish whether it is a Znode or its child node is deleted by looking at the watch event type: NodeDelete means that the Znode is deleted, and NodeDeletedChanged means that the child node is deleted.

Watches are maintained locally by the ZooKeeper server the client is connected to, so watches can be set up, managed and dispatched very easily. When a client connects to a new server, any session events will likely trigger a watch. Also, the watch will not be received when disconnected from the server. However, when a client reconnects, any previously registered watches will be re-registered.

(4) Points to note

Zookeeper's watch actually handles two types of events:

① Connection status event (type=None, path=null)

This type of event does not need to be registered, nor does it need us to trigger continuously, we just need to handle it.

② Node events

Node creation, deletion, and data modification. It is a one time trigger, we need to register the trigger constantly, and event loss may occur.

The above two types of events are processed in Watch, that is, the overloaded process (Event event)

The triggering of node events is handled by the functions exists, getData or getChildren, which have a double effect:

① Register trigger event

② The function of the function itself

The function of the function itself can be realized by an asynchronous callback function, and the function of the processing function itself in the processResult() process is overloaded.

Seven, ZooKeeper application example 

In order to facilitate your understanding of ZooKeeper, here is an example to see how ZooKeeper implements its services. I take the basic service distributed lock provided by ZooKeeper as an example.

7.1 Distributed Lock Application Scenario

In the distributed lock service, one of the most typical application scenarios is to solve the single point of failure in the distributed system by electing the master of the cluster . What is a single point of failure in a distributed system: Usually a distributed system adopts a master-slave mode, that is, a master computer connects multiple processing nodes. The master node is responsible for distributing tasks, and the slave nodes are responsible for processing tasks. When our master node fails, the entire system is paralyzed, so we call this failure a single point of failure. As shown in Figures 7.1 and 7.2 below:

Figure 7.1 Master-slave mode distributed system Figure 7.2 Single point of failure

     

7.2 Traditional solutions

The traditional method is to use a standby node. The standby node periodically sends ping packets to the current master node. After the master node receives the ping packet, it sends a reply Ack to the standby node. When the standby node receives the reply, it will consider the current master node still alive. , let him continue to provide services. As shown in Figure 7.3:

Figure 7.3 Traditional solution

When the master node hangs up, the standby node cannot receive a reply at this time, and then he thinks that the master node hangs up and takes over as the master node as shown in Figure 7.4 below:

Figure 7.4 Traditional solution

But this method has a hidden danger, that is, network problems. Let's see what consequences a network problem will cause, as shown in Figure 7.5 below:

Figure 7.5 Network failure

That is to say, our master node is not hung up, but the network fails during the reply, so that our standby node also cannot receive a reply, it will think that the master node is hung up, and then the standby node will start his Master instance. , so that there are two master nodes in our distributed system, that is , dual masters . After the master appears, our slave node will report part of what it does to the master node, and part of it to the slave node. node, so the service is completely messed up. In order to prevent this from happening, we introduced ZooKeeper, which cannot avoid network failures, but it can guarantee that there is only one Master at all times. Let's take a look at how ZooKeeper is implemented.

7.3 ZooKeeper solution

(1) Master start

After the introduction of Zookeeper, we started two master nodes, "master node-A" and "master node-B", after they were started, they both registered a node with ZooKeeper. We assume that the node where the "master node-A" lock is registered is "master-00001", and the node registered by "master node-B" is "master-00002". After registration, an election will be held, and the node with the smallest number will win the election. Acquire the lock to become the master, that is, our "master-A" will acquire the lock to become the master, and then "master-B" will be blocked to become a standby node. Then, in this way, the scheduling of the two Master processes is completed.

Figure 7.6 ZooKeeper Master election

(2) Master failure

If "master node-A" hangs, the node he registered will be automatically deleted at this time, ZooKeeper will automatically sense the change of the node, and then issue an election again. At this time, "master node-B" will win the election and replace " Masternode-A" becomes the masternode.

Figure 7.7 ZooKeeper Master election

(3) Master recovery

Figure 7.8 ZooKeeper Master election

If the master node recovers, he will register a node with ZooKeeper again. At this time, his registered node will be "master-00003". ZooKeeper will sense the change of the node and start the election again. At this time, "master node-B" is electing The middle will win again and continue to serve as the "master node", and "master node-A" will serve as the standby node.

 

 
Category:  Zookeeper

 

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=326069580&siteId=291194637