Zookeeper interview summary, salary increase after the year is easy

This article is not an introductory tutorial, it requires a certain zookeeper foundation

zookeeper application

Agree to naming service

In a distributed system, each system may act as a service provider and provide services to the outside world. At this time, you need to plan the service name uniformly, such as "application group service name version number", and Zookeeper's directory-style The structure storage feature just provides support for this naming method. For example, in Dubbo, this feature of Zookeeper is used when Zookeeper is used as the registry.

Configuration Center

In a distributed system, especially in a cluster environment, configuration is a common resource. One of its most important features is that configuration changes need to be notified to use configuration changes. Users need to be notified. Zookeeper can be used as a database for public resource storage. Zookeeper's The data will be persistent, but it will also exist in the memory at the same time. The read operation of the data is the memory data that is directly read. The configured usage must be read>write, so there is no problem in performance, and the built-in Watch The mechanism just provides data publishing and subscribing data publishing and subscribing functions. The configuration information stored on Zookeeper has changed, and the application using this configuration can receive the corresponding change notification.

Cluster management

Zookeeper can be used to monitor the cluster. For example, each server corresponds to a temporary node in Zookeeper. The parent node of these nodes is /WatchNode, then the monitoring platform can monitor the changes of nodes under /WatchNode, add a machine, or hang one Each machine can reflect the changes of Zookeeper nodes to achieve the purpose of monitoring.

The servers in the cluster usually deal with the same business, but some businesses may have heavy IO or heavy CPU. At this time, Zookeeper can be used to implement a leader election and select a leader from the cluster. This leader is The only one, so it will perform this special task. The leader election mentioned here does not refer to the leader election in the Zookeeper implementation principle. The leader election in the Zookeeper implementation principle is more complicated and is a code for the algorithm. Implementation, and the above-mentioned use of Zookeeper to achieve a leader election is relatively easy, in fact, the core is to select a unique point, for example, the servers in the cluster can create nodes with the same name at the same time, due to the consistency of Zookeeper In the end, only one client will be created successfully. Whoever succeeds is the leader. This is one method. Another is that the servers in the cluster also create nodes in Zookeeper, and each server will be created successfully. The created node is a sequence node. From these sequence nodes, a node is selected according to a simple rule (such as maximum, minimum), and this node is the leader.

Distributed lock

In a distributed system, some public resources need to be exclusive. At this time, distributed locks are needed. Distributed locks can be realized by using Zookeeper's temporary sequential nodes and the Watch mechanism.

  • 1. First define a node/Lock as the lock of a certain resource
  • 1. All servers that need to use the resource go to the /Lock node to create a temporary sequence node
  • 1. The node with the lowest order means that the corresponding server has acquired the lock
  • 1. The active release of the server or the hanging of the server will cause the deletion of the corresponding node, so other servers use the Watch mechanism to re-compare the size of the node to determine whether they have acquired the lock

**Bottom principle of zookeeper**

consistency

Zookeeper uses the ZAB algorithm to achieve strong consistency. After Zookeeper receives a client request, it will first put the request into a queue, and then use a single thread to take out the request from the queue for processing, thus avoiding concurrency problems , For transactional operations like write, update, delete, etc., in the Zookeeper cluster, they will be handed over to the Leader node for processing. When the Leader node processes transactional requests, it will perform a similar two-phase commit. First ask, then submit. In Zookeeper The leader does not wait for all nodes to tell the leader to submit, but will perform a verification of a more than half mechanism. As long as more than half of the nodes consider submitable, the leader will send all nodes a request to submit the transaction, and all other nodes will proceed. submit.

In terms of the underlying principle of Zookeeper, Zookeeper can be regarded as a database, similar to Mysql, every database operation will save logs, so Zookeeper is the same, the server will first persist a transaction log when it receives a transactional request. Then update the memory data, so the two stages in Zookeeper are: 1. Persistent transaction log, 2. Update memory. In fact, only when the memory is updated, the Zookeeper client will actually get the latest data.

watch mechanism

In Zookeeper, the client defines the listener and binds it to a node. The client itself is responsible for storing the information of which listeners a node corresponds to. The server does not need to store this information, the server only needs to throw The operation event of the node, regardless of which clients listened to this event. After the client receives this event, it triggers the corresponding listener according to the corresponding relationship of the node listener saved before. Here you need to pay attention to the Watch of Zookeeper's native client It is a one-off, that is to say, the node's listener will be automatically deleted by the native client after it is triggered once, and both Zkclient and Curator are processed and can be reused.

Leader election

First of all, the leader election is not independent, it is a step in the ZAB protocol, because the ZAB protocol is mainly used to ensure the consistency of the cluster, and the most critical step to ensure the consistency is to have a leader to unify the cluster Management, so that the consistency of the cluster can be guaranteed, so the leader election is to elect the leader.

Leader election usually occurs when the entire cluster is started, or when the leader hangs up, or when the follower hangs up, the leader finds that no more than half of the followers follow him and restart the election.

The process of leader election is actually to compare which server is stronger. The comparison rules are: 1. Whoever has the newer data and who is the leader (zxid); 2. According to the same data, it depends on whose server Id (myid) is bigger. It is the leader; this process is carried out by voting among each server, each server will receive the votes of other servers, and the voting information will contain the two information zxid and myid mentioned above, and then perform PK to select Who is stronger, and the weaker party in the PK revises its vote to vote for the party that just won the PK, so according to this rule, each server will have a person who he thinks is the strongest, and in the whole vote During the process, there will be a ballot box inside each server, and the ballot box stores who other servers are currently voting for, so each server can see if there are more than half of the servers and me based on the data in the ballot box The strongest player currently voted is the same server. If it exceeds, the leader is considered to be selected (the strongest player he currently voted is the Leader). If he finds himself the strongest player, he will lead. If not, follow (Follower).

Observer

The observer (Observer) does not participate in the election, nor does it participate in the verification of more than half of the mechanism, so when the leader processes transactional requests, it only needs to verify whether more than half of the followers agree to the request, without judging whether the observer agrees. The observer only needs to synchronize the data at the end.

If you want to improve the read performance of the Zookeeper cluster, you need to add nodes. If you increase the follower node, although the read performance is improved, because the follower needs to verify more than half of the mechanism, it will cause the leader to wait when receiving feedback from the follower As a result of more followers, this process exists whether it is leader election or transactional requests, which will increase time-consuming. For example, when the leader is elected, it needs to wait for the voting results of more nodes, and it needs to vote with more nodes; for example, when performing a write operation, the leader needs to wait for more follower nodes to submit feedback, thus affecting the write operation Performance.

If the Observer node is added, it will not affect more than half of the mechanism, and at the same time can improve the read performance. Although it will consume a little performance in synchronizing data, the advantages are much larger than the disadvantages.

This article is published by OpenWrite , a multi- posting platform for blogs !

Guess you like

Origin blog.csdn.net/yueyunyin/article/details/103755068