Talk about those things about Zookeeper

Interview-oriented blogs are presented in Q / A style.


Question1: Briefly introduce the Zookeeper architecture?

Answer1:

Zookeeper is literally translated as a zoo administrator. It is a cluster management tool. It is generally used for big data cluster management, and can also manage other clusters, such as the Solr index database cluster.

Zookeeper is a distributed coordination service that can be used for service discovery, distributed locks, distributed leadership elections, and configuration management. Zookeeper provides a tree structure similar to the Linux file system (it can be considered as a lightweight memory file system, but it is only suitable for storing a small amount of information, and is not suitable for storing a large number of files or large files at all). Monitoring and notification mechanism.

Zookeeper cluster is a highly available cluster based on master-slave replication, and its architecture is shown in the figure:
Insert picture description here

Each server assumes one of the following three roles: Leader Follower Observer.

Leader

1. Maintain heartbeat : A Zookeeper cluster will only have one actual leader at a time, and it will initiate and maintain heartbeats with each Follwer and Observer (that is, the interaction of ping and ack in the above figure).
2. Only half of the write operations are successfully written before submission : all write operations must be completed by the Leader and then the Leader will broadcast the write operations to other servers. As long as more than half of the nodes (not including the observer node) write successfully, the write request will be submitted (class 2PC protocol).

Follower

1. Maintain heartbeat : A Zookeeper cluster may have multiple followers at the same time, and it will respond to the leader's heartbeat.
2. The read request is processed by itself, and the write request is forwarded to the Leader : Follower can directly process and return the read request from the client, and at the same time, forward the write request to the Leader for processing.
3. Vote for the Leader : Responsible for voting on the request when the Leader processes the write request.

Observer

1. Maintain heartbeat : A Zookeeper cluster may have multiple Observers at the same time, and it will respond to the Leader's heartbeat.
2. The read request is processed by itself, and the write request is forwarded to the Leader : Observer can directly process and return the read request from the client, and at the same time forward the write request to the Leader for processing.
3. No voting right : not responsible for voting on the request when the Leader processes the write request.

The role is similar to Follower, but without voting rights. Zookeeper needs to ensure high availability and strong consistency. In order to support more clients, more Servers need to be added; Servers increase, and the delay of the voting phase increases, which affects performance; the introduction of Observer, Observer does not participate in voting; Observers accept client connections, Forward the write request to the leader node; add more Observer nodes to improve scalability without affecting throughput.


Question2: Talk about your understanding of the ZAB agreement?

Answer2:

First, the transaction number Zxid (transaction request counter + epoch)

In ZAB (ZooKeeper Atomic Broadcast, ZooKeeper Atomic Message Broadcast Protocol) protocol transaction number Zxid design, Zxid is a 64-bit number, of which the lower 32 bits is a simple monotonically increasing counter , for each client transaction request, The counter is increased by 1; the upper 32 bits represent the number of the leader cycle epoch . Each time a new leader server is elected, the ZXID of the largest transaction in its local log will be taken from the leader server, and the epoch value will be read from it. Then add 1 as a new epoch, and count the lower 32 bits from 0.

Zxid (Transaction id) is similar to the transaction ID in RDBMS and is used to identify the Proposal ID of an update operation. To ensure order, the zkid must increase monotonically.

Second, epoch

epoch: It can be understood as the era or cycle in which the current cluster is located. Each leader is like an emperor and has its own year number, so every time the dynasty changes and the leader changes, it will increase by 1 based on the previous era. In this way, even after the old leader recovers from collapse, no one listens to him, because the follower only listens to the commands of the current leader.

Third, two modes of ZAB agreement

Zab protocol has two modes, they are recovery mode (select master) and broadcast mode (synchronization). When the service starts or after the leader crashes, Zab enters the recovery mode. When the leader is elected, and most servers have completed synchronization with the leader's state, the recovery mode ends. State synchronization ensures that the leader and server have the same system state.

Fourth, the four stages of the ZAB agreement (the entire election process is only related to followers and leaders, observers have no voting rights)

The first stage, Leader election (election stage-choose a quasi-Leader) : Nodes are in the election stage at the beginning, as long as one node gets more than half of the votes of the node, it can be elected as a quasi-leader. Only when the quasi-leader reaches the broadcast stage will it become a real leader. The purpose of this stage is to select a quasi-leader and then enter the next stage.

The second stage, Discovery (discovery stage-accept proposal, generate epoch, accept epoch) : In this stage, the followers communicate with the prospective leader, synchronizing the latest transaction proposals received by the followers. The main purpose of this phase is to discover the latest proposals received by most of the current nodes, and the quasi-leader generates new epochs for followers to accept and update their accepted epochs.

A follower will only connect to a leader. If a node f thinks that another follower p is a leader, f will be rejected when trying to connect to p. After f is rejected, it (referred to as f) will enter the re-election stage until it is connected To the leader it supports (referring to f).

The third phase, Synchronization (synchronization phase-synchronization of follower copies) : The synchronization phase mainly uses the latest proposal history obtained by the previous stage of the leader to synchronize all the copies in the cluster. Only when most of the nodes are synchronized, the quasi-leader will become the real leader. The follower will only accept proposals whose zxid is greater than their lastZxid.

The fourth stage, Broadcast (broadcasting stage-leader message broadcast) : At this stage, Zookeeper cluster can formally provide external transaction services, and the leader can broadcast messages. At the same time, if a new node is added, the new node needs to be synchronized.

ZAB does not require all followers to ACK as 2PC commits. It only needs to get the ACK of more than half of the nodes.

Fifth, JAVA implementation of the ZAB protocol (FLE-discovery phase and synchronization are merged into Recovery Phase)

The implementation of the Java version of the protocol is somewhat different from the above definition. The election phase uses Fast Leader Election (FLE), which contains the discovery responsibility of the election. Because FLE will elect the node with the latest proposal history as the leader, which saves the step of discovering the latest proposal. The actual implementation merges the discovery phase and synchronization into the Recovery Phase. Therefore, the realization of ZAB has only three stages: Fast Leader Election ; Recovery Phase ; Broadcast Phase .


Question 3: Introduce the entire voting process of the Zookeeper cluster in detail?

Answer3:

Each sever first votes for itself, and then compares its own ballots with other sever votes, and the big weight wins, using a larger weight to update its own ballot box. The specific election process is as follows:

  1. After each server is started, it asks other servers to whom it wants to vote. For inquiries from other servers, the server responds to the leader's id and zxid of the transaction last time according to its own status (each server will recommend itself when the system is started).
  2. After receiving all the server replies, it calculates the server with the largest zxid and sets the relevant information of this server as the server to be voted next time.
  3. The server with the most votes in this process is the winner. If the winner has more than half of the votes, the server is selected as the leader. Otherwise, continue this process until the leader is elected.
  4. The leader will start waiting for the server to connect.
  5. The Follower connects to the leader and sends the largest zxid to the leader.
  6. Leader determines the synchronization point according to the follower's zxid, and the election stage is completed.
  7. After the leader synchronization is completed in the election phase, the follower is notified that it has become uptodate.
  8. After the Follower receives the uptodate message, it can accept the client's request again for service.

Example : There are currently 5 servers, each of which has no data. Their numbers are 1, 2, 3, 4, and 5, respectively, and they are started in sequence.

  1. Server 1 starts, vote for itself, and then sends voting information. Since other machines have not started, it cannot receive feedback information. The state of Server 1 has always been Looking.
  2. Server 2 starts, votes for itself, and exchanges results with the previously started server 1. Server 2 wins because the number of server 2 is large, but the number of votes is not greater than half at this time, so the status of the two servers is still LOOKING.
  3. Server 3 starts, votes for itself, and exchanges information with the previously started servers 1,2. Server 3 wins because server 3 has the highest number. At this time, the number of votes is exactly greater than half, so server 3 becomes the leader, servers 1,2 Become a little brother.
  4. Server 4 starts, vote for itself, and exchange information with the servers 1, 2 and 3 that were started earlier. Although the number of server 4 is large, server 3 has already won, so server 4 can only become a younger brother.
  5. Server 5 starts, and the logic behind it becomes the younger brother of server 4.

Question4: Introduce how Zookeeper works (or talk about atomic broadcasting)?

Answer4:

The steps of Zookeeper's entire workflow are as follows:

Step 1 : The core of Zookeeper is atomic broadcasting. This mechanism ensures the synchronization between servers. The protocol that implements this mechanism is called the Zab protocol. Zab protocol has two modes, they are recovery mode and broadcast mode.

Step 2 : When the service is started or after the leader crashes, Zab enters the recovery mode. When the leader is elected and most of the servers are synchronized with the leader's state, the recovery mode ends.

Step 3 : State synchronization ensures that the leader and server have the same system state.

Step 4 : Once the leader has synchronized the state with most of the followers, he can start broadcasting messages, that is, enter the broadcasting state. At this time, when a server joins the zookeeper service, it will start in recovery mode, discover the leader, and synchronize the state with the leader. When the synchronization is over, it also participates in the message broadcast. The Zookeeper service remains in the Broadcast state until the leader crashes or the leader loses most of the followers' support.

Step 5 : The broadcast mode needs to ensure that proposals are processed in order, so zk uses an increasing transaction id number (zxid) to ensure. All proposals are added with zxid when they are made.

Step 6 : In the implementation, zxid is a 64-bit number, and its upper 32 bits are epochs used to identify whether the leader relationship has changed. Each time a leader is selected, it will have a new epoch. The lower 32 bits are an incremental count.

Step 7 : When the leader crashes or the leader loses most of the followers, zk enters the recovery mode at this time. The recovery mode needs to re-elect a new leader, so that all servers are restored to a correct state.


Question5: Introduce the Znode directory node

Answer5:

Znode has four forms of directory nodes, as follows:

  1. PERSISTENT: persistent node.
  2. EPHEMERAL: Temporary node.
  3. PERSISTENT_SEQUENTIAL: Persistent sequential numbered directory node.
  4. EPHEMERAL_SEQUENTIAL: Temporarily serialize directory nodes.
Published 207 original articles · praised 80 · 120,000 views

Guess you like

Origin blog.csdn.net/qq_36963950/article/details/105200255