I want to enter the big data ZooKeeper knowledge points (2)

Insert picture description here

01 Let's learn big data together

Next is the more partial architecture part of Big Data ZooKeeper. It will be a little difficult. Old Liu also spent a long time understanding and memorizing it. I hope it will be helpful to students who want to learn big data. Criticism and guidance.

02 knowledge points

Point 10: Talk about the ZooKeeper cluster architecture

First of all, the ZooKeeper cluster is a master-slave architecture. There are three roles in the ZooKeeper cluster: leader, follower, and observer. Knowing these three things, you must understand their concepts.

The leader, the leader, provides read and write services for clients and maintains the state of the cluster.
Follower, provides read and write services for clients, reports their own status information to the leader, and also participates in "over half write success" and leader election.
Observer, it is a special follower, which provides read and write services for the client, reports its own status information to the leader, but does not participate in "over half write success" and leader election.

How does the client perform read operations with the ZooKeeper cluster?
Insert picture description here
Seeing this picture, you will feel that the reading operation of ZK is very simple.

That is, the client first establishes a session with a server in the ZK cluster, then directly reads data from the ZK server, returns to the client after reading, and finally closes the session. It's that simple!

How does the client write to the ZK cluster?

Does anyone think it would be very simple? Old Liu thought that way. After reading it, he found that he was still too superficial. The write operation of ZK is much more complicated than the read operation of ZK.

Let me share a particularly good example that I have seen. This example really vividly expresses the process of ZK writing operation in the eyes of Lao Liu.

1. A rich man came to the bank and told a young lady at the counter that I saved money here yesterday. You saved me 10 million less. Now you need to add it to me.

2. For such a large amount, the lady at the counter definitely does not have the authority to operate, she will report it to the manager, and the manager can't add it casually. In order to convince the crowd for his operation, he will ask all his subordinates Views.
3. If most people agree to add, the manager will make a decision and agree to the matter. And will inform all subordinates, let them write down this matter.
4. Then the young lady at the counter at the beginning will notify the rich man that the operation is successful and add 10 million.

After reading this example, if you look at the ZK write operation next, you will find that it is almost exactly the same.
Insert picture description here

① The client writes data to the ZK cluster, such as create /test; establishes a session with the leftmost follower in the cluster.
② The follower will forward the write request to the leader.
③ After the leader receives the message, it will send out the proposal creation/test, and then notify each follower to write down the creation/test first.
④ Start voting now whether to allow the write operation of creating /test. If in this cluster, more than half of the people (quorum) agree, including the leader himself, this part will be described in detail later, then the leader will commit the proposal, and the leader will create the ZNode node/test locally .
⑤ After that, the leader will notify all followers, commit proposals, and create ZNode nodes/tests locally.
⑥ After finishing all the above operations, the leftmost follower will respond to the client.

How, how, is it very similar to the example given, hahaha!

Point 11: Leader election in ZK cluster

There are two types of leader elections, one is a new election, and the other is a non-new election. Here, Lao Liu will talk about new elections in detail. Non-new elections are roughly the same as new elections. You can search for it yourself.

In leader election, there is a very important principle, that is, only after more than half of the (quorum) servers are started can the leader be elected. How is this half calculated? For example, in a ZK cluster composed of 3 machines, half of the number is equal to 3/2+1=2, which means that the number of cluster servers is divided by 2, plus 1.

During the election process, each server will vote. The voting information is of this structure (sid, zxid). In the new leader election, the initial voting information of each server is server1-(1, 0), server2-(2, 0), server3-(3, 0).

So how to judge the election to get the leader? That is, server1 votes (sid1, zxid1), server2 votes (sid2, zxid2), it will compare, first compare the zxid, who is the leader; if the zxid is equal, compare the sid, and whoever is the sid is the leader.

The above basically finished the knowledge points in the leader election. Next, let Liu talk about the leader election steps in detail, or talk about this election in a ZK cluster composed of three machines.

Assuming that the ZK1, ZK2, and ZK3 are started sequentially, then half of them will be 2.
1. After starting ZK1, vote for yourself, vote information (1,0), if there is no more than half, the election cannot be conducted.
2. Restart ZK2; ZK1 and ZK2 vote for themselves and other servers; ZK1's vote is (1, 0), ZK2's vote is (2, 0).
3. Now that the number of clusters reaches 2, the election can be conducted, and the voting will be processed first. ZK1 will compare the vote (1,0) voted for itself with the vote (2,0) passed by ZK2; using the leader election formula, because zxid are all 0, equal; therefore, the maximum value of sid is judged; 2>1; so ZK1 updates its vote to (2, 0). In the same way, ZK2 also performs the same logic, and ZK2 updates its vote to (2,0).
4. After the vote is processed, vote again. Now the votes on ZK1 and ZK2 are both (2,0), then ZK2 will be selected as the leader, and then the server status will be changed, ZK2 will be changed to Leader; ZK1 will be changed to follower.
5. Finally, when K3 is started, it finds that there is already a leader in the cluster, so it will not be elected and it will become a follower directly.

Point 12: Summary of arbitration qurorum knowledge points

Lao Liu would like to say a word here, ZAB algorithm, Lao Liu will talk about it later, but I haven't figured it out yet.

What is arbitration?

When a proposal is initiated, as long as the majority agrees, it can take effect.

Why arbitration?

That is, the proposal can take effect without all servers responding, which improves the response speed of the cluster and is reasonable.

How to choose the number of quorum?

In a ZK cluster composed of 3 machines, half of the number is equal to 3/2+1=2, which is the number of cluster servers divided by 2, plus 1.

Point 13: How ZooKeeper works

Reading and writing operations have been described in point 10, now I will talk about ZooKeeper state synchronization. After the leader election is completed, ZK will enter the state synchronization between ZooKeepers.

So how is the state synchronization performed? Let Liu write down what he remembers in his head first, don't worry!

1. The leader will construct a NEWLEADER packet, in which the NEWLEADER packet contains the leader's largest zxid, and then broadcast it to other followers.
2. After the follower receives it, it will compare it with its largest zxid. If its largest zxid is less than the leader, it means that its data is not up to date and needs to be synchronized with the leader state; otherwise, it is not needed.
3. If synchronization is required, the leader will create a LearnerHandler thread for each follower that needs synchronization, and this thread will be responsible for data synchronization requests.
4. The leader main thread will wait for the LearnHandler thread to process the results. Only when most of the followers complete synchronization, the leader starts to respond to external write requests.
5. The above is a rough state synchronization process, but in step 4, it only briefly describes the LearnerHandler thread. Next, I will elaborate on the process of the LearnerHandler thread:
① First, it will receive the follower’s packet FOLLOWERINFO, which contains the largest follower. zxid.
② Compare the maximum zxid of the follower with the maximum zxid of the leader. If they are equal, it means that the current follower is the latest;
③ During the judgment period, it is necessary to judge whether there is a newly submitted proposal. If so, it will send a DIFF packet to synchronize the different data. At the same time, the data that is not in the follower will be sent one by one COMMIT packets to the follower to save; if not, but the follower data id is larger, then a TRUNC packet will be sent to inform the removal of the excess data; if the maximum zxid of the follower is smaller than the maximum zxid of the leader, it will Send the SNAP packet directly to send the snapshot synchronously to the follower.
④ After the above message is completed, an UPTODATE packet will be sent to inform the follower that the current data is the latest, and it is almost done.

Point 14: HDFS HA ​​of ZooKeeper instance

Having talked so much about the principles of ZooKeeper, now I will talk about the examples of ZooKeeper, so that the interviewer will not ask about ZooKeeper related examples, and I am not prepared. This time I will mainly talk about HDFS HA. HDFS HA's high availability is mainly dependent on ZooKeeper. It mainly includes two parts: one is metadata synchronization and the other is active/standby switching.

Let me talk about metadata synchronization, mainly in the red circle in the figure below.
Insert picture description here
Let's talk about the process of metadata synchronization. In the same HDFS cluster, run two NameNode nodes. One is the primary NameNode node, which is in Active state, and the other is the secondary NameNode node, which is in Standby state. Among them, only Active NameNode can provide external read and write services. Standby NameNode will change according to the status of Active NameNode. When the master node is abnormal, it will switch to Active state.

However, in the process of switching between active and standby, the new Active NameNode must ensure that it is synchronized with the original Active NamNode metadata before it can provide external services.

How to synchronize metadata?

The JournalNode cluster will be used here as a shared storage system. When the client operates on HDFS, it will record in the edits.log file in the Active NameNode, and the log records will also be written to the JournalNode cluster, which is responsible for storing HDFS. The newly generated metadata. When new data is written to the JournalNode cluster, the Standby NameNode can monitor this situation and synchronize the new data. Then the Active NameNode and the Standby NameNode realize metadata synchronization. In addition, all datanodes will also do block reports to the two active and standby namenodes.

Now it’s time to switch between active and standby , talk about it! Draw the flow chart first:
Insert picture description here
According to this flow chart, experience this process well.

1. There will be a ZKFC process on each NameNode node, and the ZKFC process is responsible for controlling the active/standby switching of the NameNode.
2. When ZKFC starts, it will initialize HealthMonitor and ActiveStandbyElector services at the same time. ZKFC will also register corresponding callback methods to HealthMonitor and ActiveStandbyElector. HealthMonitor monitors the health status of the NameNode. ActiveStandbyElector will receive election requests from ZKFC and create a temporary node ActiveStandbyElectorLock.
3. Next, the two ZKFCs will try to create a temporary node ActiveStandbyElectorLock in Zookeeper through their respective ActiveStandbyElector, but due to the write consistency of Zookeeper, only one ActiveStandbyElector will be created successfully.
4. The successfully created ActiveStandbyElector calls back the ZKFC callback method to switch the corresponding NameNode to the Active NameNode state, and the failed ActiveStandbyElector calls back the ZKFC callback method to switch the corresponding NameNode to the Standby NameNode state.
5. But so! Regardless of whether the election is successful or not, all ActiveStandbyElectors will register a Watcher listener on the temporary node ActiveStandbyElectorLock to monitor the status changes of this node.
6. If the HealthMonitor corresponding to the Active NameNode detects that the NameNode status is abnormal, it will notify the corresponding ZKFC. ZKFC will call the ActiveStandbyElector method to delete the temporary node ActiveStandbyElectorLock created on Zookeeper. At this point, the Watcher registered by the ActiveStandbyElector of the Standby NameNode will listen to this delete event.
7. After receiving this event, the ActiveStandbyElector initiates a master-backup election and successfully creates a temporary node ActiveStandbyElectorLock. If the creation is successful, the Standby NameNode is elected as the Active NameNode.

Point 15: Split brain

What is split brain?

The phenomenon of two leaders in a distributed system is split brain. There are many reasons for this, such as network delays. This situation is very terrible and must be avoided through the built-in fencing mechanism.

So how does isolation work?

1. After ActiveStandbyElector successfully creates the ActiveStandbyElectorLock temporary node, it will create another ActiveBreadCrumb persistent node, which stores the address information of the Active NameNode.
2. When the Active NameNode disconnects the Session in a normal state, the temporary node ActiveStandbyElectorLock and the persistent node ActiveBreadCrumb will be deleted at the same time.
3. But if ActiveStandbyElector closes the Session in an abnormal state, the persistent node ActiveBreadCrumb will remain.
4. When another NameNode changes from standy to active, it will find the ActiveBreadCrumb node left over from the previous Active NameNode, and then call back to ZKFC to fencing the old Active NameNode.

03 Summary

Well, it's finally over! Lao Liu summarized 15 points of knowledge about big data ZooKeeper. Every point is very important to Lao Liu. Lao Liu has memorized them and kept them in his head. I hope it will be helpful to those of you who want to learn big data. , I also hope to get criticism and guidance from big guys.
Finally, if there is something, please use the official account: Lao Liu who is working hard, make contact, and if nothing happens, we will cheer up with Lao Liu and enter the factory.

Guess you like

Origin blog.csdn.net/qq_36780184/article/details/109731684