QJM solution for HDFS

introduce

Quorum Journal Manager (quorum log manager) is one of Hadoop’s officially recommended hdfs HA solutions
Use ZKFC in zookeeper to implement active and standby switching
Use Journal Node (JN) cluster to share edits log to achieve data synchronization.

Insert image description here

ZK FAi slightly verCON he ravaged people is a zookeeper client. main duty:

Monitor and manage namenode health status
ZKFC monitors the health status of namenode nodes and machines through commands
Maintaining contact with the ZK cluster
If the local namenode is healthy and ZKFC sees that no other node currently holds the lock on the znode, it will try to acquire the lock on its own. If successful, it "wins the election" and is responsible for running a failover to keep its local namenode active. If other nodes hold the lock and the zkfc election fails, the node will be registered to monitor and wait for the next election.

The failover process is also commonly known as the process of switching between active and backup roles. The most feared thing during the switching process is the occurrence of brain splitting. Therefore, the Fencing mechanism is needed to avoid isolating the previous active node and then converting the standby to the active state.
The Hadoop public library provides two Fenc ing implementations, namely sshfence and shellfence (no need to implement).
Sshfence is to log in to the target node through SSH and use the command fuser to kill the process (locate the process pid through the TCP port number. This method More accurate than the ips command)
shellfence refers to executing a user-defined shell command (script) to complete isolation

Journal Node (JN) cluster is a lightweight distributed system, mainly used for high-speed reading, writing and storing data.
Usually 2N+1 Journal Nodes are used to store shared Edits Log. ----The bottom layer is similar to zkde distributed consensus algorithm
When any modification operation is executed on active NN, the Journal Log process will also record the edits log to at least half of the JN. At this time, the Standby NN detects that the synchronization log in the JN has changed and reads the edits log in the JN. Then the replay operation record is synchronized to its own directory mirror tree.

Insert image description here