QJM solution for HDFS

introduce

  • Quorum Journal Manager (quorum log manager) is one of Hadoop’s officially recommended hdfs HA solutions
  • Use ZKFC in zookeeper to implement active and standby switching
  • Use Journal Node (JN) cluster to share edits log to achieve data synchronization.

Insert image description here

Active-standby switching, solving the split-brain problem—ZKFailoverController (zkfc)

ZK FAi slightly verCON he ravaged people is a zookeeper client. main duty:

  • Monitor and manage namenode health status
    ZKFC monitors the health status of namenode nodes and machines through commands
  • Maintaining contact with the ZK cluster
    If the local namenode is healthy and ZKFC sees that no other node currently holds the lock on the znode, it will try to acquire the lock on its own. If successful, it "wins the election" and is responsible for running a failover to keep its local namenode active. If other nodes hold the lock and the zkfc election fails, the node will be registered to monitor and wait for the next election.
    Insert image description here
    Insert image description here

Active/standby switching, solving split-brain problem--Fencing mechanism

  • The failover process is also commonly known as the process of switching between active and backup roles. The most feared thing during the switching process is the occurrence of brain splitting. Therefore, the Fencing mechanism is needed to avoid isolating the previous active node and then converting the standby to the active state.
  • The Hadoop public library provides two Fenc ing implementations, namely sshfence and shellfence (no need to implement).
    Sshfence is to log in to the target node through SSH and use the command fuser to kill the process (locate the process pid through the TCP port number. This method More accurate than the ips command)
    shellfence refers to executing a user-defined shell command (script) to complete isolation

Solve the problem of synchronization of active and standby data status

  • Journal Node (JN) cluster is a lightweight distributed system, mainly used for high-speed reading, writing and storing data.
  • Usually 2N+1 Journal Nodes are used to store shared Edits Log. ----The bottom layer is similar to zkde distributed consensus algorithm
  • When any modification operation is executed on active NN, the Journal Log process will also record the edits log to at least half of the JN. At this time, the Standby NN detects that the synchronization log in the JN has changed and reads the edits log in the JN. Then the replay operation record is synchronized to its own directory mirror tree.

Insert image description here

HA cluster construction

Cluster basic environment preparation

Insert image description here

HA cluster planning

Insert image description here

Upload the installation package and configure environment variables

Insert image description here

HA cluster initialization

Insert image description here

Insert image description here

Guess you like

Origin blog.csdn.net/weixin_49750432/article/details/132046266