HDFS-HA cluster

 HDFS-HA Automatic Failover Working Mechanism

 

Automatic failover for HA relies on the following features of ZooKeeper:

1 ) Failure detection: Each NameNode in the cluster maintains a persistent session in ZooKeeper. If the machine crashes, the session in ZooKeeper will terminate, and ZooKeeper notifies another NameNode that a failover needs to be triggered.

2 ) Active NameNode selection: ZooKeeper provides a simple mechanism for uniquely selecting a node as the active state. If the currently active NameNode crashes, another node may acquire a special exclusive lock from ZooKeeper to indicate that it should become the active NameNode .

Another new component in automatic failover, ZKFC is a client of ZooKeeper and also monitors and manages the state of NameNode . Each host running NameNode also runs a ZKFC process, ZKFC is responsible for:

1 ) Health monitoring: ZKFC uses a health check command to periodically ping the NameNode on the same host as it. As long as the NameNode responds to the health status in time, ZKFC considers the node to be healthy. If the node crashes, freezes, or enters an unhealthy state, the health monitor identifies the node as unhealthy.

2 ) ZooKeeper session management: When the local NameNode is healthy, ZKFC keeps an open session in ZooKeeper. If the local NameNode is active, ZKFC also holds a special znode lock, which uses ZooKeeper's support for ephemeral nodes. If the session terminates, the lock node will be automatically deleted.

3 ) ZooKeeper -based selection: If the local NameNode is healthy and ZKFC finds that no other node is currently holding the znode lock, it will acquire the lock for itself. If successful, it has won election and is responsible for running the failover process to make its local NameNode Active. The failover process is similar to the manual failover described earlier, first protecting the previously active NameNode if necessary, and then transitioning the local NameNode to the Active state.

Working mechanism of YARN-HA

 

Guess you like

Origin blog.csdn.net/qq_53368181/article/details/121809922