Hadoop's high availability mechanism and federation mechanism

1. Hadoop's high availability mechanism

The high availability mechanism is mainly to solve the single point of failure problem of NameNode

In Hadoop, the location of the NameNode is very important. The metadata information of the entire HDFS file system is managed by the NameNode. The availability of the NameNode directly determines the availability of Hadoop. Once the NameNode process fails, it will affect the entire cluster. Normal use . Therefore, in practical applications, a high-availability cluster (HA) is generally used, and two NameNodes are configured in the hadoop cluster.

In a typical HA cluster, two independent machines are configured as NameNodes. In a working cluster, one of the NameNode machines is in the Active state, and the other is in the Standby state . Active NameNode is responsible for all client operations in the cluster, while Standby acts as a slave server. Standby machines maintain sufficient state to provide fast failover (if required).

Insert picture description here

ZKFC components:

  • ZKFailoverController

    It is a Zookeeper-based failover controller , which is responsible for controlling the active and standby switching of the NameNode. ZKFailoverController will monitor the health status of the NameNode . When an abnormality is found in the Active NameNode, it will conduct a new election through Zookeeper to complete the switch between the Active and Standby states.

  • HealthMonitor

    Call the HAServiceProtocol RPC interface (monitorHealth and getServiceStatus) of NameNode periodically to monitor the health status of NameNode and feedback to ZKFailoverController

  • ActiveStandbyElector

    Receiving ZKFC's election request , automatically complete the primary and standby elections through Zookeeper, and call back the ZKFailoverController's primary/standby switching method to switch the NameNode between Active and Standby after the election is completed .

DataNode

The NameNode contains HDFS metadata information and data block information (blockmap), where the data block information is actively reported to the Active NameNode and Standby NameNode through the DataNode

Shared storage system

The shared storage system is responsible for storing HDFS metadata (EditsLog). Active NameNode (write) and Standby NameNode (read) realize metadata synchronization through the shared storage system . In the process of switching between active and standby, the new Active NameNode must ensure metadata Synchronization is completed to provide external services
Insert picture description here

The specific operation of high availability is shown in the figure above.

  • Shared storage system to ensure data synchronization between two NameNodes
  • Realize the switch of NameNode state through ZKFC and Zookeeper
    1. HealthMonitor monitors the status of the working NameNode, and immediately reports to ZKFalloverController when it finds that the NameNode is down.
    2. ZKFalloverController reports to ActiveStandbyElector that the NameNode is down
    3. ActiveStandbyElector tells Zookeeper to re-elect a new NameNode
    4. Reply to ActiveStandbyElector's new election results after the Zookeeper election is completed
    5. ActiveStandbyElector reports the new election results of ZKFalloverController
    6. ZKFalloverController changes the state of the original NameNode
    7. ZKFalloverController makes the newly elected NameNode status become Active

Another point to note is that SecondaryNameNode is not a backup NameNode in high availability

The main difference between the two can be shown in the following two figures

Insert picture description here

Insert picture description here

Insert picture description here

For specific differences, please refer to this blog https://blog.csdn.net/andyguan01_2/article/details/88696239

2. Hadoop's federation mechanism

The federation mechanism is mainly to solve the problem of horizontal expansion of NameNode

The single NameNode architecture makes HDFS have potential problems in cluster scalability and performance. When the cluster is large enough, the memory used by the NameNode process may reach hundreds of G, and the NameNode becomes a performance bottleneck. Therefore, the namenode horizontal expansion scheme -Federation, which is the federation mechanism, is proposed.

There are multiple NameNodes in the operation of NameNode, and multiple NameNodes means that there are multiple namespaces (namespaces), which are different from multiple NameNodes in HA mode, which have the same namespace. Different namespaces are isolated, and different namespaces have their corresponding numbers, and corresponding directories are created in the corresponding DataNodes, which means that the data in different directories under the DataNodes is managed by different namespaces.

Insert picture description here

In summary:

Multiple NNs share the storage resources in a cluster, and each NN can provide services independently.

Each NN defines a storage pool with a separate id, and each DN provides storage for all storage pools.

The DN will report the block information to its corresponding NN according to the storage pool id. At the same time, the DN will report the local storage available resources to all NNs.

Insufficient HDFS Federation

HDFS Federation does not completely solve the single point of failure problem. Although there are multiple namenode/namespaces, from a single namenode/namespace, there is still a single point of failure: if a namenode fails, the corresponding files managed by it cannot be accessed. Each namenode in the Federation is still the same as the previous implementation on HDFS, equipped with a secondary namenode, so that the primary namenode can be hung up and used to restore metadata information.

Therefore, when the cluster size is really large, the HA+Federation deployment scheme will be adopted. That is, each joint namenodes is ha.

Guess you like

Origin blog.csdn.net/qq_24852439/article/details/104185496