Hadoop namenode hot standby switching process and the role of secondarynamenode

There are generally two namenodes in a hadoop cluster, one is in active activation state and the other is in StandBy state. The NameNode in active state is responsible for all client operations in the cluster. The purpose of this setting is actually related to the underlying mechanism of HDFS. The same At a time, a file is only allowed to be occupied by one writer. If there are more than one, the file offset will be confused, resulting in an unusable data format. Of course, the NameNode in the Standby state only plays the role of a Slave at this time, so as to facilitate When the Active NameNode hangs up at any time, it can take over its task as soon as possible and become the main NameNode to achieve the effect of a hot backup. In the HA architecture, the Cold Standby role of SecondaryNameNode no longer exists. In order to keep the metadata of the secondary NameNode consistent with that of the primary NameNode, they interact through a series of guarded lightweight processes, JournalNode. When executed on the NameNode, it will also record the modification log to at least half of the JornalNodes. At this time, the NameNode in the Standby state monitors that the synchronization log in the JournalNode has changed, and will read the modification log in the JornalNode, and then synchronize to itself. In the directory mirror tree, when a fault occurs, after the Active NameNode hangs up, the Standby NameNode will read all the modification logs in the JournalNode before it becomes the Active NameNode, so that it can guarantee and hang up with high reliability. The directory mirror tree of the NameNode is consistent, and then it seamlessly takes over its responsibilities to maintain requests from clients, so as to achieve a high availability. 

Of course, hot standby requires zookeeper. In order to achieve fast fault tolerance and master the overall situation, the Standby role will also accept the block information reported by the DataNode role. The previous only introduced the working principle of NameNode fault tolerance. What can NameNode-HA achieve unattended, automatic switching fault tolerance. 

What Zookeeper can do on active-standby switchover: 
(1) Failure detection When each NameNode starts, a persistent node is registered on Zookeeper. When the NameNode goes down, its session is terminated, and Zookeeper finds that After that, the standby NameNode will be notified, Hi, dude, it's time for you to work. 
(2) Election mechanism. Zookeeper provides a simple exclusive lock to obtain the function of Master. If the NameNode finds that it has obtained the lock, it indicates that the NameNode will be activated to the Active state. 

So what is the secondarynamenode for?

1. Scholars meet the name suggests that the secondarynamenode is the backup of the namenode, or that they are the same. In essence, it is a snapshot of the namenode, and will determine how often to periodically fetch the metadata and other data in the namenode according to the values ​​set in the configuration.

2. If the namenode is damaged or lost, and hadoop cannot be started, then manual intervention is required to restore to the state of the snapshot taken in the secondary namenode, which means that the data of the cluster will be lost more or less and some downtime, and Treat the secondarynamenode as an important namenode, which requires, try not to put the secondarynamede and the namenode on the same machine.



Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325939922&siteId=291194637