secondary namenode checkpoint

 

secondary namenode

NameNode responsibility is to manage metadata information, DataNode responsibilities are specifically responsible for data storage, then SecondaryNameNode What is the role? For many beginners are very confused. Why does it appear in HDFS in. From the point of view of its name, it feels like a NameNode backup. But it actually is not.

I guess what, when the HDFS after a cluster running for some events, there will be some of the following questions:

L Edit logs files become very large, how to manage this file is a challenge.

L the NameNode restart will take a long time, because there are many changes to be merged into fsimage on file.

l If NameNode hung up, it would lose some changes. Because at this time fsimage files are very old.

Therefore, in order to overcome this problem, we need a mechanism for easily managed to help us reduce the edit logs the size of the file and get a new fsimage file , it will also reduce the NameNode pressure on. This is with Windows recovery point is very much like, Windows recovery point mechanism allows us to OS snapshot, this problem occurs when the system, we are able to roll back the latest in a recovery point.

SecondaryNameNode is to help solve the problem, its duty is to merge NameNode the edit logs to fsimage file .

 

 

Checkpoint

Each trigger condition is reached, will be the secondary namenode will namenode all accumulated on edits and a new fsimage downloaded to the local, and loaded into memory for Merge (a process called checkpoint ), as shown below:

Checkpoint detailed steps

L the NameNode manages metadata information, wherein there are two types of persistent metadata file: edits operation log files and fsimage metadata mirror file. The new operation log will not immediately fsimage merge, it will not brush to NameNode of memory, but will first wrote edits in ( since the merger need to consume a lot of resources ) update to the memory after the operation was successful.

l has dfs.namenode.checkpoint.period and dfs.namenode.checkpoint.txns two configuration as long as to reach any of these two conditions, secondarynamenode executes checkpoint operation.

l when the trigger checkpoint operation, the NameNode will generate a new edits i.e. figure above edits.new file, while SecondaryNameNode will edits documents and fsimage copied to the local ( the HTTP the GET method).

L secondarynamenode be downloaded fsimage loaded into memory and executed one by one edits the update file, make memory fsimage save date, this process is the edits and fsimage file merge, creating a new fsimage file that is the figure above Fsimage.ckpt file.

L secondarynamenode the newly generated Fsimage.ckpt copy files to NameNode node.

l In NameNode node edits.new files and Fsimage.ckpt file will replace the original edits files and fsimage file, thus happens to be a reincarnation, that NameNode in again edits and fsimage file.

l waiting for the next checkpoint trigger SecondaryNameNode work, this cycle has been operating.

 Checkpoint trigger conditions

Checkpoint operation controlled by two parameters, it can be core-site.xml configuration:

<property>

  <name> dfs.namenode.checkpoint.period</name>

   <value>3600</value>

    <description>

Two successive checkpoint time interval between. Default 1 Xiaoshi

</description>

</property>

<property>

  <name>dfs.namenode.checkpoint.txns</name>

  <value>1000000</value>

  <description>

No maximum execution checkpoint number of transactions, to meet the emergency will be enforced checkpoint , have yet to reach a checkpoint cycle. The default setting is 100 million.

</description>

</property>

We can see from the above description, SecondaryNamenode simply not Namenode of a hot standby, it just fsimage and edits the merger. It has fsimage not up to date, because he was from NameNode download fsimage and edits files when a new update has been written edit.new to the file. These updates SecondaryNamenode is not synchronized to! Of course, if NameNode in fsimage really a problem, you can still use SecondaryNamenode in fsimage replace what NameNode on fsimage , although not the latest fsimage , but we can reduce the losses to a minimum!

 



Guess you like

Origin www.cnblogs.com/TiePiHeTao/p/8890dcdcc9d89972e4f4f93a989b9cef.html