The role of secondary namenode in hadoop cluster, the relationship between fsiamge and edit

First of all, the secondary namenode is not a backup of the namenode, but is managed by the auxiliary namenode to share the pressure of the namenode.

 fsimage : shorthand for filesystem image, file image. Binary file that stores HDFS file and directory metadata

 Edits : Binary files, all HDFS operations between each save of fsimage and the next save are recorded in the Edit s file. Every operation on a file, such as opening, closing, renaming files and directories, generates an edit record.

 fstime : binary file, after fsimage completes a checkpoint , write the latest timestamp to fstime


Secondary NameNode: Also known as standby node in HA cluster

  • Its role is to periodically merge the fsimage and edits logs to keep the edits log file size under a limit
  • The namenode responds to the Secondary namenode request, pushes the edit log to the Secondary namenode, and starts to rewrite a new edit log
  • Secondary namenode receives fsimage file and edit log from ( HTTP ) namenode
  • Secondary namenode loads the fsimage into memory, applies the edit log, and generates a new fsimage file
  • Secondary namenode pushes the new fsimage ( HTTP method ) to Namenode
  • Namenode replaces the old fsimage with the new fsimage, noting in the fstime file when the checkpoint occurred

How SecondaryNameNode works

Namenode First of all, for each file operation, Hadoop does not write to fsimage, which is very slow, but each operation writes the edits edit log before running after submission, when the edits edit log file size exceeds 64M ( The parameters can be set), or the time exceeds 1 hour (parameters can be set), the secondarynamenode will do the checkpoint work and send a request to the namenode. At this time, the namenode will generate a temporary empty file edits.new, and the secondarynamenode will read the namenode. edits and fsimage, then merge, merge into fsimage.ckpt checkpoint, and then convert fsimage.ckpt through HTTPSend it to NameNode, and NameNode renames fsimage.ckpt to fsimage (overwrites the original fsimage file), and renames edits.new to edits (overwrites the original edits file).

Note that edits.new is a temporary file that exists only when the NameNode or SecondaryNameNode is doing checkpoint.

Namenode starts reading fsimage principle

When the namenode is restarted, the namenode loads the latest fsimage and edits files into memory according to the checkpoint time, and then creates a temporary empty file edits.new, and then merges to generate the fsimage.ckpt checkpoint, edits.new is renamed to edits (overwrite the original edits file), rename fsimage.ckpt to fsimage (overwrite the original fsimage file), and then update the fstime time and VERSION version

  Reasons to use secondary nameonde:

Fsimage is a file in which HDFS stores metadata, it will not be updated after every file operation (such as opening, querying, creating, modifying files) in HDFS . Each file operation in HDFS will add an edits record. This will cause the edits record to increase continuously.

      This design does not affect the resilience of the system. Because if the Namenode fails, the latest state of the metadata can be restored by reading the fsimage file from disk and loading it into memory, and then re-executing the operations in the edits record, which is exactly what the NameNode does when it restarts . But if there are many edits records, the NameNode will take a long time to run the operations in the edits record when it starts up. During this time, the HDFS file system is unavailable.

      To solve this problem, Hadoop runs a Secondary NameNode process on a node other than the NameNode. The Secondary NameNode periodically copies the fsimage and edits records from the NameNode to the temporary directory and merges them into a new Fsimage , then it uploads the new fsimage to the NameNode , so the NameNode will update the fsimage and delete the original edit log. This process is called checkpointing .

Disadvantages of Secondary NameNode :

  • Because the Secondary namenode does not perform checkpoint in real time , when the next checkpoint has not yet been performed, the namenode has a hardware failure and the metadata is not stored through NFS , then all edits files in the Namenode from the last checkpoint to the time of the failure will be lost. Because only the last fsimage file is stored in the secondary namenode at this time , and there is no latest edits file, data recovery during this period cannot be performed through the secondary namenode .

    The Secondary NameNode is not the backup process of the NameNode. If the NameNode is down, but the SecondaryNameNode is not down, the cluster will still not work properly. If you want to resume cluster work, you need to manually copy the fsimage file on the Secondary NameNode to the new NameNode .


Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325972895&siteId=291194637