NameNode&Secondary NameNode working mechanism

Thinking: Where is the metadata in the NameNode stored?

First of all, let's make an assumption that if it is stored in the disk of the NameNode node, because it often needs random access and responds to customer requests, it must be too inefficient. Therefore, metadata needs to be stored in memory. But if it only exists in memory, once the power is turned off, the metadata will be lost, and the entire cluster will not work. Therefore generate a FsImage that backs up the metadata on disk .

This will bring new problems. When the metadata in the memory is updated, if the FsImage is updated at the same time , the efficiency will be too low, but if it is not updated, the consistency problem will occur. Once the NameNode node is powered off, the Data loss will occur. Therefore, introduce the Edits file ( append only, which is very efficient ) . Whenever metadata is updated or metadata is added, the metadata in memory is modified and appended to Edits . In this way, once the NameNode node is powered off, metadata can be synthesized by merging FsImage and Edits .

However, if you add data to Edits for a long time , the file data will be too large and the efficiency will be reduced, and once the power is off, it will take too long to restore the metadata. Therefore, it is necessary to periodically merge FsImage and Edits . If this operation is completed by the NameNode node, the efficiency will be too low. Therefore, a new node SecondaryNamenode is introduced , which is specially used for merging FsImage and Edits .

1. The first stage: NameNode startup

(1) After starting the NameNode format for the first time, create Fsimage and Edits files. If it is not the first time to start, directly load the edit log and mirror file to memory.

(2) The client requests to add, delete, or modify metadata.

(3) NameNode records the operation log and updates the rolling log.

(4) NameNode adds, deletes, and modifies data in memory.

2. The second stage: Secondary NameNode work

       (1) Secondary NameNode asks NameNode whether CheckPoint is needed. Whether to directly bring back the NameNode to check the result.

       (2) Secondary NameNode requests to execute CheckPoint.

       (3) NameNode scrolls the Edits log being written.

       (4) Copy the edit log and mirror file before rolling to the Secondary NameNode.

       (5) The Secondary NameNode loads the edit log and the image file into memory and merges them.

       (6) Generate a new image file fsimage.chkpoint.

       (7) Copy fsimage.chkpoint to NameNode.

       (8) NameNode renames fsimage.chkpoint to fsimage.

Fsimage: A file formed after metadata serialization in NameNode memory.

Edits: Record every step of the operation of the client to update the metadata information (the metadata can be calculated through Edits).

When the NameNode starts, first scroll Edits and generate an empty edits.inprogress, and then load Edits and Fsimage into the memory, and the NameNode memory holds the latest metadata information at this time. Client starts to send requests for addition, deletion and modification of metadata to NameNode, and the operations of these requests will first be recorded in edits.inprogress (the operation of querying metadata will not be recorded in Edits, because the query operation will not change the metadata information) , if the NameNode hangs up at this time, it will read the metadata information from Edits after restarting. Then, the NameNode will perform the addition, deletion and modification of metadata in memory.

As more and more operations are recorded in Edits, the Edits file will become larger and larger, causing the NameNode to be very slow when starting to load Edits, so it is necessary to merge Edits and Fsimage (the so-called merge is to load Edits and Fsimage into In the memory, follow the operations in Edits step by step, and finally form a new Fsimage). The role of SecondaryNameNode is to help NameNode to merge Edits and Fsimage.

SecondaryNameNode will first ask NameNode whether it needs CheckPoint (checkpoint needs to be triggered to meet any one of the two conditions, the timing is up and the data in Edits is full). Whether to directly bring back the NameNode to check the result. The SecondaryNameNode performs the CheckPoint operation. First, the NameNode will scroll Edits and generate an empty edits.inprogress. The purpose of scrolling Edits is to mark the Edits. All new operations will be written to edits.inprogress, other unmerged Edits and Fsimage It will be copied to the local of SecondaryNameNode, and then the copied Edits and Fsimage will be loaded into the memory for merging to generate fsimage.chkpoint, and then fsimage.chkpoint will be copied to NameNode, renamed to Fsimage and replaced with the original Fsimage. When the NameNode starts, it only needs to load the Edits and Fsimage that were not merged before, because the metadata information in the merged Edits has been recorded in the Fsimage.

Guess you like

Origin blog.csdn.net/qq_53368181/article/details/121809524