HDFS note

-- copy from Internet  https://www.cnblogs.com/wxplmm/p/7239342.html

The NameNode : is the Master node, somewhat similar to Linux in the root directory. Management block mapping; write processing request from the client; copy of a policy configuration; HDFS namespace management;

SecondaryNameNode : preserved part of NameNode information (not all information NameNode recover data after dawdle away), it is NameNode cold backup; fsimage merge edits and then distributed to namenode. (Prevent excessive edits a solution)

DataNodes : responsible for storing data blocks sent by client block; read and write operations of the data block. NameNode is the younger brother.

Hot backup: b is a hot backup, if a broken. Then b immediately run instead of a job.

Cold backup: b is a cold backup, if a broken. Then b can not be immediately replaced by a work. However, some of the information stored on a, b, reduced loss after a broken.

FsImage : metadata mirror file (directory tree of the file system.)

edits : operation log metadata (file system for changes made operational records)

namenode stored in the memory is = fsimage + edits.

 

Detailed Namenote

 effect:

Namenode play a commanding role, to achieve the user to access and manipulate data through other namenode, similar to the root of the root of feeling.

Namenode comprising: the relationship between the directory data block (implemented by fsimage and edits) the relationship between the data block and the node

 fsimage file and edits files are the core files on Namenode nodes.

Namenode the only store directory information tree, and location information about BLOCK is uploaded from each Datanode to the Namenode.

Namenode the directory tree information is physically stored in fsimage this file when Namenode will start when fsimage first read this file, the directory information tree loaded into memory.

The edits are stored log information after Namenode start all of the increase in the directory structure, delete, modify, and other operations are logged to a file edits, and does not synchronize recorded in fsimage in.

And when Namenode node closed, nor will fsimage merge with edits the file, the merger process is actually Namenode occurs during the startup.

In other words, when Namenode start, the first load fsimage file, and then apply edits the file, and finally will be updated with the latest information on the directory tree to the new fsimage file and enable the new edits files.

The whole process is not a problem, but there is a small flaw is that if Namenode too much change after the start, will lead edits files become very large, so large that Namenode update frequency and extent of the relationship.

After then during the next Namenode's boot fsimage read the file, the application of this very large edits the file, resulting in a longer startup time, and uncontrollable, you may need to start a few hours instead.

Namenode of edits file is too big problem, which is the main problem to be solved SecondeNamenode.

SecondNamenode will wake up according to certain rules, and then merge fsimage file and edits files to prevent edits the file is too large, resulting in Namenode start time is too long.

 

DataNode Detailed

DataNode real data stored in HDFS.

First, to explain the concept of block (block) of:

  1. When DataNode storing data in accordance with reading and writing data in units of block. hdfs block is the basic unit of reading and writing data.
  2. Assuming that the file size is 100GB, starting from byte position 0, each byte is divided into a 128MB Block, and so on, can be divided into many block. Each block size is 128MB.
  3. Is on the block is essentially a logical concept, which means that the block will not actually store data, just divide the file.
  4. block will be stored in a copy of a copy of the advantages is safe, the drawback is the space

 

SecondaryNode

Execution: downloading metadata information (fsimage, edits) from NameNode, and the two combined to generate a new FsImage, stored locally and push them to NameNode, and resets the NameNode edits.

 

 

 

full path  https://www.cnblogs.com/wxplmm/p/7239342.html

 

Guess you like

Origin www.cnblogs.com/eiguleo/p/11718595.html
Recommended