Detailed explanation of distributed file system HDFS

1. Distributed file system HDFS storage architecture 

1. HDFS structure and architecture

HDFS structure: 

NameNode is the heart of HDFS. It manages and maintains the entire HDFS file system. Its main functions are:

  • Responsible for receiving user operation requests;
  • Responsible for managing the file system namespace (namespace), cluster configuration information, and replication of storage blocks;
  • Responsible for the maintenance of the file directory tree and the maintenance of the block list corresponding to the file;
  • Responsible for managing the relationship between block and DataNode;

In HDFS, FsImage and Edit Log are two very important files of NameNode. They are stored on the local disk of the NameNode node, which is the metadata information of the NameNode.

Among them, the FsImage file is used to record information such as the mapping of data blocks to files, the structure and attributes of directories or files, and records information about all directories and files in the HDFS file system before the last checkpoint.

The Edit Log file records the operation log of creating, deleting, and renaming files, that is, all operations on the HDFS file system since the last checkpoint will be recorded in the Edit Log file. For example, if a file is created in HDFS, Namenode will insert a record in the Edit Log; similarly, modifying the copy factor of the file will also insert a record in the Edit Log.

HDFS distributed file system architecture:

Guess you like

Origin blog.csdn.net/qq_35029061/article/details/132252490