【Big Data Hadoop】The Implementation Principle of HDFS3.3.1-Datanode-DataStorage

foreword

The most important function of Datanode is to manage HDFS data blocks stored on disk. Datanode divides this management function into two parts: ① manage and organize the disk storage directory (specified by dfs.data.dir), such as current, previous, detach, tmp, etc., this function is realized by the DataStorage class; ② manage and organize Data blocks and their metadata files, this function is mainly realized by related classes of FsDatasetlmpl. This section describes the implementation of the DataStorage class.

Storage class inheritance relationship

The parent class of Data Storage is Storage, and the following figure shows the inheritance relationship of the Storage class. Storagelnfo is the root interface, which describes the basic information of storage. The subclass Storage is an abstract class, which provides abstract storage services for Datanode and Namenode.

insert image description here

A Storage can define multiple storage directories, which are described by the internal class StorageDirectory of Storage, and the StorageDirectory class defines common operations on storage directories. Here we take the configuration of Datanode as an example. Datanode can define multiple storage directories to save data blocks. As shown in the following configuration, Datanode defines two data storage directories, namely /data/hdfs/dfs/data” and “/data/hdfs /afs/data2". HDFS will use a Data Storage object to manage the storage of the entire Datanode, and these two storage directories are managed by two StorageDirectory objects.

<property>
  <name>dfs.data.dir</name>
  <value>[DISK]/data/hdfs/data_1,[SSD]/data/hdfs/data_2</value>
</property>

Due to the Federation mechanism, Datanode will manage multiple block pools. HDFS defines the BlockPoolSliceStorage class to manage a block pool on Datanode, which is distributed in all storage directories configured by Datanode. The DataStorage class holds references to all BlockPoolSliceStorage objects, and manages all block pools on the Datanode through these references.

StorageInfo

Storagelnfo is used to describe the basic information of storage. This class has the following 5 fields.

  • layout Version: storage system layout version number, when the directory structure of node storage changes or the format of fsimage and editlog changes, the storage system layout version number will be updated. This version number is usually a negative number.
  • namespacelD: storage system namespace identifier.
  • cTime: storage system creation time.
  • clusterID: the cluster ID of the storage system.
  • storageType: node type, including DATA NODE, NAME NODE, JOURNAL NODE, etc.

It should be noted here that the information defined in Storagelnfo is stored in the VERSION file of the storage directory. A VERSION file is a typical Java Properties (properties) file. In addition to the above five properties, the VERSION files of different types of nodes also store other unique properties.

Most of the storagelnfo methods are get/set methods, and the method of reading properties from the Properties file and then assigning fields is relatively simple, so I won’t go into details here

Storage.StorageState

The Storage class defines a very important internal enumeration class—StorageState, which completely defines all possible states of the storage space. During upgrades, rollbacks, upgrade submissions, checkpoints, and other operations, various abnormalities may occur in the storage space of the node (Datanode or Namenode), such as misoperation, power failure, downtime, etc. At this time, the storage space may be in a certain state. An intermediate state is introduced, which is beneficial for HDFS to recover from errors. The determination of the storage state is carried out in the StorageDireetory.analyseStorage0) method, which we will introduce in the next section StorageDirectory.

  public enum StorageState {
    
    
    NON_EXISTENT, //  存储不存在
    NOT_FORMATTED,  //  存储未格式化
    COMPLETE_UPGRADE, //  完成升级
    RECOVER_UPGRADE,  //  恢复升级
    COMPLETE_FINALIZE,  //  完成升级提交
    COMPLETE_ROLLBACK,  //  完成回滚操作
    RECOVER_ROLLBACK, //  恢复回滚
    COMPLETE_CHECKPOINT,  //  完成检查点操作
    RECOVER_CHECKPOINT, //  恢复检查点操作
    NORMAL; //  正常状态
  }

Special attention should be paid here, the Storage state is also related to the startup options, which are stored in the HdfsServerConstants class, the code is as follows:

  enum StartupOption{
    
    
    FORMAT  ("-format"), // 格式化操作
    CLUSTERID ("-clusterid"),	// 获取集群ID
    GENCLUSTERID ("-genclusterid"), // 生成集群ID
    REGULAR ("-regular"),	// 正常启动
    BACKUP  ("-backup"),	// 备份
    CHECKPOINT("-checkpoint"),	// 
    UPGRADE ("-upgrade"),	//	升级
    ROLLBACK("-rollback"),	//	回滚
    ROLLINGUPGRADE("-rollingUpgrade"),	//	滚动升级
    IMPORT  ("-importCheckpoint"),	//
    BOOTSTRAPSTANDBY("-bootstrapStandby"),
    INITIALIZESHAREDEDITS("-initializeSharedEdits"),
    RECOVER  ("-recover"),
    FORCE("-force"),
    NONINTERACTIVE("-nonInteractive"),
    SKIPSHAREDEDITSCHECK("-skipSharedEditsCheck"),
    RENAMERESERVED("-renameReserved"),
    METADATAVERSION("-metadataVersion"),
    UPGRADEONLY("-upgradeOnly"),
    // The -hotswap constant should not be used as a startup option, it is
    // only used for StorageDirectory.analyzeStorage() in hot swap drive scenario.
    // TODO refactor StorageDirectory.analyzeStorage() so that we can do away with
    // this in StartupOption.
    HOTSWAP("-hotswap"),
    // Startup the namenode in observer mode.
    OBSERVER("-observer");
    // ...
  }

Storage.StorageDirectory

We know that both Datanode and Namenode can define multiple storage directories to store data. StorageDirectory is an internal class of Storage, which defines a common method for managing storage directories.

  • root: The root of the storage directory, which is the java.io.File file.
  • dirType: The type of the current storage directory.
  • isShared: Indicates whether the current directory is shared. For example, in an HA deployment, the storage directory is shared between different Namenodes, or between different block pools in a Federation deployment.
  • lock: Exclusive lock, java.nio.FileLock type, used to support the lock operation of Datanode or Namenode thread exclusive storage directory.
  • storageUuid: The identifier of the storage directory.

The methods of StorageDirectory are mainly divided into three categories: obtaining folder-related operations, locking/unlocking operations, and storage state restoration operations. Let's take a look at the implementation of these three types of methods in turn.

folder operation

The method to obtain each file/folder in the current storage directory structure. All directories involved in the HDFS upgrade process can be obtained through the methods provided by StorageDirectory:

insert image description here

  • getCurrentDir() - Get the current directory.
  • getVersionFile() —— Get the VERSION file in the current directory.
  • getPreviousDir() - Get the previous directory.
  • getPrevious VersionFile() - Get the VERSION file in the previous directory.
  • getPreviousTmp() - Get the previous.tmp directory.
  • getRemovedTmp() - Get the removed.tmp directory.
  • getFinalizedTmp() - Get the finalized.tmp file.
  • getLastCheckpointTmp() - Get the lastcheckpoint.tmp file.
  • getPreviousCheckpoint() - Get the previous.checkpoint file.

lock/unlock operation

In the Datanode disk storage structure section, we introduced that there will be an in_use.lock file under the storage directory, and j files are used to lock the current storage directory to ensure the exclusive use of the storage directory by the Datanode process. The in_use.lock file is deleted when the Datanode process exits execution. StorageDirectory provides two lock methods, tryLock0 and unlock0, which realize the functions of locking and unlocking the storage directory respectively.

The actual locking operation in StorageDirectory is the tryLockO method. The tryLockO method will first construct a lock file, and then call the file.getChannel.lockO method to try to obtain an exclusive lock on the storage directory. If a process already owns the lock file, then file.getChannel.lock( will return a null reference, indicating that there is another A node is running on the current storage directory, the tryLockO method will throw an exception and exit execution. If the lock is successful, the tryLockO method will write the virtual machine information in the lock file.

Storage State Restoration Operations

During the process of performing upgrade, rollback, and commit operations on Datanode, various abnormalities will occur, such as misoperation, power failure, downtime, etc. So how does Datanode resume the last interrupted operation when restarting? StorageDirectory provides two methods, doRecover() and analyzeStorage(, Datanode will first call the analyzeStorage( method to analyze the storage status of the current node, and then call the doRecover( method to perform the recovery operation according to the analyzed storage status.

Storage

Storage is an abstract class that provides abstract storage services for Datanode and Namenode. The Storage class manages all storage directories on the current node (it can be Datanode or Namenode). Each storage directory is managed by a StorageDirectory object. Storage uses a linear table field storageDirs to store all the StorageDirectorv it manages, and iterates through the Dirlterator iterator traverse.

DataStorage

DataStorage inherits from the Storage abstract class and provides the function of managing Datanode storage space. This section describes the implementation of the DataStorage class.

In the HDFSFederation architecture, a Datanode can save data blocks of multiple namespaces, and each namespace has an independent block pool (BlockPool) on the Datanode disk. This block pool will be distributed under all storage directories of the Datanode. All the data blocks of this block pool on the current Datanode are collectively saved. HDFS defines the BlockPoolSliceStorage class to manage the storage space of a single block pool on Datanode (the implementation of BlockPoolSliceStorage will be introduced in the next section), and the DataStorage class defines the bpStorageMap field to save the references of all block pool BlockPoolSliceStorage objects on Datanode. As shown in the following code, the bpStorageMap field is of Map type, which maintains the mapping relationship of bpld->BlockPoolSliceStorage.

private final Map<String, BlockPoolSlicestorage> bpstorageMap = Collections.synchronizedMap (new HashMap<String, BlockPoolSlicestorage>());

When Datanode starts, it will call the method provided by DataStorage to initialize the storage space of Datanode. In HDFS Federation architecture, Datanode will save data blocks of multiple namespaces. For each namespace, Datanode will construct a BPOfferService class to maintain communication with the Namenode in this namespace (please refer to the BPOfferService subsection of the BlockManager section of the file system dataset). As shown in Figure 4-15, when the BPServiceActor class in BPOfferService successfully shakes hands with the Namenode of the namespace, it will call DataNode.initBlockPool( to initialize the block pool of the namespace. The DataNode.initBlockPool0 method will eventually call DataStorage.recoverTransitionRead( ) to perform block pool storage initialization.

I hope it will be helpful to you who are viewing the article, remember to pay attention, comment, and favorite, thank you

Guess you like

Origin blog.csdn.net/u013412066/article/details/130469800