[Big Data Hadoop] HDFS3.3.1-Namenode-Cache Management

foreword

Hadoop 2.3.0 version has added a centralized cache management ( Centralized Cache Management) function, which allows users to save some files and directories to the HDFS cache. The HDFS centralized cache is Datanodecomposed of off-heap memory distributed on , and is managed uniformly by Namenode

HDFS clusters with centralized caching capabilities have the following significant advantages.

  • Prevents frequently used data from being cleared from memory.
  • Because the centralized cache is managed by the Namenode, the HDFS client can schedule tasks according to the cache status of the data blocks, thereby improving the read performance of the data blocks.
  • After the data block is cached by the Datanode, the client can use a new and more efficient zero-copy mechanism to read the data block. Clients reading blocks using zero-copy do not incur read overhead because blocks are cached with a checksum operation.
  • Can improve the memory utilization of the cluster. When Datanode uses the buffer of the operating system to cache data blocks, repeated reading of a block will cause all N copies of the block to be sent to the buffer of the operating system. When using a centralized cache, the user can lock M of the N copies, thus saving the memory of the NM.

This section first introduces the concept and architecture of HDFS centralized cache, and then introduces the components responsible for managing cache in Namenode CacheManagerand CacheReplicationMonitorits implementation.

cache concept

HDFS centralized caching has two main concepts.

  • Cache directive Cache Directive): A cache directive defines a path to be cached. These paths can be folders or files. It should be noted that the cache of the folder is non-recursive, and only the files listed in the first level of the folder will be cached. Folders can also specify additional parameters, such as cache replication factor (replication), expiration date, etc. The cache copy factor sets the number of cache copies of the path, if multiple cache directives point to the same file, then the maximum cache copy factor is used.

  • Cache Pool: A cache pool is a management unit and a group that manages cache instructions. The cache pool has UNIX-like permissions, which can restrict which users and groups can access the cache pool. The cache pool can also be used for resource management. It can set a maximum limit value, limiting the number of bytes written to the instructions in the buffer pool.

cache management commands

HDFS provides administrators and users with the "hdfs cacheadmin" command to manage centralized caches, including cache instruction control and cache pool control.

  • Cache instruction control: administrators can call hafs cacheadmin -addDirectivethe command cache specified path: call hdfs cacheadmin -removeDirectivethe command to delete the cache corresponding to the specified id; call hdfs cacheadmin -removeDirectivesthe command to delete the cache of the specified path: call hdfs cacheadmin -listDirectivesthe command to display all current caches. Due to space reasons, the parameters of all commands are not listed here.

  • Cache pool control: the administrator can call hdfs cacheadmin -addPoolthe command to create a cache pool; call hdfs cacheadmin -modifyPoolthe command to modify the configuration of a cache pool: call hdfs cacheadmin -removePoolthe command to delete a cache pool.

HDFS centralized cache architecture

insert image description here

As shown in the figure, the user can hdfs cacheadminsend the cache instruction ( Cache Directive) to the Namenode through the command or the HDFS API, and the CacheManager class of the Namenode will save the cache instruction to the specified data structure (CacheManager directivesByld, directivesByPathfield) of the memory, and record the cache instruction in the fsimageand file at the same time editlogStorage instructions. After that, the Namenode CacheReplicationMonitorclass will periodically scan the namespace and active cache instructions to determine the data blocks that need to be cached or deleted, and assign cache tasks to the Datanode.

Namenode is also responsible for managing the off-heap cache of all Datanodes in the cluster. Datanode will periodically send cache reports to Namenode, and Namenode will send cache instructions (add cache or delete cache) to Datanode through heartbeat response.

When DFSClient reads a data block, it will send a request to Namenode ClientProtocol.getBlockLocationsto obtain the location information of the data block. In addition to returning the location information of the data block, Namenode will also return the cache information of the data block, so that DFSClient can perform local zero-copy read cache data block, thus improving the read efficiency.

CacheManager class implementation

CacheManager The class is the core component of Namenode's centralized cache management. It manages all cached data blocks distributed on Datanodes in the HDFS cluster, and is responsible for responding to commands hdfs cacheadminor cache management commands sent by HDFS API (please refer to the cache management command section).

CacheManager defines the following fields.

  • directivesByld: Use the cache instruction id as the key to save all cache instructions.
  • direetivesByPath: Use the path as the key to save all cache instructions.
  • cachePools: Save all cache pools with the cache pool name as the key.
  • monitor: CacheReplicationMonitorObject, responsible for scanning the namespace and active cache instructions to determine the data blocks that need to be cached or deleted, and assign cache tasks to Datanodes.

The client will call ClientProtocol.addCachePool()the method to create a cache pool. After receiving the request, NameNodeRpcServer will encapsulate the parameters of the cache pool creation request into a CachePoollnfo object (including owner, group, mode, limit, max Ttl and other parameters), and then call the CacheManager.addCachePool()method respond to this request. CacheManager.addCachePool()The method first verifies whether the request parameter (saved by the info variable) is valid, and then checks CacheManager.cachePoolswhether the buffer pool has been saved in the collection. If there is no information about this cache pool in CacheManager, call CachePool.ereate FromlnfoAndDefaults()the method to create a new CachePoolobject according to the request parameters. createFromlnfoAndDefaults()The method will fill in the parameters that are not set in the request with default values, and then construct CachePoolthe object. After successfully creating CachePoolthe object, addCachePool()the method saves CachePoolthe object cachePoolsin the field. So far, the operation of adding a cache pool is completed.

CacheManager.addCachePool()The code of the method is as follows:

After successfully creating the cache pool, the client will call ClientProtocol.adaCacheDirective()the method to add a path to the cache. After NameNodeRpeServer receives this request, it will encapsulate the parameters of the cache creation request into an CacheDirectivelnfoobject, and then call CacheManager.addlnternal()the method to respond to this request.
As shown in the following code, CacheManager.addinternal()the cache instruction object will be added to directivesByldand directivesByPathsaved in the collection, and then the statistical information of the cache pool will be updated. The method then addinternal()calls setNeedsRescan()the method to trigger CacheReplicationMonitorthe execution of rescan()the action.

It can be seen that addInternal()the method only updates the cache information Cache Managermaintained by the and fields directivesByld, directivesByPathand cachePoolsthen does not issue any cache instructions (add, delete cache) to Datanode, but calls setNeedsRescanthe method to trigger CacheReplicationMonitorthe execution of the storage operation. We describe CacheReplicationMonitorthe implementation of in the next section.

CacheReplicationMonitor

CacheReplicationMonitorIt is a thread class that scans the namespace and active cache instructions at fixed intervals when the Namenode starts to determine the data blocks that need to be cached or deleted, and then sends cache instructions to the Datanode (the interval here is determined by the dfs.namenode.path.based.cache.refresh.interval.msconfiguration item configuration, the default is 30 seconds).

CacheReplicationMonitorThe method will be called cyclically rescan()to execute the scanning logic. rescan()The code of the method is as follows. It will call rescanCacheDirectives()the method to traverse all cache instructions (stored in CacheManager.

directivesByPathfield), and add the data blocks contained in the storage instruction path to the CacheReplicationMonitor.

cachedBlocksAwaiting further action in the collection. Afterwards rescan()the method will be called to rescanCachedBlockMap()traverse CacheReplicationMonitor.cachedBlocksthe collection to determine whether these data blocks are to perform cachethe operation or uncachethe operation. For cachethe data block that needs to perform the operation, the method rescanCachedBlockMap()will be called to select an appropriate Datanode CacheReplicationMonitor.addNewPendingCached()for each waiting cachedata block (select the one with the most available memory from all Datanodes that have saved copies of the data block), and then add the data block to the in the list DatanodeDescriptorof objects corresponding to the Datanode . pendingCachedFor uncachedata blocks that need to perform operations, rescanCachedBlockMap()the CacheReplicationMonitor.

addNewPendingUncached()The method randomly selects a node from the Datanode that has cached the data block to perform uncachethe operation, that is, adds the data block to pendingUncachedthe list of DatanodeDescriptor objects corresponding to the Datanode.

After the data block is added to DatanodeDescriptorthe list of pendingCachedand pendingUncached, Namenode will generate the name node command in the heartbeat process (please refer to the section of the name node command generation of data node management), and send it to the Datanode through the heartbeat response. After Datanode accepts the command, it will perform cache and delete cache operations (please refer to the FSDatasetlmpl section in Chapter 4). At this point, the logic of processing cached data blocks on the Namenode side is over. For the implementation of DFSClient reading cached data blocks through zero-copy mode, readers are referred to the relevant sections of zero-copy reading in Chapter 5.

The content of this article is combined from "Hadoop 2.X HDFS Source Code Analysis" and my own understanding

I hope it will be helpful to you who are viewing the article, remember to pay attention, comment, and favorite, thank you

Guess you like

Origin blog.csdn.net/u013412066/article/details/130083613