Data block (Block) HDFS in

We understand the three characteristics of distributed stored in a distributed storage principle summary:

  1. Data block, stored in the plurality of distributed machines
  2. Redundant data block is stored in multiple machines to improve high availability data blocks
  3. With the primary / from a distributed storage (master / slave) cluster structure

HDFS as an implementation of distributed storage, certainly has the above three features.

We post the video data blocks in the micro class HDFS in the understanding of the basic characteristics of the data block in HDFS, we now come under review, and then to achieve in-depth understanding of the next data block

review

In HDFS, data block size is the default 128Mwhen we upload a file of more than 300 M to the HDFS time, then the file will be divided into three blocks: 

 

 

 

 All data blocks are distributed on all stored DataNode:

 

 

 

 

 

 In order to improve the high availability of each data block in each data block HDFS default backup storage parts 3, here we see only a part, because we have hdfs-site.xmlthe following configurations in the configuration:

<Property> 
            <name> dfs.replication </ name> 
            <value>. 1 </ value> 
            <Description> represents the number of backup data block can not be greater than the number of DataNode, the default value is. 3 </ Description> 
        </ Property>

  We can also command by the file /user/hadoop-twq/cmd/big_file.txtall data blocks are stored in the backup 3 parts:

hadoop fs -setrep 3 /user/hadoop-twq/cmd/big_file.txt

  

We can see from the following: for each data block of the redundant memory backup 3

 

 

 

 

 

 

Here, the students may ask why see here is two backup it? This is because we are only two clusters DataNode, so only a maximum of two backups, even if you set three backup is useless, so we set the number of backups are generally equal to the ratio of the number of clusters DataNode or less

Be sure to note: When we 362.4MB of data uploaded to the HDFS, if the number of backup data block is three, then the amount of data stored in HDFS is true: 362.4MB * 3 = 1087.2MB

Note: We are above to view the information data blocks HDFS file via WEB UI HDFS, in addition to such methods of viewing information data blocks, we can also command fsck to check

Of the data blocks of

In the implementation of HDFS, the data blocks are abstracted into classes org.apache.hadoop.hdfs.protocol.Block(我们以下简称Block). There are several class attribute field in Block:

Block the implements the Writable class public, the Comparable <Block> { 
    Private Long blockId; unique identifier Id // Block of a 
    private long numBytes; // Block size (in bytes) 
    Private Long generationStamp; // Block timestamp was generated 
}

  We can also see the data block information on the WEB UI from:

 

 Block 3 except a field information storage above, also need to know how many Block contains the backup, a backup respectively stored in each of which a DataNodes on, in order to store this information in the HDFS called a org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoContiguous(下面我们简称为BlockInfo)class to store the information, this BlockInfo class inherits the Block class, as follows:

 

 

Only a very BlockInfo core class attribute is an array called triplets, the length of the array is 3*replication, replicationrepresents the number of backup data block. This information is stored in the array of the data block DataNode all the backup data corresponding to the block, numbers are we now assume backup 3, then the length of this array is 3*3=9, data stored in the array as follows: 

 

 

In other words, triplets contain information:

  • triplets [i]: DataNode Block located;
  • triplets [i + 1]: The front DataNode a Block;
  • triplets [i + 2]: After the DataNode a Block;

Where i represents the i-th replica Block, i of values ​​[0, replication).

We talked about in the HDFS NameNode the Namespace management, and a HDFS file contains an array of BlockInfo, it indicates that the file is divided into several blocks, this is actually an array BlockInfo We are speaking here of the BlockInfoContiguousarray. The following are INodeFile attributes:

{class INodeFile public 
    Private Long header = 0L; // information for identifying the storage policy ID, and the number of copies of the data block size 
    private BlockInfoContiguous [] blocks; // array of data blocks that the file contains 
}

  

So far, we have learned this information: File What's Block, the Block were on what DataNode, all around the Block list DataNode relationship is actually stored.

If the integrity of the information from the point of view, more than enough information and data to support all normal operations on HDFS file system, but there is a problem using the scene more: how to quickly locate blockId BlockInfo through?

In fact, we can use a HashMap to maintain on NameNode blockId to Block mapping, which means we can use HashMap<Block, BlockInfo>to maintain, so we can quickly locate according blockId BlockInfo, but due to memory usage, the collision conflict resolution and performance problems exist, use LightWeightGSet reimplemented after the Hadoop team instead of HashMap, on the nature of the data structure also use HashTable resolve the conflict chain collision, but in terms of ease of use, better performance, memory usage and performance.

HDFS order to solve the problem quickly locate BlockInfo by blockId, so the introduction of BlocksMap, BlocksMap by LightWeightGSet underlying implementation.

In HDFS cluster startup process, DataNode will be BR (BlockReport, in fact, the DataNode data blocks stored by reporting to the NameNode), calculated HashCode according to each Block BR, after which the corresponding BlockInfo inserted into the corresponding position is gradually built up huge the BlocksMap. BlockInfo collection INodeFile also mentioned in front of, if we BlocksMap in the BlockInfo and INodeFile in the BlockInfo all are collected, can be found in the two sets are identical, in fact BlocksMap all BlockInfo is a reference to the corresponding BlockInfo of INodeFile ; BlockInfo when looking through the corresponding Block, is the first calculation of the Block HashCode, BlockInfo quickly locate the corresponding information according to the result. So far the problem related to the HDFS file system metadata itself basically been resolved.

 

BlocksMap memory estimate

HDFS file according to a certain size into a plurality of Block, in order to ensure reliability of data, each corresponding to a plurality of Block copies stored on different DataNode. Block NameNode addition to the need to maintain the information itself, but also need to maintain a correspondence relationship Block DataNode from the list, the physical location of a description of each stored copy of the actual Block, BlocksMap structure is used for the mapping between DataNode Block list, BlocksMap is resident in memory, and the memory for a very large, so BlocksMap estimate of memory is very necessary. We look at the internal structure of BlocksMap:

Estimates are less memory and does not turn the scene pointer compression on 64-bit operating system

    Estimates are less memory and does not turn on the 64-bit operating system pointer compression scene 

class BlocksMap { 
    Private int Capacity Final; // occupies 4 bytes 
    // we use gset achieved by: LightWeightGSet 
    Private gset <Block, BlockInfoContiguous> blocks; // reference type occupies 8 bytes 
}

  

Can draw a direct memory size is BlocksMap对象头16字节 + 4字节 + 8字节 = 28字节

Block structure as follows:

Block the implements the Writable class public, the Comparable <Block> { 
    Private Long blockId; Id // uniquely identifies a Block 8 bytes of 
    private long numBytes; // Block size (in bytes) occupies 8 bytes of 
    private long generationStamp; // Block timestamp was generated occupies 8 bytes 
}

  

Block can be drawn directly to memory size对象头16字节 + 8字节 + 8字节 + 8字节 = 40字节

BlockInfoContiguous following structure:

Block the extends BlockInfoContiguous {class public 
    Private BlockCollection BC; // reference type occupies 8 bytes 
    private LightWeightGSet.LinkedElement nextLinkedElement; // reference type occupies 8 bytes of 
    private Object [] triplets; // array of reference type object header 8 bytes + 24 byte * 3 + 3 (the number is assumed to be backup 3) bytes * 8 = 104 
}

  

Can draw a direct memory size is BlockInfoContiguous对象头16字节 + 8字节 + 8字节 + 104字节 = 136字节

LightWeightGSet following structure:

LightWeightGSet class public <K, the extends E K> the implements gset <K, E> { 
    Private Final LinkedElement [] entries It; // array of reference type object header 8 bytes + 24 bytes = 32 bytes 
    private final int hash_mask; // 4 bytes 
    private int size = 0; // 4 bytes 
    private int modification = 0; // 4 bytes 
}

  LightWeightGSet chain is essentially a hash table to resolve the conflict, in order to avoid the process of bringing rehash performance overhead, initialization, LightWeightGSet index space to go directly to the entire JVM available memory of 2%, and no change. So LightWeightGSet direct memory size:对象头16字节 + 32字节 + 4字节 + 4字节 + 4字节 + (2%*JVM可用内存) = 60字节 + (2%*JVM可用内存)

 

CCP 100 million cluster assume Block, NameNode available memory space of fixed size 128GB, memory for the BlocksMap situation:

BlocksMap direct memory size + (Direct Memory Block Size + BlockInfoContiguous direct memory size) * 100M + LightWeightGSet direct memory size 
that is: 
28 + bytes (40 bytes + 136 bytes) * 100M + 60 bytes + (2% * 128G ) = 19.7475GB

  

Why the above is multiplied by the 100M it? Because 100M = 100 * 1024 * 1024 bytes = 104857600 bytes, approximately equal to one billion bytes, and above all bytes of memory units, we multiply 100M, equivalent to 100 million Block

BlocksMap data resident in memory NameNode throughout the life cycle, with the increase in data size, the number of the corresponding Block will increase accordingly, BlocksMap occupied by the JVM heap memory space will remain substantially linear synchronous increase.

 

Guess you like

Origin www.cnblogs.com/tesla-turing/p/11488035.html