"In-depth HDFS" - HDFS memory storage

Memory Storage

HDFS from a variety of data storage, which is stored in a memory, which is stored as data carrier machine.

Memory drawback may be stored in:
1. The data is temporarily stored in memory, once the service is stopped (or down), the data is lost
2. Data stored in memory, persisted to disk when the service is stopped
To avoid problems occurring above, use asynchronous persistent manner, that is, when new data is stored in memory, the oldest data persistence.
HDFS using LAZY_PERSIST memory storage strategies:
Here Insert Picture Description

Asynchronous stored procedure is as follows:
L) on the target file directory is set to LAZY_PERSIST StoragePolicy memory storage strategy.
2) The client initiates the process to create NameNode / write request of the file.
3) to a specific client request will DataNode DataNode after the data blocks written into the RAM memory, the same
start service asynchronous thread persistent memory data written to disk.

Memory storage policy settings file

File storage strategy is used by default: StoragePolicy.DEFAULT
If you want to use memory storage, you can use the following method:

1. Command Line
    hdfs storagepolicies -setStoragePolicy -path <path> -policy LAZY_ PERSIST
2. Call program
FSDataOutputStream fos =
fs . create(
path ,
FsPermi ssion.getFileDefault() ,1童 HDFS的数据存储 令 5
EnumSet.of(CreateFlag.CREATE , CreateFlag.LAZY_ PERSIST) ,
bufferLength,
replicationFactor,
blockSize,
null) ;
// DFSClient 创建文件方法
public DFSOutputStream create(String src , FsPerm工 S sion permission,EnumSet <CreateFlag> flag, short replication, long blocksize,Progressable progress, int buffersize, ChecksumOpt checksumOpt)throws IOException {
return create(src, permiss 工0口 , flag , true,replication, blockSize, progress, buffersize , checksumOpt,nu11);
}

LAZY_PERSIST use

The use of the memory storage, the storage medium is RAM_DISK, so before you need to use virtual memory settings. The tmpfs filesystem, tmpfs is linked to the / dev / shm, actually stored in the directory file is stored in memory.
If you want to change the settings:

sudo mount t tmpf s -o s 工 ze=16g tmpfs /mnt/dn-tmpfs/

Virtual memory disk is set to dfs.datanode.data.dir in, such as:

<property>
    <name>dfs.datanode data . dir</name>
    <Value>/grid/0 , /grid/l , /grid/2, [RAM_DISK) /mnt/dn-tmpfs</value>
</property>

Heterogeneous storage should be used to confirm whether the policy was closed property dfs.storage.policy.enabled
confirm whether setting the maximum memory, dfs.datanode.max.locked.memory, to see if more than DataNode define the maximum memory size ah.

Published 79 original articles · won praise 3 · Views 5218

Guess you like

Origin blog.csdn.net/SW_LCC/article/details/104043829