HDFS in DataNode mechanism

1.DataNode working mechanism

1) a data block on datanode stored as files on disk, including two files, one for the data itself, a meta data (including length of the data block, the data block checksum, and timestamp).

2) DataNode After starting the registration namenode after by a periodic (1 hour) to report all of the information blocks namenode .

3) the heartbeat every 3 seconds , heartbeat return results from namenode to the datanode of commands such as copy data block to another machine, or deleting a data block. If more than 10 minutes did not receive a datanode the heartbeat, the node is considered unavailable .

4) cluster operation can safely join and leave some machines.

2. Data Integrity

1) When the read block DataNode when it calculates the checksum

2) checksum If after calculation, and create a time when the block is not the same, indicating that block has been damaged.

3) client on another block read DataNode.

4) datanode period after its creation file checksum verification

3.DataNode dropped the time limit set parameters

datanode process of death or network failure caused datanode unable to communicate with namenode, namenode not immediately determine the node is dead, over a period of time, a long period of time when temporarily called timeout. Length of 10 minutes + 30 seconds HDFS default timeout. If the timeout is defined timeout, the timeout duration is calculated as:

timeout  = 2 * dfs.namenode.heartbeat.recheck-interval + 10 * dfs.heartbeat.interval

And the default size dfs.namenode.heartbeat.recheck-interval of 5 minutes, dfs.heartbeat.interval defaults to 3 seconds.

Note that hdfs-site.xml configuration file heartbeat.recheck.interval of milliseconds , dfs.heartbeat.interval of seconds .

<property>
    <name>dfs.namenode.heartbeat.recheck-interval</name>
    <value>300000</value>
</property>
<property>
    <name> dfs.heartbeat.interval </name>
    <value>3</value>
</property>

 

1

) When DataNode reading block of time, it will calculate the checksum

2 ) If the calculated Checksum , and block creation time value is not the same, indicating that block has been damaged.

. 3 ) Client read another DataNode on the block.

4 ) Datanode after its creation cycle verification file checksum

 

Guess you like

Origin www.cnblogs.com/MWCloud/p/11237222.html