Hadoop Safe Mode Illustrated

When NameNode is started (not the first time start after formatting), it's running in safe mode, which means that it offers only a read-only view of the filesystem to clients. Strictly speaking, in safe mode, only filesystem operations that access the filesystem metadata are guaranteed to work. Reading a file will work only when the blocks are available on the current set of DataNodes in the cluster, and file modifications will always fail.

The locations of blocks are not persisted by the NameNode; this information resides with the DataNodes, in the form of a list of blocks it's storing. During normal operation of the system, the NameNode has a map of block locations stored in memory. Safe mode is needed to give the DataNodes time to check in to the NameNode with their block lists, so the NameNode can be informed of enough block locations to run effectively. If the NameNode didn't wait for enough DataNode's to check in, then it would start the process of replicating blocks to new DataNodes, which would be unnecessary in most cases (because it only needed to wait for the extra DataNodes to check in) and will put a great strain on the cluster's resources. Indeed, while in safe mode, the NameNode does not issue any block-replication or deletion instructions to DataNodes.

Safe mode is exited when the minimal replication condition is reached, plus an extension time of 30 seconds. The minimal replication condition is when 99.9% of the blocks in the whole filesystem meet their minimum replication level(set by dfs.replication.min, defaults to 1).

Property name Type Default value Description
dfs.replication.min int 1 The min num of replicas that have to be written for a write to be successful.
dfs.safemode.threshold.pct float 0.999 The proportion of blocks in the system that must meet the min replication level defined by dfs.replication.min before the NameNode will leave safe mode. Setting this value to 0 or less forces the NameNode not to start in safe mode. Setting this value to 1 means the NameNode will never leave safe mode.
dfs.safemode.extension int 30000 (milliseconds) The time to extend safe mode after the min replication condition defined by dfs.safemode.threshold.pct has been satisfied. For small clusters, it can be set to 0.

Force Hadoop to leave safe mode, use

$hadoop dfsadmin -safemode leave
Safe mode is OFF

 Force Hadoop to enter safe mode, use

$hadoop dfsadmin -safemode enter
Safe mode is ON

 To see if the NameNode is in safe mode, use NameNode web UI or use

$hadoop dfsadmin -safemode get
Safe mode is ON

猜你喜欢

转载自puffsun.iteye.com/blog/1901048