CM Cluster Management Considerations

1. Node offline:

The first problem: a large batch of dead blocks (more than 2000)

Reason: When a node is offline, 10 nodes are dropped in one block.

Solution: When 3 copies are offline, it is best to use 2 and 2 machines. It also depends on the progress on the nn (50070 main page) page. After reaching 0, delete the node.


Text Explanation:

Decommission Datanode is to remove Datanode from HDFS cluster. 

The Datanode stores the actual data, so the data on the Datanode needs to be migrated to other machines when Decommissioning the Datanode. When offline, Datanode will perform the following operations: 1: Calculate block information 2: Delete block 3: Copy block 4: Verify block information

Operation step 
1: On the Namenode, add the machine name of the Datanode that needs Decommission to the file specified by dfs.hosts.exclude (the configuration item is in hdfs-site.xml), that is, tell the Namenode which Datanodes are to be decommissioned. 
If it is hdfs-site.xmlnot found dfs.hosts.exclude, then manually add the following content to hdfs-site.xmlit, and then write the machine that needs Decommission to the file /etc/hadoop/conf/dfs.exclude.

<property> 
<name>dfs.hosts.exclude</name> 
<value>/etc/hadoop/conf/dfs.exclude</value> 
</property>

exclude 例子: pslave1 pslave2 pslave3

2. Start Decommission with the following command:

hdfs dfsadmin -refreshNodes

After refreshing, you can see the Decommission In Progress node on the hfs page

hadoop dfsadmin -report: also view

note: If you accelerate offline, you can reduce the number of blocks copied


Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324513079&siteId=291194637