Hadoop 3.0 disk balancer (diskbalancer) function and usage introduction

Hadoop 3.0 disk balancer (diskbalancer) function and usage introduction

Memory Large data memory past historical data large
in HDFS, DataNodes data blocks stored in a local file system directory, directories can configure specific parameters hdfs-site.xml dfs.datanode.data.dir inside. In a typical installation configuration, multiple directories are generally configured, and these directories are configured to different devices, such as different HDD (the full name of HDD is Hard Disk Drive) and SSD (full name of Solid State Drives, It is the solid-state drive that we are familiar with).

When we write a new data block to HDFS, DataNode will use the volume selection strategy to select the storage location for this block. Currently Hadoop supports two volume selection strategies: round-robin and available space (see: HDFS-1804 for details), which can be set by the dfs.datanode.fsdataset.volume.choosing.policy parameter.

The round-robin strategy distributes new blocks evenly on the available disks; the available-space strategy prioritizes writing data to the disk with the largest available space (calculated by percentage). As shown in the figure below:

Hadoop 3.0 disk balancer (diskbalancer) function and usage introduction

By default, DataNode uses a round-robin-based strategy to write new data blocks. However, in a long-running cluster, due to large-scale file deletion in HDFS or by adding new disks to the DataNode, the data stored on different disks in the same DataNode will still be very uneven. Even if you are using a strategy based on available space, unbalanced volumes can still lead to inefficient disk I/O. For example, all newly added data blocks will be written to the newly added disk. During this period, other disks will be idle, so that the new disk will be the bottleneck of the entire system.

Recently, the Apache Hadoop community has developed several offline scripts (see HDFS-1312 or hadoop-balancer) to alleviate the problem of data imbalance. However, these scripts are outside the HDFS code base. When executing these scripts to move data between different disks, the DataNode needs to be shut down. As a result, HDFS-1312 also introduced an online disk balancer designed to rebalance the disk data on the running DataNode based on various indicators. Similar to the existing HDFS balancer, the HDFS disk balancer runs in the form of threads in the DataNode and moves data between volumes of the same storage type. We should note that the HDFS disk balancer introduced in this article moves data between different disks in the same DataNode, while the previous HDFS balancer moves data between different DataNodes.

In the following article, I will introduce how to use this new feature.

Let's explore this useful feature step by step through an example. First, make sure that the dfs.disk.balancer.enabled parameter on all DataNodes is set to true. In this example, our DataNode has mounted a disk (/mnt/disk1), and now we mount a new disk (/mnt/disk2) to this DataNode, we use the df command to display the disk usage:


# df -h
….
/var/disk1      5.8G  3.6G  1.9G  66% /mnt/disk1
/var/disk2      5.8G   13M  5.5G   1% /mnt/disk2

It can be seen from the above output that the usage rate of the two disks is very uneven, so let's balance the data of these two disks.

A typical disk balancer task involves three steps (via the HDFS diskbalancer command): plan, execute and query. In the first step, the HDFS client reads the necessary information of the specified DataNode from the NameNode to generate an execution plan:


# hdfs diskbalancer -plan lei-dn-3.example.org
16/08/19 18:04:01 INFO planner.GreedyPlanner: Starting plan for Node : lei-dn-3.example.org:20001
16/08/19 18:04:01 INFO planner.GreedyPlanner: Disk Volume set 03922eb1-63af-4a16-bafe-fde772aee2fa Type : DISK plan completed.
16/08/19 18:04:01 INFO planner.GreedyPlanner: Compute Plan for Node : lei-dn-3.example.org:20001 took 5 ms
16/08/19 18:04:01 INFO command.Command: Writing plan to : /system/diskbalancer/2016-Aug-19-18-04-01

As can be seen from the above output, the HDFS disk balancer uses the disk usage information reported by the DataNode to the NameNode and combined with the planner to calculate the steps of the data movement plan on the specified DataNode. Each step specifies the source volume and target volume of the data to be moved , And the amount of data expected to be moved.

As of this writing, HDFS only supports GreedyPlanner, which continuously moves data from the most commonly used device to the least used device until all the data is evenly distributed on all devices. The user can also specify the space utilization threshold when using the plan command, that is, if the difference in space utilization is lower than this threshold, the planner considers that the disk has reached a balance. Of course, we can also limit the I/O during disk data movement by using the --bandwidth parameter.

The file content format generated by the disk balance execution plan is Json and stored on HDFS. By default, these files are stored under the /system/diskbalancer directory:


# hdfs dfs -ls /system/diskbalancer/2016-Aug-19-18-04-01
Found 2 items
-rw-r--r--   3 hdfs supergroup       1955 2016-08-19 18:04 /system/diskbalancer/2016-Aug-19-18-04-01/lei-dn-3.example.org.before.json
-rw-r--r--   3 hdfs supergroup        908 2016-08-19 18:04 /system/diskbalancer/2016-Aug-19-18-04-01/lei-dn-3.example.org.plan.json

The generated plan can be executed on the DataNode with the following command:


$ hdfs diskbalancer -execute /system/diskbalancer/2016-Aug-17-17-03-56/172.26.10.16.plan.json
16/08/17 17:22:08 INFO command.Command: Executing "execute plan" command

This command submits the plan in the JSON to the DataNode, and the DataNode will start a thread called BlockMover to execute the plan. We can use the query command to query the status of the diskbalancer task on the DataNode:


# hdfs diskbalancer -query lei-dn-3:20001
16/08/19 21:08:04 INFO command.Command: Executing "query plan" command.
Plan File: /system/diskbalancer/2016-Aug-19-18-04-01/lei-dn-3.example.org.plan.json
Plan ID: ff735b410579b2bbe15352a14bf001396f22344f7ed5fe24481ac133ce6de65fe5d721e223b08a861245be033a82469d2ce943aac84d9a111b542e6c63b40e75
Result: PLAN_DONE

The PLAN_DONE output in the above result indicates that the disk-balancing task has been executed. In order to verify the effectiveness of the disk balancer, we can use the df -h command to view the space usage of each disk:


# df -h
Filesystem      Size  Used Avail Use% Mounted on
….
/var/disk1      5.8G  2.1G  3.5G  37% /mnt/disk1
/var/disk2      5.8G  1.6G  4.0G  29% /mnt/disk2

The above results prove that the disk balancer successfully reduced the difference between /var/disk1 and /var/disk2 to below 10%, indicating that the task is complete!

Original text: https://blog.cloudera.com/how-to-use-the-new-hdfs-intra-datanode-disk-balancer-in-apache-hadoop/

Guess you like

Origin blog.51cto.com/15127589/2677861