HDFS WEB UI parameter parsing

background    

        In the hdfs management interface of hadoop, there is a summary display box, which shows the current state of the hdfs, which is very helpful for the capacity management of hdfs. The following is an analysis of the specific content displayed.

 

CLI output

        The data in the legend can also be viewed through the hdfs dfsadmin -report command, as well as the detailed data of a single node

 

        It may be more intuitive to look at the following figure:

        

Configured Capacity: 这是HDFS的总容量,是通过Configured Capacity = Total Disk Space - Reserved Space公式计算出来的,Reserved space是一个预留给OS层面操作的空间,Reserved space空间可以通过dfs.datanode.du.reserved=0在hdfs-site.xml文件中进行配置。
Present Capacity: 这是分配完metadata和open-blocks的空间后(Non DFS Used space),实际可用来存储文件的总容量,当DataNodes给NameNode发送report的时候,是发送的本机的Present Capacity容量,Present Capacity = DFS Used + DFS Remaining
DFS Remaining: 这是HDFS上实际可以使用的总容量,你必须除以复制因子,才是你实际可以使用的容量
DFS Used: 这是HDFS已经被使用的容量,这个值的大小是复制因子复制后的值,dfs.replication=3,你的真实数据使用容量是DFS Used/3
DFS Used%: 以百分比显示HDFS已经使用的容量
Non DFS Used: Non DFS Used是在任何DataNodes节点上,不在配置的dfs.datanode.data.dir里面的数据,How much of Configured Capacity is being occupied for Non DFS Use。Configured Capacity - DFS Remaining - DFS Used,
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: 0
Missing blocks (with replication factor 1): 0

数据检验:
Present Capacity = Sum of [ DFS Used + DFS Remaining ] for all the Data Nodes

 

legend

    ps: HDP connection

 

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324956913&siteId=291194637