采集指标的地址(以HBase39集群的HDFS为例):http://xxxxxx:50070/jmx?qry=Hadoop:service=NameNode,name=*
一、NameNode文件系统详细信息(核心指标)
Hadoop:service=NameNode,name=FSNamesystem
Hadoop:service=NameNode,name=FSNamesystemStat
Metric | Type(GAUGE,COUNTER) | 类型 | 业务意义 | 备注 |
---|---|---|---|---|
MissingBlocks | GAUGE |
Current number of missing blocks | ||
ExpiredHeartbeats | GAUGE |
Total number of expired heartbeats | ||
TransactionsSinceLastCheckpoint | GAUGE |
Total number of transactions since last checkpoint | ||
TransactionsSinceLastLogRoll | GAUGE |
Total number of transactions since last edit log roll | ||
LastCheckpointTime | GAUGE |
ms | Time in milliseconds since epoch of last checkpoint | |
CapacityTotal |
GAUGE |
Byte | Current raw capacity of DataNodes in bytes | |
CapacityUsed | GAUGE | Byte | Current used capacity across all DataNodes in bytes | |
CapacityRemaining |
GAUGE | Byte | Current remaining capacity in bytes | |
TotalLoad |
GAUGE | Current number of connections | ||
SnapshottableDirectories | GAUGE | Current number of snapshottable directories | ||
Snapshots | GAUGE | Current number of snapshots | ||
BlocksTotal |
GAUGE | 块数量 | ||
FilesTotal |
GAUGE | 文件数量 | ||
NumLiveDataNodes |
GAUGE | 活跃的DN数量 | ||
NumDeadDataNodes |
GAUGE | 死掉的DN数量 | ||
NumDecomLiveDataNodes |
GAUGE | 活跃的DN中处于“ Decommission”的数量 | ||
NumDecomDeadDataNodes |
GAUGE | 死亡的DN中处于“ Decommission”的数量 |
二、NameNode JvmMetrics详细信息(核心指标)
Hadoop:service=NameNode,name=JvmMetrics
Metric | Type(GAUGE,COUNTER) | 类型 | 业务意义 | 备注 |
---|---|---|---|---|
GcCountParNew |
COUNTER |
新生代GC次数 | ||
GcTimeMillisParNew |
COUNTER |
ms | 新生代GC耗时(ms) | |
GcCountConcurrentMarkSweep |
COUNTER |
老年代GC次数 | ||
GcTimeMillisConcurrentMarkSweep |
COUNTER |
ms | 老年代GC耗时(ms) | |
GcCount |
COUNTER |
总的GC次数 | ||
GcTimeMillis |
COUNTER |
ms | 总的GC耗时(ms) |
三、NameNode操作信息(核心指标)
Hadoop:service=NameNode,name=NameNodeActivity
Metric | Type(GAUGE,COUNTER) | 类型 | 业务意义 | 备注 |
---|---|---|---|---|
CreateFileOps |
COUNTER |
Total number of files created | ||
FilesCreated |
COUNTER | Total number of files and directories created by create or mkdir operations | ||
FilesAppended |
COUNTER | Total number of files appended | ||
GetBlockLocations |
COUNTER | Total number of getBlockLocations operations | ||
FilesRenamed |
COUNTER | Total number of rename operations (NOT number of files/dirs renamed) | ||
GetListingOps | COUNTER | Total number of directory listing operations | ||
DeleteFileOps | COUNTER | Total number of delete operations | ||
FilesDeleted | COUNTER | Total number of files and directories deleted by delete or rename operations | ||
FileInfoOps | COUNTER | Total number of getFileInfo and getLinkFileInfo operations | ||
AddBlockOps | COUNTER | Total number of addBlock operations succeeded | ||
GetAdditionalDatanodeOps | COUNTER | Total number of getAdditionalDatanode operations | ||
CreateSymlinkOps | COUNTER | Total number of createSymlink operations | ||
GetLinkTargetOps | COUNTER | Total number of getLinkTarget operations | ||
FilesInGetListingOps | COUNTER | Total number of files and directories listed by directory listing operations | ||
AllowSnapshotOps | COUNTER | Total number of allowSnapshot operations | ||
DisallowSnapshotOps | COUNTER | Total number of disallowSnapshot operations | ||
CreateSnapshotOps | COUNTER | Total number of createSnapshot operations | ||
DeleteSnapshotOps | COUNTER | Total number of deleteSnapshot operations | ||
RenameSnapshotOps | COUNTER | Total number of renameSnapshot operations | ||
ListSnapshottableDirOps | COUNTER | Total number of snapshottableDirectoryStatus operations | ||
SnapshotDiffReportOps | COUNTER | Total number of getSnapshotDiffReport operations | ||
TransactionsNumOps | COUNTER | Total number of Journal transactions | ||
TransactionsAvgTime | GAUGE | ms | Average time of Journal transactions in milliseconds | |
SyncsNumOps | COUNTER | Total number of Journal syncs | ||
SyncsAvgTime | GAUGE | ms | Average time of Journal syncs in milliseconds | |
TransactionsBatchedInSync | COUNTER | Total number of Journal transactions batched in sync | ||
BlockReportNumOps | COUNTER | Total number of processing block reports from DataNode | ||
BlockReportAvgTime | GAUGE | ms | Average time of processing block reports in milliseconds | |
CacheReportNumOps | COUNTER | Total number of processing cache reports from DataNode | ||
CacheReportAvgTime | GAUGE | ms | Average time of processing cache reports in milliseconds | |
SafeModeTime | GAUGE | ms | The interval between FSNameSystem starts and the last time safemode leaves in milliseconds. | |
FsImageLoadTime | GAUGE | Time loading FS Image at startup in milliseconds | ||
GetEditNumOps | COUNTER | Total number of edits downloads from SecondaryNameNode | ||
GetEditAvgTime | GAUGE | ms | Average edits download time in milliseconds | |
GetImageNumOps | COUNTER | Total number of fsimage downloads from SecondaryNameNode | ||
GetImageAvgTime | GAUGE | ms | Average fsimage download time in milliseconds | |
PutImageNumOps | COUNTER | Total number of fsimage uploads to SecondaryNameNode | ||
PutImageAvgTime | GAUGE | ms | Average fsimage upload time in milliseconds |
四、NameNode RPC详细信息(非核心指标,暂不采集)
hadoop:service=NameNode,name=RpcDetailedActivityForPort*
Metric | Type(GAUGE,COUNTER) | 类型 | 业务意义 | 备注 |
---|---|---|---|---|
SetSafeModeNumOps |
COUNTER |
|||
SetSafeModeAvgTime |
GAUGE | ms | ||
GetFileInfoNumOps |
COUNTER | Total number of getFileInfo and getLinkFileInfo operations | ||
GetFileInfoAvgTime |
GAUGE | ms | ||
GetBlockLocationsNumOps |
COUNTER | |||
GetBlockLocationsAvgTime |
GAUGE | ms | ||
GetListingNumOps |
COUNTER | |||
GetListingAvgTime |
GAUGE | ms | ||
GetContentSummaryNumOps |
COUNTER | |||
GetContentSummaryAvgTime |
GAUGE | ms | ||
MkdirsNumOps |
COUNTER | |||
MkdirsAvgTime |
GAUGE | ms | ||
SetPermissionNumOps |
COUNTER | |||
SetPermissionAvgTime |
GAUGE | ms | ||
CreateNumOps |
COUNTER | |||
CreateAvgTime |
GAUGE | ms | ||
AddBlockNumOps |
COUNTER | |||
AddBlockAvgTime |
GAUGE | ms | ||
GetServerDefaultsNumOps |
COUNTER | |||
GetServerDefaultsAvgTime |
GAUGE | ms | ||
CompleteNumOps |
COUNTER | |||
CompleteAvgTime |
GAUGE | ms | ||
DeleteNumOps |
COUNTER | |||
DeleteAvgTime |
GAUGE | ms | ||
AppendNumOps |
COUNTER | |||
AppendAvgTime |
GAUGE | ms | ||
RenameNumOps |
COUNTER | |||
RenameAvgTime |
GAUGE | ms | ||
FileNotFoundExceptionNumOps |
COUNTER | |||
FileNotFoundExceptionAvgTime |
GAUGE | ms | ||
SetOwnerNumOps |
COUNTER | |||
SetOwnerAvgTime |
GAUGE | ms | ||