NameNode自定义监控

采集指标的地址(以HBase39集群的HDFS为例):http://xxxxxx:50070/jmx?qry=Hadoop:service=NameNode,name=*


一、NameNode文件系统详细信息(核心指标

Hadoop:service=NameNode,name=FSNamesystem
Hadoop:service=NameNode,name=FSNamesystemStat
Metric Type(GAUGE,COUNTER) 类型 业务意义 备注
MissingBlocks
GAUGE
  Current number of missing blocks  
ExpiredHeartbeats
GAUGE
  Total number of expired heartbeats  
TransactionsSinceLastCheckpoint
GAUGE
  Total number of transactions since last checkpoint  
TransactionsSinceLastLogRoll
GAUGE
  Total number of transactions since last edit log roll  
LastCheckpointTime
GAUGE
ms Time in milliseconds since epoch of last checkpoint  
CapacityTotal
GAUGE
Byte  Current raw capacity of DataNodes in bytes  
CapacityUsed GAUGE Byte Current used capacity across all DataNodes in bytes  
CapacityRemaining
GAUGE Byte Current remaining capacity in bytes  
TotalLoad
GAUGE   Current number of connections  
SnapshottableDirectories GAUGE   Current number of snapshottable directories  
Snapshots GAUGE   Current number of snapshots  
BlocksTotal
 
GAUGE    块数量  
FilesTotal
 
GAUGE   文件数量   
NumLiveDataNodes
 
GAUGE    活跃的DN数量   
NumDeadDataNodes
GAUGE    死掉的DN数量   
 
NumDecomLiveDataNodes
GAUGE   活跃的DN中处于“ Decommission”的数量   
NumDecomDeadDataNodes
 
GAUGE    死亡的DN中处于“ Decommission”的数量  

二、NameNode JvmMetrics详细信息(核心指标

Hadoop:service=NameNode,name=JvmMetrics

Metric Type(GAUGE,COUNTER) 类型 业务意义 备注
GcCountParNew
COUNTER
  新生代GC次数  
GcTimeMillisParNew
COUNTER
ms 新生代GC耗时(ms)  
GcCountConcurrentMarkSweep
COUNTER
  老年代GC次数  
GcTimeMillisConcurrentMarkSweep
COUNTER
ms 老年代GC耗时(ms)  
GcCount
COUNTER
  总的GC次数  
GcTimeMillis
COUNTER
ms 总的GC耗时(ms)  

三、NameNode操作信息(核心指标

Hadoop:service=NameNode,name=NameNodeActivity
Metric Type(GAUGE,COUNTER) 类型 业务意义 备注
CreateFileOps
COUNTER
  Total number of files created  
FilesCreated
COUNTER    Total number of files and directories created by create or mkdir operations   
FilesAppended 
COUNTER    Total number of files appended   
GetBlockLocations
COUNTER    Total number of getBlockLocations operations   
FilesRenamed
COUNTER    Total number of rename operations (NOT number of files/dirs renamed)   
GetListingOps   COUNTER   Total number of directory listing operations   
DeleteFileOps   COUNTER   Total number of delete operations   
FilesDeleted   COUNTER   Total number of files and directories deleted by delete or rename operations   
FileInfoOps   COUNTER   Total number of getFileInfo and getLinkFileInfo operations   
AddBlockOps  COUNTER    Total number of addBlock operations succeeded   
GetAdditionalDatanodeOps  COUNTER    Total number of getAdditionalDatanode operations   
CreateSymlinkOps  COUNTER    Total number of createSymlink operations  
GetLinkTargetOps  COUNTER    Total number of getLinkTarget operations   
FilesInGetListingOps   COUNTER   Total number of files and directories listed by directory listing operations   
AllowSnapshotOps   COUNTER   Total number of allowSnapshot operations   
DisallowSnapshotOps COUNTER   Total number of disallowSnapshot operations  
CreateSnapshotOps COUNTER   Total number of createSnapshot operations  
DeleteSnapshotOps COUNTER   Total number of deleteSnapshot operations  
RenameSnapshotOps COUNTER   Total number of renameSnapshot operations  
ListSnapshottableDirOps COUNTER   Total number of snapshottableDirectoryStatus operations  
SnapshotDiffReportOps COUNTER   Total number of getSnapshotDiffReport operations  
TransactionsNumOps COUNTER   Total number of Journal transactions  
TransactionsAvgTime GAUGE ms Average time of Journal transactions in milliseconds  
SyncsNumOps COUNTER   Total number of Journal syncs  
SyncsAvgTime GAUGE ms Average time of Journal syncs in milliseconds  
TransactionsBatchedInSync COUNTER   Total number of Journal transactions batched in sync  
BlockReportNumOps COUNTER   Total number of processing block reports from DataNode  
BlockReportAvgTime GAUGE ms Average time of processing block reports in milliseconds  
CacheReportNumOps COUNTER   Total number of processing cache reports from DataNode  
CacheReportAvgTime GAUGE ms Average time of processing cache reports in milliseconds  
SafeModeTime GAUGE ms The interval between FSNameSystem starts and the last time safemode leaves in milliseconds.    
FsImageLoadTime GAUGE   Time loading FS Image at startup in milliseconds  
GetEditNumOps COUNTER   Total number of edits downloads from SecondaryNameNode  
GetEditAvgTime GAUGE ms Average edits download time in milliseconds  
GetImageNumOps COUNTER   Total number of fsimage downloads from SecondaryNameNode  
GetImageAvgTime GAUGE ms Average fsimage download time in milliseconds  
PutImageNumOps COUNTER   Total number of fsimage uploads to SecondaryNameNode  
PutImageAvgTime GAUGE ms Average fsimage upload time in milliseconds  

四、NameNode RPC详细信息(非核心指标,暂不采集)

hadoop:service=NameNode,name=RpcDetailedActivityForPort*

Metric Type(GAUGE,COUNTER) 类型 业务意义 备注
SetSafeModeNumOps
COUNTER
     
SetSafeModeAvgTime
GAUGE ms    
GetFileInfoNumOps
 COUNTER   Total number of getFileInfo and getLinkFileInfo operations  
GetFileInfoAvgTime
 GAUGE ms    
GetBlockLocationsNumOps
 COUNTER      
GetBlockLocationsAvgTime
 GAUGE ms    
GetListingNumOps
 COUNTER      
GetListingAvgTime
 GAUGE ms    
GetContentSummaryNumOps
 COUNTER      
GetContentSummaryAvgTime
 GAUGE ms    
MkdirsNumOps
 COUNTER      
MkdirsAvgTime
 GAUGE ms    
SetPermissionNumOps
 COUNTER      
SetPermissionAvgTime
 GAUGE ms    
CreateNumOps
 COUNTER      
CreateAvgTime
 GAUGE ms    
 
AddBlockNumOps
 COUNTER      
AddBlockAvgTime
 GAUGE ms    
 
GetServerDefaultsNumOps
 COUNTER      
 
GetServerDefaultsAvgTime
 GAUGE ms    
 
CompleteNumOps
 COUNTER      
CompleteAvgTime
GAUGE ms    
DeleteNumOps
COUNTER      
DeleteAvgTime
GAUGE ms    
AppendNumOps
COUNTER      
AppendAvgTime
GAUGE ms    
RenameNumOps
COUNTER      
RenameAvgTime
GAUGE ms    
FileNotFoundExceptionNumOps
COUNTER      
FileNotFoundExceptionAvgTime
GAUGE ms    
SetOwnerNumOps
COUNTER      
SetOwnerAvgTime
GAUGE ms    
         

猜你喜欢

转载自blog.csdn.net/mnasd/article/details/81029905
今日推荐