Flume的监控(Monitor)

使用Flume实时收集日志的过程中,尽管有事务机制保证数据不丢失,但仍然需要时刻关注Source、Channel、Sink之间的消息传输是否正常,比如,SouceàChannel传输了多少消息,ChannelàSink又传输了多少,两处的消息量是否偏差过大等等。

Flume为我们提供了Monitor的机制:http://flume.apache.org/FlumeUserGuide.html#monitoring 通过Reporting的方式,把过程中的Counter都打印出来。一共有4种Reporting方式,JMX Reporting、Ganglia Reporting、JSON Reporting、Custom Reporting, 这里以最简单的JSON Reporting为例。

在启动Flume Agent时候,增加两个参数:

flume-ng agent -n agent_lxw1234 –conf . -f agent_lxw1234_file_2_kafka.properties -Dflume.monitoring.type=http -Dflume.monitoring.port=34545

flume.monitoring.type=http 指定了Reporting的方式为http,flume.monitoring.port 指定了http服务的端口号。

启动后,会在Flume Agent所在的机器上启动http服务,http://<hostname>:34545/metrics 打开该地址后,返回一段JSON:

 
  1. {
  2. "SINK.sink_lxw1234":{
  3. "ConnectionCreatedCount":"0",
  4. "BatchCompleteCount":"0",
  5. "BatchEmptyCount":"72",
  6. "EventDrainAttemptCount":"0",
  7. "StartTime":"1518400034824",
  8. "BatchUnderflowCount":"43",
  9. "ConnectionFailedCount":"0",
  10. "ConnectionClosedCount":"0",
  11. "Type":"SINK",
  12. "RollbackCount":"0",
  13. "EventDrainSuccessCount":"244",
  14. "KafkaEventSendTimer":"531",
  15. "StopTime":"0"
  16. },
  17. "CHANNEL.file_channel_lxw1234":{
  18. "Unhealthy":"0",
  19. "ChannelSize":"0",
  20. "EventTakeAttemptCount":"359",
  21. "StartTime":"1518400034141",
  22. "Open":"true",
  23. "CheckpointWriteErrorCount":"0",
  24. "ChannelCapacity":"10000",
  25. "ChannelFillPercentage":"0.0",
  26. "EventTakeErrorCount":"0",
  27. "Type":"CHANNEL",
  28. "EventTakeSuccessCount":"244",
  29. "Closed":"0",
  30. "CheckpointBackupWriteErrorCount":"0",
  31. "EventPutAttemptCount":"244",
  32. "EventPutSuccessCount":"244",
  33. "EventPutErrorCount":"0",
  34. "StopTime":"0"
  35. },
  36. "SOURCE.source_lxw1234":{
  37. "EventReceivedCount":"244",
  38. "AppendBatchAcceptedCount":"45",
  39. "Type":"SOURCE",
  40. "AppendReceivedCount":"0",
  41. "EventAcceptedCount":"244",
  42. "StartTime":"1518400034767",
  43. "AppendAcceptedCount":"0",
  44. "OpenConnectionCount":"0",
  45. "AppendBatchReceivedCount":"45",
  46. "StopTime":"0"
  47. }
  48. }

我的例子中,Source为TAILDIR,Channel为FileChannel,Sink为Kafka Sink。三个JSON对象分别打印出三个组件的Counter信息。

比如:SOURCE中”EventReceivedCount”:”244″ 表示SOURCE从文件中读取到244条消息;

CHANNEL中”EventPutSuccessCount”:”244″ 表示成功存放244条消息;

SINK中”EventDrainSuccessCount”:”244″ 表示成功向Kafka发送了244条消息。

猜你喜欢

转载自blog.csdn.net/mnasd/article/details/81947543