Filebeat outputs multiple logs to different Kafka topics

Link to the original text of this article: https://blog.csdn.net/xzk9381/article/details/114324351

Usually when using Filebeat on a physical machine to collect logs and output them to Kafka, multiple filebeat configuration files are written and then multiple filebeat processes are started to collect logs in different paths and push them to different topics. Then if you write all the log paths to a filebeat configuration file, then you need to set Topic according to different logs.

In fact, logstash can also achieve this function. But here only demonstrate the implementation on Filebeat. The steps and explanations are as follows:

If you want to collect different logs into different indexes, you can refer to another article of mine: https://blog.csdn.net/xzk9381/article/details/109535450

  1. For example, there are now the following three log files, which need to be output to different topics:
access.log  ---->  Topic:topic-for-access-log
error.log   ---->  Topic:topic-for-error-log
info.log    ---->  Topic:topic-for-info-log
  1. Create three topics in Kafka:
[@localhost ~]# for type in access error info; do /opt/kafka_cluster/kafka/bin/kafka-topics.sh --create --topic topic-for-${type}-log --zookeeper 10.16.12.204:2181 --partitions 1 --replication-factor 1; done
Created topic topic-for-access-log.
Created topic topic-for-error-log.
Created topic topic-for-info-log.
  1. Check whether the corresponding topic already exists in zookeeper:
[@localhost ~]# /opt/kafka_cluster/zookeeper/bin/zkCli.sh
Connecting to localhost:2181
[zk: localhost:2181(CONNECTED) 0] ls /brokers/topics
[__consumer_offsets, topic-for-access-log, topic-for-error-log, topic-for-info-log]
  1. Write the following Filebeat configuration file:
filebeat.idle_timeout: 2s
filebeat.name: filebeat-shiper
filebeat.spool_zie: 50000

filebeat.inputs:                      # 从这里开始定义每个日志的路径、类型、收集方式等信息
- type: log                           # 指定收集的类型为 log
  paths:
   - /opt/log_test/access.log         # 设置 access.log 的路径
  fields:                             # 设置一个 fields,用于标记这个日志 
    topic: topic-for-access-log       # 为 fields 设置一个关键字 topic,值为 kafka 中已经设置好的 topic 名称
  enabled: true
  backoff: 1s
  backoff_factor: 2
  close_inactive: 1h
  encoding: plain
  harvester_buffer_size: 262144
  max_backoff: 10s
  max_bytes: 10485760
  scan_frequency: 10s
  tail_lines: true
- type: log
  paths:
   - /opt/log_test/error.log          # 设置 error.log 的路径
  fields:                             # 设置一个 fields,用于标记这个日志 
    topic: topic-for-error-log        # 为 fields 设置一个关键字 topic,值为 kafka 中已经设置好的 topic 名称
  enabled: true
  backoff: 1s
  backoff_factor: 2
  close_inactive: 1h
  encoding: plain
  harvester_buffer_size: 262144
  max_backoff: 10s
  max_bytes: 10485760
  scan_frequency: 10s
  tail_lines: true
- type: log
  paths:
   - /opt/log_test/info.log          # 设置 info.log 的路径
  fields:                            # 设置一个 fields,用于标记这个日志 
    topic: topic-for-info-log        # 为 fields 设置一个关键字 topic,值为 kafka 中已经设置好的 topic 名称
  enabled: true
  backoff: 1s
  backoff_factor: 2
  close_inactive: 1h
  encoding: plain
  harvester_buffer_size: 262144
  max_backoff: 10s
  max_bytes: 10485760
  scan_frequency: 10s
  tail_lines: true

output.kafka:                       # 指定输出到 Kafka
  bulk_flush_frequency: 0
  bulk_max_size: 2048
  codec.format:
    string: '%{[message]}'
  compression: gzip
  compression_level: 4
  hosts:
  - 10.16.12.204:9092               # 指定 Kafka 的地址,如果是集群则写集群的地址
  max_message_bytes: 10485760
  partition.round_robin:
    reachable_only: true
  required_acks: 1
  topic: '%{
    
    [fields.topic]}'        # 根据每个日志设置的 fields.topic 来输出到不同的 topic
  workers: 4

setup.ilm.enabled: false
  1. Write the following to the access, error, and info log files:
echo "this is access log" > /opt/log_test/access.log
echo "this is error log" > /opt/log_test/error.log
echo "this is info log" > /opt/log_test/info.log
  1. Start Filebeat:
/opt/filebeat-7.3.0/filebeat run -c /opt/filebeat-7.3.0/conf/test.yaml -httpprof 0.0.0.0:15502 -path.logs /opt/filebeat-7.3.0/logs/filebeat_15502 -e
  1. Connect to Kafka cluster consumption log:
[@k8s-master1 ~]# /opt/kafka_cluster/kafka/bin/kafka-console-consumer.sh --topic topic-for-access-log --bootstrap-server 10.16.12.204:9092 --from-beginning
this is access log

[@k8s-master1 ~]# /opt/kafka_cluster/kafka/bin/kafka-console-consumer.sh --topic topic-for-error-log --bootstrap-server 10.16.12.204:9092 --from-beginning
this is error log

[@k8s-master1 ~]# /opt/kafka_cluster/kafka/bin/kafka-console-consumer.sh --topic topic-for-info-log --bootstrap-server 10.16.12.204:9092 --from-beginning
this is info log

You can see that the logs of different paths have been collected in different topics of Kafka.

Link to the original text of this article: https://blog.csdn.net/xzk9381/article/details/114324351

Guess you like

Origin blog.csdn.net/xzk9381/article/details/114324351