Link to the original text of this article: https://blog.csdn.net/xzk9381/article/details/114324351
Usually when using Filebeat on a physical machine to collect logs and output them to Kafka, multiple filebeat configuration files are written and then multiple filebeat processes are started to collect logs in different paths and push them to different topics. Then if you write all the log paths to a filebeat configuration file, then you need to set Topic according to different logs.
In fact, logstash can also achieve this function. But here only demonstrate the implementation on Filebeat. The steps and explanations are as follows:
If you want to collect different logs into different indexes, you can refer to another article of mine: https://blog.csdn.net/xzk9381/article/details/109535450
- For example, there are now the following three log files, which need to be output to different topics:
access.log ----> Topic:topic-for-access-log
error.log ----> Topic:topic-for-error-log
info.log ----> Topic:topic-for-info-log
- Create three topics in Kafka:
[@localhost ~]# for type in access error info; do /opt/kafka_cluster/kafka/bin/kafka-topics.sh --create --topic topic-for-${type}-log --zookeeper 10.16.12.204:2181 --partitions 1 --replication-factor 1; done
Created topic topic-for-access-log.
Created topic topic-for-error-log.
Created topic topic-for-info-log.
- Check whether the corresponding topic already exists in zookeeper:
[@localhost ~]# /opt/kafka_cluster/zookeeper/bin/zkCli.sh
Connecting to localhost:2181
[zk: localhost:2181(CONNECTED) 0] ls /brokers/topics
[__consumer_offsets, topic-for-access-log, topic-for-error-log, topic-for-info-log]
- Write the following Filebeat configuration file:
filebeat.idle_timeout: 2s
filebeat.name: filebeat-shiper
filebeat.spool_zie: 50000
filebeat.inputs: # 从这里开始定义每个日志的路径、类型、收集方式等信息
- type: log # 指定收集的类型为 log
paths:
- /opt/log_test/access.log # 设置 access.log 的路径
fields: # 设置一个 fields,用于标记这个日志
topic: topic-for-access-log # 为 fields 设置一个关键字 topic,值为 kafka 中已经设置好的 topic 名称
enabled: true
backoff: 1s
backoff_factor: 2
close_inactive: 1h
encoding: plain
harvester_buffer_size: 262144
max_backoff: 10s
max_bytes: 10485760
scan_frequency: 10s
tail_lines: true
- type: log
paths:
- /opt/log_test/error.log # 设置 error.log 的路径
fields: # 设置一个 fields,用于标记这个日志
topic: topic-for-error-log # 为 fields 设置一个关键字 topic,值为 kafka 中已经设置好的 topic 名称
enabled: true
backoff: 1s
backoff_factor: 2
close_inactive: 1h
encoding: plain
harvester_buffer_size: 262144
max_backoff: 10s
max_bytes: 10485760
scan_frequency: 10s
tail_lines: true
- type: log
paths:
- /opt/log_test/info.log # 设置 info.log 的路径
fields: # 设置一个 fields,用于标记这个日志
topic: topic-for-info-log # 为 fields 设置一个关键字 topic,值为 kafka 中已经设置好的 topic 名称
enabled: true
backoff: 1s
backoff_factor: 2
close_inactive: 1h
encoding: plain
harvester_buffer_size: 262144
max_backoff: 10s
max_bytes: 10485760
scan_frequency: 10s
tail_lines: true
output.kafka: # 指定输出到 Kafka
bulk_flush_frequency: 0
bulk_max_size: 2048
codec.format:
string: '%{[message]}'
compression: gzip
compression_level: 4
hosts:
- 10.16.12.204:9092 # 指定 Kafka 的地址,如果是集群则写集群的地址
max_message_bytes: 10485760
partition.round_robin:
reachable_only: true
required_acks: 1
topic: '%{
[fields.topic]}' # 根据每个日志设置的 fields.topic 来输出到不同的 topic
workers: 4
setup.ilm.enabled: false
- Write the following to the access, error, and info log files:
echo "this is access log" > /opt/log_test/access.log
echo "this is error log" > /opt/log_test/error.log
echo "this is info log" > /opt/log_test/info.log
- Start Filebeat:
/opt/filebeat-7.3.0/filebeat run -c /opt/filebeat-7.3.0/conf/test.yaml -httpprof 0.0.0.0:15502 -path.logs /opt/filebeat-7.3.0/logs/filebeat_15502 -e
- Connect to Kafka cluster consumption log:
[@k8s-master1 ~]# /opt/kafka_cluster/kafka/bin/kafka-console-consumer.sh --topic topic-for-access-log --bootstrap-server 10.16.12.204:9092 --from-beginning
this is access log
[@k8s-master1 ~]# /opt/kafka_cluster/kafka/bin/kafka-console-consumer.sh --topic topic-for-error-log --bootstrap-server 10.16.12.204:9092 --from-beginning
this is error log
[@k8s-master1 ~]# /opt/kafka_cluster/kafka/bin/kafka-console-consumer.sh --topic topic-for-info-log --bootstrap-server 10.16.12.204:9092 --from-beginning
this is info log
You can see that the logs of different paths have been collected in different topics of Kafka.
Link to the original text of this article: https://blog.csdn.net/xzk9381/article/details/114324351