Fluentd (tD-agent) log processing

1. What is td-agent

td-agent is a log collector that provides a wealth of plug-ins to adapt to different data sources, output destinations, etc.

In use, we can collect information from various sources through simple configuration and collect logs to different places, first send them to Fluentd, and then Fluentd forwards the information to different places through different plug-ins according to the configuration, such as files , SaaS Platform, database, and even forward to another Fluentd.
insert image description here

2. How to install td-agent

Linux system: centos

2.1 Execute the script

curl -L https://toolbelt.treasuredata.com/sh/install-redhat-td-agent3.sh | sh

2.2 Check if it is installed

rpm -qa|grep td-agent

2.3 Start command

启动td-agent   systemctl start td-agent
启动服务   /etc/init.d/td-agent start
查看服务状态   /etc/init.d/td-agent status
停止服务  /etc/init.d/td-agent stop
重启服务  /etc/init.d/td-agent restart

2.4 Default configuration file path

/etc/td-agent/td-agent.conf

2.5 Default log file path:

/var/log/td-agent/td-agent.log

3. Explanation of terms

source : specify the data source
match : specify the output address
filter : specify an event processing procedure
system : used to set the configuration of the system
label : group output and filter
@include : use it to include other configuration file
plugins in the configuration file : When fluentd collects and sends logs, you need to use plug-ins. Some plug-ins are built-in. To use non-built-in plug-ins, you need to install plug-ins.

4. Configuration file analysis

# Receive events from 20000/tcp
# This is used by log forwarding and the fluent-cat command
<source>
  @type forward
  port 20000
</source>

# http://this.host:8081/myapp.access?json={"event":"data"}
<source>
  @type http
  port 8081
</source>

<source>
  @type tail
  path /root/shell/test.log
  tag myapp.access
</source>

# Match events tagged with "myapp.access" and
# store them to /var/log/td-agent/access.%Y-%m-%d
# Of course, you can control how you partition your data
# with the time_slice_format option.
<match myapp.access>
  @type file
  path /var/log/td-agent/access
</match>

sources Configure the source of the log file

@type : specifies where the configuration file comes from

forward: from another fluent

http: parameters passed from an http request

tail: from a log file

port: The developed data transmission port when reading data transmitted by other machines

path: read data location

tag: The tag of the data, which matches the tag configured by match

match data forwarding configuration

myapp.access: The output tag matches the input tag

@type: output location, can be output to kafak, local file, database, mongo get

path: When outputting to a file, the path of the file. If the output is to other locations, there will be other special configurations, such as the following configuration, because it is forwarded to Kafka, so the match tag is also configured with a lot of configuration about Kafka.

[External link image transfer failed, the source site may have anti-leech mechanism, it is recommended to save the image and upload it directly (img-wU9d5qLk-1642839362691) (img/1642670826444.png)]

The store tag in the store tag, each store represents the storage direction of the data of this tag attribute, we can configure the storage to Kafka, and we can configure the local file storage

<source>
 type tail
 format none

 path /var/log/apache2/access_log
 pos_file /var/log/apache2/access_log.pos
 tag mytail
</source>

5. Some parameter explanations

format:配置表达式,去过滤数据,只有满足format表达式的字符串才能在match中进行store存储。

type tail: tail方式是 Fluentd 内置的输入方式,其原理是不停地从源文件中获取增量日志,与linx命令tail相似,也可以使用其他输入方式如http、forward等输入,也可以使用输入插件,将 tail 改为相应的插件名称 如: type tail_ex  ,注意tail_ex为下划线。

format apache: 指定使用 Fluentd 内置的 Apache 日志解析器。可以自己配置表达式。

path /var/log/apache2/access_log: 指定收集日志文件位置。

Pos_file /var/log/apache2/access_log.pos:强烈建议使用此参数,access_log.pos文件可以自动生成,要注意access_log.pos文件的写入权限,因为要将access_log上次的读取长度写入到该文件,主要保证在fluentd服务宕机重启后能够继续收集,避免日志数据收集丢失,保证数据收集的完整性。
 

6. Configuration file case

6.1. Transfer data to logs and Kafka at the same time through http

 

# http://this.host:8888/mytail?json={"event":"data"}
<source>
  @type http
  port 8081 
 </source>



<match mytail>
 @type copy 
  <store>
  @type kafka
  brokers localhost:9092
  default_topic test1
  default_message_key message
  ack_timeout 2000
  flush_interval 1
  required_acks -1
  </store>
  <store>
   @type file
   path /var/log/td-agent/access
  </store>
</match>

6.2 Transfer data to logs and Kafka at the same time by reading files



<source>
type tail
 format none 
 path /var/log/apache2/access_log
pos_file /var/log/apache2/access_log.pos
 tag mytail
</source>

<match mytail>
 @type copy 
  <store>
  @type kafka
  brokers localhost:9092
  default_topic test1
  default_message_key message
  ack_timeout 2000
  flush_interval 1
  required_acks -1
  </store>
  <store>
   @type file
   path /var/log/td-agent/access
  </store>
</match>

おすすめ

転載: blog.csdn.net/qq_45171957/article/details/122639120