Filebeat Chinese Guide
The following blog is an example of using filebeat in the company, you can learn from it directly, and you can give me feedback if you have any questions.
Filebeat Quick Start: http://www.cnblogs.com/kerwinC/p/8866471.html
I. Overview
Filebeat is a log file shipping tool. After installing the client on your server, filebeat will monitor the log directory or the specified log file, track and read these files (track file changes, read continuously), and forward the information. Store it in elasticsearch or logstarsh.
The following is the workflow of filebeat: when you start the filebeat program, it will start one or more probes (prospectors) to detect the log directory or file you specify. For each log file found by the probe, filebeat will start The harvester process (harvester), each harvest process reads the new content of a log file, and sends these new log data to the handler (spooler), the handler will aggregate these events, and finally filebeat will send the aggregated data to your specified location.
(Personally, filebeat is a lightweight logstash. When the machine configuration or resources you need to collect information are not particularly large, use filebeat to collect logs. In daily use, filebeat is very stable, and the author has not encountered any downtime. )
2. Getting Started with Filebeat
Before starting to configure the use of filebeat, you need to install and configure these dependencies:
Elasticsearch serves as storage and indexing for this data.
Kibana as a showcase platform.
Logstash (optional) to insert data into elasticsearch.
For details, see Getting started with beat and elastic
After installing the elastic cluster, read the next options to learn how to install, configure, and run filebeat.
Step 1: Install filebeat
Please select the download and install command in your system to download and install filebeat. (deb for Debian/Ubuntu, rpm for Redhat/Centos/Fedora, mac for OS X, and win for Windows).
If you use yum or apt, you can install or update to newer versions more easily from our installation repository.
deb:
curl -L -O https://artifacts.elastic.co/downloads/beats/filebeat/filebeat-5.1.1-amd64.deb
sudo dpkg -i filebeat-5.1.1-amd64.deb
rpm:
curl -L -O https://artifacts.elastic.co/downloads/beats/filebeat/filebeat-5.1.1-x86_64.rpm
sudo rpm -vi filebeat-5.1.1-x86_64.rpm
mac:
curl -L -O https://artifacts.elastic.co/downloads/beats/filebeat/filebeat-5.1.1-darwin-x86_64.tar.gz
tar xzvf filebeat-5.1.1-darwin-x86_64.tar.gz
win:
slightly.
Step 2: Configure filebeat
Edit the configuration file to configure filebeat. For rpm or deb, the configuration file is the file /etc/filebeat/filebeat.yml. For MAC or win, please check your decompressed file.
Here is a sample of a simple filebeat configuration file filebeat.yml, filebeat will use many default options.
---------------prospectors block----------------
filebeat.prospectors:
- input_type: log
paths:/var/log/*.log
- c:\programdata\elasticsearch\logs*
Let's configure filebeat:
1. Define the path(s) to your log file(s)
For most basic filebeat configurations, you can define a single probe for a single path, for example:
#---------------prospectors block----------------
1
2
3
4
filebeat.prospectors:
- input_type: log
paths:- /var/log/*.log
#json.keys_under_root: true If you want to receive logs whose log format is json, please enable this configuration
- /var/log/*.log
In this example, the detector will collect all matching files in /var/log/*.log, which means that filebeat will collect all files ending in .log under /var/log, and it also supports Golang Glob support. All modes.
To get all files in subdirectories of a predefined level, you can use this configuration: /var/log/ /.log , this will find all files ending in .log in all subdirectories under /var/log. But it doesn't find files ending in .log in the /var/log folder. Right now it can't recursively fetch all log files in all subdirectories.
If you set the output to elasticsearch, then you need to set the IP address and port of elasticsearch in the filebeat configuration file.
#-----------------elastic output (generally not directly output to elastic, configuration is not recommended) -----------
output.elasticsearch:
hosts: ["192.168.1.42:9200"]
If you set the output to logstarsh, then please refer to the third step, configure filebeat to use logstarsh
Step 3: Configure filebeat or elasticsearch to use logstarsh
If you want to use logstash to perform other processing on the data collected by filebeat, you need to configure filebeat to use logstash.
You need to edit the filebeat config file, comment the elasticsearch option, and turn on the logstash config comment:
----------------------------- Logstash output --------------------------------
output.logstash:
hosts: ["127.0.0.1:5044"]
The hosts option needs to specify the address and port on which the logstash service is listening.
Note: To test your configuration file, change to the directory where you installed the filebeat executable, then run the following options on the command line: ./filebeat -configtest -e , make sure your configuration file is in the default configuration file directory below, see Directory Layout.
Before using this configuration, you need to set up logstash in advance to receive data.
If you want to store directly into elasticsearch without going through logstash, you can comment out the storage part of logstash and open the storage part of elasticsearch directly. Writing to es directly is not recommended.
output.elasticsearch:
hosts: ["localhost:9200"]
Step 4: Start filebeat
rpm install:
sudo /etc/init.d/filebeat start
Now, filebeat is ready to read your log file and send it to your defined output!
2018.04.17 Update Filebeat Quick Start: http://www.cnblogs.com/kerwinC/p/8866471.html
Filebeat Quick Start
What can Filebeat do? Does the
entry
filebeat
write language GO
support multiple output support ?
Does it support multiple input support
? Does it support modify log content support?
Will data be lost
? Merge support
for multi-line files Fuzzy matching support for multi-level directories
Simple
memory installation and configuration Occupies 10MB
filebeat installation
system version: Centos7.2
filebeat version: 5.5.1.
wget https://artifacts.elastic.co/downloads/beats/filebeat/filebeat-5.5.1-x86_64.rpm
install command
rpm -Uvh filebeat-5.5.1-x86_64.rpm
configuration file path
/etc/filebeat/filebeat.yml
log file path
/ var / log / filebeat
Note that each new startup will generate a new filebeat, and the last startup will be mv as filebeat.1
start command
systemctl restart filebeat
Profile Template
Notice! The file format is yml, and there are strict requirements on the format (spaces must be used for indentation, be careful not to omit "-").
copy code
============================ Filebeat prospectors =================== ===========
filebeat.prospectors: #file detector
- input_type: log #detection type, log file
paths: #path- /data/w/www/ /logs/request.log #Note that the existence of wildcards makes project logs with the same directory structure will be collected.
#json.keys_under_root: true If you want to receive logs whose log format is json, please enable this configuration
document_type: request #Log type, which is the type of elastic index, please see the detailed explanation later in the article 1
fields:
topic: log_common #Add fields.topic : "application_log" field, used for multi-topic configuration of kafka.
- /data/w/www/ /logs/request.log #Note that the existence of wildcards makes project logs with the same directory structure will be collected.
- input_type: log
paths:- /data/w/www/*/logs/dubbo-access-consumer.log
multiline.pattern: '^[0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}:[0-9]{2}' #For multi-line log processing, see the detailed explanation later in the article 2
multiline.negate: true
multiline .match : after
document_type: dubbo_consumer
topic: application_log
- /data/w/www/*/logs/dubbo-access-consumer.log
----------------------------- kafka output --------------------------------
output.kafka: #Output to kafka
hosts: ["kafka4.dp.data.cn1.wormpex.com:9092", "kafka5.dp.data.cn1.wormpex.com:9092", "kafka6.dp.data. cn1.wormpex.com:9092"] #kafka-broker address
topic: '%{[fields.topic]}' #Which topic is the output to (that is, the fields.topic defined where the log detects, and the variables are automatically sent to different topics)
partition.round_robin:
reachable_only: false
required_acks: 1
compression: gzip
max_message_bytes: 100000000 #The size of a single log does not exceed 10MB (the author's company log has a single log of several MB...)
----------------------------- Logstash output --------------------------------
output.logstash:
# The Logstash hosts
hosts: ["logstash1.ops.sys.cn1.wormpex.com:5044"] #logstash needs to open the input beta plugin and start listening on port 5044
Copy code
comments:
Note 1: fields.topic
Defines which topic this type of log will be sent to. The topic in the kafka configuration is used as a variable.
Note 2: multiline.pattern
Students who have java projects in the company as web servers know that in our production, java often prints stack information in the log, similar to:
2018-04-17 15:12:25.185 IndexNotFoundException[no such index]
at org.elasticsearch.cluster.metadata.IndexNameExpressionResolver$WildcardExpressionResolver.resolve(IndexNameExpressionResolver.java:566)
at org.elasticsearch.cluster.metadata.IndexNameExpressionResolver.concreteIndices( IndexNameExpressionResolver.java:133)
at org.elasticsearch.cluster.metadata.IndexNameExpressionResolver.concreteIndices(IndexNameExpressionResolver.java:77)
at org.elasticsearch.action.admin.indices.delete.TransportDeleteIndexAction.checkBlock(TransportDeleteIndexAction.java:75)
if direct If collected by line, there is no context in kibana, making it impossible to watch.
The function of this configuration is to merge the stack information into the log with the date at the beginning, and send it out as one, so that it is clearly visible in kibana.