filebeat collects tomcat and nginx logs

1. Introduction

Beats is a new tool in the ELK system. It belongs to a lightweight log collector. The log collection tool we used above is logstash, but logstash takes up a lot of resources and is not as lightweight as beats, so it is also recommended to use beats as a log collection tool. And beats is extensible and supports custom builds

Official introduction: https://www.elastic.co/en/products/beats

2. Architecture

insert image description here
From bottom to top, filebeat is very suitable as a client for collecting logs, because it is light enough, and then transmits the log data to the redis cluster, and logstash extracts the logs from redis for filtering and filtering, and then stores them in the elasticsearch cluster. Finally, kibana displays the data from the es cluster, which is very simple, right?

3. Installation

Do you still remember the rpm package you downloaded before, it’s his turn to play

yum install -y filebeat-7.3.2-x86_64.rpm

4. Configuration

Delete the original configuration directly and replace it with a new one

rm -rf /etc/filebeat/filebeat.yml
vim /etc/filebeat/filebeat.yml

The following is the configuration content

#=========================== Filebeat inputs =============================
#部分旧系统版本不认input,要用prospectors
#filebeat.prospectors:
#centos7用inputs
filebeat.inputs:

# Each - is an input. Most options can be set at the input level, so
# you can use different inputs for various configurations.
# Below are the input specific configurations.

  
#=========== nginx error日志 ==============

- document_type: "172.17.19.137‐nginx‐error‐log"
  enabled: true
  paths:
    - /opt/server/nginx/logs/error.log
  encoding: utf-8
  scan_frequency: 10s
  tail_files: true
  tags: "nginx-error-qudao-searching"
#=========== searching日志 ==============

- document_type: "172.17.19.137‐searching‐log"
  enabled: true
  paths:
    - /opt/server/tomcat_searching/logs/catalina.out
  encoding: utf-8
  scan_frequency: 10s
  include_lines: ['ERROR', 'WARN']
  multiline.pattern: ^\d{
    
    4}-\d{
    
    2}-\d{
    
    2}
  multiline.negate: true
  multiline.match: after
  tail_files: true
  tags: "searching"

#=========== qudao日志 ==============

- document_type: "172.17.19.137‐qudao‐log"
  enabled: true
  paths:
    - /opt/server/tomcat_qudao/logs/catalina.out
  encoding: utf-8
  scan_frequency: 10s
  include_lines: ['ERROR', 'WARN']
  multiline.pattern: ^\d{
    
    4}-\d{
    
    2}-\d{
    
    2}
  multiline.negate: true
  multiline.match: after
  tail_files: true
  tags: "qudao"
  
#=========== redis output  ===============================

output.redis:
   hosts: ["172.17.19.149:7227"]   #输出到redis的机器
   password: "redis配置的密码"
   key: "filebeat_19.137"   #redis中日志数据的key值ֵ
   db: 0
   timeout: 5

#============================= Filebeat modules ===============================

filebeat.config.modules:
  # Glob pattern for configuration loading
  path: ${path.config}/modules.d/*.yml

Configuration description, the configuration of filebeat is divided into two parts, input and output. As a client, filebeat will scan log files continuously at intervals of 10 seconds (note that scan_frequency in the configuration file is the scanning interval), and then output is the upstream path. The redis cluster used in the configuration is used as a transfer, and Kafka is also used. This is more casual.

**filebeat.inputs:** This part is annoying, you need to pay attention. Otherwise, filebeat cannot be started.
Some old system versions do not recognize input, so use prospectors, namely:
filebeat.prospectors:
centos7 uses inputs, namely:
filebeat.inputs:

document_type : This is the key point. Each log should be identified by this label, which can be said to be a title or the starting point of a paragraph. In earlier versions, it was type, but now it is document_type.

include_lines: ['ERROR', 'WARN']
multiline.pattern: ^\d{4}-\d{2}-\d{2}
multiline.negate: true
multiline.match: after

these are keywords and regular pairs The log matching is applicable to tomcat logs, which can be directly copied. Filebeat also has its own filter and interception function since version 6.0, and I learned from my own practice that if even logs at the level of info are collected, the redis cluster will be directly blocked. Dry blast, the processing speed of logstash is far behind the speed of redis being stored, so this step is necessary. It can be seen from this configuration that I did not add any filter conditions to the nginx error log, and saved it in its original form.

**tags:** This is for logstash screening, and will be used in the configuration of logstash later

What needs to be said in the output part is only key and db. There is also a
key: it is the table stored in redis, and logstash will extract these tables for analysis and processing
. , Curry keys with different numbers can be repeated

Well, that's all for the configuration, it's very simple, and then don't rush to start the service. First of all, make sure that redis is running normally, otherwise the filebeat log will directly tell you that the connection failed. In addition, try to start filebeat after logstash is started, so as to save only storage and no extraction, the log volume will directly explode redis, and the log output of filebeat can be viewed in /var/log/message

#centos7
systemctl start filebeat
#centos6
service filebeat start

After starting the service, you can go to redis to see if there is a key passed over. Of course, if there is no hair after opening, it is normal. After all, we only filter out warn and error. During the experiment, you can remove this filter condition and try the connection.

The way to enter the redis cluster to see if there is a key is:

redis-cli -p 端口 -a 密码
127.0.0.1:6379> keys *
1) "filebeat_19.137"
2) "filebeat_19.138"


You can see that several keys have come over, and the next step is to configure the configuration file of logstash. This configuration file will definitely have several different processing methods, so it is recommended to store it in the conf.d directory with the project name.

vim /etc/logstash/conf.d/项目名.conf
input {
    
    
    redis {
    
    
        data_type => "list"
        key => "filebeat_19.137"
        host => "127.0.0.1"
        port => "redis端口"
        password => "redis密码"
        db => "0"
        #codec => json
        }
    redis {
    
    
        data_type => "list"
        key => "filebeat_19.138"
        host => "127.0.0.1"
        port => "redis端口"
        password => "redis密码"
        db => "0"
        #codec => json
        }
}

#filter {
    
    
#    
#    if "sfa-web" in [tags]{
    
    
#        grok {
    
    
#            match => ["message", "%{TIMESTAMP_ISO8601:time}\s* \s*%{NOTSPACE:thread-id}\s* \s*%{LOGLEVEL:level}\s* \s*%{JAVACLASS:class}\s* \- \s*%{JAVALOGMESSAGE:logmessage}\s*"]
#        }
#        
#    }
#    mutate {
    
    
#        remove_field => "log"
#        remove_field => "beat"
#        remove_field => "meta"
#        remove_field => "prospector"
#        remove_field => "[host][os]"
#    }
#}


output {
    
    
    if "nginx-error" in [tags]{
    
    
        elasticsearch {
    
    
            hosts => ["127.0.0.1:9200"]      
            index => "qudao-nginx-error-%{+yyyy.MM.dd}"      
	    user => "elastic"
	    password => "es的ACL密码"
        }
    }
    if "qudao" in [tags]{
    
    
        elasticsearch {
    
    
            hosts => ["127.0.0.1:9200"]      
            index => "qudao-%{+yyyy.MM.dd}"      
	    user => "elastic"
	    password => "es的ACL密码"
        }
    }
    if "searching" in [tags]{
    
    
        elasticsearch {
    
    
            hosts => ["127.0.0.1:9200"]      
            index => "searching-%{+yyyy.MM.dd}"      
	    user => "elastic"
	    password => "es的ACL密码"
        }
    }
}

Explain the configuration file. The logstash configuration file is divided into three parts. The input part is the extraction part, the filter is the filtering part, and the output is the storage part.
Needless to say, the extraction part of the input, the key is extracted from the local redis cluster for analysis.
In the filer part, I commented out the filter conditions of logstash, and opened and modified it as needed.
The output part will be filtered according to the table extracted from redis. In the configuration, I will classify and separate according to the tags tagged on filebeat before, and store the tags that match nginx-error, qudao, and searching in different es indexes In this operation, the logs of different backend servers after load balancing can be summarized in the same project index for centralized processing.

Next is the moment to witness the miracle, logstash starts!

systemctl start logstash

After starting, you can see that the key in redis is gradually consumed

127.0.0.1:6379> keys *
(empty list or set)

Go to the es server to check the index, you can see the newly added nginx-error-date, qudao-date, search index, it’s a good thing if you don’t see it, after all, all the error messages are screened, which means that filesbeat and es can communicate normally up:

curl 'elastic:es的acl密码@127.0.0.1:9200/_cat/indices?v' 
yellow open   searching-2021.03.26                                          tavRmrvmR5CgeCpNgmjy8w   1   1        115            0     95.4kb         95.4kb
yellow open   qudao-2021.03.26                                              L3RSZ5zpRTaTFknRGoZx8A   1   1       1088            0    674.7kb        674.7kb

It can also be seen in the index in kibana, and the index is configured
insert image description here

Guess you like

Origin blog.csdn.net/qq_35855396/article/details/115318209