Log collection and analysis example

Background brief description: A series of landing pages of the business have generated a batch of CDN logs, and the address of the landing page needs to be captured, and then content analysis is performed.

I have done the elastic stack solution before, and I have done the nginx log real-time collection system, and I have a special liking for elastic products. Quick configuration, less development, simple and efficient. So decided to continue to use some products of elastic as tools.

1. Solution: filebeat+logstash+mysql+ scripting language

Log collection and analysis example

2. Preparations (the specific process is omitted):

1. Log files to be collected.

  1. Install MySQL.
  2. Install Filebeat.
  3. Install Logstash.

3. Detailed process:

1. Configure Filebeat to collect logs.
Modify filebeat.yml

#配置filebeate的输入
# ============================== Filebeat inputs ===============================
 #这里开启Filebeat的输入配置
  filebeat.inputs:

 #这里指定输入类型
  - type: log
    enabled: true

    paths:

        # 这里指定日志文件的路径
      - /Users/zhang****/Downloads/test_log/logs/*

#配置Filebeat的输出,这里输出到Logstash。(注意:输出到其他的地方要先注释关闭)
# ------------------------------ Logstash Output ------------------------------- output.logstash:
   # The Logstash hosts
   hosts: ["localhost:5044"]

2. Install logstash-output-jdbc plug-in, download mysql-connector-java

#安装logstash-output-jdbc
./bin/logstash-plugin install logstash-output-jdbc

#下载mysql-connector-java
可以在这里选择适合的版本:http://mvnrepository.com/artifact/mysql/mysql-connector-java

4. Configure the input, filter, and output of logstash to
modify the configuration file./config/logstash.conf

#这里配置输入源为beats(Filebeate属于Elastic的Beats系统产品的一种)
   input {
     beats {
       port => 5044

       #客户端响应超时时间
       client_inactivity_timeout => 60000
    }
  }

#配置日志过滤,这里采用grok组件进行日志格式匹配
#grok规则见https://github.com/logstash-plugins/logstash-patterns-core

 filter {
       grok {
          match => {
              "message" => "\[%{HTTPDATE:visited_time}\] %{IP:visited_ip} .* \"%{GREEDYDATA:referer_url}\" \"GET %{URI:target_url}\" .* \"%{GREEDYDATA:user_agent}\" \"%{GREEDYDATA    :content-type}\""
          }
      }

         #剔除掉Content-Type为icon的日志
      if([content-type] == 'image/x-icon')
      {
          drop{}
      }
 }

 #配置结果的输出
   output {
    #elasticsearch {
    #  hosts => ["http://localhost:9200"]
    #  index => "%{[@metadata][beat]}-%{[@metadata][version]}-%{+YYYY.MM.dd}"
    #  #user => "elastic"
    #  #password => "changeme"
    #}

    #stdout{
    #      codec=>rubydebug{}
    #}

#这里需要安装Logstash的logstash-jdbc-output插件,下载好mysql-connector-java
    jdbc {
        connection_string => "jdbc:mysql://localhost:3306/land_page?serverTimezone=Asia/Shanghai&useUnicode=true&characterEncoding=utf8&useSSL=false&allowMultiQueries=true"
        username => "root"
        password => "***"
        driver_jar_path => "/Users/zhang***/Downloads/logstash-7.11.2/mysql-connector-java/mysql-connector-java-8.0.16.jar"
        driver_class => "com.mysql.cj.jdbc.Driver"
        statement => [ "INSERT INTO land_page_log (visited_ip,visited_time,referer_url,target_url,content_type,user_agent) VALUES(?,?,?,?,?,?)", "[visited_ip]", "[visited_time]","    [referer_url]","[target_url]","[content-type]","[user_agent]" ]
    }

Four, open the service

#开启Logstash服务
cd logstash安装目录
./bin/logstash -c ./config/logstash.conf

#开启Filebeat服务
cd filebeat安装目录
./filebeat -e -c ./filebeat.yml

V. Log analysis
The log content to be analyzed has been collected in the mysql database for persistent storage, and it can be slowly analyzed by other tools in the future. I will not elaborate here.

Guess you like

Origin blog.51cto.com/phpme/2676191