日志分析系统 [ 2 ] --- Logstash 安装与使用,Filebeat 安装与使用

企业级日志分析系统

一、Logstash 入门

官网 点我直达

1、安装

这里用的是 logstash-7.10.0
运行最基本的 Logstash 管道来测试 Logstash 安装

[root@ela1 ~]# ls
logstash-7.10.0-linux-x86_64.tar.gz
[root@ela1 ~]# tar xf logstash-7.10.0-linux-x86_64.tar.gz 
[root@ela1 ~]# cd logstash-7.10.0
[root@ela1 logstash-7.10.0]# bin/logstash -e 'input { stdin { } } output { stdout {} }'

当显示如下的时候

[2020-12-28T06:51:35,703][INFO ][logstash.agent           ] Successfully started Logstash API endpoint {
    
    :port=>9600}

尝试输入 hello
输出

hello
{
    
    
      "@version" => "1",
    "@timestamp" => 2020-12-28T12:01:42.559Z,
       "message" => "hello",
          "host" => "ela1"
}

安装成功

2、配置输入和输出

1.管道配置文件

创建一个Logstash管道,该管道使用标准输入来获取Apache Web日志作为输入,解析这些日志以从日志中创建特定的命名字段,然后将解析的数据写入Elasticsearch集群。无需在命令行上定义管道配置,而是在配置文件中定义管道。

创建first-pipeline.conf文件,并写入如下内容,作为 Logstash 的管道配置文件

[root@ela1 logstash-7.10.0]# cat first-pipeline.conf

input {
    
     
    stdin {
    
     } 
} 

output {
    
     
    stdout {
    
    } 
}

测试配置文件

bin/logstash -f first-pipeline.conf --config.test_and_exit

启动 Logstatsh

bin/logstash -f first-pipeline.conf --config.reload.automatic

--config.reload.automatic 会在你修改管道配置文件后自动加载,而不必重新启动 Logstash

启动后输入

83.149.9.216 - - [04/Jan/2015:05:13:42 +0000] "GET /presentations/logstash-monitorama-2013/imageskibana-search.png HTTP/1.1" 200 203023 "http://semicomplete.com/presentations/logstash-monitorama-2013/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/32.0.1700.77 Safari/537.36"

输出

{
    
    
    "@timestamp" => 2020-12-28T12:32:09.982Z,
      "@version" => "1",
       "message" => "",
          "host" => "ela1"
}
{
    
    
    "@timestamp" => 2020-12-28T12:32:10.035Z,
      "@version" => "1",
       "message" => "83.149.9.216 - - [04/Jan/2015:05:13:42 +0000] \"GET /presentations/logstash-monitorama-2013/imageskibana-search.png HTTP/1.1\" 200 203023 \"http://semicomplete.com/presentations/logstash-monitorama-2013/\" \"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/32.0.1700.77 Safari/537.36\"",
          "host" => "ela1"
}

报错解决

Failed to execute action {
    
    :action=>LogStash::PipelineAction::Create/pipeline_id:main, :exception=>"LogStash::ConfigurationError", :message=>"Expected one of [ \\t\\r\\n], \"#\", \"input\", \"filter\", \"output\" at line 1, col

一般是 配置的 .conf 文件内容写错了,仔细检查一下
如果真的没错误,就是程序被占用了 杀掉再起一下就好了

2.使用Grok过滤器插件解析Web日志

使用grok过滤器插件解析日志消息以从日志中创建特定的命名字段,将非结构化日志数据解析为结构化和可查询的内容。
grok 过滤插件,会根据你感兴趣的内容分配字段名称,并把这些内容和对应的字段名称进行绑定。

grok 如何知道哪些内容是你感兴趣的呢?它是通过自己预定义的模式来识别感兴趣的字段的。这个可以通过给其配置不同的模式来实现。

这里使用的模式是 %{COMBINEDAPACHELOG}

{COMBINEDAPACHELOG}使用以下模式从Apache日志中构造行:

原信息 对应新的字段名称
IP 地址 clientip
用户 ID ident
用户认证信息 auth
时间戳 timestamp
HTTP 请求方法 verb
请求的 URL request
HTTP 版本 httpversion
响应码 response
响应体大小 bytes
跳转来源 referrer
客户端代理(浏览器) agent

关于 grok 更多的用法请参考 grok 参考文档 点我直达

这里要想实现修改配置文件之后自动加载它,不能配置 inputstdin
所以, 这里我们使用了 file

添加 日志文件方便导入

[root@ela1 logstash-7.10.0]# cat /usr/local/logstash-7.10.0/access_log 
83.149.9.216 - - [04/Jan/2015:05:13:42 +0000] "GET /presentations/logstash-monitorama-2013/imageskibana-search.png HTTP/1.1" 200 203023 "http://semicomplete.com/presentations/logstash-monitorama-2013/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/32.0.1700.77 Safari/537.36"
[root@ela1 logstash-7.10.0]# cat /usr/local/logstash-7.10.0/error_log 
 2020/12/29 15:25:10 [warn] 3380#3380: *161 an upstream response is buffered to a temporary file /var/cache/nginx/proxy_temp/5/00/0000000005 while reading upstream, client: 10.9.29.234, server: localhost, request: "GET /35949/bundles/plugin/data/data.plugin.js HTTP/1.1", upstream: "http://127.0.0.1:5601/35949/bundles/plugin/data/data.plugin.js", host: "10.9.12.250:8080", referrer: "http://10.9.12.250:8080/app/home"

match => { “message” => “%{COMBINEDAPACHELOG}”} 的意思是:
当匹配到 “message” 字段时,用户模式 “COMBINEDAPACHELOG}” 进行字段映射。

配置完成后,再次进行验证

[root@ela1 logstash-7.10.0]# cat /usr/local/logstash-7.10.0/second-pipeline.conf 
input {
    
    
  file {
    
    
    path => "/usr/local/logstash-7.10.0/access_log"
    start_position => "beginning"
  }
}

filter {
    
    
    grok {
    
    
        match => {
    
     "message" => "%{COMBINEDAPACHELOG}" }
        remove_field => [ "message" ]
    }
}

output {
    
    
  stdout {
    
    
     codec => rubydebug
  }
}

输出

{
    
    
     "@timestamp" => 2020-12-29T06:26:15.259Z,
           "path" => "/usr/local/logstash-7.10.0/access_log",
       "clientip" => "83.149.9.216",
    "httpversion" => "1.1",
           "host" => "localhost",
       "referrer" => "\"http://semicomplete.com/presentations/logstash-monitorama-2013/\"",
          "agent" => "\"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/32.0.1700.77 Safari/537.36\"",
       "response" => "200",
          "bytes" => "203023",
       "@version" => "1",
           "verb" => "GET",
           "auth" => "-",
          "ident" => "-",
      "timestamp" => "04/Jan/2015:05:13:42 +0000",
        "request" => "/presentations/logstash-monitorama-2013/imageskibana-search.png"
}

会发现原来的非结构化数据,变为结构化的数据了。
细心的你一定发现原来的 message 字段仍然存在,假如你不需要它,可以使用 grok 中提供的常用选项之一: remove_filed 来移除这个字段。
事实上 remove_field 可以移除任意的字段,它可以接收的值是一个数组。

修改后管道配置文件如下:

[root@localhost logstash-7.10.0]# cat first-pipeline.conf 
input {
    
    
  file {
    
    
    path => "/usr/local/logstash-7.10.0/access_log"
    start_position => "beginning"
  }
}

filter {
    
    
    grok {
    
    
        match => {
    
     "message" => "%{COMBINEDAPACHELOG}"}
        remove_field => [ "message" ]
    }
}

output {
    
    
  stdout {
    
     codec => rubydebug }
}

继续执行

bin/logstash -f first-pipeline.conf --config.reload.automatic

发现 message 不见了

使用如下命令向示例日志文件中输入新的一行内容

[root@localhost logstash-7.10.0]# echo '83.149.9.216 - - [04/Jan/2015:05:13:42 +0000] "GET /presentations/logstash-monitorama-2013/imageskibana-search.png HTTP/1.1" 200 203023 "http://semicomplete.com/presentations/logstash-monitorama-2013/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/32.0.1700.77 Safari/537.36"' >> /var/nginx/access_log 

输出

{
    
    
          "bytes" => "203023",
           "path" => "/usr/local/logstash-7.10.0/access_log",
      "timestamp" => "04/Jan/2015:05:13:42 +0000",
          "agent" => "\"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/32.0.1700.77 Safari/537.36\"",
       "clientip" => "83.149.9.216",
       "response" => "200",
           "verb" => "GET",
       "referrer" => "\"http://semicomplete.com/presentations/logstash-monitorama-2013/\"",
       "@version" => "1",
          "ident" => "-",
     "@timestamp" => 2020-12-29T06:36:51.119Z,
        "request" => "/presentations/logstash-monitorama-2013/imageskibana-search.png",
           "host" => "localhost",
    "httpversion" => "1.1",
           "auth" => "-"
}

3 使用Geoip过滤器插件增强数据编辑

新的管道配置文件

input {
    
    
 stdin {
    
    }
}
 filter {
    
    
    grok {
    
    
        match => {
    
     "message" => "%{COMBINEDAPACHELOG}"}
    }
    geoip {
    
    
        source => "clientip"
    }
}
output {
    
    
    stdout {
    
     codec => rubydebug }
}

继续输入之前的内容

83.149.9.216 - - [04/Jan/2015:05:13:42 +0000] "GET /presentations/logstash-monitorama-2013/imageskibana-search.png HTTP/1.1" 200 203023 "http://semicomplete.com/presentations/logstash-monitorama-2013/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/32.0.1700.77 Safari/537.36"

输出

{
    
    
        "message" => "83.149.9.216 - - [04/Jan/2015:05:13:42 +0000] \"GET /presentations/logstash-monitorama-2013/imageskibana-search.png HTTP/1.1\" 200 203023 \"http://semicomplete.com/presentations/logstash-monitorama-2013/\" \"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/32.0.1700.77 Safari/537.36\"",
           "auth" => "-",
          "geoip" => {
    
    
          "country_name" => "Russia",
              "latitude" => 55.7527,
              "location" => {
    
    
            "lon" => 37.6172,
            "lat" => 55.7527
        },
                    "ip" => "83.149.9.216",
         "country_code2" => "RU",
         "country_code3" => "RU",
             "city_name" => "Moscow",
        "continent_code" => "EU",
              "timezone" => "Europe/Moscow",
             "longitude" => 37.6172,
           "postal_code" => "144700",
           "region_code" => "MOW",
           "region_name" => "Moscow"
    },
       "@version" => "1",
       "clientip" => "83.149.9.216",
      "timestamp" => "04/Jan/2015:05:13:42 +0000",
        "request" => "/presentations/logstash-monitorama-2013/imageskibana-search.png",
    "httpversion" => "1.1",
       "response" => "200",
           "verb" => "GET",
          "bytes" => "203023",
          "ident" => "-",
       "referrer" => "\"http://semicomplete.com/presentations/logstash-monitorama-2013/\"",
     "@timestamp" => 2020-12-29T02:19:09.153Z,
          "agent" => "\"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/32.0.1700.77 Safari/537.36\"",
           "host" => "ela1"
}

详情请参考 grok 点我直达 和 geoip 点我直达

二、Filebeat 安装与使用

1、安装

curl -L -O https://artifacts.elastic.co/downloads/beats/filebeat/filebeat-7.10.1-linux-x86_64.tar.gz

tar xzvf filebeat-7.10.1-linux-x86_64.tar.gz -C /usr/local

2、启用和配置数据收集模块

Filebeat使用模块来收集和解析日志数据。

1 查看可以启用的模块列表

[root@localhost ~]# cd /usr/local/filebeat-7.10.0-linux-x86_64/
[root@localhost filebeat-7.10.0-linux-x86_64]# ./filebeat modules list
Enabled:

Disabled:
activemq
apache
...

2 在安装目录中,启用一个或多个模块。

2.1 启用 nginx 模块

例如,下面的命令启用 nginx 的配置模块:

[root@localhost filebeat-7.10.0-linux-x86_64]# ./filebeat modules enable  nginx
Enabled nginx

这条命令实际上是把位于 modules.d/ 目录下的文件 nginx.yml.disabled 修改为了 nginx.yml

2.2 配置 nginx 模块

nginx.yml 文件内容:

module: nginx
access:

  # 开启搜集访问日志

  enabled: true

  # var.paths:

error:

  # 开启搜集错误日志

  enabled: true
  #var.paths:
ingress_controller:
  #它可以在Kubernetes环境中用于解析nginx日志的入口,默认禁用此选项
  enabled: false
  #var.paths:
var.paths

是用于给日志文件设置自定义路径的。
如果不设置此选项,Filebeat将根据您的操作系统选择路径。
比如:

/var/log/nginx/assecc.log
/var/log/nginx/error.log

var.paths 接收的值是一个数组,可以使用如下方式配置。

- module: nginx
  access:
    # 开启搜集访问日志
    enabled: true
    var.paths: ["/var/log/nginx/access.log*"] #自定义位置

部署步骤

1.logstash
beats {
port => 5044
}

2.filebeat

数据源
从哪儿搜集日志

模块
./filebeat modules enable nginx

var.paths: ["/path/to/*.log"]

配置输出
filebeat.yml

输出到Logstatsh
配置上 logstash 的 IP 地址

3.启动 filebeat

./filebeat # 前台

nohub ./filebeat &

3 修改配置文件

[root@localhost ~]# cat  /usr/local/logstash-7.10.0/first-pipeline.conf 
input {
    
    
    beats {
    
    
      port => 5044
   }
}

filter {
    
    
    grok {
    
    
        match => {
    
     "message" => "%{COMBINEDAPACHELOG}"}
        remove_field => [ "message" ]
    }
    geoip {
    
     source => "clientip" }
}

output {
    
    
  stdout {
    
     codec => rubydebug }
}

/filebeat-7.10.0-linux-x86_64/filebeat.yml
在这里插入图片描述

4 启动

[root@localhost ~]#  /usr/loacl/filebeat-7.10.0-linux-x86_64/filebeat -d #放入后台
[root@localhost ~]# cd /usr/local/logstash-7.10.0/
[root@localhost logstash-7.10.0]# bin/logstash -f first-pipeline.conf --config.reload.automatic

输出

[2020-12-29T02:22:28,710][INFO ][logstash.agent           ] Successfully started Logstash API endpoint {
    
    :port=>9600}
{
    
    
       "clientip" => "10.9.29.250",
      "timestamp" => "29/Dec/2020:08:46:43 +0800",
       "referrer" => "\"http://10.9.12.250:8080/app/home\"",
           "verb" => "POST",
          "ident" => "-",
           "auth" => "-",
            "ecs" => {
    
    
        "version" => "1.5.0"
    },
          "event" => {
    
    
        "timezone" => "-05:00",
         "dataset" => "nginx.access",
          "module" => "nginx"
    },
     "@timestamp" => 2020-12-29T07:21:11.293Z,
           "host" => {
    
    
         "architecture" => "x86_64",
                 "name" => "localhost",
                   "ip" => [
            [0] "192.168.116.167",
            [1] "fe80::2bcc:46ea:d75d:d5dc"
        ],
...

三、Elasticsearch + Logstash + Filebeat 用于生产环境的简单配置方案

配合 Elastic 可以在集群中实现查询

[root@localhost filebeat-7.10.0-linux-x86_64]# curl -X GET "192.168.116.167:9200/_cat/nodes"
192.168.116.155 33 96  6 0.33 0.23 0.17 cdhilmrstw * ela2
192.168.116.166 31 96  6 0.10 0.11 0.13 cdhilmrstw - ela3
192.168.116.167 29 96 39 1.39 1.67 1.84 cdhilmrstw - ela1
[root@ela3 ~]# curl -X GET "192.168.116.167:9200/_cat/indices?v"
health status index                      uuid                   pri rep docs.count docs.deleted store.size pri.store.size
green  open   logstash-2020.12.28-000001 W1R0NfkXToCDbgy_nSmr8A   1   1          0            0    
[root@localhost ~]# cat /usr/local/logstash-7.10.0/first-pipeline.conf 
input {
    
    
    beats {
    
    
      port => 5044
   }
}

filter {
    
    
    grok {
    
    
        match => {
    
     "message" => "%{COMBINEDAPACHELOG}"}
        remove_field => [ "message" ]
    }
    geoip {
    
     source => "clientip" }
}

output {
    
    
  stdout {
    
     codec => rubydebug }

elasticsearch {
    
    
    # 这里是输出到 elasticsearch 集群中
    hosts => ["192.168.116.167:9200","192.168.116.155:9200","192.168.116.166:9200"]
  }
}

日志追加 则继续
输出

{
    
    
        "request" => "/api/ui_metric/report",
       "clientip" => "10.9.29.250",
        "fileset" => {
    
    
        "name" => "access"
    },
       "referrer" => "\"http://10.9.12.250:8080/app/home\"",
      "timestamp" => "29/Dec/2020:08:46:43 +0800",
        "service" => {
    
    
        "type" => "nginx"
    },
          "agent" => {
    
    
        "ephemeral_id" => "68962cc2-f4d6-465a-b7bc-8cc3aa91429d",
                  "id" => "dce975d3-24f5-421f-a7ca-0dadfc6348f1",
            "hostname" => "localhost",
                "type" => "filebeat",
                "name" => "localhost",
             "version" => "7.10.0"
    },
          "geoip" => {
    
    },
            "log" => {
    
    
          "file" => {
    
    
            "path" => "/var/log/nginx/access.log"
        },
        "offset" => 0
    },
    "httpversion" => "1.1",
          "bytes" => "0",
           "tags" => [
        [0] "beats_input_codec_plain_applied",
        [1] "_geoip_lookup_failure"
    ],
       "response" => "499",
          "input" => {
    
    
        "type" => "log"
    },
           "verb" => "POST",
           "host" => {
    
    
                   "id" => "38b8887c97c045caa0333f41031ea4ea",
             "hostname" => "localhost",
                  "mac" => [
            [0] "00:0c:29:6d:70:86"
        ],
                   "os" => {
    
    
            "platform" => "centos",
                "name" => "CentOS Linux",
            "codename" => "Core",
              "family" => "redhat",
             "version" => "7 (Core)",
              "kernel" => "3.10.0-1127.19.1.el7.x86_64"
        },
         "architecture" => "x86_64",
        "containerized" => false,
                 "name" => "localhost",
                   "ip" => [
            [0] "192.168.116.167",
            [1] "fe80::2bcc:46ea:d75d:d5dc"
        ]
    },
       "@version" => "1",
          "ident" => "-",
           "auth" => "-",
            "ecs" => {
    
    
        "version" => "1.5.0"
    },
          "event" => {
    
    
          "module" => "nginx",
        "timezone" => "-05:00",
         "dataset" => "nginx.access"
    },
     "@timestamp" => 2020-12-29T09:53:38.427Z
}

猜你喜欢

转载自blog.csdn.net/Houaki/article/details/111997131