Getting Started with Logstash Basics

img

I. Introduction

Logstash is an open source data collection engine with real-time data transmission capabilities. It can uniformly filter data from different sources, and output to the destination according to the developer's specifications.

As the name implies, Logstash collects data objects that are log files. Since there are many sources of log files (such as: system logs, server logs, etc.), and the content is messy, it is not convenient for humans to observe. Therefore, we can use Logstash to collect and filter log files in a unified manner, and turn them into highly readable content, which is convenient for developers or operation and maintenance personnel to observe, so as to effectively analyze the performance of the system/project operation, and do a good job of monitoring and early warning Preparations etc.

2. Installation

Logstash depends on JDK1.8, so please make sure that the machine has installed and configured JDK1.8 before installation.

Logstash download

tar -zxvf logstash-5.6.3.tar.gz -C /usr

cd logstash-5.6.3

3. Composition

Logstash works through pipelines, which have two required elements, input and output, and an optional element, filter.

Input plugins obtain data from data sources, filter plugins modify data according to user-specified data formats, and output plugins write data to destinations. As shown below:

img

Let's start with a simple case:

bin/logstash -e 'input { stdin { } } output { stdout {} }'

After starting Logstash, type Hello World again, the result is as follows:

[root@localhost logstash-5.6.3]# bin/logstash -e 'input { stdin { } } output { stdout {} }'
Sending Logstash's logs to /usr/logstash-5.6.3/logs which is now configured via log4j2.properties
[2017-10-27T00:17:43,438][INFO ][logstash.modules.scaffold] Initializing module {
    
    :module_name=>"fb_apache", :directory=>"/usr/logstash-5.6.3/modules/fb_apache/configuration"}
[2017-10-27T00:17:43,440][INFO ][logstash.modules.scaffold] Initializing module {
    
    :module_name=>"netflow", :directory=>"/usr/logstash-5.6.3/modules/netflow/configuration"}
[2017-10-27T00:17:43,701][INFO ][logstash.pipeline        ] Starting pipeline {
    
    "id"=>"main", "pipeline.workers"=>1, "pipeline.batch.size"=>125, "pipeline.batch.delay"=>5, "pipeline.max_inflight"=>125}
[2017-10-27T00:17:43,744][INFO ][logstash.pipeline        ] Pipeline main started
The stdin plugin is now waiting for input:
[2017-10-27T00:17:43,805][INFO ][logstash.agent           ] Successfully started Logstash API endpoint {
    
    :port=>9600}
Hello World
2017-10-27T07:17:51.034Z localhost.localdomain Hello World

Hello World (input) becomes: 2017-10-27T07:17:51.034Z localhost.localdomain Hello World (output) through the Logstash pipeline (filtering).

In a production environment, Logstash's pipeline is much more complex and may require multiple input, filter, and output plugins to be configured.

Therefore, a configuration file is required to manage input, filter and output related configuration. The content format of the configuration file is as follows:

# 输入
input {
    
    
  ...
}

# 过滤器
filter {
    
    
  ...
}

# 输出
output {
    
    
  ...
}

Just configure the input plug-in , filter plug-in , output plug-in and codec plug-in in the corresponding position according to your own needs .

Fourth, plug-in usage

Before using plugins, let's understand a concept: events.

Every time Logstash reads data, it is called an event.

Create a configuration file in the Logstach directory called logstash.conf (name it whatever you want).

4.1 Input plugin

The input plugin allows a specific event source to be read into the Logstash pipeline, configured in input {}, and multiple can be set.

Modify the configuration file:

input {
    
    
    # 从文件读取日志信息
    file {
    
    
        path => "/var/log/messages"
        type => "system"
        start_position => "beginning"
    }
}

# filter {
    
    
#
# }

output {
    
    
    # 标准输出
    stdout {
    
     codec => rubydebug }
}

Among them, messages is the system log.

save document. type:

bin/logstash -f logstash.conf

The results in the console are as follows:

{
    
    
      "@version" => "1",
          "host" => "localhost.localdomain",
          "path" => "/var/log/messages",
    "@timestamp" => 2017-10-29T07:30:02.601Z,
       "message" => "Oct 29 00:30:01 localhost systemd: Starting Session 16 of user root.",
          "type" => "system"
}
......

4.2 Output plugins

The output plugin sends event data to a specific destination, configured in output {}, and multiple can be set.

Modify the configuration file:

input {
    
    
    # 从文件读取日志信息
    file {
    
    
        path => "/var/log/error.log"
        type => "error"
        start_position => "beginning"
    }
    
}

# filter {
    
    
#
# }

output {
    
    
    # 输出到 elasticsearch
    elasticsearch {
    
    
        hosts => ["192.168.2.41:9200"]
        index => "error-%{+YYYY.MM.dd}"
    }
}

Among them, the content format of error.log is as follows:

2017-08-04 13:57:30.378 [http-nio-8080-exec-1] ERROR c.g.a.global.ResponseResultAdvice -设备数据为空
com.light.pay.common.exceptions.ValidationException: 设备数据为空
    at com.light.pay.common.validate.Check.isTrue(Check.java:31)
    at com.light.attendance.controllers.cloudApi.DevicePushController.deviceInfoPush(DevicePushController.java:44)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
    at java.lang.Thread.run(Thread.java:745)
2017-08-04 13:57:44.495 [http-nio-8080-exec-2] ERROR c.g.a.global.ResponseResultAdvice -Failed to invoke remote method: pushData, provider: dubbo://192.168.2.100:20880/com.light.attendance.api.DevicePushApi?application=salary-custom&default.check=false&default.timeout=30000&dubbo=2.8.4&interface=com.light.attendance.api.DevicePushApi&methods=getAllDevices,getDeviceById,pushData&organization=com.light.attendance&ow
......

The elasticsearch output plugin is used in the configuration file. The output log information will be saved to Elasticsearch, and the index name is in the format set by the index parameter.

If readers don't know the basic content of Elasticsearch, you can check the article "Introduction to Elasticsearch Basics" on this site or fill in the knowledge on Baidu yourself.

save document. type:

bin/logstash -f logstash.conf

Open the browser to visit http://192.168.2.41:9100 and use the head plug-in to view Elasticsearch data, the result is as follows:

img

Stepping pit reminder: The file input plugin uses "\n" by default to determine the boundary position of each line in the log. error.log is the error log edited by the author. Previously, when copying and pasting the log content, I forgot to add a newline at the end of the content, so the log data could not be imported into Elasticsearch. Here, readers are reminded of this key point.

4.3 Codec plug-in

The codec plug-in is essentially a stream filter, which is used with input plug-ins or output plug-ins.

From the above figure, we found a problem: Java exception logs are split into single-line event records and recorded in Elasticsearch, which does not conform to the viewing habits of developers or operation and maintenance personnel. Therefore, we need to encode the log information to convert multi-line events into single-line events and record them.

We need to configure the Multiline codec plug-in, which can combine multiple lines of log information into one line and process it as an event.

Logstash does not have this plugin installed by default, and developers need to install it themselves. type:

bin/logstash-plugin install logstash-codec-multiline

Modify the configuration file:

input {
    
    
    # 从文件读取日志信息
    file {
    
    
        path => "/var/log/error.log"
        type => "error"
        start_position => "beginning"
        # 使用 multiline 插件
        codec => multiline {
    
    
            # 通过正则表达式匹配,具体配置根据自身实际情况而定
            pattern => "^\d"
            negate => true
            what => "previous"
        }
    }

}

# filter {
    
    
#
# }

output {
    
    
    # 输出到 elasticsearch
    elasticsearch {
    
    
        hosts => ["192.168.2.41:9200"]
        index => "error-%{+YYYY.MM.dd}"
    }
}

save document. type:

bin/logstash -f logstash.conf

Use the head plug-in to view Elasticsearch data, the result is as follows:

img

4.4 Filter plugins

The filter plug-in is located in the middle of the Logstash pipeline, and performs filtering processing on events. It is configured in filter {}, and multiple configurations are possible.

This test uses the grok plug-in demonstration, the grok plug-in is used to filter messy content, structure it, and increase readability.

Install:

bin/logstash-plugin install logstash-filter-grok

Modify the configuration file:

input {
    
    
     stdin {
    
    }
}


filter {
    
    
     grok {
    
    
       match => {
    
     "message" => "%{IP:client} %{WORD:method} %{URIPATHPARAM:request} %{NUMBER:bytes} %{NUMBER
:duration}" }
     }
}


output {
    
    
     stdout {
    
    
        codec => "rubydebug"
     }
}

save document. type:

bin/logstash -f logstash.conf

After the startup is successful, we enter:

55.3.244.1 GET /index.html 15824 0.043

The console returns:

[root@localhost logstash-5.6.3]# bin/logstash -f logstash.conf 
Sending Logstash's logs to /root/logstash-5.6.3/logs which is now configured via log4j2.properties
[2017-10-30T08:23:20,456][INFO ][logstash.modules.scaffold] Initializing module {
    
    :module_name=>"fb_apache", :directory=>"/root/logstash-5.6.3/modules/fb_apache/configuration"}
[2017-10-30T08:23:20,459][INFO ][logstash.modules.scaffold] Initializing module {
    
    :module_name=>"netflow", :directory=>"/root/logstash-5.6.3/modules/netflow/configuration"}
[2017-10-30T08:23:21,447][INFO ][logstash.pipeline        ] Starting pipeline {
    
    "id"=>"main", "pipeline.workers"=>1, "pipeline.batch.size"=>125, "pipeline.batch.delay"=>5, "pipeline.max_inflight"=>125}
The stdin plugin is now waiting for input:
[2017-10-30T08:23:21,516][INFO ][logstash.pipeline        ] Pipeline main started
[2017-10-30T08:23:21,573][INFO ][logstash.agent           ] Successfully started Logstash API endpoint {
    
    :port=>9600}
55.3.244.1 GET /index.html 15824 0.043
{
    
    
      "duration" => "0.043",
       "request" => "/index.html",
    "@timestamp" => 2017-10-30T15:23:23.912Z,
        "method" => "GET",
         "bytes" => "15824",
      "@version" => "1",
          "host" => "localhost.localdomain",
        "client" => "55.3.244.1",
       "message" => "55.3.244.1 GET /index.html 15824 0.043"
}

The input is matched to the corresponding name.

5. References

Guess you like

Origin blog.csdn.net/ximaiyao1984/article/details/132258119