Introduction to Logstash

Introduction to Logstash

introduce

        Logstash is an open source server-side data processing pipeline capable of ingesting data from multiple sources simultaneously, transforming it, and sending it to your favorite repository (ours is of course ElasticSearch)

Insert image description here

       Let's go back to our ElasticStack architecture diagram, and we can see that Logstash serves as the data processing requirement. When our data needs to be processed, it will be sent to Logstash for processing, otherwise it will be sent directly to ElasticSearch
Insert image description here

use

       Logstash can process a variety of inputs, from documents, graphs, and databases, and then send them to Elasticsearch after processing.

Insert image description here

Deployment and installation

       Logstash mainly processes the data of the data source line by line, and also directly filters and cuts functions.

Insert image description here

First go to the official website to download logstash: portal

Choose the version we need to download:

[External link picture transfer failed, the source site may have an anti-leeching mechanism, it is recommended to save the picture and upload it directly (img-f0xO5Pzn-1687342729987) (C:\Users\XIA\AppData\Roaming\Typora\typora-user-images\ image-20230616195125119.png)]

Download directly using wget

#检查jdk环境,要求jdk1.8+
java -version
# 下载
wget https://artifacts.elastic.co/downloads/logstash/logstash-8.8.1-linux-x86_64.tar.gz
#解压安装包
tar -xvf logstash-8.8.1-linux-x86_64.tar.gz
mv logstash-8.8.1 logstash
#第一个logstash示例--定义标准输入和输出
bin/logstash -e 'input { stdin { } } output { stdout {} }'

test

       We enter hello in the console, and we can see its output immediately

Insert image description here

Configuration details

       Logstash configuration has three parts, as shown below

input {
    
     #输入
stdin {
    
     ... } #标准输入
}
filter {
    
     #过滤,对数据进行分割、截取等处理
...
}
output {
    
     #输出
stdout {
    
     ... } #标准输出
}

enter

  • Collect data of various shapes, sizes and sources, often in various forms, distributed or centralized in many systems.
  • Logstash supports a variety of input choices to capture events from many common sources at the same time. Easily ingest data from your logs, metrics, web applications, data stores, and various AWS services in a continuous stream.
    Insert image description here

filter

  • Parse and transform data in real time
  • As data travels from source to repository, Logstash filters parse individual events, identify named fields to build structures, and transform them into a common format for easier and faster analysis and business value.
    Insert image description here

output

       Logstash offers numerous output options to get data where it needs to go, with the flexibility to unlock numerous downstream use cases.
Insert image description here

Read custom log

       Earlier, we read the nginx log through Filebeat. If it is a log with a custom structure, it needs to be read and processed before it can be used. Therefore, Logstash needs to be used at this time, because Logstash has powerful processing capabilities and can handle various Various scenarios.

Log structure

2023-06-17 21:21:21|ERROR|1 读取数据出错|参数:id=1002

       As you can see, the content in the log is split using "|". Using this, we also need to split the data when processing.

Write configuration file

vim shengxia-pipeline.conf

Then add the following content

input {
    
    
    file {
    
    
        path => "/opt/elk/logs/app.log"
        start_position => "beginning"
    }
}
filter {
    
    
    mutate {
    
    
    	split => {
    
    "message"=>"|"}
    }
}
output {
    
    
	stdout {
    
     codec => rubydebug }
}

start up

#启动
./bin/logstash -f ./mogublog-pipeline.conf

Then we insert our test data

echo "2023-06-17 21:21:21|ERROR|读取数据出错|参数:id=1002" >> app.log

Then we can see that logstash will capture the data we just inserted, and our data will also be split.

Insert image description here

Output to Elasticsearch

       We can modify our configuration file to output our logging records to ElasticSearch

input {
    
    
    file {
    
    
        path => "/opt/elk/logs/app.log"
        start_position => "beginning"
    }
}
filter {
    
    
    mutate {
    
    
    	split => {
    
    "message"=>"|"}
    }
}
output {
    
    
	elasticsearch {
    
    
		hosts => ["192.168.40.150:9200","192.168.40.137:9200","192.168.40.138:9200"]
	}
}

Then restart our logstash

./bin/logstash -f ./shenngxia-pipeline.conf

Then insert two pieces of data into the log record

echo "2023-06-17 21:57:21|ERROR|读取数据出错|参数:id=1002" >> app.log
echo "2023-06-17 21:58:21|ERROR|读取数据出错|参数:id=1003" >> app.log

Finally, you can see the data we just inserted.

Insert image description here

Guess you like

Origin blog.csdn.net/qq_52589631/article/details/131332764