[Teach you through ELK] Logstash input, filter and output plug-ins

Yuxian: CSDN content partner, CSDN new star mentor, 51CTO (Top celebrity + expert blogger), github open source enthusiast (secondary development of go-zero source code, game back-end architecture https://github.com/Peakchen)

 

Logstash is an open source tool for data processing, transformation, and transmission. Its architecture includes three main components: input plugins, filters, and output plugins. The input plugin is responsible for collecting data from the data source, the filter can process, transform and clean the data, and the output plugin sends the processed data to the target system.

Input plugins: Logstash input plugins are responsible for collecting data from data sources. Logstash supports a variety of input plug-ins, including common data sources such as files, networks, MQ, and databases. The input plug-in can decode, encode, compress, etc. the data to ensure the integrity and availability of the data.

Filter: Logstash filter plug-in can process, transform and clean the input data. Logstash provides a variety of built-in filter plug-ins, such as grok, mutate, date, etc., which can perform operations such as data analysis, field extraction, data type conversion, and date formatting. Users can also write custom filter plug-ins to meet specific data processing needs.

Output plugins: Logstash output plugins send processed data to target systems. Logstash supports a variety of output plugins, including common target systems such as Elasticsearch, Redis, Kafka, and MySQL. The output plug-in can format, compress, encrypt and other operations on the data to ensure the reliability and security of the data.

Logstash is often used in the following scenarios:

  • Log processing and analysis: Logstash can collect, analyze and filter log data generated by various applications and systems, which can be used for log monitoring, report analysis, troubleshooting, etc.

  • Data collection and ETL: Logstash can collect data from various data sources, convert and clean it, and can be used for data warehouse, data analysis, BI, etc.

  • Data pipeline and stream processing: Logstash can transfer data from one system to another, supporting real-time streaming and batch processing.

Here are some links to literature material on Logstash input, filter, and output plugins:

Below is a simple Logstash example implementation that demonstrates how to use Logstash to read data from a file, filter and output to Elasticsearch:

  1. Prepare a sample data file sample.logwith the following content:
2023-08-04 12:00:00,123 INFO [com.example.app] - Request received: GET /api/users/123
2023-08-04 12:00:01,234 ERROR [com.example.app] - Internal server error occurred
2023-08-04 12:00:02,345 WARN [com.example.app] - Slow response time: 500ms

This file contains some simple log information, including fields such as timestamp, log level, class name, and log message.

  1. Create a logstash.confconfiguration file called config with the following content:
input {
  file {
    path => "/path/to/sample.log"
    start_position => "beginning"
  }
}

filter {
  grok {
    match => { "message" => "%{TIMESTAMP_ISO8601:timestamp} %{LOGLEVEL:loglevel} \[%{DATA:class}\] - %{GREEDYDATA:message}" }
  }
  date {
    match => [ "timestamp", "yyyy-MM-dd HH:mm:ss,SSS" ]
    target => "@timestamp"
  }
}

output {
  elasticsearch {
    hosts => ["localhost:9200"]
    index => "logstash-%{+YYYY.MM.dd}"
  }
}

This configuration file specifies a file input plugin, a grok filter plugin and an Elasticsearch output plugin. It will /path/to/sample.logread data from the file, use the grok filter to parse the log data into various fields, then use the date filter to convert the timestamp into a time format acceptable to Elasticsearch, and output the processed data to Elasticsearch.

  1. Start Logstash, specify the configuration file:
bin/logstash -f logstash.conf

This will start Logstash and load logstash.confthe configuration file.

  1. View the output in Elasticsearch:
GET /logstash-2023.08.04/_search
{
  "query": {
    "match_all": {}
  }
}

This query will return all the log data saved in Elasticsearch today, and you can see that the value of each field has been correctly identified and separated.

Guess you like

Origin blog.csdn.net/feng1790291543/article/details/132102675