Logstash: How to Create Logstash pipeline maintainable and reusable

Logstash is an open-source data processing pipeline that extracts an event from the one or more inputs, convert them, and then sends each event to one or more outputs. Logstash Some implementations may have multiple lines of code, and can handle events from multiple input sources. In order to achieve these more maintainable, I'll show how to improve the code by creating a pipeline from the modular component reusability.

 

Motivation to write this article

Logstash often necessary to process a subset of the generic logic applied to events from multiple input sources. Usually achieved by one of two ways:

  • Processing a plurality of events from different sources in a single pipe, so that it can be easily applied to all general-purpose logic events from all sources. In such an implementation, in addition to general-purpose logic, usually a large number of conditional logic. Therefore, this method may result in Logstash implement complex and difficult to understand.

  • The implementation of a unique pipe to handle events from each unique input source. This method needs to be copied and copied into each pipe general function, which makes it difficult to maintain the common part of the code.

This technique described by blog modular ducting components are stored in different files, and then the pipe is constructed by combination of these components, thereby solving the disadvantages of the above methods. This technique can reduce the complexity and the pipeline can eliminate the code duplication.

 

The modular pipeline construction

Logstash profile input, and an output filter Logstash performed by piping components:
 

In more advanced settings, there is typically a Logstash example of performing a plurality of pipes . By default, when the start Logstash without parameters, it reads a file named pipelines.yml document, and instantiating the specified pipe.

Logstash input, and output filters may be stored in multiple files, by specifying glob expression to select the files to be contained in the pipeline. Global expression file matches the combination in alphabetical order . Since the execution order of the filter is usually very important, so the file name includes a numeric identifier to ensure that the file in the desired order combinations may be helpful.

In the following, we will define a unique two pipes, these pipes are a combination of several modular components Logstash. We will Logstash components are stored in the following files:

  • Enter the statement: 01_in.cfg, 02_in.cfg
  • Filter statement: 01_filter.cfg, 02_filter.cfg, 03_filter.cfg
  • Output Statement: 01_out.cfg

Then use glob expression in pipelines.yml custom pipeline and allowed by the required components, as follows:

- pipeline.id: my-pipeline_1
  path.config: "<path>/{01_in,01_filter,02_filter,01_out}.cfg"
- pipeline.id: my-pipeline_2
  path.config: "<path>/{02_in,02_filter,03_filter,01_out}.cfg"

In the pipeline configuration, the two pipes are present document 02_filter.cfg , the document demonstrates how to define and maintain a total of two pipes of the code, and how to execute the code by a plurality of conduits in both files.

 

Test Pipeline

In this section, we provide a specific example of a file, these files are merged to said pipelines.yml single duct defined. We then use these files to run Logstash, and shows the generated output.

Profiles

input file:01_in.cfg

The document defines a generator input. Builder input designed to test Logstash, in this case, it generates an event.

input { 
  generator { 
    lines => ["Generated line"] 
    count => 1 
  } 
}

Input file: 02_in.cfg

This file defines Logstash input stdin of a listener.

input { 
  stdin {} 
}

Filter file: 01_filter.cfg

filter { 
  mutate { 
    add_field => { "filter_name" => "Filter 01" } 
  } 
}

Filter file: 02_filter.cfg

filter { 
  mutate { 
    add_field => { "filter_name" => "Filter 02" } 
  } 
}

Filter file: 03_filter.cfg

filter { 
  mutate { 
    add_field => { "filter_name" => "Filter 03" } 
  } 
}

Output file: 01_out.cfg

output { 
  stdout { codec =>  "rubydebug" } 
}

 

Execution pipeline

Without any options will start Logstash the implementation of our previously defined pipelines.yml file. Run Logstash, as follows:

./bin/logstash

Because the pipeline my-pipeline_1 being executed generator to simulate input events, so Logstash after initialization is complete, we should see the following output. This indicates 01_filter.cfg and 02_filter.cfg content has been executed by the pipeline as expected.

{
     "@timestamp" => 2020-02-29T02:44:40.024Z,
           "host" => "liuxg-2.local",
       "sequence" => 0,
        "message" => "Generated line",
       "@version" => "1",
    "filter_name" => [
        [0] "Filter 01",
        [1] "Filter 02"
    ]
}

When another named my-pipeline_2 when the pipeline is waiting for input on stdin, we have yet to see any event the pipeline process. Type in the terminal operation Logstash, and then press Return to create an event pipeline for this purpose. Once this is done, you should see something similar to the following:

hello, the world!
{
        "message" => "hello, the world!",
       "@version" => "1",
     "@timestamp" => 2020-02-29T02:48:26.142Z,
           "host" => "liuxg-2.local",
    "filter_name" => [
        [0] "Filter 02",
        [1] "Filter 03"
    ]
}

We can see from the above, according to the intended application of 02_filter.cfg and 03_filter.cfg logic.

 

Execution order

Please note, Logstash not pay attention to the order glob expression of files. It only uses glob expressions to determine the files you want to include, and then sort them in alphabetical order. This means that, even if we want to change my-pipeline_2 defined, so 03_filter.cfg appear in 02_filter.cfg glob expression before each event will also 03_filter.cfg before the filter defined by 02_filter.cfg a filter.

 

in conclusion

Using global expression can Logstash conduit with modular components, these components are stored as separate files. This increases code maintainability, reusability and readability.

Incidentally, in addition to the technical recording of this blog, we should also consider pipeline to the communication pipe to see if it can improve Logstash implementation module.

reference:

【1】https://www.elastic.co/blog/how-to-create-maintainable-and-reusable-logstash-pipelines

Published 512 original articles · won praise 124 · views 900 000 +

Guess you like

Origin blog.csdn.net/UbuntuTouch/article/details/104569518