What is Logstash and what is it used for? Take you to know in detail

What is Logstash and what is it used for? Take you to know in detail

introduce

  • Logstash is an open source data collection engine with real-time pipeline capabilities. Logstash can dynamically unify data from different sources and normalize the data to a target output of your choice. It provides a large number of plugins that help us parse, enrich, transform and buffer any kind of data.

how to work

  • Pipeline (Logstash Pipeline) is an independent operating unit in Logstash. Each pipeline contains two required element input (input) and output (output), and an optional element filter (filter). The event processing pipeline is responsible for coordination their execution. Input and output support codecs, allowing you to encode or decode data as it enters or exits the pipeline without having to use separate filters. Such as: json, multiline, etc.

inputs (input stage):

  • event will be generated. Including: file, kafka, beats, etc.

filters (filter stage):

  • Events can be processed using a combination of filters and conditional statements. Including: grok, mutate, etc.

outputs (output stage):

  • Send the event data to a specific destination, complete the output processing, and change the event to complete the execution. Such as: elasticsearch, file, etc.

Codecs (decoder):

  • Basically stream filters that operate as part of the input and output to easily separate the transport of messages from the serialization process.

1. Working principle

Each input stage in the Logstash pipeline runs in its own thread, and the input writes events to a central queue in memory or disk. Each pipeline worker gets a batch of events from the queue, runs the batch of events through the configured filters, and then runs the filtered events to all outputs. The batch size and number of worker threads can be configured via pipeline.batch.sizeand .pipeline.workers

By default, Logstash uses memory queues to cache events between stages of the pipeline. If an unexpected termination occurs, the events in memory will be lost. To prevent data loss, a Logstash configuration can be enabled queue.type: persistedto persist running events to disk.

2. Sequence of events

By default Logstash does not guarantee event ordering, reordering can be sent in two places:

  • Events in a batch can be reordered during filter processing

  • Batches can be reordered when one or more batches are processed faster than others

When maintaining the order of events is important, the ordering setting:

  1. If set pipeline.ordered: autoand set pipeline.workers: 1, sorting is automatically enabled.

  2. set pipeline.ordered: true, this method ensures that the batches are executed one by one and ensures that events maintain their order within the batch.

  3. Setting pipeline.ordered: falsedisables sort processing, but saves the cost of sorting.

Logstash module

  • Logstash Module provides a fast end-to-end solution for ingesting data and visualizing it with dedicated dashboards.

  • Each module has built-in Logstash configuration, Kibana dashboards, and other meta files. Makes it easier for you to set up the Elastic Stack for a specific use case or data source.

  • To make it easier to get started, Logstash Module provides three basic functions, and the following steps will be performed when running the module:

  • Create an ElasticSearch index

  • Set up the indexing patterns, searches and visualizations needed for Kibana dashboards and visualize data.

  • Run the Logstash pipeline with the configuration

elastic data

  • As data flows through the event processing pipeline, Logstash may encounter conditions that prevent its events from being delivered to the output. Such as: unexpected data type or abnormal termination. To prevent data loss and ensure an uninterrupted flow of events through the pipeline, Logstash provides two features.

    • persistent queues

    • Dead letter queues (dead letter queues-DLQ)

persistent queue

  • By default Logstash uses in-memory bounded queues to buffer events between pipeline stages (inputs->pipeline worker). The size of these memory queues is fixed and not configurable. If Logstash experiences a temporary computer failure, the data in the in-memory queue will be lost.

  • In order to prevent data loss during abnormal termination, Logstash has a persistent queue function, which stores message queues on disk to provide data persistence. Persistent queues are also useful for Logstash deployments that require large buffers. Instead of deploying and managing message brokers (Kafka, Redis, etc.) to facilitate a buffered publish-subscribe model, persistent queues can be enabled to buffer messages on disk and drop message brokers.

  • Use queue.max_bytes to configure the total capacity of the queue on disk. When the queue is full, Logstash puts pressure on the input to stop data from flowing in and no longer accepts new events. This mechanism helps to control the data flow rate during the input phase without overwhelming it. sex to the output.

Benefits of persistent queues:

  • To avoid data loss when Logstash terminates abnormally or restarts, messages are stored on disk until delivery succeeds at least once.

  • No need to buffer message brokers external to Kafka. Deal with large buffer zones and absorb contingencies.

Problems that cannot be resolved:

  • Persistent queues do not protect against data loss against permanent machine failures (such as disk corruption). Plugins like Beats and http with acknowledgment capabilities will be well protected by persistent queues.

  • Without input plugins for request-response protocols (eg TCP, UDP), persistent queues cannot protect against data loss.

working principle

  • The queue sits between the input and filter stages: input → queue → filter + output.

  • When the input stage can process the event, the event is written to the queue. After a successful write, the input can send an acknowledgment to the data source.

  • When processing an event in a queue, Logstash only logs the event (the queue keeps the pipeline for events already processed) as acknowledged (acknowledged/ACKed) when the event has been fully processed by the filters and outputs - meaning the event has been processed Handled by all configured filters and outputs.

  • On a graceful shutdown, Logstash will stop reading data from the queue and will complete in-flight events being processed by filters and outputs. After restarting, Logstash will resume processing events in the durable queue and accepting new events from the input.

  • If Logstash terminates abnormally, any running events will not be logged as ACKed and will be reprocessed by filters and outputs when Logstash restarts. When Logstash batches events, when an abnormal termination occurs, there may be some batches that have completed successfully but are not recorded as ACKed.

Page

  • The queue itself is a collection of pages, divided into a head page and a tail page. There is only one head page, which will become a tail page when it reaches a specific size (queue.page_capacity), and a new front page. Trailer pages are immutable, head pages are append-only. Each page is a file. After all events in the page are acknowledged, they will be deleted. If at least one of the older pages is unacknowledged, the entire page will be kept on disk until all events on the page are successfully processed. until.

checking point

  • When the persistent queue feature is enabled, Logstash commits to disk through a mechanism called checkpointing. A checkpoint file records details about itself (page information, acknowledgments, etc.) in a separate file. When checkpointing, Logstash will invoke the sync operation on the head page and atomically write the current state of the queue to disk. The process of checkpointing is atomic, meaning that if successful, any modifications to the file are saved. If Logstash terminates, or if there is a hardware-level failure, all data buffered in durable queues that has not yet been submitted for a checkpoint will be lost. You can force Logstash to checkpoint more frequently by setting queue.checkpoint.writes . To ensure maximum durability and avoid losing data, you can set queue.checkpoint.writes to 1 to force a checkpoint after each event.

dead letter queue

  • Dead letter queues provide another layer of data resiliency. (Dead letter queues are currently only supported for Elasticsearch output, for documents with response codes 400 and 404, both of which indicate events that cannot be retried.) By default, when Logstash encounters an event that cannot be processed due to data errors, it will Suspend or delete failed events. To prevent data loss, unsuccessful events can be configured to be written to the dead letter queue instead of discarded. Each event written to the dead letter queue includes the original event, the reason it could not be processed, the insertion information that wrote the event, and the event timestamp. To handle events in the dead letter queue, you need to create a pipeline configuration that uses the dead_letter_queue plugin to read data from the dead letter queue.

working principle

  •  
  • If an HTTP request that Elasticsearch cannot access fails, the Elasticsearch output plugin will retry the entire request indefinitely, and the dead letter queue will not intercept in these scenarios.

Deploy and scale

  • From operational log and metric analysis to enterprise and application search, the Elastic Stack can be used for a multitude of use cases. Ensuring scalable, durable, and secure data transfer to Elasticsearch is extremely important, especially for mission-critical environments. This article focuses on common architectural models for Logstash and how to scale efficiently as your deployment grows. Emphasis is placed on operational logs, metrics, security analytics use cases as they tend to require large-scale deployments.

Beats to Elasticsearch

  • Quickly collect, parse, and index popular log types and pre-built Kibana dashboards using Filebeat Modules. In this case Beats will send the data directly to ES, which will be processed and indexed by the ingestion node.

Beats and Logstash to Elasticsearch

  • Together, Beats and Logstash provide a comprehensive solution that is scalable and resilient. Beats runs on thousands of edge host servers, collecting, tailing and streaming logs to Logstash. Logstash is horizontally scalable and can form groups of nodes running the same pipeline. Logstash's adaptive buffering feature facilitates smooth transfers even with fluctuating throughput. If Logstash becomes a bottleneck, just add more nodes to scale out. Here are some suggestions:

extension:

  • Beats should load balance across a set of Logstash nodes

  • It is recommended to use at least two Logstash nodes for high availability

  • Typically only one Beats input is deployed per Logstash node, but it is also possible to deploy multiple Beats inputs per Logstash node.

elasticity:

  • Guaranteed at-least-once delivery when using Filebeat/Winlogbeat for log collection

  • Both communication protocols from Filebeat/Winlogbeat to Logstash, and from Logstash to Elasticsearch are synchronous and support confirmation. Other Beats are not supported.

deal with:

  • Logstash will typically use grok or dissect to extract fields, augment geographic information, and can further leverage files, databases, or Elasticsearch lookup datasets to enrich events.

  • Processing complexity affects overall throughput and CPU utilization, make sure to check out other available filter plugins.

Integrating with Messaging Queues

  • Putting data into the Elastic Stack is easy if you have message queues in your existing infrastructure. If you only use message queues for Logstash to buffer data, it is recommended to use Logstash persistent queues to eliminate unnecessary complexity.

performance tuning

  • Includes performance troubleshooting and tuning and analyzing Logstash performance.

JVM

  • It is recommended that the size of the heap should be no less than 4G and no more than 8G. If it is too small, the JVM will continue to perform garbage collection, resulting in increased CPU utilization.

  • The size of the heap should not exceed the level of the amount of physical memory, some memory must be reserved to run the OS and other processes, generally not more than 50-75% of the physical memory.

  • Setting the minimum (Xms) and maximum (Xmx) heap allocation size to the same value prevents heap resizing at runtime, which is a very expensive process.

Tuning and analyzing Logstash performance

  • Logstash provides the following options to optimize pipeline performance, pipeline.workers, pipeline.batch.size and pipeline.batch.delay.

pipeline.workers

  • This setting determines how many threads to run for filtering and output processing. If you find that events are being backed up or the CPU is not saturated, consider increasing this parameter to make better use of available processing power.

pipeline.batch.size

  • This setting defines the maximum number of events that a single worker thread collects before attempting to execute filters and outputs. Larger batch sizes are generally more efficient, but increase memory overhead. Certain hardware configurations require you to increase the JVM heap space in the jvm.options configuration file to avoid performance degradation. Values ​​outside the optimal range can cause performance degradation due to frequent garbage collections or JVM crashes related to out-of-memory exceptions. Output plugins can process each batch as a logical unit. For example, Elasticsearch outputs batch requests for each batch it receives. Adjust the pipeline.batch.size setting to adjust the size of the batch requests sent to Elasticsearch.

pipeline.batch.delay

  • Adjustments are rarely required. This setting adjusts the latency of the Logstash pipeline. The pipeline batch delay is the maximum time, in milliseconds, that Logstash waits for a new message after receiving an event in the current pipeline worker thread. After this time, Logstash starts executing filters and outputs. The maximum time Logstash waits between receiving an event and processing it in a filter is the product of the pipeline.batch.delay and pipeline.batch.size settings.

Pipeline configuration and optimization

  • The total number of in-progress events is determined by the product of the pipeline.workers and pipeline.batch.size settings. Note that pipelines that receive large events intermittently require sufficient memory to handle these spikes. It is possible to set the number of worker threads higher than the number of CPU cores, since output usually spends idle time under I/O wait conditions.

Guess you like

Origin blog.csdn.net/Andrew_Chenwq/article/details/129633618