Getting started with ES Logstash data synchronization

1 Introduction

Official website address: https://www.elastic.co/cn/logstash

Logstash is a powerful tool that can be integrated with various deployments. It provides a large number of plug-ins that can help you parse, enrich, convert and buffer data from various sources. If your data requires additional processing that is not available in Beats, you need to add Logstash to the deployment.

Logstash is a very important part of the Elastic stack, but it is not only used by Elasticsearch. It can introduce a wide variety of data sources. Logstash can help us analyze, enrich, and convert data with its own Filter.

Finally, it can output its own data to various data storage locations, including Elasticsearch.

To put it simply, Logstash is a data flow engine:

  • It is an open source streaming ETL engine for data logistics
  • Establish a data flow pipeline in minutes
  • With horizontal scalability and resilience, with adaptive buffering
  • Unknowable data source
  • Plug-in ecosystem with more than 200 integrations and processors
  • Use Elastic Stack to monitor and manage deployments

We can control the automatic synchronization of data by setting the auto-increment id primary key or time, and the time is used logstashfor identification.

  • id: Assuming that there are 1000 pieces of data now, Logstatsh will perform a synchronization after identification. After synchronization, the id will be recorded as 1000. After the database adds new data, the id will always accumulate, logstash will have a scheduled task, and it is found that there is an id greater than 1000. , The increment is added to es.
  • Time: In the same way, at the beginning of synchronizing 1000 pieces of data, each piece of data has a field called time. After the first synchronization is completed, record this time, and perform time comparison during the next synchronization. If it exceeds this time, then Synchronization can be done, here you can synchronize new data, or modify metadata, because the time change of the same data will be recognized, but the id will not.

2 working principle

Official website address: https://www.elastic.co/guide/en/logstash/current/index.html

Insert picture description here

Logstash consists of 3 main parts: inputs, filters and outputs. You must define the configuration of these processes to use Logstash, although not every one is required. In some cases, we can even have no filters. In the filter part, it can analyze, enrich, process, etc. the data of the data source.

3 Related information

  • The blog post is not easy, everyone who has worked so hard to pay attention and praise, thank you

Guess you like

Origin blog.csdn.net/qq_15769939/article/details/114995419