Logstash configuration synchronization mysql to configure es

About logstash installation: https: //www.cnblogs.com/toov5/p/10301727.html

Logstash is an open source data collection engine, with real-time pipeline function. Logstash can dynamically data from different data sources to unify, standardize the data to the destination of your choice

Further detail below configuration:

jdbc_driver_library: jdbc mysql-driven path, in the previous step has been downloaded 
jdbc_driver_class: the name of the driver class, mysql fill com.mysql.jdbc.Driver like 
jdbc_connection_string: mysql address 
jdbc_user: mysql user 
jdbc_password: mysql password 
schedule: the timing of implementation of sql similar crontab scheduling 
statement: sql to be executed, a ":" at the beginning of the definition of variables, can be set variable by parameters, sql_last_value here is built in a variable, the represented execution update_time value once sql, where update_time conditions is> = time because there may be equal, not equal sign might miss some incremental 
use_column_value: using a column value is incremented 
tracking_column_type: incrementing type field, represents a numeric value type, timestamp represents the time stamp type 
tracking_column: incrementing field name, As used herein update_time this column, this column is of type timestamp 
last_run_metadata_path: synchronization point file that records the last synchronization point, will read the file restart, this file can be modified manually  

note:

Crontab: official website  https://tool.lu/crontab/   Note: Crontab expression to be divided into units

./bin/logstash -f mysql.conf start

 

principle:

Logstash -> Send queries to MySQL,

Logstash -> send query results to ES

 

All first query data (based on time in 1970), this last record number update_time, modify the condition value as the next time the query. New database or modify, delete when the time will be recorded.

where update_time >= xxxx-xx-xx

Every once in a query. Table inside must have: update_time field

 

Synchronously: 

  1. The new primary key way

  2. update_time way

 

Compare:

Use logstash-input-jdbc mysql the read data plug, the plug-in principle is relatively simple, is the time to execute a sql, sql execution result is then written to the stream, acquiring an incremental manner not synchronized by way binlog, but with an incremental field as a condition to the query, the query each time the current record position due to increasing property only need to check to obtain all this time increment within larger than the current record, there is a general increment field two types of primary keys id and oN uPDATE CURRENT_TIMESTAMP of the update_time AUTO_INCREMENT field, id field only applies to only insert a table that is not updated, more general update_time some suggestions when mysql table design have increased a update_time field.

 

To sum up configuration:

cd /home/elasticsearch/logstash-6.4.3/config

input {
  jdbc {
    jdbc_driver_library => "/home/mysql5.7/mysqlDriver/mysql-connector-java-8.0.13.jar"
    jdbc_driver_class => "com.mysql.jdbc.Driver"
# 8.0以上版本:一定要把serverTimezone=UTC天加上 jdbc_connection_string => "jdbc:mysql://192.168.124.8:3306/test?characterEncoding=utf8&useSSL=false&serverTimezone=UTC&rewriteBatchedStatements=true" jdbc_user => "root" jdbc_password => "root" schedule => "* * * * *" statement => "SELECT * FROM user WHERE update_time >= :sql_last_value" use_column_value => true tracking_column_type => "timestamp" tracking_column => "update_time" last_run_metadata_path => "syncpoint_table" } } output { Elasticsearch { the IP address and port # ES of the hosts => [ "192.168.91.66:9200"] # Name Index can be customized index => "user" database associated with the need to have # a id field, corresponding to the type of ID the document_id => "% {ID}" DOCUMENT_TYPE => "User" } stdout { # outputs the JSON format CODEC => json_lines } }

 

The configuration file config thrown down: name for mysql.conf, casually played

Start: ./bin/logstash ./config/mysql.conf

Note: Because I use the latest driver mysql 8. many versions, so the configuration database url must take the time serverTimezone = UTC day plus!

Meanwhile mysql database:

grant all privileges on *.* to 'root'@'%' identified by 'root' with grant option;

FLUSH PRIVILEGES;

 

Very slow to start the process of looking at the log:

 

kinbana:

 

Guess you like

Origin www.cnblogs.com/toov5/p/11355596.html