Use logstash to synchronize mysql data to elasticsearch

Introduction to ELK

ELK is the abbreviation of three open source software, which means: Elasticsearch, Logstash, Kibana, they are all open source software. A new FileBeat is added, which is a lightweight log collection and processing tool (Agent). Filebeat occupies less resources and is suitable for collecting logs on various servers and transmitting them to Logstash. This tool is also officially recommended.

Elasticsearch is an open source distributed search engine that provides three functions of collecting, analyzing, and storing data. Its features are: distributed, zero configuration, automatic discovery, automatic index sharding, index copy mechanism, restful style interface, multiple data sources, automatic search load, etc.

Logstash is mainly a tool for collecting, analyzing, and filtering logs, and supports a large number of data acquisition methods. The general working method is the c/s architecture. The client is installed on the host that needs to collect logs, and the server is responsible for filtering and modifying the received logs of each node and sending them to elasticsearch at the same time.

Kibana is also an open source and free tool. Kibana can provide Logstash and ElasticSearch with a friendly web interface for log analysis, which can help summarize, analyze, and search for important data logs.

Use Logstash

You can refer to these two blogs about installing elasticsearch and head and installing kibana .

Let's talk about how to use Logstash to synchronize the data in the mysql table to elasticsearch.

1. Download and install Logstash

2. Unzip it to the directory you specify, and create a mysql folder in the same directory as bin to store the mysql driver package. Download the mysql driver package .
Insert picture description here
3. Create a logstash.conf file in the bin folder. The configuration content of this file is as follows:

input {
    
    
  # 多张表的同步只需要设置多个jdbc的模块就行了
  jdbc {
    
    
      # mysql 数据库链接,test为数据库名
      jdbc_connection_string => "jdbc:mysql://localhost:3306/test?useUnicode=true&characterEncoding=utf8&serverTimezone=UTC"
      # 用户名和密码
      jdbc_user => "xxxx"
      jdbc_password => "xxxx"

      # 驱动
      # 改成你自己的安装路径
      jdbc_driver_library => "D:/software/logstash-7.3.2/mysql/mysql-connector-java-8.0.22.jar"

      # 驱动类名
      jdbc_driver_class => "com.mysql.cj.jdbc.Driver"
      jdbc_validate_connection => "true"

      #是否分页
      #jdbc_paging_enabled => "true"
      #jdbc_page_size => "1000"
      #时区
      jdbc_default_timezone => "Asia/Shanghai"

      #直接执行sql语句
	  statement => "select * from user"
      # 执行的sql 文件路径+名称
      # statement_filepath => "/hw/elasticsearch/logstash-6.2.4/bin/test.sql"

      #设置监听间隔  各字段含义(由左至右)分、时、天、月、年,全部为*默认含义为每分钟都更新
      #schedule => "* * * * *"
      #每隔10分钟执行一次
      #schedule => "*/10 * * * *"
      #是否记录上次执行结果, 如果为真,将会把上次执行到的 tracking_column 字段的值记录下来,保存到last_run_metadata_path
      record_last_run => true
      #记录最新的同步的offset信息
      #last_run_metadata_path => "D:/es/logstash-7.3.2/logs/last_id.txt"

      use_column_value => true
      #递增字段的类型,numeric 表示数值类型, timestamp 表示时间戳类型
      tracking_column_type => "numeric"
      tracking_column => "id"
      clean_run => false
      # 索引类型
      #type => "jdbc"
    }

}


output {
    
    
  elasticsearch {
    
    
        #es的ip和端口
        hosts => ["http://localhost:9200"]
        #ES索引名称(自己定义的)
        index => "test"
        #文档类型
        document_type => "_doc"
        #设置数据的id为数据库中的字段
        document_id => "%{id}"
    }
    stdout {
    
    
        codec => json_lines
    }
}

4. Startup
Go to the installation directory bin of logstash and enter in the command line window

logstash -f logstash.conf

Insert picture description here
Note: The installation path of logstash cannot have Chinese, otherwise an error will be reported.

After running, you can view the data in the index through the head plugin of elasticsearch.

Guess you like

Origin blog.csdn.net/hello_cmy/article/details/111224633