logstash-input-jdbc 实时同步mysql数据

1. 版本

 Linux 版本:CentOS release 6.5 (Final)

JDK版本: java version "1.8.0_102"

名称 版本号 备注
Elasticsearch 6.1.2 最新版本
logstash 5.5.1 logstash选择的版本不需要太新,很难找到匹配的插件版本。
logstash-input-jdbc 4.2.2 同步mysql到es插件
ruby 9.1.13.0 插件是基于ruby开发的,需要ruby包,建议用rvm安装,可选择不同版本
Elasticsearch-head master 基于node.js 开发,需要npm,node,grunt包
mysql-connector-java 5.144 mysql驱动

  说明:rvm是ruby包版本管理程序,可以用yum安装,后通过rvm安装指定版本的ruby,也可以通过yum直接安装ruby但是版本不好控制。

[root@test1 ~]# rvm list known   --列出ruby版本,后续通过  
[ruby-]1.8.6[-p420]
[ruby-]1.8.7[-head] # security released on head
[ruby-]1.9.1[-p431]
[ruby-]1.9.2[-p330]
[ruby-]1.9.3[-p551]
[ruby-]2.0.0[-p648]
[ruby-]2.1[.10]
[ruby-]2.2[.7]
[ruby-]2.3[.4]
[ruby-]2.4[.1]
ruby-head

# for forks use: rvm install ruby-head-<name> --url https://github.com/github/ruby.git --branch 2.2

# JRuby
jruby-1.6[.8]
jruby-1.7[.27]
jruby[-9.1.13.0]
jruby-head
[root@test1 ~]# rvm install 2.0.0   --rvm安装ruby 2.0.0版本 


2. ES安装

  ES安装参考:http://blog.csdn.net/jjshouji/article/details/78450847

 ES Head安装:http://blog.csdn.net/jjshouji/article/details/78449769


3.logstash 安装
 logstash 安装很容易,直接解压即可用,无需配置。

运行命令测试:

[elk@test1 logstash-5.5.1]$ bin/logstash  -e 'input { stdin { } } output { stdout {} }'
ERROR StatusLogger No log4j2 configuration file found. Using default configuration: logging only errors to the console.
Sending Logstash's logs to /home/elk/logstash-5.5.1/logs which is now configured via log4j2.properties
[2018-01-26T11:08:28,330][INFO ][logstash.pipeline        ] Starting pipeline {"id"=>"main", "pipeline.workers"=>6, "pipeline.batch.size"=>125, "pipeline.batch.delay"=>5, "pipeline.max_inflight"=>750}
[2018-01-26T11:08:28,434][INFO ][logstash.pipeline        ] Pipeline main started
The stdin plugin is now waiting for input:
[2018-01-26T11:08:28,760][INFO ][logstash.agent           ] Successfully started Logstash API endpoint {:port=>9601}
 hello hxw --手动输入
2018-01-26T03:08:48.654Z test1 hello hxw --logstash 输出

说明: input { stdin { } } output { stdout {} } 标准输入作为logstash输入,并把结果输出到标准输出。


4.logstash-input-jdbc 安装

 简单说下 ruby ,rvm,gem,bundle 几个命令关系.

 rvm: 安装和管理ruby程序

 ruby:一种为简单快捷的面向对象编程(面向对象程序设计)而创的脚本语言

 gem:Gem是封装起来的Ruby应用程序或代码库

 bundle:在配置文件gemfile里说明你的应用依赖哪些第三方包,他自动帮你下载安装多个包,并且会下载这些包依赖的包

 gem 和bundle 都是安装ruby 程序的,gem手工安装程序,bundle 安装Gemfile文件中指定的包。

安装插件前,先切换gem源和bundle源:

[root@test1 logs]#  gem sources --add https://gems.ruby-china.org/ --remove https://rubygems.org/
[root@test1 logs]#  gem sources -l
*** CURRENT SOURCES ***
https://gems.ruby-china.org/
[elk@test1 ~]$ cd logstash-5.5.1
[elk@test1 logstash-5.5.1]$ bin/logstash-plugin install logstash-input-jdbc
Validating logstash-input-jdbcInstalling 
logstash-input-jdbc
Installation successful    --安装成功
说明:另外安装方式本地安装
bin/logstash-plugin install file:///path/to/logstash-input-jdbc-6.1.2.zip

5.案例
因为查询的表多,业务执行并发大时很慢偶尔会卡死。先用ES作为最终数据同步,mysql数据实时同步到ES。
[elk@test1 logstash-5.5.1]$ ps -ef |grep sample
root     26241     1  2 Jan25 ?        00:33:13 /usr/java/jdk1.8.0_102//bin/java -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -XX:+DisableExplicitGC -Djava.awt.headless=true -Dfile.encoding=UTF-8 -XX:+HeapDumpOnOutOfMemoryError -Djava.security.egd=file:/dev/urandom -Xmx1g -Xms256m -Xss2048k -Djffi.boot.library.path=/home/elk/logstash-5.5.1/vendor/jruby/lib/jni -Xbootclasspath/a:/home/elk/logstash-5.5.1/vendor/jruby/lib/jruby.jar -classpath :.:/usr/java/jdk1.8.0_102//lib/dt.jar:/usr/java/jdk1.8.0_102//lib/tools.jar -Djruby.home=/home/elk/logstash-5.5.1/vendor/jruby -Djruby.lib=/home/elk/logstash-5.5.1/vendor/jruby/lib -Djruby.script=jruby -Djruby.shell=/bin/sh org.jruby.Main /home/elk/logstash-5.5.1/lib/bootstrap/environment.rb logstash/runner.rb -f /home/elk/logstash-5.5.1/data_config/sample.conf
执行的命令:
任务直接丢后台默认每分钟同步一次,可自行配置,后台自动执行命令用nohub 运行即可:
[root@test1 ~]#nohup /home/elk/logstash-5.5.1/bin/logstash -f /home/elk/logstash-5.5.1/data_config/sample.conf  >>/dev/null &
配置详解:
[elk@test1 data_config]$ cat  sample.conf
input {
jdbc {
# 数据库连接信息
jdbc_connection_string => "jdbc:mysql://IP:PORT/DATABASE"
# 连接用户名密码
jdbc_user => "******"
jdbc_password => "********"
# 设置JDBC驱动文件
jdbc_driver_library => "/home/elk/mysql-connector-java-5.1.44-bin.jar"
# 驱动类
jdbc_driver_class => "com.mysql.jdbc.Driver"
jdbc_paging_enabled => "true"
jdbc_page_size => "100000"
#sql 可以单独写在文件里,方便管理
statement_filepath => "/home/elk/logstash-5.5.1/data_config/jdbc.sql"
#调度时间格式 分时日月周 通linux任务计划cron
schedule => "* * * * *"
#如果有多个input 输入到不同的type 课用下面的type 字段区分,后续在output下面分别做逻辑(if  type=="jdbc " then..)
#type => "jdbc"  --- 这里不需要 #因为是增量需要制定跟踪的字段
use_column_value=>true
tracking_column => "updatetime"
--使用别名小写,默认都是小写,sql别名有大小写这里只认小写。
#每次更新是基于上次跟踪的updatetime ,这里定义上次值得放置位置,特殊情况可手动更改,再同步
last_run_metadata_path => "/home/elk/logstash-5.5.1/data_config/last_run_value.log"     }}

####这里有filter 需求可以加filter  删除,修改,截取某些字段。

output {
elasticsearch {
hosts => ["10.0.7.71:9200"]
index => "jw"    ##定义文档的index
document_type=>"order"     ##定义文档类型
document_id =>"%{orderno}"  ##指定sql中的orderno作为文档id,后续有相同orderno 自动替换现有的,更新的目的,注意大小写,只能小写
}
stdout { codec => rubydebug }
}


sql文件:
SELECT ol.order_no AS orderNo, ol.sno AS sno, ol.student_name AS sname, date_add(ol.create_time,INTERVAL 8 hour) AS createTime
, DATE_FORMAT(ol.create_time, '%Y-%c-%d %H:%i:%s') AS createTimeStr, date_add(ol.pay_time,INTERVAL 8 hour) AS payTime
, DATE_FORMAT( IFNULL((
SELECT oph.pay_time
FROM order_payment_history oph
WHERE oph.order_no = ol.order_no
ORDER BY id ASC
LIMIT 1
), ol.pay_time), '%Y-%c-%d %H:%i:%s') AS payTimeStr
, ol.receipt_no AS receiptNo, op.real_pay AS realPay, ol.status AS status
, CASE
ol.STATUS,
CASE ol.STATUS
WHEN '1' THEN '未支付'
WHEN '2' THEN '支付成功'
WHEN '3' THEN '取消支付'
WHEN '4' THEN '支付失效' ELSE '' END AS payStatus,
IFNULL(i.status,1) AS invoiceStatus, op.credit_card_no AS creditCardNo,
op.credit_card_money AS creditCardMoney, op.transfer_money AS transferMoney, op.transfer_receipts AS transferReceipts, op.cash AS cash, op.cash_note AS cashNote
, op.cheque AS cheque, op.cheque_receipts AS chequeReceipts, op.payment_type AS paymentType
, if(ow.webpay_money IS NULL, 0.00, ow.webpay_money) AS webpayMoney
, ow.webpay_receipts AS webpayReceipts,
date_add(ol.update_time,INTERVAL 8 hour) as updatetime
FROM order_list  ol
LEFT JOIN order_payment op ON ol.order_no = op.order_no
LEFT JOIN invoice i ON ol.order_no = i.order_no
LEFT JOIN student_tel st ON ol.sno = st.sno
LEFT JOIN order_webpay ow ON ow.order_no = ol.order_no
where ol.update_time > date_sub(:sql_last_value,INTERVAL 8 hour)
-- ORDER BY ol.create_time desc  排序没必要了


  说明:注意mysql 时间字段和ES导入后的字段时区问题,一般会相差8个小时,可在sql修改用date_add ,date_sub 函数

登录head 查看数据刷新:




附录
bin/logstash-plugin list  --已经安装的插件
bin/logstash-plugin remove [name]  --删除已安装的插件,后面插件名称
bin/logstash-plugin list --verbose --列出已安装插件的版本信息
更多logstash 插件管理:
https://www.elastic.co/guide/en/logstash/current/working-with-plugins.html
ES索引可能会是不健康颜色(黄色),因为文档的副本数量没有达到设置的数量,
因为默认是number_of_replicas:2,number_of_shards:5,可以动态修改(请慎重,没有考虑安全问题):

相关推荐:
ES权威指南:   https://es.xiaoleilu.com/  

猜你喜欢

转载自blog.csdn.net/jjshouji/article/details/79169072