Use canal incremental synchronization mysql database information to ElasticSearch

This article describes how to use the canal incremental synchronization mysql database information to ElasticSearch. (Note: incremental !!!)

1 Introduction

Introduction 1.1 canal

Canal is a MySQL based binary log performance data synchronization system. Widely used in Canal Alibaba (including https://www.taobao.com ), to provide a reliable low-latency data pipe increments, github Address: https://github.com/alibaba/canal

Canal Server can parse MySQL binlog and subscribe to data changes, and the changes can be implemented Canal Client broadcast to any place, such as databases and Apache Kafka.

It has the following features:

  1. All supported platforms.
  2. Support powered by Prometheus fine-grained monitoring system.
  3. Analytical support and subscribe to MySQL binlog in different ways, for example by GTID.
  4. Support high-performance, real-time data synchronization. (See Performance)
  5. Canal Server and Canal Client support HA / Scalability, supported by Apache ZooKeeper
  6. Docker support.

Disadvantages:

Does not support full amount update only supports incremental updates.

The full wiki Address: https://github.com/alibaba/canal/wiki

1.2 How it works

The principle is simple:

  1. Canal analog MySQL interactive protocol of the slave, slave disguised as MySQL, and transmitted to the forwarding protocol Master MySQL server.
  2. MySQL Master dump request is received and begins to push the binary log Slave (i.e. canal).
  3. Canal binary log object resolves to their data type (raw byte stream)

as the picture shows:

image

1.3 synchronous es

Es to synchronize data when the need to use an adapter: canal adapter. The latest version 1.1.3, Download: https://github.com/alibaba/canal/releases .

Seemingly es currently supported versions 6.x, 7.x versions are not supported! ! !

2. Preparation

2.1 es and jdk

Reference may be mounted es: https://www.dalaoyang.cn/article/78

Reference can be installed jdk: https://www.dalaoyang.cn/article/16

2.2 Installation canal server

Download canal.deployer-1.1.3.tar.gz

wget https://github.com/alibaba/canal/releases/download/canal-1.1.3/canal.deployer-1.1.3.tar.gz

unzip files

tar -zxvf canal.deployer-1.1.3.tar.gz

Unzip files into the folder

cd canal.deployer-1.1.3

Modify conf / example / instance.properties file, several major attention to the following:

  • canal.instance.master.address: address database, for example 127.0.0.1:3306
  • canal.instance.dbUsername: Database User
  • canal.instance.dbPassword: Database Password

Complete reads as follows:

#################################################
## mysql serverId , v1.0.26+ will autoGen
# canal.instance.mysql.slaveId=0

# enable gtid use true/false
canal.instance.gtidon=false

# position info
canal.instance.master.address=127.0.0.1:3306
canal.instance.master.journal.name=
canal.instance.master.position=
canal.instance.master.timestamp=
canal.instance.master.gtid=

# rds oss binlog
canal.instance.rds.accesskey=
canal.instance.rds.secretkey=
canal.instance.rds.instanceId=

# table meta tsdb info
canal.instance.tsdb.enable=true
#canal.instance.tsdb.url=
#canal.instance.tsdb.dbUsername=
#canal.instance.tsdb.dbPassword=

#canal.instance.standby.address =
#canal.instance.standby.journal.name =
#canal.instance.standby.position =
#canal.instance.standby.timestamp =
#canal.instance.standby.gtid=

# username/password
canal.instance.dbUsername=root
canal.instance.dbPassword=12345678
canal.instance.connectionCharset = UTF-8
# enable druid Decrypt database password
canal.instance.enableDruid=false
#canal.instance.pwdPublicKey=MFwwDQYJKoZIhvcNAQEBBQADSwAwSAJBALK4BUxdDltRRE5/zXpVEVPUgunvscYFtEip3pmLlhrWpacX7y7GCMo2/JM6LeHmiiNdH1FWgGCpUfircSwlWKUCAwEAAQ==

# table regex
canal.instance.filter.regex=.*\\..*
# table black regex
canal.instance.filter.black.regex=

# mq config
#canal.mq.topic=example
# dynamic topic route by schema or table regex
#canal.mq.dynamicTopic=mytest1.user,mytest2\\..*,.*\\..*
#canal.mq.partition=0
# hash partition config
#canal.mq.partitionsNum=3
#canal.mq.partitionHash=test.table:id^name,.*\\..*
#################################################

Back canal.deployer-1.1.3 directory, start the canal:

sh bin/startup.sh

View Log:

vi logs/canal/canal.log

View specific instance log:

 vi logs/example/example.log

Close the command

sh bin/stop.sh

2.3 Installation canal-adapter

Download canal.adapter-1.1.3.tar.gz

wget https://github.com/alibaba/canal/releases/download/canal-1.1.3/canal.adapter-1.1.3.tar.gz

Decompression

tar -zxvf canal.adapter-1.1.3.tar.gz

Unzip files into the folder

cd canal.adapter-1.1.3

Modify conf / application.yml file, the main attention to the following, because it is yml file, note the name of my property described here:

  • server.port: canal-adapter port number
  • canal.conf.canalServerHost: canal-server and ip address
  • canal.conf.srcDataSources.defaultDS.url: Address Database
  • canal.conf.srcDataSources.defaultDS.username: database user name
  • canal.conf.srcDataSources.defaultDS.password: Database Password
  • canal.conf.canalAdapters.groups.outerAdapters.hosts: es host address, tcp port

Complete reads as follows:

server:
  port: 8081
spring:
  jackson:
    date-format: yyyy-MM-dd HH:mm:ss
    time-zone: GMT+8
    default-property-inclusion: non_null


canal.conf:
  mode: tcp
  canalServerHost: 127.0.0.1:11111
  batchSize: 500
  syncBatchSize: 1000
  retries: 0
  timeout:
  accessKey:
  secretKey:
  srcDataSources:
    defaultDS:
      url: jdbc:mysql://127.0.0.1:3306/test?useUnicode=true
      username: root
      password: 12345678
  canalAdapters:
  - instance: example
    groups:
    - groupId: g1
      outerAdapters:
      - name: es
        hosts: 127.0.0.1:9300
        properties:
         cluster.name: elasticsearch

Also need to configure the conf / es / *. Yml file, adapter will automatically load all .yml ending in conf / es profiles. Before introducing the configuration, you need to tell us about the table structure used in this case, as follows:

CREATE TABLE `test` (
  `id` int(11) NOT NULL,
  `name` varchar(200) NOT NULL,
  `address` varchar(1000) DEFAULT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;

Es need to manually create the index, such as created here using es-head, as shown below:

image

Index test structure is as follows:

{
    "mappings":{
        "_doc":{
            "properties":{
                "name":{
                    "type":"text"
                },
                "address":{
                    "type":"text"
                }
            }
        }
    }
}

Next, create test.yml (file names at will), the contents are well understood _index name for the index, sql corresponding statement reads as follows:

dataSourceKey: defaultDS
destination: example
groupId:
esMapping:
  _index: test
  _type: _doc
  _id: _id
  upsert: true
  sql: "select a.id as _id,a.name,a.address from test a"
  commitBatch: 3000

Once configured, return to the root canal-adapter, execute the command to start

bin/startup.sh

View Log

vi logs/adapter/adapter.log

Close canal-adapter command

bin/stop.sh

3. Test

After all successful start, first look at es-head, as shown, and now there is no data.

image

Next, we insert a test data in the database, the statement is as follows:

INSERT INTO `test`.`test`(`id`, `name`, `address`) VALUES (7, '北京', '北京市朝阳区');

Then look at the es-head, as follows

image

Next, look at the log, as follows:

2019-06-22 17:54:15.385 [pool-2-thread-1] DEBUG c.a.otter.canal.client.adapter.es.service.ESSyncService - DML: {"data":[{"id":7,"name":"北京","address":"北京市朝阳区"}],"database":"test","destination":"example","es":1561197255000,"groupId":null,"isDdl":false,"old":null,"pkNames":["id"],"sql":"","table":"test","ts":1561197255384,"type":"INSERT"} 
Affected indexes: test 

Small knowledge: view the log of the method described above may not be very easy to use, it is recommended to use the following syntax, such as viewing log 200 last line:

tail -200f logs/adapter/adapter.log

4. Summary

1. The full amount of the update can not be achieved, but additions and deletions are possible.
2. Be sure to create a good index in advance.
3.es configuration is tcp port, such as the default of 9300

Guess you like

Origin www.cnblogs.com/dalaoyang/p/11069850.html
Recommended