MySQL protocol and implement canal

Foreword

The previous article, we learned canal can sense changes in data from MySQL. This is because it mimics MySQL slave interactive protocol, disguised himself as MySQL slave, in order to achieve a master-slave replication.

It is learned this, I have two questions it has been lingering in my mind:

  • How does it interact with the simulation MySQL slave protocol?
  • It is how to resolve binlog log it?

Today, I prepare it with these two issues, push lightly push lightly canal code, check it out.

A, MySQL master-slave replication

Before turning to canal, we need to further review under the principle of MySQL master-slave replication.

FIG summarizes the process is as follows:

  • MySQL master data changes are written to the binary log (binary log, which records the binary log event is called binary log events);
  • MySQL slave to the master copy of the binary log events to its relay log (relay log);
  • MySQL slave relay log replay the event, data changes will be reflected in its own database.

Two, canal principles

The figure is very image describes the roles of the canal. Its principle is simple:

  • canal interactive simulation mysql slave protocol, disguised himself as mysql slave, sent the agreement to dump mysql master;
  • mysql master dump request is received, starts to push binary log Slave (i.e. Canal);
  • Analytical canal binary log object (for the original byte stream);
  • The canal parsed object, according to the service scenarios, such as to distribute MySQL, RocketMQ or ES.

Third, the source start

After reading the MySQL master-slave replication and canal principles, in order to facilitate debug, I have a source Fork in GitHub, and import local.

Can be found in com.alibaba.otter.canal.deployer.CanalLauncherthe class, it is a stand-alone version canal entrance class started.

Here, the main method can be run directly run canal, and in /canal/bin/startup.shthe same in effect.

In fact, canal codes are more divided on the architecture and design of many modules, such as a parser event, event consumer, memory, storage, service instances, metadata, and other high availability.

This article is not intended to be exhaustive description of each one to achieve, it would have eight children to write a series of canal job. Mainly to the beginning of the two issues we have raised.

Fourth, how to simulate slave?

We have already said, CanalLauncherit is the entry class canal started.

After running the main method, canal will open with a lot of preparatory work. For example, loading a configuration file, initialization message queue, start canal Admin, load the Spring configuration, hook registration procedures.

canal analog slave protocol in EventParserthe module starts in.

In canal code, simplifying the whole process as follows:

// 开始执行replication
// 1. 构造Erosa连接
ErosaConnection erosaConnection = buildErosaConnection();
// 2. 启动一个心跳线程
startHeartBeat(erosaConnection);
// 3. 执行dump前的准备工作
preDump(erosaConnection);
erosaConnection.connect();// 链接
// 查询master serverId
long queryServerId = erosaConnection.queryServerId();
if (queryServerId != 0) {
    serverId = queryServerId;
}
// 4. 获取binlog最后的位置信息
EntryPosition position = findStartPosition(erosaConnection);
final EntryPosition startPosition = position;
// 加载元数据
processTableMeta(startPosition);
// 重新链接,因为在找position过程中可能有状态,需要断开后重建
erosaConnection.reconnect();
// 4. 开始dump数据
erosaConnection.dump(startPosition.getJournalName(),startPosition.getPosition(),sinkHandler);
复制代码

1, shake hands, verification

Before you begin, canal must first establish a connection to the MySQL server, and complete the client authentication.

In MySQL, the connection process following protocol:

In the code, we look at its connection method:

Wherein negotiatethe method is client authentication handshake protocol and the specific implementation. MySQL is in accordance with the protocol specification, created by the above Socket channelto read and write data network.

2, preparation before the dump

After properly connected to MySQL, before starting dump instruction, but also to initialize some configuration information.

The idea is by MySQL actuators, execute SQL statements, get information.

Code is non-stick, but the statement they are executed as follows:

show variables like 'binlog_format'      #获取binlog format格式
show variables like 'binlog_row_image'   #获取binlog image格式
show variables like 'server_id'          #获取matser serverId
show master status                       #获取binlog名称和position
复制代码

3, registration slave

Now calls the erosaConnection.dump(binlogfilename,binlogPosition,func)method to register slave and send the dump command.

In use COM_BINLOG_DUMPprior to sending the request binlog event, register with the primary server from a server, its instructions are COM_REGISTER_SLAVE.

After registering, dump request is transmitted, its instructions are COM_BINLOG_DUMP.

After you perform this code, we pass show processlist;to see the process, you can see the state of this dump thread.

id user host db command time state
139 canal localhost:62901 null Binlog Dump 3 Master has sent all binlog to slave; waiting for more updates

Fifth, how to resolve binlog data?

In the previous section we have seen, MySQL master server has accepted the canal from the server, then when the canal to get binlog content, it is how to resolve it?

First of all, remember when configuring MySQL server, we will binlog-formatset ROW mode, it is row-based replication.

binlog each data change events can be called, in the ROW mode, there are several major types of events:

event SQL command rows content
TABLE_MAP_EVENT null Definition table to be changed.
WRITE_ROWS_EVENT insert To insert row data
DELETE_ROWS_EVENT delete Deleted data
UPDATE_ROWS_EVENT Update To change the data of the original data +

Every change data will trigger two events, first you want to change the table information to tell you, and then tell you to change the content of row.

For example TABLE_MAP_EVENT + WRITE_ROWS_EVENT.

canal after receiving binlog data, and it is not immediately resolved to our familiar JSON data, but only started when sent.

For example, we choose to use RocketMQ, then began binlog inside the byte array into an object before sending.

// 并发构造
EntryRowData[] datas = MQMessageUtils.buildMessageData(message, executor);
// 串行分区
List<FlatMessage> flatMessages = MQMessageUtils.messageConverter(datas, message.getId());
复制代码

In both methods, the complete conversion to a byte array object. Converted into FlatMessagean object, it becomes our consumption to the message queue data structure.

public class FlatMessage implements Serializable {
    private long                      id;
    private String                    database;
    private String                    table;
    private List<String>              pkNames;
    private Boolean                   isDdl;
    private String                    type;
    // binlog executeTime
    private Long                      es;
    // dml build timeStamp
    private Long                      ts;
    private String                    sql;
    private Map<String, Integer>      sqlType;
    private Map<String, String>       mysqlType;
    private List<Map<String, String>> data;
    private List<Map<String, String>> old;
}
复制代码

to sum up

As the beginning of this article said, when I had just learned canal mechanism, really feel very weird.

Hey, it's how MySQL slave simulate it? Always feel that is not what's on the inside black technology. . .

In fact, it is due to ignorance of the author of MySQL.

MySQL already developed a good variety of interface protocols, how to connect, validation, registration and dump there are obviously white write it.

It should be a sentence: just flowers, just waiting for the king to ~

Guess you like

Origin juejin.im/post/5e57ac4cf265da57663fd721