[canal system] canal cluster exception Could not find first log file name in binary log index file

Let me first explain that the canal version number used here is 1.1.5

Before describing this problem, we first need to have a basic understanding of the canal architecture.

canal working principle

  • canal simulates the interaction protocol of MySQL slave, disguises itself as MySQL slave, and sends the dump protocol to MySQL master.
  • MySQL master receives the dump request and starts pushing binary log to slave (ie canal)
  • canal parses the binary log object (originally a byte stream)

Several components of the canal environment

canal-server (canal-deploy): directly monitors MySQL's binlog, disguises itself as a MySQL slave, and is only responsible for receiving data without processing it.

canal-adapter: equivalent to the client of canal, it will obtain data from canal-server, and then synchronize the data, which can be synchronized to storage such as MySQL, Elasticsearch, and HBase.

canal-admin: Provides overall configuration management, node operation and maintenance and other operation-oriented functions for canal, and provides a relatively friendly WebUI operation interface to facilitate more users to operate quickly and safely.

canal cluster building architecture

I found a few articles here

Canal Admin High Availability Cluster Usage Tutorial-Tencent Cloud Developer Community-Tencent Cloud

Build a canal cluster environment

 Articles related to cluster construction can give you a general understanding of the corresponding cluster environment, required components and their functions.

Could not find first log file name in binary log index file

2023-09-07 16:15:57.322 [destination = example , address = /192.168.6.168:3306 , EventParser] WARN  c.a.o.c.p.inbound.mysql.rds.RdsBinlogEventParserProxy - ---> find start position successfully, EntryPosition[included=false,journalName=mysql-bin.000192,position=270118817,serverId=101,gtid=,timestamp=1662998460000] cost : 394ms , the next step is binlog dump
2023-09-07 16:15:57.334 [destination = example , address = /192.168.6.168:3306 , EventParser] ERROR c.a.o.canal.parse.inbound.mysql.dbsync.DirectLogFetcher - I/O error while reading from client socket
java.io.IOException: Received error packet: errno = 1236, sqlstate = HY000 errmsg = Could not find first log file name in binary log index file
	at com.alibaba.otter.canal.parse.inbound.mysql.dbsync.DirectLogFetcher.fetch(DirectLogFetcher.java:102) ~[canal.parse-1.1.5.jar:na]
	at com.alibaba.otter.canal.parse.inbound.mysql.MysqlConnection.dump(MysqlConnection.java:238) [canal.parse-1.1.5.jar:na]
	at com.alibaba.otter.canal.parse.inbound.AbstractEventParser$1.run(AbstractEventParser.java:262) [canal.parse-1.1.5.jar:na]
	at java.lang.Thread.run(Thread.java:748) [na:1.8.0_181]
2023-09-07 16:15:57.335 [destination = example , address = /192.168.6.168:3306 , EventParser] ERROR c.a.o.c.p.inbound.mysql.rds.RdsBinlogEventParserProxy - dump address /192.168.6.168:3306 has an error, retrying. caused by 
java.io.IOException: Received error packet: errno = 1236, sqlstate = HY000 errmsg = Could not find first log file name in binary log index file
	at com.alibaba.otter.canal.parse.inbound.mysql.dbsync.DirectLogFetcher.fetch(DirectLogFetcher.java:102) ~[canal.parse-1.1.5.jar:na]
	at com.alibaba.otter.canal.parse.inbound.mysql.MysqlConnection.dump(MysqlConnection.java:238) ~[canal.parse-1.1.5.jar:na]
	at com.alibaba.otter.canal.parse.inbound.AbstractEventParser$1.run(AbstractEventParser.java:262) ~[canal.parse-1.1.5.jar:na]
	at java.lang.Thread.run(Thread.java:748) [na:1.8.0_181]
2023-09-07 16:15:57.336 [destination = example , address = /192.168.6.168:3306 , EventParser] ERROR com.alibaba.otter.canal.common.alarm.LogAlarmHandler - destination:example[java.io.IOException: Received error packet: errno = 1236, sqlstate = HY000 errmsg = Could not find first log file name in binary log index file
	at com.alibaba.otter.canal.parse.inbound.mysql.dbsync.DirectLogFetcher.fetch(DirectLogFetcher.java:102)
	at com.alibaba.otter.canal.parse.inbound.mysql.MysqlConnection.dump(MysqlConnection.java:238)
	at com.alibaba.otter.canal.parse.inbound.AbstractEventParser$1.run(AbstractEventParser.java:262)
	at java.lang.Thread.run(Thread.java:748)

Because in the cluster environment, the canal server will record the binlog location of the last successful consumption in zookeeper, so we need to delete the node information of zookeeper, and the client connects to zookeeper to delete the specified directory data. 

/otter/canal/destinations/{instance’s name}/1001/cursor

Restart canal and verify whether table synchronization has been restored

After I deleted it, I used the canal-admin tool to view the Instance instance. When I checked the log, other errors appeared.

com.alibaba.otter.canal.parse.exception.CanalParseException: column size is not match for table

com.alibaba.otter.canal.parse.exception.CanalParseException: column size is not match for table

com.alibaba.otter.canal.parse.exception.CanalParseException: column size is not match for table This error will cause the parsing thread to be blocked, that is, the binlog event will no longer be received and parsed. 

Caused by: com.alibaba.otter.canal.parse.exception.CanalParseException: column size is not match for table:mx_oms.om_logistics_task_header,138 vs 132
2023-09-07 17:38:46.312 [destination = example , address = /192.168.6.168:3306 , EventParser] ERROR com.alibaba.otter.canal.common.alarm.LogAlarmHandler - destination:example[com.alibaba.otter.canal.parse.exception.CanalParseException: com.alibaba.otter.canal.parse.exception.CanalParseException: parse row data failed.
Caused by: com.alibaba.otter.canal.parse.exception.CanalParseException: parse row data failed.
Caused by: com.alibaba.otter.canal.parse.exception.CanalParseException: column size is not match for table:mx_oms.om_logistics_task_header,138 vs 132

 The first reaction is that it seems to be related to the content of our table metadata. Looking back, there seems to be a place where table metadata information is configured.

Log in to the canal-admin interface and you can see that there is an H2 open source lightweight database that stores data related to the table structure.

This is also to solve the problem of table structure consistency that existed in the previous version of canal.

Of course, I need to interrupt here. If we are building a canal cluster, tsdb is used here to support the consistency of ddl table structure changes, and we must use mysql for storage. Because the cluster I was investigating turned out to have this configuration turned off.

For example, use the following tsdb configuration

canal.instance.tsdb.enable=true
canal.instance.tsdb.url=jdbc:mysql://192.168.6.168:3306/canal_manager
canal.instance.tsdb.dbUsername=xxxxx
canal.instance.tsdb.dbPassword=xxxxx
canal.instance.tsdb.spring.xml = classpath:spring/tsdb/mysql-tsdb.xml

Enter the installation directory of canal-server. For docker containers, enter the directory /home/admin/canal-server/conf/example.

Delete the files starting with h2. and then restart canal-server.

I tested the table synchronization again, and sure enough, the data synchronization was finally successful.

Guess you like

Origin blog.csdn.net/run_boy_2022/article/details/132740697
Recommended