MySQL GTID复制中主从重连如何校验GTID

MySQL GTID复制中主从重连如何校验GTID

  • 环境:MySQL5.7.18 多线程复制
  • show master status先查看主库的Executed_Gtid_Set
root@localhost : (none) 01:37:02> show master status;
+------------------+----------+--------------+------------------+--------------------------------------------+
| File             | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set                          |
+------------------+----------+--------------+------------------+--------------------------------------------+
| mysql-bin.000028 |     4313 |              |                  | 1a324bb7-4a61-11e7-811f-fa163e85255f:1-138 |
+------------------+----------+--------------+------------------+--------------------------------------------+
1 row in set (0.00 sec)
  • show slave status先查看slave的Retrieved_Gtid_Set与Executed_Gtid_Set
    这里写图片描述
  • 备库上show master status查看信息
    这里写图片描述
  • 执行set global gtid_purged=”;之前需要先清空@@GLOBAL.GTID_EXECUTED。也就是要先执行reset master
    这里写图片描述

    可以看到,show slave status、show master status中的Executed_Gtid_Set与select @@GLOBAL.GTID_EXECUTED的值是同一个。

  • 备库执行reset master

root@localhost : (none) 01:42:49> reset master;
Query OK, 0 rows affected (0.09 sec)
  • show slave status 查看Retrieved_Gtid_Set与Executed_Gtid_Set
                  Master_UUID: 1a324bb7-4a61-11e7-811f-fa163e85255f
             Master_Info_File: mysql.slave_master_info
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
      Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates
           Master_Retry_Count: 86400
                  Master_Bind: 
      Last_IO_Error_Timestamp: 
     Last_SQL_Error_Timestamp: 
               Master_SSL_Crl: 
           Master_SSL_Crlpath: 
           Retrieved_Gtid_Set: 1a324bb7-4a61-11e7-811f-fa163e85255f:125-138
            Executed_Gtid_Set: 
                Auto_Position: 1
         Replicate_Rewrite_DB: 
                 Channel_Name: 
           Master_TLS_Version: 
1 row in set (0.00 sec)

发现,执行reset master之后,show slave status中的Executed_Gtid_Set会被清空。

  • 然后执行set global gtid_purged
root@localhost : (none) 01:43:28> set global gtid_purged='1a324bb7-4a61-11e7-811f-fa163e85255f:1-100';
Query OK, 0 rows affected (0.02 sec)
  • 再查看show slave status中的Retrieved_Gtid_Set与Executed_Gtid_Set
                  Master_UUID: 1a324bb7-4a61-11e7-811f-fa163e85255f
             Master_Info_File: mysql.slave_master_info
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
      Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates
           Master_Retry_Count: 86400
                  Master_Bind: 
      Last_IO_Error_Timestamp: 
     Last_SQL_Error_Timestamp: 
               Master_SSL_Crl: 
           Master_SSL_Crlpath: 
           Retrieved_Gtid_Set: 1a324bb7-4a61-11e7-811f-fa163e85255f:125-138
            Executed_Gtid_Set: 1a324bb7-4a61-11e7-811f-fa163e85255f:1-100
                Auto_Position: 1
         Replicate_Rewrite_DB: 
                 Channel_Name: 
           Master_TLS_Version: 
1 row in set (0.00 sec)
root@localhost : (none) 01:43:36> show master status;
+------------------+----------+--------------+------------------+--------------------------------------------+
| File             | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set                          |
+------------------+----------+--------------+------------------+--------------------------------------------+
| mysql-bin.000001 |      154 |              |                  | 1a324bb7-4a61-11e7-811f-fa163e85255f:1-100 |
+------------------+----------+--------------+------------------+--------------------------------------------+
1 row in set (0.00 sec)

root@localhost : (none) 01:58:53> select @@GLOBAL.GTID_EXECUTED;
+--------------------------------------------+
| @@GLOBAL.GTID_EXECUTED                     |
+--------------------------------------------+
| 1a324bb7-4a61-11e7-811f-fa163e85255f:1-100 |
+--------------------------------------------+
1 row in set (0.00 sec)

被purged掉的gtid会被当成已经执行过的gtid,设置在Executed_Gtid_Set中。

  • 此时,在主库上操作,建表。
root@localhost : wukong 02:07:48> create table b(id int);
Query OK, 0 rows affected (10.31 sec)
  • 查看主库的show master status
root@localhost : wukong 02:08:07> show master status;
+------------------+----------+--------------+------------------+--------------------------------------------+
| File             | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set                          |
+------------------+----------+--------------+------------------+--------------------------------------------+
| mysql-bin.000028 |     4483 |              |                  | 1a324bb7-4a61-11e7-811f-fa163e85255f:1-139 |
+------------------+----------+--------------+------------------+--------------------------------------------+
1 row in set (0.00 sec)
  • 查看备库的slave status
                  Master_UUID: 1a324bb7-4a61-11e7-811f-fa163e85255f
             Master_Info_File: mysql.slave_master_info
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
      Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates
           Master_Retry_Count: 86400
                  Master_Bind: 
      Last_IO_Error_Timestamp: 
     Last_SQL_Error_Timestamp: 
               Master_SSL_Crl: 
           Master_SSL_Crlpath: 
           Retrieved_Gtid_Set: 1a324bb7-4a61-11e7-811f-fa163e85255f:125-139
            Executed_Gtid_Set: 1a324bb7-4a61-11e7-811f-fa163e85255f:1-100:139
                Auto_Position: 1
         Replicate_Rewrite_DB: 
                 Channel_Name: 
           Master_TLS_Version: 
1 row in set (0.00 sec)
  • 接着在主库上建另一张表
root@localhost : wukong 02:12:56> create table c(id int);
Query OK, 0 rows affected (2.36 sec)
  • 查看主库的show master status
root@localhost : wukong 02:13:05> show master status;
+------------------+----------+--------------+------------------+--------------------------------------------+
| File             | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set                          |
+------------------+----------+--------------+------------------+--------------------------------------------+
| mysql-bin.000028 |     4653 |              |                  | 1a324bb7-4a61-11e7-811f-fa163e85255f:1-140 |
+------------------+----------+--------------+------------------+--------------------------------------------+
1 row in set (0.00 sec)
  • 此时再查看备库的show slave status
root@localhost : (none) 02:12:30> show slave status\G
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 192.168.1.12
                  Master_User: repl
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: mysql-bin.000028
          Read_Master_Log_Pos: 4653
               Relay_Log_File: mysql-relay-bin.000007
                Relay_Log_Pos: 1730
        Relay_Master_Log_File: mysql-bin.000028
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
              Replicate_Do_DB: 
          Replicate_Ignore_DB: 
           Replicate_Do_Table: 
       Replicate_Ignore_Table: 
      Replicate_Wild_Do_Table: 
  Replicate_Wild_Ignore_Table: 
                   Last_Errno: 0
                   Last_Error: 
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 4653
              Relay_Log_Space: 5333
              Until_Condition: None
               Until_Log_File: 
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File: 
           Master_SSL_CA_Path: 
              Master_SSL_Cert: 
            Master_SSL_Cipher: 
               Master_SSL_Key: 
        Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error: 
               Last_SQL_Errno: 0
               Last_SQL_Error: 
  Replicate_Ignore_Server_Ids: 
             Master_Server_Id: 330612
                  Master_UUID: 1a324bb7-4a61-11e7-811f-fa163e85255f
             Master_Info_File: mysql.slave_master_info
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
      Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates
           Master_Retry_Count: 86400
                  Master_Bind: 
      Last_IO_Error_Timestamp: 
     Last_SQL_Error_Timestamp: 
               Master_SSL_Crl: 
           Master_SSL_Crlpath: 
           Retrieved_Gtid_Set: 1a324bb7-4a61-11e7-811f-fa163e85255f:125-140
            Executed_Gtid_Set: 1a324bb7-4a61-11e7-811f-fa163e85255f:1-100:139-140
                Auto_Position: 1
         Replicate_Rewrite_DB: 
                 Channel_Name: 
           Master_TLS_Version: 
1 row in set (0.00 sec)

发现slave的Executed_Gtid_Set会从139开始。原因:在purged执行之前,slave上的Executed_Gtid_Set: 1a324bb7-4a61-11e7-811f-fa163e85255f:1-138;所以新的会从139开始。当purged的值与原本的Executed_Gtid_Set值不一致,就会造成这种空洞。

  • 此时,如果stop slave,然后start slave
root@localhost : (none) 02:13:06> stop slave;
Query OK, 0 rows affected (0.21 sec)

root@localhost : (none) 02:13:35> start slave;
Query OK, 0 rows affected (1.05 sec)
root@localhost : (none) 02:13:39> show slave status\G
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 192.168.1.12
                  Master_User: repl
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: mysql-bin.000028
          Read_Master_Log_Pos: 4653
               Relay_Log_File: mysql-relay-bin.000008
                Relay_Log_Pos: 454
        Relay_Master_Log_File: mysql-bin.000027
             Slave_IO_Running: Yes
            Slave_SQL_Running: No
              Replicate_Do_DB: 
          Replicate_Ignore_DB: 
           Replicate_Do_Table: 
       Replicate_Ignore_Table: 
      Replicate_Wild_Do_Table: 
  Replicate_Wild_Ignore_Table: 
                   Last_Errno: 1062
                   Last_Error: Coordinator stopped because there were error(s) in the worker(s). The most recent failure being: Worker 2 failed executing transaction '1a324bb7-4a61-11e7-811f-fa163e85255f:103' at master log mysql-bin.000027, end_log_pos 2231. See error log and/or performance_schema.replication_applier_status_by_worker table for more details about this failure or others, if any.
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 1454
              Relay_Log_Space: 11130
              Until_Condition: None
               Until_Log_File: 
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File: 
           Master_SSL_CA_Path: 
              Master_SSL_Cert: 
            Master_SSL_Cipher: 
               Master_SSL_Key: 
        Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error: 
               Last_SQL_Errno: 1062
               Last_SQL_Error: Coordinator stopped because there were error(s) in the worker(s). The most recent failure being: Worker 2 failed executing transaction '1a324bb7-4a61-11e7-811f-fa163e85255f:103' at master log mysql-bin.000027, end_log_pos 2231. See error log and/or performance_schema.replication_applier_status_by_worker table for more details about this failure or others, if any.
  Replicate_Ignore_Server_Ids: 
             Master_Server_Id: 330612
                  Master_UUID: 1a324bb7-4a61-11e7-811f-fa163e85255f
             Master_Info_File: mysql.slave_master_info
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
      Slave_SQL_Running_State: 
           Master_Retry_Count: 86400
                  Master_Bind: 
      Last_IO_Error_Timestamp: 
     Last_SQL_Error_Timestamp: 170728 02:13:40
               Master_SSL_Crl: 
           Master_SSL_Crlpath: 
           Retrieved_Gtid_Set: 1a324bb7-4a61-11e7-811f-fa163e85255f:101-140
            Executed_Gtid_Set: 1a324bb7-4a61-11e7-811f-fa163e85255f:1-102:139-140
                Auto_Position: 1
         Replicate_Rewrite_DB: 
                 Channel_Name: 
           Master_TLS_Version: 
1 row in set (0.00 sec)
  • 查看报错信息
root@localhost : (none) 02:14:21> select * from performance_schema.replication_applier_status_by_worker\G;
*************************** 1. row ***************************
         CHANNEL_NAME: 
            WORKER_ID: 1
            THREAD_ID: NULL
        SERVICE_STATE: OFF
LAST_SEEN_TRANSACTION: 1a324bb7-4a61-11e7-811f-fa163e85255f:102
    LAST_ERROR_NUMBER: 0
   LAST_ERROR_MESSAGE: 
 LAST_ERROR_TIMESTAMP: 0000-00-00 00:00:00
*************************** 2. row ***************************
         CHANNEL_NAME: 
            WORKER_ID: 2
            THREAD_ID: NULL
        SERVICE_STATE: OFF
LAST_SEEN_TRANSACTION: 1a324bb7-4a61-11e7-811f-fa163e85255f:103
    LAST_ERROR_NUMBER: 1062
   LAST_ERROR_MESSAGE: Worker 2 failed executing transaction '1a324bb7-4a61-11e7-811f-fa163e85255f:103' at master log mysql-bin.000027, end_log_pos 2231; Could not execute Write_rows event on table pxs.dd; Duplicate entry '123123123' for key 'PRIMARY', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log FIRST, end_log_pos 2231
 LAST_ERROR_TIMESTAMP: 2017-07-28 14:13:40

如果purged前后没有进行stop slave与start slave,那么slave会接着原本的Retrieved_Gtid_Set从master往下接收新的事务,所以当时在这个例子中看到没有报错。但是,如果purged之后,stop slave、start slave,那么slave会将自己的UNION(@@global.gtid_executed, Retrieved_gtid_set - last_received_GTID)发送给master,在这个例子中是UNION(Executed_Gtid_Set: 1a324bb7-4a61-11e7-811f-fa163e85255f:1-100:139-140,Retrieved_Gtid_Set: 1a324bb7-4a61-11e7-811f-fa163e85255f:125-140)=1a324bb7-4a61-11e7-811f-fa163e85255f:1-100:125-140;master会与之对比自己的Executed_Gtid_Set,在这个例子中是1a324bb7-4a61-11e7-811f-fa163e85255f:1-140。master发现并认为101-124(注意:不是101-138)的gtid对应的事务从库没有执行过,所以会将101-124的事务发送给slave,而实际上103的这个事务slave已经执行过了,所以此时报主键冲突的错误;如果此时101-124对应的master的binlog被purged掉了,那么slave就会报error 1236:Got fatal error 1236 from master when reading data from binary log: ‘The slave is connecting using CHANGE MASTER TO MASTER_AUTO_POSITION = 1, but the master has purged binary logs containing GTIDs that the slave requires.’

  • 主从gtid如何校验

When using GTIDs, the slave tells the master which transactions it has already received, executed, or both. To compute this set, it reads the global value of gtid_executed and the value of the Retrieved_gtid_set column from SHOW SLAVE STATUS. The GTID of the last transmitted transaction is included in Retrieved_gtid_set only when the full transaction is received. The slave computes the following set:

UNION(@@global.gtid_executed, Retrieved_gtid_set)
Prior to MySQL 5.7.5, the GTID of the last transmitted transaction was included in Retrieved_gtid_set even if the transaction was only partially transmitted, and the last received GTID was subtracted from this set. (Bug #17943188) Thus, the slave computed the following set:

UNION(@@global.gtid_executed, Retrieved_gtid_set - last_received_GTID)
This set is sent to the master as part of the initial handshake, and the master sends back all transactions that it has executed which are not part of the set. If any of these transactions have been already purged from the master’s binary log, the master sends the error ER_MASTER_HAS_PURGED_REQUIRED_GTIDS to the slave, and replication does not start.

猜你喜欢

转载自blog.csdn.net/wukong_666/article/details/77636258