Friends who have used Mysql database, must have heard of separation of reading and writing, and those who listened a lot, it is estimated that their ears have become cocooned. So how is the separation of reading and writing achieved? The most common method is to build a master-slave replication of Mysql. The main library provides write operations and the slave library provides read operations, thereby achieving application read-write separation.
For the newcomers who are just entering the development post and the operation and maintenance post, they must understand what read-write separation is and what business problems the read-write separation solves. Only after thoroughly understanding these can they use the read-write separation architecture.
Not much nonsense.
Let’s talk about the two most common errors in master-slave replication. The first type: primary key conflict (Error_code: 1062).
The second type: record loss, such as update and delete operations, in the slave library. The corresponding record could not be found (Error_code: 1032)
Let’s simulate the loss of records in detail and deal with the whole process.
Check whether the master-slave replication is normal
[root@localhost] 11:34:29 [testdb]>show slave status\G;
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 192.168.0.1
Master_User: repl
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: binlog.000029
Read_Master_Log_Pos: 3683
Relay_Log_File: mysql-relay-bin.000003
Relay_Log_Pos: 2207
Relay_Master_Log_File: binlog.000029
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
You can see that the IO thread and SQL thread are running normally.
Create test tables and records
[root@localhost] 11:25:48 [testdb]>show create table test1\G;
*************************** 1. row ***************************
Table: test1
Create Table: CREATE TABLE `test1` (
`id` int(11) NOT NULL,
`name1` char(10) COLLATE utf8mb4_unicode_ci NOT NULL DEFAULT '',
`name2` char(20) COLLATE utf8mb4_unicode_ci NOT NULL DEFAULT '',
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci
1 row in set (0.07 sec)
insert into test1 values(1,'test1','test1');
insert into test1 values(2,'test2','test2');
insert into test1 values(3,'test3','test3');
Simulated master-slave replication failed due to missing records in the slave library
Step 1: Delete the record with id=2 from the library
[root@localhost] 11:26:41 [testdb]>delete from test1 where id=2;
Query OK, 1 row affected (0.44 sec)
[root@localhost] 11:26:52 [testdb]>select * from test1;
+----+-------+-------+
| id | name1 | name2 |
+----+-------+-------+
| 1 | test1 | test1 |
| 3 | test3 | test3 |
+----+-------+-------+
2 rows in set (0.00 sec)
Step 2: Delete the record with id=2 on the main database
[root@localhost] 11:27:11 [testdb]>delete from test1 where id=2;
Query OK, 1 row affected (0.17 sec)
[root@localhost] 11:27:51 [testdb]>select * from test1;
+----+-------+-------+
| id | name1 | name2 |
+----+-------+-------+
| 1 | test1 | test1 |
| 3 | test3 | test3 |
+----+-------+-------+
2 rows in set (0.00 sec)
View master-slave replication on the slave library
[root@localhost] 11:34:05 [testdb]>show slave status\G;
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 192.168.0.1
Master_User: repl
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: binlog.000029
Read_Master_Log_Pos: 3683
Relay_Log_File: mysql-relay-bin.000003
Relay_Log_Pos: 1929
Relay_Master_Log_File: binlog.000029
Slave_IO_Running: Yes
Slave_SQL_Running: No
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 1032
Last_Error: Could not execute Delete_rows event on table testdb.test1; Can't find record in 'test1', Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event's master log binlog.000029, end_log_pos 3652
Skip_Counter: 0
Exec_Master_Log_Pos: 3405
Relay_Log_Space: 2414
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 1032
Last_SQL_Error: Could not execute Delete_rows event on table testdb.test1; Can't find record in 'test1', Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event's master log binlog.000029, end_log_pos 3652
Replicate_Ignore_Server_Ids:
Master_Server_Id: 111213106
Master_UUID: 3ada166e-c4db-11ea-b21d-000c29cc2388
Master_Info_File: mysql.slave_master_info
SQL_Delay: 0
SQL_Remaining_Delay: NULL
Slave_SQL_Running_State:
Master_Retry_Count: 86400
Master_Bind:
Last_IO_Error_Timestamp:
Last_SQL_Error_Timestamp: 200904 11:33:10
Master_SSL_Crl:
Master_SSL_Crlpath:
Retrieved_Gtid_Set: 3ada166e-c4db-11ea-b21d-000c29cc2388:84830-84835
Executed_Gtid_Set: 3ada166e-c4db-11ea-b21d-000c29cc2388:1-84834,
3ada166e-c4db-11ea-b21d-000c29cc2389:1-4
Auto_Position: 1
Replicate_Rewrite_DB:
Channel_Name:
Master_TLS_Version:
1 row in set (0.00 sec)
At this time, the master-slave sql thread is already in a stopped state, and the data copied by the master-slave is out of sync. A 1032 error was reported when copying started.
To solve the 1032 error, you can have the following 3 solutions.
Solution 1 : Manually export the missing business records on the main library and import them to the slave library, and then start the sql thread of the slave library. Wait a minute, have you noticed a problem, that is, on the main library, which record should be exported? It is not in the error message, but there is a hint, he event's master log binlog.000029, end_log_pos 3652, so you need to log binlog It seems a bit troublesome to analyze the content in the log and find the record to be operated. Don't panic, there are options two and three.
Solution 2 : Mysql database provides a parameter slave_skip_errors. This parameter can skip the sql statement specifying the error code, for example: slave_skip_errors=1032. Unfortunately, this parameter cannot be modified online. The modification takes effect and the instance needs to be restarted. Is it too friendly?
[root@localhost] 11:28:57 [testdb]>set global slave_skip_errors=1032;
ERROR 1238 (HY000): Variable 'slave_skip_errors' is a read only variable
Solution 3 : Use the pt-slave-restart tool in the percona-toolkits tool set to automatically skip the error code sql statement specified by the master-slave synchronization. This method is less invasive to mysql data and does not need to restart the Mysql instance
[mysql@mysql ~]$ pt-slave-restart --user=root --password=root --socket=/data/mysql/run/3306/mysql.sock --error-numbers=1032
# A software update is available:
2020-09-04T11:32:07 S=/data/mysql/run/3306/mysql.sock,p=...,u=root mysql-relay-bin.000003 1651 1032
When the error code sql statement specified by the master-slave synchronization is skipped, after the master-slave replication resumes, at an interval of 64 seconds, the master-slave replication will automatically detect whether there is a 1032 error again.
Other similar errors can be dealt with by the above three methods. It is recommended that you use the third method.