116. SQL线程故障

项目背景:
开发人员误操作 将从库当主库使,导致主从环境产生故障
报错信息如下:

2019-12-31T09:53:24.916228Z 12 [ERROR] Slave SQL for channel '': Error 'Can't create database 'rep'; database exists' on query. Default database: 'rep'. Query: 'create database rep charset utf8', Error_code: 1007
2019-12-31T09:53:24.916472Z 12 [Warning] Slave: Can't create database 'rep'; database exists Error_code: 1007
2019-12-31T09:53:24.916494Z 12 [ERROR] Error running query, slave SQL thread aborted. Fix the problem, and restart the slave SQL thread with "SLAVE START". We stopped at log 'mysql-bin.000007' position 194.

处理方法(以从库为核心的处理方案):
方案一:

stop slave; 
set global sql_slave_skip_counter = 1;
#将同步指针向下移动一个,如果多次不同步,可以重复操作。
start slave;

从库操作:
mysql> stop slave;
Query OK, 0 rows affected (0.00 sec)

mysql> set global sql_slave_skip_counter=1;
Query OK, 0 rows affected (0.00 sec)

mysql> start slave;
Query OK, 0 rows affected (0.00 sec)

主库测试:

mysql> use rep;
Database changed
mysql> create table t1 (id int,name varchar(20));
Query OK, 0 rows affected (0.00 sec)

mysql> insert into t1 values(1,'test');
Query OK, 1 row affected (0.01 sec)

mysql> commit;
Query OK, 0 rows affected (0.00 sec)

mysql> select * from t1;
+------+------+
| id   | name |
+------+------+
|    1 | test |
+------+------+
1 row in set (0.00 sec)

从库检查:

mysql> use rep;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Database changed
mysql> show tables;
+---------------+
| Tables_in_rep |
+---------------+
| t1            |
+---------------+
1 row in set (0.00 sec)

mysql> select * from t1;
Empty set (0.00 sec)

mysql> show slave status \G
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 10.0.0.51
                  Master_User: repl
                  Master_Port: 3306
                Connect_Retry: 10
              Master_Log_File: mysql-bin.000007
          Read_Master_Log_Pos: 800
               Relay_Log_File: db02-relay-bin.000008
                Relay_Log_Pos: 498
        Relay_Master_Log_File: mysql-bin.000007
             Slave_IO_Running: Yes
            Slave_SQL_Running: No
                   Last_Errno: 1677
                   Last_Error: Column 1 of table 'rep.t1' cannot be converted from type 'varchar(60(bytes))' to type 'varchar(20(bytes) latin1)'
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 541
              Relay_Log_Space: 1345
              Until_Condition: None
               Until_Log_File: 
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
        Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error: 
               Last_SQL_Errno: 1677
               Last_SQL_Error: Column 1 of table 'rep.t1' cannot be converted from type 'varchar(60(bytes))' to type 'varchar(20(bytes) latin1)'
               
报错详解:因为主库和从库的字符集不一致;导致录入不了数据

方案二:

vim  /etc/my.cnf
slave-skip-errors = 1032,1062,1007
常见错误代码:
1007:对象已存在
1032:无法执行DML
1062:主键冲突,或约束冲突
但是,以上操作有时是有风险的,最安全的做法就是重新构建主从。把握一个原则,一切以主库为主

方案三:一劳永逸的方法

(1) 可以设置从库只读.
db01 [(none)]>show variables like '%read_only%';
mysql> set global read_only=1;		#临时生效,永久需修改my.cnf
Query OK, 0 rows affected (0.00 sec)

注意:
只会影响到普通用户,对管理员用户无效。

(2)采用架构的方式 加中间件
使主从环境 读写分离

主从环境读写分离架构图:
在这里插入图片描述
结果分析:
1、SQL线程功能:

(1)读写relay-log.info 
(2)relay-log损坏,断节,找不到
(3)接收到的SQL无法执行

2、导致SQL线程故障原因分析:

1. 版本差异,参数设定不同,比如:数据类型的差异,SQL_MODE影响
2.要创建的数据库对象,已经存在
3.要删除或修改的对象不存在  
4.DML语句不符合表定义及约束时.  
归根揭底的原因都是由于从库发生了写入操作.
Last_SQL_Error: Error 'Can't create database 'db'; database exists' on query. Default database: 'db'. Query: 'create database db'
发布了148 篇原创文章 · 获赞 65 · 访问量 7626

猜你喜欢

转载自blog.csdn.net/chengyinwu/article/details/103787666