深入研究insert into select语句锁表故障(上)

故障描述
前几天,一个mysql数据库运维同事,在生产上用insert into select from语句,在生产上备份了一张表,结果将备份表全表锁住了,业务影响了大约10分钟。
看到这个语句,我第一反应就是select语句也能锁表,可是生产上的故障,证明确实锁表了。所以,需要将insert into select
from获取锁的情况彻底研究明白。

故障复盘
创建模拟表和模拟记录

[root@localhost] 17:39:55 [testdb1]>show create table t_test_1\G;
*************************** 1. row ***************************
       Table: t_test_1
Create Table: CREATE TABLE `t_test_1` (
  `id` int(11) NOT NULL,
  `name` char(10) COLLATE utf8mb4_unicode_ci NOT NULL DEFAULT '',
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci
1 row in set (0.00 sec)

[root@localhost] 17:40:30 [testdb1]>select * from t_test_1;
+----+-------+
| id | name  |
+----+-------+
|  1 | trest |
|  2 | e99e  |
|  3 | test  |
|  4 | fresd |
|  5 | fsfa  |
+----+-------+
5 rows in set (0.00 sec)

[root@localhost] 17:40:17 [testdb1]>show create table t_test_2\G;
*************************** 1. row ***************************
       Table: t_test_2
Create Table: CREATE TABLE `t_test_2` (
  `id` int(11) NOT NULL,
  `name` char(10) COLLATE utf8mb4_unicode_ci NOT NULL DEFAULT '',
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci
1 row in set (0.00 sec)

模拟insert into select操作


[root@localhost] 17:41:32 [testdb1]>begin;
Query OK, 0 rows affected (0.00 sec)

[root@localhost] 17:41:33 [testdb1]>insert into t_test_2 select * from t_test_1 where name like 'trest';
Query OK, 1 row affected (0.02 sec)
Records: 1  Duplicates: 0  Warnings: 0

获取innodb的lock信息


[root@localhost] 17:42:00 [(none)]>show engine innodb status\G;
TRANSACTIONS
------------
Trx id counter 182551
Purge done for trx's n:o < 182551 undo n:o < 0 state: running but idle
History list length 0
LIST OF TRANSACTIONS FOR EACH SESSION:
---TRANSACTION 421524582451936, not started
0 lock struct(s), heap size 1136, 0 row lock(s)
---TRANSACTION 182546, ACTIVE 20 sec
3 lock struct(s), heap size 1136, 6 row lock(s), undo log entries 1
MySQL thread id 7, OS thread handle 140049254979328, query id 82 localhost root

从innodb引擎获取的lock信息,太少了,只能看到有3 lock struct(s),6 row lock(s),不清楚那表申请的锁,申请什么类型的锁,不知道这些信息,就研究不明白故障到底怎么发生的。

幸运的是,mysql数据库提供一个参数innodb_status_output_locks,可以打印更详细的lock信息。

启用innodb_status_output_locks参数
启用innodb_status_output_locks参数,默认是不开启,所以需要开启。

[root@localhost] 17:31:12 [(none)]>show variables like 'innodb_status_output_locks';
+----------------------------+-------+
| Variable_name              | Value |
+----------------------------+-------+
| innodb_status_output_locks | OFF   |
+----------------------------+-------+
1 row in set (0.00 sec)

[root@localhost] 17:47:41 [(none)]>set global innodb_status_output_locks=on;
Query OK, 0 rows affected (0.00 sec)

[root@localhost] 17:47:41 [(none)]>show variables like 'innodb_status_output_locks';
+----------------------------+-------+
| Variable_name              | Value |
+----------------------------+-------+
| innodb_status_output_locks | ON    |
+----------------------------+-------+
1 row in set (0.00 sec)

获取innodb的lock详细信息
下面是开启innodb_status_output_locks参数之后,获取的详细lock信息


[root@localhost] 17:48:28 [(none)]>show engine innodb status\G;
TRANSACTIONS
------------
Trx id counter 182552
Purge done for trx's n:o < 182551 undo n:o < 0 state: running but idle
History list length 0
LIST OF TRANSACTIONS FOR EACH SESSION:
---TRANSACTION 421524582451936, not started
0 lock struct(s), heap size 1136, 0 row lock(s)
---TRANSACTION 182551, ACTIVE 5 sec
3 lock struct(s), heap size 1136, 6 row lock(s), undo log entries 1
MySQL thread id 11, OS thread handle 140049254979328, query id 100 localhost root
TABLE LOCK table `testdb1`.`t_test_1` trx id 182551 lock mode IS
RECORD LOCKS space id 97 page no 3 n bits 72 index PRIMARY of table `testdb1`.`t_test_1` trx id 182551 lock mode S
Record lock, heap no 1 PHYSICAL RECORD: n_fields 1; compact format; info bits 0
 0: len 8; hex 73757072656d756d; asc supremum;;

Record lock, heap no 2 PHYSICAL RECORD: n_fields 4; compact format; info bits 0
 0: len 4; hex 80000001; asc     ;;
 1: len 6; hex 00000002c710; asc       ;;
 2: len 7; hex af000000310110; asc     1  ;;
 3: len 10; hex 74726573742020202020; asc trest     ;;

Record lock, heap no 3 PHYSICAL RECORD: n_fields 4; compact format; info bits 0
 0: len 4; hex 80000002; asc     ;;
 1: len 6; hex 00000002c710; asc       ;;
 2: len 7; hex af00000031011c; asc     1  ;;
 3: len 10; hex 65393965202020202020; asc e99e      ;;

Record lock, heap no 4 PHYSICAL RECORD: n_fields 4; compact format; info bits 0
 0: len 4; hex 80000003; asc     ;;
 1: len 6; hex 00000002c710; asc       ;;
 2: len 7; hex af000000310128; asc     1 (;;
 3: len 10; hex 74657374202020202020; asc test      ;;

Record lock, heap no 5 PHYSICAL RECORD: n_fields 4; compact format; info bits 0
 0: len 4; hex 80000004; asc     ;;
 1: len 6; hex 00000002c710; asc       ;;
 2: len 7; hex af000000310134; asc     1 4;;
 3: len 10; hex 66726573642020202020; asc fresd     ;;

Record lock, heap no 6 PHYSICAL RECORD: n_fields 4; compact format; info bits 0
 0: len 4; hex 80000005; asc     ;;
 1: len 6; hex 00000002c710; asc       ;;
 2: len 7; hex af000000310140; asc     1 @;;
 3: len 10; hex 66736661202020202020; asc fsfa      ;;

TABLE LOCK table `testdb1`.`t_test_2` trx id 182551 lock mode IX

从上面的信息,可以很清晰看到,t_test_1获取到IS锁,并且有5个Record lock信息,即锁了5条记录,而此表只有5条记录,所以锁全表。
TABLE LOCK table testdb1.t_test_1 trx id 182551 lock mode IS

锁全表解决方案
insert into t_test_2 select * from t_test_1 where name like 'trest';这个sql语句中,t_test_1表的name字段没有索引,索引走了全表扫描,如果在name字段创建索引呢,会有什么变化呢
创建索引

[root@localhost] 17:54:33 [testdb1]>alter table t_test_1 add index idx_t_test_1_name (name);
Query OK, 0 rows affected (0.24 sec)
Records: 0  Duplicates: 0  Warnings: 0

[root@localhost] 17:54:52 [testdb1]>begin;
Query OK, 0 rows affected (0.00 sec)

[root@localhost] 17:54:55 [testdb1]>insert into t_test_2 select * from t_test_1 where name like 'trest';
Query OK, 1 row affected (0.00 sec)
Records: 1  Duplicates: 0  Warnings: 0

重新获取innodb的lock详细信息


TRANSACTIONS
------------
Trx id counter 182565
Purge done for trx's n:o < 182565 undo n:o < 0 state: running but idle
History list length 0
LIST OF TRANSACTIONS FOR EACH SESSION:
---TRANSACTION 421524582451936, not started
0 lock struct(s), heap size 1136, 0 row lock(s)
---TRANSACTION 182560, ACTIVE 3 sec
3 lock struct(s), heap size 1136, 2 row lock(s), undo log entries 1
MySQL thread id 11, OS thread handle 140049254979328, query id 105 localhost root
TABLE LOCK table `testdb1`.`t_test_1` trx id 182560 lock mode IS
RECORD LOCKS space id 97 page no 4 n bits 72 index idx_t_test_1_name of table `testdb1`.`t_test_1` trx id 182560 lock mode S
Record lock, heap no 1 PHYSICAL RECORD: n_fields 1; compact format; info bits 0
 0: len 8; hex 73757072656d756d; asc supremum;;

Record lock, heap no 6 PHYSICAL RECORD: n_fields 2; compact format; info bits 0
 0: len 10; hex 74726573742020202020; asc trest     ;;
 1: len 4; hex 80000001; asc     ;;

TABLE LOCK table `testdb1`.`t_test_2` trx id 182560 lock mode IX

看到没有,在这里,现在只有一个Record lock,不再是锁全表了。

此故障分析未完,待续。

猜你喜欢

转载自blog.51cto.com/15061930/2642055