一个mysql事务引发的思考(血案)

版权声明:本文为博主原创文章,未经博主允许不得转载。 https://blog.csdn.net/hl_java/article/details/80918062

问题简述

大家都知道mysql是支持ACID,支持事务的,事务是非常重要的一个特性,要不都执行成功,要不都不成功。我们在coding时也会大量用到,但是随着业务代码的累加与使用人数的增加,系统有某些场景下会出现下面这个错误

11:48:40.265[http-nio-8999-exec-3] ERROR c.x.e.l.c.e.ExceptionHandleFilter - exception message: {} com.xxx.common.exception.ExceptionHandleFilter:45
org.springframework.dao.CannotAcquireLockException: 
### Error updating database.  Cause: java.sql.SQLException: Lock wait timeout exceeded; try restarting transaction
### The error may involve com.xxx.dao.issuegroup.CatalogTaskDao.insert-Inline
### The error occurred while setting parameters
### SQL: INSERT INTO `catalog_task`  (`issue_id`,`catalog_id`,`status`,`cuserid`)  VALUES ( ?,?,?,?)
### Cause: java.sql.SQLException: Lock wait timeout exceeded; try restarting transaction
; SQL []; Lock wait timeout exceeded; try restarting transaction; nested exception is java.sql.SQLException: Lock wait timeout exceeded; try restarting transaction
    at org.springframework.jdbc.support.SQLErrorCodeSQLExceptionTranslator.doTranslate(SQLErrorCodeSQLExceptionTranslator.java:259) ~[spring-jdbc-4.2.3.RELEASE.jar:4.2.3.RELEASE]
    at org.springframework.jdbc.support.AbstractFallbackSQLExceptionTranslator.translate(AbstractFallbackSQLExceptionTranslator.java:73) ~[spring-jdbc-4.2.3.RELEASE.jar:4.2.3.RELEASE]
    at org.mybatis.spring.MyBatisExceptionTranslator.translateExceptionIfPossible(MyBatisExceptionTranslator.java:75) ~[mybatis-spring-1.3.0.jar:1.3.0]
    at org.mybatis.spring.SqlSessionTemplate$SqlSessionInterceptor.invoke(SqlSessionTemplate.java:447) ~[mybatis-spring-1.3.0.jar:1.3.0]
    at com.sun.proxy.$Proxy99.insert(Unknown Source) ~[na:na]
    at org.mybatis.spring.SqlSessionTemplate.insert(SqlSessionTemplate.java:279) ~[mybatis-spring-1.3.0.jar:1.3.0]
    at org.apache.ibatis.binding.MapperMethod.execute(MapperMethod.java:56) ~[mybatis-3.4.0.jar:3.4.0]
    at org.apache.ibatis.binding.MapperProxy.invoke(MapperProxy.java:53) ~[mybatis-3.4.0.jar:3.4.0]

问题跟踪

复现操作场景,跟踪期间执行的sql语句
开启mysql generallog日志,并找到日志路径,如果未开启的话执行 set global general_log = on; 即可。

mysql> show global variables like '%general%';
ERROR 2006 (HY000): MySQL server has gone away
No connection. Trying to reconnect...
Connection id:    16082
Current database: lean_dev
+------------------+---------------------------------+
| Variable_name    | Value                           |
+------------------+---------------------------------+
| general_log      | ON                              |
| general_log_file | /var/lib/mysql/9905e4ad340b.log |
+------------------+---------------------------------+
2 rows in set (0.61 sec)

需要注意的是问题复现期间建议使用单独的mysql机器,我使用的是已经废弃不用的机器(ip:10.94.120.110),这样才能得到特定场景操作下的纯粹的sql日志,截取具体的日志,
找到可能存在锁竞争的sql(过滤掉select语句):

2018-07-04T12:04:11.847548Z 16087 Query SET autocommit=0
2018-07-04T12:04:11.901111Z 16087 Query UPDATE lean_archive la SET la.status = -1, mtime = '2018-07-04 20:04:11.875'  WHERE la.object_type = 1      AND la.object_id = 195959      AND la.status = 0
2018-07-04T12:04:12.083234Z 16087 Query insert into  lean_change  ( `issueid`,`cuserid`,`ctime`, `type`, `status`, `field`, `oldvalue`, `newvalue`, `desc`,source)  values  (null,120859,now(),'list
',11,'status','195959',null,'柳志崇恢复了列“{2}”',0)
2018-07-04T12:04:12.099792Z 16087 Query update  lean_list  set  name = 'done2',  ctime = '2018-05-09 16:55:45.0',  cuserid = 120859,  list_order = 3,  status = 0,  boardid = 10114, finished = 0 wh
ere id = 195959
2018-07-04T12:04:15.023635Z 16087 Query UPDATE `catalog`  SET `status` = 0,mtime=now() where  status !=-1 and id in    (    1649   ,   1650   ,   3602   ,   3603   ,   3604   ,   3605   )
2018-07-04T12:04:15.036100Z 16087 Query UPDATE `catalog_task`  SET `status` = 0,mtime=now()   where status!=-1 and issue_id in    (    118859   )
2018-07-04T12:04:15.051886Z 16087 Query update lean_issue SET mtime = now()        ,boardid=10114  where id = 118859
2018-07-04T12:04:23.394968Z 16087 Query update lean_issue SET mtime = now()        ,boardid=10114  where id = 118864
2018-07-04T12:04:24.308721Z 16085 Query SET autocommit=0
2018-07-04T12:04:24.497692Z 16085 Query insert ignore into lean_issue set  title = 'bbbb',  content = null,  cuserid = 120859,  status = 0,  listid = 195406,  boardid = 10114,  priority = null,
   source_type = 0,   starttime = null,  endtime = null,  sequnce = 'LR4605-83',  parentid = null,  ctime = now(),  mtime = null,  workflowid = null,  type = 4722,  source = null,  issue_order = null
2018-07-04T12:04:25.218430Z 16085 Query insert into  lean_change  ( `issueid`,`cuserid`,`ctime`, `type`, `status`, `field`, `oldvalue`, `newvalue`, `desc`,source)  values  (123096,120859,now(),'is
sue',1,'list','','195406','柳志崇在“Done1”添加了任务',0)
2018-07-04T12:04:25.620718Z 16085 Query insert into  lean_card_propose_user  set  issue_id = 123096,  user_id = 120859,  status = null
2018-07-04T12:04:25.639585Z 16085 Query insert into  lean_issue_user set  userid = 120859,  issueid = 123096,  type = 2
2018-07-04T12:04:25.832163Z 16085 Query update lean_board SET mtime = '2018-07-04 20:04:25' ,count = 83  where id = 10114
2018-07-04T12:04:26.449587Z 16087 Query UPDATE `catalog`  SET `status` = 0,mtime=now() where  status !=-1 and id in    (    3602   ,   3604   ,   3605   )
2018-07-04T12:04:26.458029Z 16087 Query UPDATE `catalog_task`  SET `status` = 0,mtime=now()   where status!=-1 and issue_id in    (    118864   )
2018-07-04T12:04:26.468883Z 16087 Query update lean_issue SET mtime = now()        ,boardid=10114  where id = 118864
2018-07-04T12:04:26.951996Z 16085 Query INSERT INTO `catalog_task`  (`issue_id`,`catalog_id`,`status`,`cuserid`)  VALUES ( 123096,1649,0,120859)
...
2018-07-04T12:05:13.319018Z 16087 Query UPDATE `catalog_task`  SET `status` = 0,mtime=now()   where status!=-1 and issue_id in    (    118873   )
2018-07-04T12:05:13.330249Z 16087 Query update lean_issue SET mtime = now()        ,boardid=10114  where id = 118873
2018-07-04T12:05:16.033715Z 16087 Query UPDATE `catalog`  SET `status` = 0,mtime=now() where  status !=-1 and id in    (    1649   ,   1650   ,   3602   ,   3603   ,   3604   ,   3605   )
2018-07-04T12:05:16.042514Z 16087 Query UPDATE `catalog_task`  SET `status` = 0,mtime=now()   where status!=-1 and issue_id in    (    118874   )
2018-07-04T12:05:16.053007Z 16087 Query update lean_issue SET mtime = now()        ,boardid=10114  where id = 118874
2018-07-04T12:05:17.967792Z 16085 Query rollback
2018-07-04T12:05:17.972078Z 16085 Query SET autocommit=1
2018-07-04T12:05:18.769333Z 16087 Query UPDATE `catalog`  SET `status` = 0,mtime=now() where  status !=-1 and id in    (    1649   ,   1650   ,   3602   ,   3603   ,   3604   ,   3605   )
2018-07-04T12:05:18.779075Z 16087 Query UPDATE `catalog_task`  SET `status` = 0,mtime=now()   where status!=-1 and issue_id in    (    118874   )
2018-07-04T12:05:18.791102Z 16087 Query update lean_issue SET mtime = now()        ,boardid=10114  where id = 118874
2018-07-04T12:05:21.380525Z 16087 Query UPDATE `catalog`  SET `status` = 0,mtime=now() where  status !=-1 and id in    (    1649   ,   1650   ,   3602   ,   3603   ,   3604   ,   3605   )
2018-07-04T12:05:21.386858Z 16087 Query UPDATE `catalog_task`  SET `status` = 0,mtime=now()   where status!=-1 and issue_id in    (    118875   )
...
2018-07-04T12:08:29.544130Z 16087 Query update lean_issue SET mtime = now()        ,boardid=10114  where id = 122925
2018-07-04T12:08:30.424652Z 16087 Query UPDATE `catalog`  SET `status` = 0,mtime=now() where  status !=-1 and id in    (    1649   ,   1650   ,   3602   ,   3603   ,   3604   ,   3605   )
2018-07-04T12:08:30.437569Z 16087 Query UPDATE `catalog_task`  SET `status` = 0,mtime=now()   where status!=-1 and issue_id in    (    122925   )
2018-07-04T12:08:30.448584Z 16087 Query update lean_issue SET mtime = now()        ,boardid=10114  where id = 122925
2018-07-04T12:08:30.718356Z 16087 Query INSERT INTO `lean_change`  ( `issueid`,`cuserid`,`ctime`, `type`, `status`, `field`, `oldvalue`, `newvalue`, `desc`,source)  VALUES     (118859,120859,now()
,'issue',10,'status','-1','0','lean恢复了任务',0)  ,  (118859,120859,now(),'issue',10,'status','-1','0','lean恢复了任务',0)  ,  (118861,120859,now(),'issue',10,'status','-1','0','lean恢复了任务',0)  ,  (118861,120859,now(),'issue',10,
'status','-1','0','lean恢复了任务',0)  ,  (118862,120859,now(),'issue',10,'status','-1','0','lean恢复了任务',0)  ,  (118862,120859,now(),'issue',10,'status','-1','0','lean恢复了任务',0)  ,  (118864,120859,now(),'issue',10,'status','-1
','0','lean恢复了任务',0)  ,  (118864,120859,now(),'issue',10,'status','-1','0','lean恢复了任务',0)  ,  (118865,120859,now(),'issue',10,'status','-1','0','lean恢复了任务',0)  ,  (118865,120859,now(),'issue',10,'status','-1','0','lean恢复了任务'
,0)  ,  (118866,120859,now(),'issue',10,'status','-1','0','lean恢复了任务',0)  ,  (118866,120859,now(),'issue',10,'status','-1','0','lean恢复了任务',0)  ,  (118867,120859,now(),'issue',10,'status','-1','0','lean恢复了任务',0)  ,  (118
867,120859,now(),'issue',10,'status','-1','0','lean恢复了任务',0) ...
2018-07-04T12:08:30.733035Z 16087 Query UPDATE lean_archive la, lean_issue li SET la.status = -1, la.mtime = now(), li.status = la.object_status, li.mtime = now() WHERE la.object_type = 2 AND la.s
tatus = 0 AND li.listid = 195959  AND li.status = -1 AND li.id = la.object_id
2018-07-04T12:08:30.879701Z 16087 Query UPDATE lean_board_list_lane SET status = 0, archive_id = null WHERE  status = -1   and board_id = 10114   and list_id = 195959   and archive_id = 57354
2018-07-04T12:08:31.623011Z 16087 Query insert into  notice_info  (`title`, `user_id`, `type`, `desc`, `url`, `from`, `from_id`, `c_time`) values  ('柳志崇恢复了列“done2”。',38604,1,'','http://lean-dev.intra.xiaoju
keji.com/project/LR4605','柳志崇',120859,'2018-07-04 20:08:31.618')
2018-07-04T12:08:32.241940Z 16087 Query commit
2018-07-04T12:08:32.244986Z 16087 Query SET autocommit=1

至此问题基本锁定了表catalog_task(为什么这么说呢,只有这张表同时涉及到了update与insert变更操作)

场景还原

接下来就简化业务场景,准备测试表,测试数据进行验证

初始化

drop table catalog_task_test;
CREATE TABLE `catalog_task_test` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `issue_id` int(11) NOT NULL,
  `catalog_id` int(11) NOT NULL COMMENT '目录id',
  `status` tinyint(3) NOT NULL,
  `cuserid` int(11) NOT NULL,
  `ctime` timestamp NULL DEFAULT CURRENT_TIMESTAMP COMMENT '创建时间',
  `mtime` timestamp NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP COMMENT '更新时间',
  PRIMARY KEY (`id`) 
) ENGINE=InnoDB AUTO_INCREMENT=7199 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci ;
INSERT INTO `catalog_task_test`  (`issue_id`,`catalog_id`,`status`,`cuserid`)  VALUES ( 111,1649,0,120859) ;
INSERT INTO `catalog_task_test`  (`issue_id`,`catalog_id`,`status`,`cuserid`)  VALUES ( 222,1649,0,120859) ;

测试脚本

session1:
set autocommit=0;
INSERT INTO `catalog_task_test`  (`issue_id`,`catalog_id`,`status`,`cuserid`)  VALUES ( 333,1649,0,120859) ;
session2:(session1语句执行完成后再执行session2中的语句)
set autocommit=0; 
select now(); update catalog_task_test set status=1 where issue_id=111; select now();

session2实际执行时会报错,提示事务超时,错误结果:

mysql> select now(); update catalog_task_test set status=1 where issue_id=111; select now();
+---------------------+
| now()               |
+---------------------+
| 2018-07-03 10:23:26 |
+---------------------+
1 row in set (0.01 sec)
ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction
+---------------------+
| now()               |
+---------------------+
| 2018-07-03 10:24:17 |
+---------------------+
1 row in set (0.01 sec) 

疑问点

  • Q: insert into难道是表级锁吗,为什么在insert into语句没有执行commit; 时 update需要等待他的提交
  • A: 以mysql的InnoDB引擎为例,有索引并且使用了该索引当条件的时候就是行锁,没有索引的时候就是表锁。innodb的行锁是在有索引的情况下,没有索引的表是锁定全表的。insert一般都是行锁, update能确定且走索引的是行锁,否则表锁。

结合上面的解答,补充2个问题

  • Q: 那么先在session1中执行update ,然后在session2中执行insert 会出现锁超时的问题么?
  • A: 会,表级锁被占用的情况下是拿不到行级锁的
  • Q: 那么先在session1中执行insert ,然后在session2中执行insert 会出现锁超时的问题么?
  • A:不会,因为insert 均是行级锁
  • Q: 那么先在session1中执行update ,然后在session2中执行update 会出现锁超时的问题么?
  • A:会

解决思路1

将update表级锁降为行级锁(可以通过按主键进行更新或者where条件列增加索引)
上述案例添加索引sql如下:

ALTER TABLE `catalog_task_test` ADD INDEX idx_issue_id ( `issue_id` )  ;

解决思路2

上述案例insert并不是拿不到锁就立即抛异常的,它会等待innodb_lock_wait_timeout时间后还拿不到锁才会抛出异常,可以优化先拿到锁的业务逻辑,在innodb_lock_wait_timeout时间以内处理完并释放锁即可的

获取锁的超时时间的方法(在这个时间内会阻塞一直等待锁,到时间还拿不到锁的话就会抛异常)

show variables like '%innodb_lock_wait_timeout%'"
mysql: [Warning] Using a password on the command line interface can be insecure.
+--------------------------+-------+
| Variable_name            | Value |
+--------------------------+-------+
| innodb_lock_wait_timeout | 50    |
+--------------------------+-------+

猜你喜欢

转载自blog.csdn.net/hl_java/article/details/80918062