两个mysql排锁实例

死锁案例分析:

    首先介绍下业务背景:放款系统(Remittance)通过http发送放款计划到放款计划校验及上送服务器(Mocky),Mocky接受到请求后,对请求进行校验,如果校验通过,执行放款计划进行包装然后发送给异步消息通知中心(NotifyServer将会在项目启动时会根据配置新建多个线程,然后轮循一个完成消息的日志表或队列,将该任务在通过http在发送给remittance进行放款计划状态的更新),校验后,直接返回一个状态信息表示该放款计划是否接受成功。因为NotifyServer是个异步消息通知中心,因此并不能保证消息的及时性,而且如果NotifyServer挂掉,那么之前所有完成的放款计划也无法及时更新状态,因此在remittance系统中还额外设计了一个每隔一定时间就主动直接通过http查询Mocky(而且这个http请求的mocky接口会同步直接返回放款计划信息,而不是异步推送到NotifyServer)的补漏线程task。下面来看具体的细节:

1.Remittance接受Mocky通过NotifyServer异步发送过来的请求的处理(部分,只涉及到会产生死锁的那部分代码):

fastDataWithTemperatureHandler.updateData(plan);
this.genericDaoSupport.executeSQL("UPDATE t_remittance_plan_exec_log "
				+ " SET pg_account_name =:pgAccountName, " + " pg_account_no =:pgAccountNo, "
				+ " opposite_receive_date =:oppositeReceiveDate, " + " complete_payment_date =:completePaymentDate, "
				+ " actual_total_amount =:actualTotalAmount, " + " execution_status =:executionStatus, "
				+ " transaction_recipient =:transactionRecipient, " + " execution_remark =:executionRemark, "
				+ " transaction_serial_no =:transactionSerialNo, " + " exec_rsp_no =:execRspNo, "
				+ " last_modified_time =:lastModifiedTime, "
				+ " plan_credit_cash_flow_check_number =:planCreditCashFlowCheckNumber "
				+ "WHERE remittance_plan_uuid =:remittancePlanUuid " + "AND exec_req_no =:execReqNo "
				+ "AND execution_status =:processingStatus", params);
其中fastDataWithTemperatureHandler.updateData(plan)最后执行的是这样的sql:(事务传播机制设置为  PROPAGATION_REQUIRED,因此下面这段sql其实和上面的sql处于同一个事务中)

String sql = "UPDATE t_remittance_plan " + " SET pg_account_name =:pgAccountName, "
				+ " pg_account_no =:pgAccountNo, " + " complete_payment_date =:completePaymentDate, "
				+ " actual_total_amount =:actualTotalAmount, " + " execution_status =:executionStatus, "
				+ " execution_remark =:executionRemark, " + " transaction_serial_no =:transactionSerialNo, "
				+ " last_modified_time =:lastModifiedTime " + "WHERE remittance_plan_uuid =:remittancePlanUuid "
				+ "AND execution_status =:processingStatus";

接下来我们再来看补漏的task接受Mocky直接返回过来的信息进行处理的代码:

this.genericDaoSupport.executeSQL(
					"UPDATE t_remittance_plan_exec_log "
							+ " SET pg_account_name =:pgAccountName, "
							+ " pg_account_no =:pgAccountNo, "
							+ " opposite_receive_date =:oppositeReceiveDate, "
							+ " complete_payment_date =:completePaymentDate, "
							+ " actual_total_amount =:actualTotalAmount, "
							+ " execution_status =:executionStatus, "
							+ " transaction_recipient =:transactionRecipient, "
							+ " execution_remark =:executionRemark, "
							+ " transaction_serial_no =:channelSequenceNo, "
							+ " exec_rsp_no =:execRspNo, "
							+ " last_modified_time =:lastModifiedTime, "
							+ " plan_credit_cash_flow_check_number =:planCreditCashFlowCheckNumber "
							+ "WHERE remittance_plan_uuid =:remittancePlanUuid "
							+ "AND exec_req_no =:execReqNo "
							+ "AND execution_status =:processingStatus", params);
			if(execReqNo.equals(latestExecReqNo)) {
				this.genericDaoSupport.executeSQL(
						"UPDATE t_remittance_plan "
							+ " SET pg_account_name =:pgAccountName, "
							+ " pg_account_no =:pgAccountNo, "
							+ " complete_payment_date =:completePaymentDate, "
							+ " actual_total_amount =:actualTotalAmount, "
							+ " execution_status =:executionStatus, "
							+ " execution_remark =:executionRemark, "
							+ " transaction_serial_no =:channelSequenceNo, "
							+ " last_modified_time =:lastModifiedTime "
							+ "WHERE remittance_plan_uuid =:remittancePlanUuid "
							+ "AND execution_status =:processingStatus", params);
从这里我们可以看出一点,第一种方式接受消息通知中心回调的信息后,先更新计划表,在更新日志表,而第二种方式则先更新日志表和计划表。普通情况下,但在task频率调高(30s扫描一次数据库并查询出相应记录然后发送给Mocky)的情况下,出现死锁,使用show engine innodb status,信息如下:

*** (1) TRANSACTION:
TRANSACTION 55375121, ACTIVE 0 sec starting index read
mysql tables in use 1, locked 1
LOCK WAIT 6 lock struct(s), heap size 1136, 4 row lock(s), undo log entries 1
MySQL thread id 69147, OS thread handle 140007280789248, query id 945290175 192.168.122.7 root updating
UPDATE t_remittance_plan_exec_log  SET pg_account_name ='测试专户号',  pg_account_no ='600000000001',  opposite_receive_date ='2017-10-26 19:49:15',  complete_payment_date ='2017-10-26 19:49:15',  actual_total_amount =1500.00,  execution_status =2,  transaction_recipient =1,  execution_remark ='测试置成功',  transaction_serial_no =null,  exec_rsp_no ='121014412319903744',  last_modified_time ='2017-10-26 19:49:16.618',  plan_credit_cash_flow_check_number =3 WHERE remittance_plan_uuid ='c1af7440-14e2-494f-bf7e-f8c5b0ae2d4d' AND exec_req_no ='9030800c-7390-4424-9c05-96601ebe16c0' AND execution_status =1
*** (1) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 437 page no 13445 n bits 384 index remittance_plan_uuid of table `yunxin`.`t_remittance_plan_exec_log` trx id 55375121 lock_mode X waiting
Record lock, heap no 97 PHYSICAL RECORD: n_fields 2; compact format; info bits 0
 0: len 30; hex 63316166373434302d313465322d343934662d626637652d663863356230; asc c1af7440-14e2-494f-bf7e-f8c5b0; (total 36 bytes);
 1: len 8; hex 0000000000098792; asc         ;;

*** (2) TRANSACTION:
TRANSACTION 55375122, ACTIVE 0 sec starting index read
mysql tables in use 1, locked 1
6 lock struct(s), heap size 1136, 4 row lock(s), undo log entries 1
MySQL thread id 69090, OS thread handle 140007617246976, query id 945290177 192.168.122.7 root updating
UPDATE t_remittance_plan  SET pg_account_name ='测试专户号',  pg_account_no ='600000000001',  complete_payment_date ='2017-10-26 19:49:15',  actual_total_amount =1500.00,  execution_status =2,  execution_remark ='测试置成功',  transaction_serial_no =null,  last_modified_time ='2017-10-26 19:49:16.619' WHERE remittance_plan_uuid ='c1af7440-14e2-494f-bf7e-f8c5b0ae2d4d' AND execution_status =1
*** (2) HOLDS THE LOCK(S):
RECORD LOCKS space id 437 page no 13445 n bits 384 index remittance_plan_uuid of table `yunxin`.`t_remittance_plan_exec_log` trx id 55375122 lock_mode X
Record lock, heap no 97 PHYSICAL RECORD: n_fields 2; compact format; info bits 0
 0: len 30; hex 63316166373434302d313465322d343934662d626637652d663863356230; asc c1af7440-14e2-494f-bf7e-f8c5b0; (total 36 bytes);
 1: len 8; hex 0000000000098792; asc         ;;

*** (2) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 236 page no 26239 n bits 376 index remittance_plan_uuid of table `yunxin`.`t_remittance_plan` trx id 55375122 lock_mode X waiting
Record lock, heap no 309 PHYSICAL RECORD: n_fields 2; compact format; info bits 0
 0: len 30; hex 63316166373434302d313465322d343934662d626637652d663863356230; asc c1af7440-14e2-494f-bf7e-f8c5b0; (total 36 bytes);
 1: len 8; hex 0000000000098794; asc         ;;

*** WE ROLL BACK TRANSACTION (2)
死锁原因很明了,第一个事务正在等该获取日志表相应记录的X锁(此时根据代码,可以知道它已经获得计划表相应记录的X锁),而此时第二个事务已经获取到
日志表相应记录的X锁,正在等待计划表相应记录的X锁,这两个事务互相等待,造成死锁,Innodb自动回滚了第二个事务,保证第一个事务的正常进行。这正是因为在高并发情况下,task主动获取结果进行处理和接受NotifyServer请求进行处理刚好在同一时刻,然后根据同一个放款计划的uuid更新计划表和日志表,因为两个事务的更新顺序不一致,导致分别获取到不同表的行锁,然后一直等待对方释放以获取另一张表的行锁。知道了原因处理起来就很简单了,只需要将两种情况下对这张表的更新顺序统一就可以完美解决。


间隙锁案例分析:

业务背景:放款系统(remittance)接受外部放款申请(remittanceApplication)后,会进行一系列逻辑校验和业务校验(这两种校验由于业务原因分拆到不同的系统中进行),然后根据校验信息更新相应放款申请的状态,如果校验失败后,将该申请状态置为失败,并使用NotifyServer回调外部,一次回调完成后,在更新一次状态;如果校验成功,那么将根据该放款申请解析获得的放款计划上送给Mocky系统,Mocky系统处理完成后回调,remiitance接受请求后根据情况再一次更新上送的放款计划对应的放款申请的状态。所有这些更新全部以application_uuid作为条件之一,同样,在大部分场景下,这些代码能工作的很好,但在测试的‘高’并发条件下(500*1000),运行一定时候后,总会如期出现lock wait timeout exception. 这说明该系统中有些sql语句一直尝试获取锁失败。通过show engine innodb status 查看innodb的状态,显示出现gap锁,因为当时的信息每保存下来,用下面的状态信息类比展示:

扫描二维码关注公众号,回复: 190042 查看本文章
RECORD LOCKS space id 0 page no 1665 n bits 72 index `number` of table `learn`.`test` trx id BA91 lock_mode X locks gap before rec insert intention waiting
gap锁时innodb为了防止幻读而设计的一种区间锁(具体见我上篇关于mysql锁的博文),针对所有普通索引进行的更新操作都会导致区间锁,在本例中,application_uuid正是普通索引,在高并发情况下,每一次更新都产生了一个间隙锁,因此一段时间过后,会产生大量的不同间隙的间隙锁,而此时如果新生成一条放款申请,此时想要插入到数据库中,而这个放款申请的uuid恰好处于某个间隙锁的间隙中,则必然会一直等待间隙锁释放,从而等待超时。处理很简单,在设计大量更新时,将条件中普通索引替换成唯一索引。

总结:出现mysql锁的问题时,第一时间就应该使用show engine innodb status(如果使用的存储引擎是innodb的话)查看innodb的状态信息,里面详细记录了锁的状态,类型等以及事务当前正在执行的sql,通过这些信息再去精准定位代码,找到解决方案。





猜你喜欢

转载自blog.csdn.net/summermangozz/article/details/78374365
今日推荐