死锁案例十三

阿里云幸运券

一 前言
死锁,其实是一个很有意思也很有挑战的技术问题,大概每个DBA和部分开发同学都会在工作过程中遇见 。关于死锁我会持续写一个系列的案例分析,希望能够对想了解死锁的朋友有所帮助

二案例分析
2.1 业务场景
用户录入商品,应用程序会提前检查是否存在相同记录,如果有则先删除再插入;如果没有则直接插入。

2.2 环境说明
MySQL 5.7.22 事务隔离级别为RC模式。

create table t
(
id
int

not

null
auto_increment primary key
,

a
int

not

null

default

0
,

b
int

not

null

default

0
,

c
int

not

null

default

0
,

unique key uk_ab
(
a
,
b
))
engine

innodb
;

insert
into
t
(
a
,
b
,
c
)
values
(
1
,
1
,
1
),(
3
,
3
,
2
),(
6
,
6
,
3
),(
9
,
9
,
5
);

2.3 背景知识
知识点一
INSERT操作在插入或更新记录时,检查到 duplicate key或者有一个被标记删除的duplicate key(本文的案例),对于普通的INSERT/UPDATE,会加LOCK_S属性锁next-key lock。而对于类似REPLACE INTO或者INSERT … ON DUPLICATE这样的SQL加的是X锁。而针对不同的索引类型也有所不同:

代码位置 row0ins.cc:2013

if

(
flags
&
BTR_NO_LOCKING_FLAG
)

{

/* Set no locks when applying log

        in online table rebuild. */

}

else

if

(
allow_duplicates
)

{

/* If the SQL-query will update or replace

        duplicate key we will take X-lock for

        duplicates ( REPLACE, LOAD DATAFILE REPLACE,

        INSERT ON DUPLICATE KEY UPDATE). */



        err 

=
row_ins_set_exclusive_rec_lock
(

            lock_type

,
block
,
rec
,
index
,
offsets
,
thr
);

}

else

{

        err 

=
row_ins_set_shared_rec_lock
(

            lock_type

,
block
,
rec
,
index
,
offsets
,
thr
);

}

知识点二
当向某个数据页中插入一条记录时,总是会调用函数 lock_rec_insert_check_and_lock 进行锁检查(构建索引时的数据插入除外),会去检查当前插入位置的下一条记录上是否存在锁对象,这里的下一条记录不是指的物理连续,而是按照逻辑顺序的下一条记录。 如果下一条记录上不存在锁对象:若记录是二级索引上的,先更新二级索引页上的最大事务ID为当前事务的ID;直接返回成功。

如果下一条记录上存在锁对象,就需要判断该锁对象是否锁住了GAP。如果GAP被锁住了,并判定和插入意向GAP锁冲突,当前操作就需要等待,加的锁类型为LOCKX | LOCKGAP | LOCKINSERTINTENTION,并进入等待状态。
代码位置 lock0lock.cc:5965

inherit

TRUE
;

/* If another transaction has an explicit lock request which locks

the gap, waiting or granted, on the successor, the insert has to wait.

An exception is the case where the lock by the another transaction

is a gap type lock which it placed to wait for its turn to insert. We

do not consider that kind of a lock conflicting with our insert. This

eliminates an unnecessary deadlock which resulted when 2 transactions

had to wait for their insert. Both had waiting gap type lock requests

on the successor, which produced an unnecessary deadlock. */

const
ulint type_mode

LOCK_X
|
LOCK_GAP
|
LOCK_INSERT_INTENTION
;

const

lock_t
*
wait_for

lock_rec_other_has_conflicting
(

           type_mode

,
block
,
heap_no
,
trx
);

if

(
wait_for
!=
NULL
)

{

RecLock
rec_lock
(
thr
,
index
,
block
,
heap_no
,
type_mode
);

trx_mutex_enter

(
trx
);

err 

=
rec_lock
.
add_to_waitq
(
wait_for
);

trx_mutex_exit

(
trx
);

}

else

{

err 

=
DB_SUCCESS
;

}

我通过如下测试进行验证。表结构和数据是 2.2 中的构造测试数据。

测试案例一

sess1

mysql

delete

from
t
where
a

3

and
b

3

;

Query
OK
,

1
row affected
(
0.00
sec
)

sess2

mysql

update t
set
c

6

where
a

6

and
b

6

and
c

3
;

sess1

mysql

insert
into
t
(
a
,
b
,
c
)
values
(
3
,
3
,
5
);

–产生锁等待

insert (3,3,5) 申请lock S 被sess2 delete 持有的Lock X 行锁阻塞,

show engine innodb status 并没有完整的显示 该lock S 是什么锁。我们继续测试。

测试案例二

T1 sess1

mysql

delete

from
t
where
a

3

and
b

3

;

mysql

insert
into
t
(
a
,
b
,
c
)
values
(
3
,
3
,
5
);

T2 sess2

mysql

insert
into
t
(
a
,
b
,
c
)
values
(
3
,
2
,
6
);

T3 sess3

mysql

insert
into
t
(
a
,
b
,
c
)
values
(
3
,
4
,
5
);

其中 sess2 sess3 等待申请lock_mode X locks gap before rec insert intention waiting,显然是被sess1持有的LOCK S Next key lock阻塞. 而且是(1,3),(3,6)两个区间的

显然测试案例一中 sess2 持有记录(6,6)的lock X record lock but not gap,会阻塞 insert (3,3)申请LOCK S Next key lock .

2.4 测试用例

2.5 死锁日志
2019

04

27

23
:
26
:
16

0x7f26cc77b700


(
1
)
TRANSACTION
:

TRANSACTION
2489
,
ACTIVE
43
sec inserting

mysql tables
in

use

1
,
locked
1

LOCK WAIT
5

lock

struct
(
s
),
heap size
1136
,

4
row
lock
(
s
),
undo log entries
2

MySQL
thread id
121125
,
OS thread handle
139804595451648
,
query id
526
localhost msandbox update

insert
into
t
(
a
,
b
,
c
)
values
(
3
,
3
,
3
),(
3
,
1
,
2
)


(
1
)
WAITING FOR THIS LOCK TO BE GRANTED
:

RECORD LOCKS space id
26
page
no

4
n bits
80
index uk_ab of table
test
.
t
trx id
2489

lock
mode S waiting


(
2
)
TRANSACTION
:

TRANSACTION
2490
,
ACTIVE
36
sec inserting

mysql tables
in

use

1
,
locked
1

6

lock

struct
(
s
),
heap size
1136
,

6
row
lock
(
s
),
undo log entries
3

MySQL
thread id
121123
,
OS thread handle
139804615882496
,
query id
528
localhost msandbox update

insert
into
t
(
a
,
b
,
c
)
values
(
6
,
6
,
6
),(
6
,
5
,
4
)


(
2
)
HOLDS THE LOCK
(
S
):

RECORD LOCKS space id
26
page
no

4
n bits
80
index uk_ab of table
test
.
t
trx id
2490
lock_mode X locks rec but
not
gap


(
2
)
WAITING FOR THIS LOCK TO BE GRANTED
:

RECORD LOCKS space id
26
page
no

4
n bits
80
index uk_ab of table
test
.
t
trx id
2490
lock_mode X locks gap before rec insert intention waiting


WE ROLL BACK TRANSACTION
(
1
)

2.6 分析死锁日志
T1 delete a=3 and b=3 未提交,持有二级索引(3,3)行锁,记录被标记为失效。

T2 delete a=6 and b=6 未提交,持有二级索引(6,6)行锁,记录被标记为失效。

T3 insert (3,3,3), 检查到被标记为删除的(3,3),申请加上LOCK_S next-key lock。但是在检查到下一条记录持有Lock X record lock 。于是等待。

T4 insert (6,5,4) 写入(3,6)的区间,申请lock_mode X locks gap before rec insert intention waiting,但是需要等待T3会话LOCK_S next-key lock。于是相互等待,发生死锁。

2.7 解决方法
本质上是并发操作相邻记录导致死锁。和开发沟通,将业务逻辑做修改,如果发现录入的商品记录数和存在的记录数一样就做更新,不存在的则直接写入。降低直接操作相邻记录的可能性。

三 小结
以上分析是基于自己半路出家的阅读代码能力的出来的,不一定完全正确。如果大家有其他意见,请拍砖。

腾讯云代金券

原文链接

https://mp.weixin.qq.com/s?__biz=MzI4NjExMDA4NQ%3D%3D&mid=2648451307&idx=1&sn=715b8dedbb9888c398bde8346a483025&chksm=f3c97001c4bef91799526bfd7909e5a8bebf960bd5a13ea4b93549b28ecc48eeaf0ca10904b3&mpshare=1&scene=23&srcid=%23rd

服务推荐

猜你喜欢

转载自blog.csdn.net/weixin_44476888/article/details/89789321