An analysis of the Mysql lock mechanism caused by an online problem | JD Logistics Technical Team

background

Recently, during the Double Eleven period, an online problem occurred in the group due to Mysql deadlock. At that time, we could see from the monitoring that the number of active database connections soared, causing the application layer database connection pool to be filled up, and all subsequent requests were blocked due to inability to obtain. failed to connect

The overall business code streamlining logic is as follows:

@Transaction
public void service(Integer id) {
    delete(id);
    insert(id);
}


Database instance monitoring:

After analyzing the upstream problem and solving the traffic limit problem at that time, I found time to re-analyze the root cause of the problem. Now I summarize it as follows: This article will first analyze the various locks in Mysql, including mutex locks. , gap locks and insertion intention locks, let everyone have an understanding of the usage scenarios of various locks, and then analyze this problem on this basis. I hope that everyone can quickly locate the problem when encountering similar scenarios in the future.

Mysql lock mechanism

In order to solve the problem of concurrent writing of the same row of records in Mysql, a row lock mechanism is introduced. Multiple transactions cannot modify a row of data at the same time. When a row of data in the database needs to be modified, the row of data will be judged first. Whether to lock or not. If not, then the current transaction is successfully locked and subsequent modification operations can be performed. However, if the row data has been locked by other transactions, the current transaction can only be locked after waiting for the locked transaction to release the lock. Success, continue the modification operation

Table creation statements used in all experiments in this article:

create table `test` (
    `id` int(11) NOT NULL,
    `num` int(11) NOT NULL,
    PRIMARY KEY (`id`),
    KEY `num` (`num`)
) ENGINE = InnoDB;

insert into
    test
values
(10, 10),
(20, 20),
(30, 30),
(40, 40),
(50, 50);




Shared and Exclusive Locks

shared(S) lock represents a shared lock. When a transaction holds the S lock on a row, it can read the data of the row. You can add a shared lock through the statement select... from test lock in share mode,** Generally used less, **won’t elaborate too much

exclusive(X) lock represents a mutual exclusion lock. When a transaction updates or deletes a row of data, it must first acquire the X lock on the record. If another transaction has acquired the X lock on the record, then The current transaction will block and wait until the previous transaction releases the X lock on the corresponding record.

S locks are not mutually exclusive. Multiple transactions can acquire the S lock on a record at the same time. X locks are mutually exclusive. Multiple transactions cannot acquire the X lock on the same record at the same time. S locks and X locks are mutually exclusive. Multiple transactions cannot acquire S locks and X locks on the same record at the same time

When multiple transactions update the same record on the index at the same time, they all need to obtain the X lock on the record first. The so-called lock means that a data structure is generated in the memory to record the current transaction information, lock type and whether Waiting for information . In the figure below, T1 and T2 update the row record with id = 30 at the same time, and T1 successfully obtains the lock. The field is_wating in the lock structure information generated in the memory is false, and the subsequent logic of the transaction can continue to be executed, while T2 If the lock acquisition fails, the generated lock structure information field is_wating is true and blocks waiting for the lock on T1 to be released.

The lock information of the mutex lock in the Mysql log is: lock_mode X locks rec but not gap

RECORD LOCKS space id 58 page no 3 n bits 72 index `PRIMARY` of table `test`.`t`
trx id 10078 lock_mode X locks rec but not gap
Record lock, heap no 2 PHYSICAL RECORD: n_fields 3; compact format; info bits 0
 0: len 4; hex 8000000a; asc     ;;
 1: len 6; hex 00000000274f; asc     'O;;
 2: len 7; hex b60000019d0110; asc        ;;




Gap Locks

Exclusive Locks were introduced in the previous section. This lock can prevent multiple transactions from updating a row of records at the same time, but it cannot solve the problem of phantom reading . The so-called phantom reading means that when a transaction queries the same range twice before and after, The last query found records that were not available in the previous query.

 session A session B
T1 select num from test where num > 10 and num < 15 for update; (0 rows) 
T2  insert into test values(12, 12);
T3 select num from test where num > 10 and num < 15 for update; (1 rows) 

In the above scenario, session A performed two range queries at T1 and T3, and session B inserted a piece of data within the range at T2. If session A could query the data inserted by session B at T3, This means that phantom reading has occurred. At this time, only using a mutex lock cannot solve the phantom read, because the record with num = 12 does not exist in the database yet, and a mutex lock cannot be added to it to prevent the insertion of session B at time T2.

Therefore, in order to solve the phantom read problem, a new lock mechanism can only be introduced, namely Gap Locks . Gap locks are different from mutex locks. Mutex locks are row locks that only lock a specific row of records, while gap locks lock the gap between two rows of records to prevent other transactions from inserting new records in this gap.

After the gap lock is introduced, session A will generate a Gap Locks for the record with id = 20 at time T1. Then, when session B wants to insert a record at time T2, it needs to first determine whether there are Gap Locks on the next record to be inserted. Obviously, Gap Locks already exist on the record with id = 20 at this time, so session B needs to generate an insertion intention lock on the record with id = 20 and enter the lock wait

The lock log information of gap lock in Mysql is as follows: lock_mode X locks gap before rec

RECORD LOCKS space id 133 page no 3 n bits 80 index PRIMARY of table `test`.`test` trx id 38849 lock_mode X locks gap before rec
Record lock, heap no 4 PHYSICAL RECORD: n_fields 4; compact format; info bits 0
 0: len 4; hex 8000001e; asc     30 ;;
 1: len 6; hex 00000000969c; asc       ;;
 2: len 7; hex a60000011a0128; asc       (;;
 3: len 4; hex 8000001e; asc     ;;


Although gap locks solve the phantom read problem, they lock a gap every time, which greatly reduces the overall concurrency of the database . And because gap locks and gap locks are not mutually exclusive, different transactions can add locks to the same gap at the same time. Gap Locks, which are often the source of various deadlocks

Next-Key Locks

Next-Key Locks is a combination of (Shard/Exclusive Locks + Gap Locks). When session A adds mutually exclusive Next-Key Locks to a certain row of record R, it is equivalent to owning the X lock of record R and the Gap Locks

In the above Gap Locks example, transaction 1 adds Next-Key Locks, that is, X locks and Gap locks are added to the record with id = 20 at the same time.

Under the repeatable read isolation level, update and delete operations will add Next-Key Locks to the record by default . The lock log information of Next-Key Locks in Mysql is: lock_mode X

RECORD LOCKS space id 58 page no 3 n bits 72 index `PRIMARY` of table `test`.`t`
trx id 10080 lock_mode X
Record lock, heap no 1 PHYSICAL RECORD: n_fields 1; compact format; info bits 0
 0: len 8; hex 73757072656d756d; asc supremum;;

Record lock, heap no 2 PHYSICAL RECORD: n_fields 3; compact format; info bits 0
 0: len 4; hex 8000000a; asc     ;;
 1: len 6; hex 00000000274f; asc     'O;;
 2: len 7; hex b60000019d0110; asc        ;


Insert Intention Locks

Insert Intention Locks are also a kind of gap lock, acquired by the INSERT operation before row data is inserted.

Before inserting a record, you need to first locate the storage location of the record in the B+ tree, and then determine whether Gap Locks are added to the next record at the location to be inserted. If there are Gap Locks on the next record, then the insertion operation requires Block and wait until the transaction owning Gap Locks commits. At the same time, the transaction waiting for the insertion operation will also generate a lock structure in the memory, indicating that a transaction wants to insert a new record in a certain gap, but is currently in a blocked state. The generated The lock structure is to insert the intention lock

The experimental simulation is as follows:

 session 1 session 2 session 3
T1 begin;  
T2 select * from test where id = 25 for update;  
T3  insert into test values(26, 26); (blocked) 
T4   insert into test values(26, 26); (blocked)

For the statement select * from test where id = 25 for update, because the record does not exist in the current table, under the repeatable read isolation level, in order to avoid phantom reads, Gap Locks will be added to the (20, 30] gap.

From the lock log, we can see that session 1 added a gap lock to record 30 ( lock_mode X locks gap before rec )

RECORD LOCKS space id 133 page no 3 n bits 80 index PRIMARY of table `test`.`test` trx id 38849 lock_mode X locks gap before rec
Record lock, heap no 4 PHYSICAL RECORD: n_fields 4; compact format; info bits 0
 0: len 4; hex 8000001e; asc     30 ;;
 1: len 6; hex 00000000969c; asc       ;;
 2: len 7; hex a60000011a0128; asc       (;;
 3: len 4; hex 8000001e; asc     ;;




When session 2 inserts record 26, it will first locate the location to be inserted in the B+ tree, and then determine whether there are Gap Locks in the gap at the insertion location, that is, determine whether there are Gap Locks in the next record id = 30 at the location to be inserted. If There is a need to generate an insertion intention lock wait on this record.

RECORD LOCKS space id 133 page no 3 n bits 80 index PRIMARY of table `test`.`test` trx id 38850 lock_mode X locks gap before rec insert intention waiting
Record lock, heap no 4 PHYSICAL RECORD: n_fields 4; compact format; info bits 0
 0: len 4; hex 8000001e; asc    30 ;;
 1: len 6; hex 00000000969c; asc       ;;
 2: len 7; hex a60000011a0128; asc       (;;
 3: len 4; hex 8000001e; asc     ;;




At this time, both session 2 and session 3 added insertion intention locks to the record with id = 30 and waited for the Gap Locks on session 1 to be released. The generated lock records are as follows:

Online problem analysis

After having a clear understanding of the various lock structures in Mysql, let’s go back and look at the previous online questions.

@Transaction
public void service(Integer id) {
    delete(id);
    insert(id);
}


The following two situations may exist for the above business code:

  • The parameter id passed in does not exist in the original database
  • The parameter id passed in exists in the original database

This time we will mainly analyze whether the id record does not exist in the original database.

 session 1 session 2 session 3
T1 delete from test where id = 15;  
T2  delete from test where id = 15; delete from test where id = 15;
T3 insert into test values(15, 15);  
T4  insert into test values(15, 15); 
T5   insert into test values(15, 15);

Because id = 15 does not exist in the database, session 1 will add Gap Locks to the next record in the gap at time T1, and because Gap Locks are not mutually exclusive, session 2 and session 3 will obtain the id at the same time at time T2. = 20 Gap lock

In the figure below, tx: T1, T2, and T3 represent session 1, session 2, and session 3 respectively.

When the record with id = 15 is inserted in session 1 at T3 time, it will be judged whether the record after the insertion position has Gap Locks. If it exists, Insert Intention Locks need to be generated on the record and wait for the transaction holding the Gap Locks to be released. Lock

When the insert statement is executed in session 2 at time T4, the Insert Intention Locks wait will also need to be generated because there are Gap Locks in the record after the insertion position. At this point, it is obvious that a deadlock has formed. Session 1 generates an insertion intention lock and waits for the gap locks on session 2 and session 3 to be released, while session 2 also generates an insertion intention lock and waits for the gap locks on session 1 and session 3 to be released.

After a deadlock is detected at T4, Mysql will select one of the transactions to roll back. Assume that session 2 is rolled back at this time and all lock resources it holds are released. Can session 1 continue to execute? Obviously not, session 1 is still waiting for the gap lock on session 3 to be released, and continues to block and wait.

At time T5, session 3 starts executing the insert statement. At this time, at the same time as T4, a deadlock is formed. The insertion intention lock generated by session 1 is waiting for the Gap Locks on session 3 to be released, and the insertion intention lock generated on session 3 is waiting for session 1. Gap Locks are released. At this time, session 3 rolls back and releases all lock resources, and session 1 can finally execute successfully.

After completing the deadlock analysis of three concurrent threads, some people may think that although there is a deadlock, it can be quickly detected through deadlock detection and the program can be executed normally. What is the problem with this? In fact, there are no problems above mainly because the amount of concurrency is small, and deadlock detection can be detected quickly. If the amount of concurrency is expanded by 100 times or even 1000 times, will there still be no problems?

Take a look at the number of calls to the interface when online problems occurred.

Further simulate the concurrent execution of 300 threads locally. Because it would be very complicated for the human brain to analyze the execution of all transactions concurrently, this time we only use transaction 1 as a point for analysis.

As can be seen from the figure, when T1 executes the insert statement, it needs to wait for the Gap Locks held on T2-T101 to be released. After that, T2-T6 may execute the insert statement at the same time, and then perform deadlock detection and transaction rollback. It seems As long as a subsequent transaction executes the insert statement, a deadlock rollback will be performed and the operation will be normal. However, during the deadlock detection process, new transactions (T101 - T 200) will acquire Gap Locks, causing the lock to wait for transactions in the queue. More and more, **and the overall deadlock detection time complexity of Mysql is O(n^2),** when there are many transactions in the lock waiting queue, every time there is a new transaction waiting for the lock, deadlock detection requires Traverse the transactions waiting before it in the lock waiting queue to determine whether a loop will be formed due to its own addition. At this time, the detection will consume a lot of CPU resources, causing the overall performance of the database to decrease, the deadlock detection time to increase, and the number of MySQL active connections to increase significantly. , and the connection cannot be released due to lock waiting, eventually causing the application layer connection pool to be filled up.

Based on the above analysis, the main reason for this problem is that there are large concurrent requests to first delete and then insert the same row of data in a short period of time (the same is true for updating first and then inserting), resulting in deadlock waiting and application layer connection. The pool was full, and a large number of upstream requests timed out and were retried, further causing lock waits and ultimately affecting all businesses that relied on the database.

Therefore, where similar logic exists in the business code in the future, duplication prevention must be done to avoid concurrent operations of updating and then inserting the same row of data in a short period of time. At the same time, under the repeatable read isolation category, Next-Key Locks are added by default for update and delete operations. The introduction of gap locks makes deadlock problems easily occur in concurrency situations. This is also an issue that needs to be considered in business logic implementation.

Summarize

This article uses an online problem as the background to make a detailed summary of the various lock mechanisms in Mysql, and analyzes the locking timing and specific usage scenarios of each lock. Particular attention should be paid to the use of gap locks, because gap locks and gaps Locks are not mutually exclusive, and deadlocks can easily form when multiple transactions are executed concurrently.

Author: JD Logistics Zhang Gongyan

Source: JD Cloud Developer Community Ziyuanqishuo Tech Please indicate the source when reprinting

Alibaba Cloud suffered a serious failure and all products were affected (restored). Tumblr cooled down the Russian operating system Aurora OS 5.0. New UI unveiled Delphi 12 & C++ Builder 12, RAD Studio 12. Many Internet companies urgently recruit Hongmeng programmers. UNIX time is about to enter the 1.7 billion era (already entered). Meituan recruits troops and plans to develop the Hongmeng system App. Amazon develops a Linux-based operating system to get rid of Android's dependence on .NET 8 on Linux. The independent size is reduced by 50%. FFmpeg 6.1 "Heaviside" is released
{{o.name}}
{{m.name}}

Guess you like

Origin my.oschina.net/u/4090830/blog/10143236