Analysis of MySQL lock wait and deadlock problem

Foreword:

In the process of MySQL operation and maintenance, lock waiting and deadlock problems are very headaches for DBAs and developers. The occurrence of such problems will cause business rollbacks, stalls, and other failures, especially for systems with busy business, and the impact will be more serious after a deadlock problem. In this article, let's learn what lock waiting and deadlock are, and how should we analyze and deal with such problems?

1. Unlock waiting and deadlock

The reason for lock waiting or deadlock is that access to the database requires a lock. Then you may have to ask, why do you need to lock? The reason is to ensure the correctness of the data in the concurrent update scenario and to ensure the isolation of database transactions.

Imagine a scenario. If you are going to the library to borrow a copy of "High Performance MySQL", in order to prevent someone from borrowing the book in advance, you can make an appointment (lock) in advance. How can this lock be added?

  • Block library (database-level lock)
  • Lock all the books related to the database (table-level lock)
  • Only lock MySQL-related books (page-level lock)
  • Only lock the book "High Performance MySQL" (row-level lock)

The finer the lock granularity, the higher the concurrency level, and the more complicated the implementation.

Lock waiting can also be referred to as transaction waiting. The subsequent transaction waits for the previously processed transaction to release the lock, but the waiting time exceeds the MySQL lock waiting time, and this exception will be raised. The error message after waiting timeout is "Lock wait timeout exceeded...".

The deadlock occurs because two transactions wait for each other to release the lock of the same resource, which causes an infinite loop. An error "Deadlock found when trying to get lock..." will be reported immediately after a deadlock occurs.

2. Phenomenon recurrence and treatment

Let's take MySQL version 5.7.23 as an example (the isolation level is RR) to reproduce the above two abnormal phenomena.

mysql> show create table test_tb\G
*************************** 1. row ***************************
       Table: test_tb
Create Table: CREATE TABLE `test_tb` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `col1` varchar(50) NOT NULL DEFAULT '',
  `col2` int(11) NOT NULL DEFAULT '1',
  `col3` varchar(20) NOT NULL DEFAULT '',
  PRIMARY KEY (`id`),
  KEY `idx_col1` (`col1`)
) ENGINE=InnoDB AUTO_INCREMENT=4 DEFAULT CHARSET=utf8
1 row in set (0.00 sec)

mysql> select * from test_tb;
+----+------+------+------+
| id | col1 | col2 | col3 |
+----+------+------+------+
|  1 | fdg  |    1 | abc  |
|  2 | a    |    2 | fg   |
|  3 | ghrv |    2 | rhdv |
+----+------+------+------+
3 rows in set (0.00 sec)

# 事务一首先执行
mysql> begin;
Query OK, 0 rows affected (0.00 sec)

mysql> select * from test_tb where col1 = 'a' for update;
+----+------+------+------+
| id | col1 | col2 | col3 |
+----+------+------+------+
|  2 | a    |    2 | fg   |
+----+------+------+------+
1 row in set (0.00 sec)

# 事务二然后执行
mysql> begin;
Query OK, 0 rows affected (0.01 sec)

mysql> update test_tb set col2 = 1 where col1 = 'a';
ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction

The reason for the above exception is that transaction 2 is waiting for the row lock of transaction 1, but transaction 1 has not been committed, and the waiting timeout results in an error. The InnoDB row lock wait timeout time is controlled by the innodb_lock_wait_timeout parameter. The default value of this parameter is 50 and the unit is seconds. That is, by default, transaction 2 will wait for 50s. If the row lock is still not obtained, it will report a wait timeout exception and roll back this Sentences.

For version 5.7, when a lock wait occurs, we can check several system tables in the information_schema to query the transaction status.

  • innodb_trx All transactions currently running.
  • innodb_locks The locks currently present.
  • Correspondence of innodb_lock_waits lock waits
# 锁等待发生时 查看innodb_trx表可以看到所有事务 
# trx_state值为LOCK WAIT 则代表该事务处于等待状态

mysql> select * from information_schema.innodb_trx\G
*************************** 1. row ***************************
                    trx_id: 38511
                 trx_state: LOCK WAIT
               trx_started: 2021-03-24 17:20:43
     trx_requested_lock_id: 38511:156:4:2
          trx_wait_started: 2021-03-24 17:20:43
                trx_weight: 2
       trx_mysql_thread_id: 1668447
                 trx_query: update test_tb set col2 = 1 where col1 = 'a'
       trx_operation_state: starting index read
         trx_tables_in_use: 1
         trx_tables_locked: 1
          trx_lock_structs: 2
     trx_lock_memory_bytes: 1136
           trx_rows_locked: 1
         trx_rows_modified: 0
   trx_concurrency_tickets: 0
       trx_isolation_level: REPEATABLE READ
         trx_unique_checks: 1
    trx_foreign_key_checks: 1
trx_last_foreign_key_error: NULL
 trx_adaptive_hash_latched: 0
 trx_adaptive_hash_timeout: 0
          trx_is_read_only: 0
trx_autocommit_non_locking: 0
*************************** 2. row ***************************
                    trx_id: 38510
                 trx_state: RUNNING
               trx_started: 2021-03-24 17:18:54
     trx_requested_lock_id: NULL
          trx_wait_started: NULL
                trx_weight: 4
       trx_mysql_thread_id: 1667530
                 trx_query: NULL
       trx_operation_state: NULL
         trx_tables_in_use: 0
         trx_tables_locked: 1
          trx_lock_structs: 4
     trx_lock_memory_bytes: 1136
           trx_rows_locked: 3
         trx_rows_modified: 0
   trx_concurrency_tickets: 0
       trx_isolation_level: REPEATABLE READ
         trx_unique_checks: 1
    trx_foreign_key_checks: 1
trx_last_foreign_key_error: NULL
 trx_adaptive_hash_latched: 0
 trx_adaptive_hash_timeout: 0
          trx_is_read_only: 0
trx_autocommit_non_locking: 0
2 rows in set (0.00 sec)

# innodb_trx 字段值含义
trx_id:事务ID。
trx_state:事务状态,有以下几种状态:RUNNING、LOCK WAIT、ROLLING BACK 和 COMMITTING。
trx_started:事务开始时间。
trx_requested_lock_id:事务当前正在等待锁的标识,可以和 INNODB_LOCKS 表 JOIN 以得到更多详细信息。
trx_wait_started:事务开始等待的时间。
trx_weight:事务的权重。
trx_mysql_thread_id:事务线程 ID,可以和 PROCESSLIST 表 JOIN。
trx_query:事务正在执行的 SQL 语句。
trx_operation_state:事务当前操作状态。
trx_tables_in_use:当前事务执行的 SQL 中使用的表的个数。
trx_tables_locked:当前执行 SQL 的行锁数量。
trx_lock_structs:事务保留的锁数量。
trx_isolation_level:当前事务的隔离级别。

# sys.innodb_lock_waits 视图也可看到事务等待状况,且给出了杀链接的SQL
mysql> select * from sys.innodb_lock_waits\G
*************************** 1. row ***************************
                wait_started: 2021-03-24 17:20:43
                    wait_age: 00:00:22
               wait_age_secs: 22
                locked_table: `testdb`.`test_tb`
                locked_index: idx_col1
                 locked_type: RECORD
              waiting_trx_id: 38511
         waiting_trx_started: 2021-03-24 17:20:43
             waiting_trx_age: 00:00:22
     waiting_trx_rows_locked: 1
   waiting_trx_rows_modified: 0
                 waiting_pid: 1668447
                 waiting_query: update test_tb set col2 = 1 where col1 = 'a'
             waiting_lock_id: 38511:156:4:2
           waiting_lock_mode: X
               blocking_trx_id: 38510
                blocking_pid: 1667530
              blocking_query: NULL
            blocking_lock_id: 38510:156:4:2
          blocking_lock_mode: X
        blocking_trx_started: 2021-03-24 17:18:54
            blocking_trx_age: 00:02:11
    blocking_trx_rows_locked: 3
  blocking_trx_rows_modified: 0
     sql_kill_blocking_query: KILL QUERY 1667530
sql_kill_blocking_connection: KILL 1667530

The sys.innodb_lock_waits view integrates the transaction waiting status and also gives a kill statement to kill the blocked source. However, whether to kill the link still needs to be considered comprehensively.

Deadlock and lock wait are slightly different, we also come to simply reproduce the deadlock phenomenon.

# 开启两个事务
# 事务一执行
mysql> update test_tb set col2 = 1 where col1 = 'a';
Query OK, 1 row affected (0.00 sec)
  Rows matched: 1  Changed: 1  Warnings: 0

# 事务二执行
mysql> update test_tb set col2 = 1 where id = 3;
Query OK, 1 row affected (0.00 sec)
Rows matched: 1  Changed: 1  Warnings: 0

  # 回到事务一执行 回车后 此条语句处于锁等待状态
mysql> update test_tb set col1 = 'abcd' where id = 3;
Query OK, 1 row affected (5.71 sec)
Rows matched: 1  Changed: 1  Warnings: 0
# 回到事务二再执行 此时二者相互等待发生死锁
mysql> update test_tb set col3 = 'gddx' where col1 = 'a';
ERROR 1213 (40001): Deadlock found when trying to get lock; try restarting transaction

After a deadlock occurs, a transaction is selected for rollback. If you want to find out the cause of the deadlock, you can execute show engine innodb status to view the deadlock log. According to the deadlock log, combined with business logic to further locate the cause of the deadlock.

In practical applications, we must try to avoid the occurrence of deadlock phenomenon, we can start from the following aspects:

  • The transaction is as small as possible, do not put complex logic into one transaction.
  • When multiple rows of records are involved, it is agreed that different transactions are accessed in the same order.
  • Commit or roll back transactions in a timely manner in the business to reduce the probability of deadlocks.
  • The table must have a suitable index.
  • You can try to change the isolation level to RC.

to sum up:

This article briefly introduces the causes of lock waiting and deadlocks. In fact, deadlocks in real business are still difficult to analyze, and some experience accumulation is required. This article is only for beginners, I hope you can have a simple impression of deadlock.

Guess you like

Origin blog.51cto.com/10814168/2677268