Deadlock case caused by executing mysqldump when slave turns on MTS

Author: eight strange (Gao) also in database technology experts

Original: https://www.jianshu.com/p/39db1bb5041c

1. The source of the problem

This is a case provided by a customer as follows, the screenshot of show processlist is as follows:

Unless this problem occurs manually, kill the FTWRL session, and the replication thread can continue. Version community version 5.7.26.

Second, block diagram

If you analyze the above blockage, you can draw the picture as follows:

Three, about the waiting of worker threads w1 and w3

Here we need to focus on the parameter  slave_preserve_commit_order , which is described in detail in the book "In-depth Understanding of MySQL Master-Slave Principles" that I will publish. Here is a brief description as follows:

  • This parameter is to ensure that the transaction submission order of each worker thread in the group commit of the slave library is consistent with the execution order of the main library transaction. It takes effect before the flush phase of order commit. The transaction of the worker thread will be blocked in the state'Waiting for preceding transaction to commit' while waiting to obtain its commit permission.

But we know that MDL_key::COMMIT will be obtained before the flush of the order commit. Therefore, the w1 and w3 worker threads are waiting for the arrival of their submission permission, but unfortunately the transaction of w2 cannot be submitted due to the inability to obtain the global read lock. At the same time they blocked FTWRL.

Fourth, the waiting on FTWRL

I have described this many times, and the process of FTWRL is roughly as follows:

The first step:  add MDL LOCK type to GLOBAL, level to S. If the waiting state appears, it is'Waiting for global read lock'. Note that the select statement will not lock on the GLOBAL level, but the DML/DDL/FOR UPDATE statement will lock on the IX lock of the GLOBAL level. This kind of waiting will occur if the IX lock and the S lock are incompatible. Here is the compatibility matrix:

          | Type of active   |
  Request |   scoped lock    |
   type   | IS(*)  IX   S  X |
 ---------+------------------+
 IS       |  +      +   +  + |
 IX       |  +      +   -  - |
 S        |  +      -   +  - |
 X        |  +      -   -  - |

Step 2:  Advance the cache version of the global table. The source code is a global variable refresh_version++. Step 3:  Release the unused table cache. You can refer to the function close_cached_tables yourself. Step 4:  Determine whether there is a table cache being occupied, and if there is, wait for the occupant to release. The waiting state is'Waiting for table flush'. This step will determine whether the table cache version matches the global table cache version. If it does not match, wait as follows:

for (uint idx=0 ; idx < table_def_cache.records ; idx++) 
     {
       share= (TABLE_SHARE*) my_hash_element(&table_def_cache, idx); //寻找整个 table cache shared hash结构
       if (share->has_old_version()) //如果版本 和 当前 的 refresh_version 版本不一致
       {
         found= TRUE;
         break; //跳出第一层查找 是否有老版本 存在
       }
     }
...
if (found)//如果找到老版本,需要等待
   {
     /*
       The method below temporarily unlocks LOCK_open and frees
       share's memory.
     */
     if (share->wait_for_old_version(thd, &abstime,
                                   MDL_wait_for_subgraph::DEADLOCK_WEIGHT_DDL))
     {
       mysql_mutex_unlock(&LOCK_open);
       result= TRUE;
       goto err_with_reopen;
     }
   }

The end of waiting is the release of the occupant of the occupied table cache. This release operation exists in the function close_thread_table, as follows:

if (table->s->has_old_version() || table->needs_reopen() ||
      table_def_shutdown_in_progress)
  {
    tc->remove_table(table);//关闭 table cache instance
    mysql_mutex_lock(&LOCK_open);
    intern_close_table(table);//去掉 table cache define
    mysql_mutex_unlock(&LOCK_open);
  }

Eventually, the function MDL_wait::set_status will be called to wake up FTWRL, which means that the release of the occupied table cache is not the FTWRL session but the occupant himself. In any case, the entire table cache will be emptied eventually. If you check Open_table_definitions and Open_tables after FTWRL, you will find that the count has been re-counted. Here is the code for the wake-up function, which is also obvious:

bool MDL_wait::set_status(enum_wait_status status_arg) open_table
{
  bool was_occupied= TRUE;
  mysql_mutex_lock(&m_LOCK_wait_status);
  if (m_wait_status == EMPTY)
  {
    was_occupied= FALSE;
    m_wait_status= status_arg;
    mysql_cond_signal(&m_COND_wait_status);//唤醒
  }
  mysql_mutex_unlock(&m_LOCK_wait_status);//解锁
  return was_occupied;
}

Step 5:  Add MDL LOCK type COMMIT level to S. If there is a waiting state, it is'Waiting for commit lock'. This kind of waiting is likely to occur if there is a large transaction commit.

Pay attention to  the fifth step here. It is precisely because w1 and w3 have acquired MDL LOCK COMMIT, and they are waiting for the transaction of w2 to commit, so FTWRL has to wait.

Five, about the waiting of the worker thread w2

There are 2 possible reasons here:

  • In the case of multi-threaded parallelism, the order of thread execution is inherently uncertain. It is possible that threads will lag behind other threads due to loss of CPU, because the smallest unit of CPU scheduling is thread. To ensure the integrity of a shared memory operation, technologies such as mutex and atomic variables are needed.

  • If the transaction in w2 inherently contains multiple DML statements, then obtaining the GLOBAL READ LOCK itself is intermittent, that is, it will be released at the end of each statement, and then open the table again when the next statement starts.

Let's take a look at the second point, only consider the binlog in row_format format.

We know that a transaction can contain multiple statements. Each statement contains a map Event and multiple DML Events. When this Event is the last Event of the statement, it will be marked with STMT_END_F, and it is at this time that GLOBAL will be released. READ LOCK, the source code is as follows:

if (get_flags(STMT_END_F))
  {
    if((error= rows_event_stmt_cleanup(rli, thd)))

栈:
#0  MDL_context::release_lock (this=0x7fffa8000a08, duration=MDL_STATEMENT, ticket=0x7fffa800ea40) at /opt/percona-server-locks-detail-5.7.22/sql/mdl.cc:4350
#1  0x0000000001464bf1 in MDL_context::release_locks_stored_before (this=0x7fffa8000a08, duration=MDL_STATEMENT, sentinel=0x0) at /opt/percona-server-locks-detail-5.7.22/sql/mdl.cc:4521
#2  0x000000000146541b in MDL_context::release_statement_locks (this=0x7fffa8000a08) at /opt/percona-server-locks-detail-5.7.22/sql/mdl.cc:4813
#3  0x0000000001865c75 in Relay_log_info::slave_close_thread_tables (this=0x341e8b0, thd=0x7fffa8000970) at /opt/percona-server-locks-detail-5.7.22/sql/rpl_rli.cc:2014
#4  0x0000000001865873 in Relay_log_info::cleanup_context (this=0x341e8b0, thd=0x7fffa8000970, error=false) at /opt/percona-server-locks-detail-5.7.22/sql/rpl_rli.cc:1886
#5  0x00000000017e8fc7 in rows_event_stmt_cleanup (rli=0x341e8b0, thd=0x7fffa8000970) at /opt/percona-server-locks-detail-5.7.22/sql/log_event.cc:11782
#6  0x00000000017e8c79 in Rows_log_event::do_apply_event (this=0x7fffa8017dc0, rli=0x341e8b0) at /opt/percona-server-locks-detail-5.7.22/sql/log_event.cc:11660
#7  0x00000000017cfdcd in Log_event::apply_event (this=0x7fffa8017dc0, rli=0x341e8b0) at /opt/percona-server-locks-detail-5.7.22/sql/log_event.cc:3570
#8  0x00000000018476dc in apply_event_and_update_pos (ptr_ev=0x7fffec14f880, thd=0x7fffa8000970, rli=0x341e8b0) at /opt/percona-server-locks-detail-5.7.22/sql/rpl_slave.cc:4766
#9  0x0000000001848d9a in exec_relay_log_event (thd=0x7fffa8000970, rli=0x341e8b0) at /opt/percona-server-locks-detail-5.7.22/sql/rpl_slave.cc:5300
#10 0x000000000184f9cc in handle_slave_sql (arg=0x33769a0) at /opt/percona-server-locks-detail-5.7.22/sql/rpl_slave.cc:7543
(gdb) p ticket->m_lock->key.mdl_namespace()
$1 = MDL_key::GLOBAL
(gdb) p ticket->m_type
$2 = MDL_INTENTION_EXCLUSIVE
(gdb) p ticket->m_duration
$3 = MDL_STATEMENT

If the next statement starts to get GLOBAL READ LOCK again, this is what I call intermittent access.


At this point, the deadlock conditions have matured. As long as this situation is encountered, human intervention may be required to continue.

Six, about mysqldump

The community edition needs to add FTWRL in the following situations:

  • Set master-data

  • Set singal-transaction and flush-logs

The percona version needs to add FTWRL in the following situations:

  • Set singal-transaction and flush-logs

Let's take a look at the community version of the code as follows (code version 8.0.21), the following is the process of pouring UNLOCK from FTWRL:

 if ((opt_lock_all_tables || opt_master_data || //如果设置了 master data 设置flush table with read lock
       (opt_single_transaction && flush_logs)) &&//如果设置了single transaction和flush logs 设置flush table with read lock
      do_flush_tables_read_lock(mysql)) //设置flush table with read lock
    goto err;
  /*
  /*
    Flush logs before starting transaction since
    this causes implicit commit starting mysql-5.5.
  */
  if (opt_lock_all_tables || opt_master_data || 
      (opt_single_transaction && flush_logs) || opt_delete_master_logs) {
    if (flush_logs || opt_delete_master_logs) {//如果设置了 flush logs 进行日志刷新
      if (mysql_refresh(mysql, REFRESH_LOG)) { //进行日志刷新
        DB_error(mysql, "when doing refresh");
        goto err;
      }
      verbose_msg("-- main : logs flushed successfully!\n");
    }

    /* Not anymore! That would not be sensible. */
    flush_logs = false;
  }

  if (opt_delete_master_logs) {
    if (get_bin_log_name(mysql, bin_log_name, sizeof(bin_log_name))) goto err;
  }

  if (opt_single_transaction && start_transaction(mysql)) goto err; //开启事务 RR

  /* Add 'STOP SLAVE to beginning of dump */
  if (opt_slave_apply && add_stop_slave()) goto err;

  /* Process opt_set_gtid_purged and add SET @@GLOBAL.GTID_PURGED if required.
   */
  if (process_set_gtid_purged(mysql)) goto err; //设置GTID,如果设置了gtid_purged 这个函数会跳过

  if (opt_master_data && do_show_master_status(mysql)) goto err; //获取主库binlog位置
  if (opt_slave_data && do_show_slave_status(mysql)) goto err; //slave_data 设置相关 从show slave中获取
  if (opt_single_transaction &&
      do_unlock_tables(mysql)) /* unlock but no commit! */
    goto err;

The judgment function check_consistent_binlog_pos is added to the percona version, as follows (but not too much description):

  if (opt_single_transaction && opt_master_data)
  {
    /*
       See if we can avoid FLUSH TABLES WITH READ LOCK with Binlog_snapshot_*
       variables.
    */
    consistent_binlog_pos= check_consistent_binlog_pos(NULL, NULL);
  }

  if ((opt_lock_all_tables || (opt_master_data && !consistent_binlog_pos) ||//consistent_binlog_pos 0 需要 1 不需要
       (opt_single_transaction && flush_logs)))
  {
    if (do_flush_tables_read_lock(mysql))
      goto err;
  }

Seven, how to solve

Summarized as follows:

  • Master-data generally increases in backups, so backups can only be performed during low peak periods to minimize the impact.

  • Consider turning off the parameter slave_preserve_commit_order. But the blockage of FTWRL still exists, but it will not cause deadlock.

  • If the pressure is not great, consider closing MTS. But the blockage of FTWRL still exists, but it will not cause deadlock.

The full text is over.

Enjoy MySQL :)

Scan code to add author WeChat

Teacher Ye's "MySQL Core Optimization" class has been upgraded to MySQL 8.0, scan the code to start the journey of MySQL 8.0 practice

Guess you like

Origin blog.csdn.net/n88Lpo/article/details/111148053