An example of MySQL 5.6 master-slave error

1. Problem phenomenon

Version: MySQL 5.6, using the master-slave replication structure configured in the traditional binlog file & pos mode.
Insert picture description here
After the instance restarts, the master-slave replication error is shown in the figure above.

2. Error meaning

The error is divided into 2 parts.

first part

  • Client requested master to start replication from position > file
    size;
  • the first event ‘mysql-bin.000398’ at 163800795,the last event read
    from ‘./mysql-binlog.000398’ at 4,the last byte read from
    ‘./mysql-bin.000398’ at 4’

The first part
This part comes from the DUMP thread function of the main library

mysql_binlog_send
  ->sender.run()
    ->Binlog_sender::init
       ->Binlog_sender::check_start_file

  if ((file= open_binlog_file(&cache, m_linfo.log_file_name, &errmsg)) < 0) 
  {
    set_fatal_error(errmsg);
    return 1;
  }

  size= my_b_filelength(&cache);
  end_io_cache(&cache);
  mysql_file_close(file, MYF(MY_WME)); 

  if (m_start_pos > size)
  {
    set_fatal_error("Client requested master to start replication from "
                    "position > file size");
    return 1;
  }

The key is the two values ​​of m_start_pos and size, where m_start_pos comes from the position that needs to be read from the library. And size is the size of the binlog file, so it is easy to understand that if the pos point required by the io thread is larger than the size of the binlog file, it is naturally wrong.

The second part
This part also comes from the DUMP thread

mysql_binlog_send
  ->sender.run()
     ->Binlog_sender::init
     ->while (!has_error() && !m_thd->killed)
     #如果正常这里开始循环读取binlog event,如果前面出错则直接继续后面逻辑
     #如果有读取错误则报错
       my_snprintf(error_text, sizeof(error_text),
                  "%s; the first event '%s' at %lld, "
                  "the last event read from '%s' at %lld, "
                  "the last byte read from '%s' at %lld.",
                  m_errmsg,
                  m_start_file, m_start_pos, m_last_file, m_last_pos,
                  log_file, my_b_tell(&log_cache));

Here we mainly look at m_start_pos and m_last_pos. In fact, m_start_pos is the position information that needs to be read from the library, which is consistent with the previous error, and m_last_pos comes from the dump thread, which is the last read position. Obviously, it has not been read once. Therefore, the position is pos 4 at the beginning.

3. Possible causes

After analysis, I feel that the most likely cause should be related to sync_binlog.

If we do not set it to 1, then the os cache may not be flashed. If the main library server crashes and restarts, it is easy to encounter this problem.

A little google query found that most of these errors are caused by server crashes and sync_binlog is not set to 1.

Finally, check that the main database of the problem database is indeed not set to double 1.

Reference link:

An example of MySQL 5.6 master-slave error: https://mp.weixin.qq.com/s/7uQk5cRjfgUxyDcLS7CURg

Guess you like

Origin blog.csdn.net/qq_40907977/article/details/114822444