foreword
Recently, a project of the company reported that the master-slave replication of mysql failed, and was recorded as a major mistake by the operation and maintenance department, which affected the acceptance of the project. So what is the reason? And what is the principle of master-slave replication? This article makes a record of the process of investigation and analysis.
Master-slave replication principle
Let's first briefly understand the principle of MySQL master-slave replication.
- The main library
master
server will write SQL records to the binary log throughdump
threadsbinary log
; slave
Open aio thread
thread from the library server to send a request to the server andmaster
request to the main librarybinary log
. After the main librarymaster
server receives the request, it willbinary log
send toslave
the server according to the offset.- After
slave
receiving a new from the library server, it is written to its own , which is the so-called relay log.binary log
relay log
- From the library
slave
server, after opening asql thread
readrelay log
, write it into its own data, so as to ensure the consistency of the master-slave data.
The above is a brief principle of MySQL master-slave replication, and more details will not be discussed. According to the operation and maintenance feedback, the failure of master-slave replication is mainly due to the timeout of the binary log obtained by the IO thread. The log of the master database has reached 4G, which is bin log
normal binlog
. Under the circumstances, according to the configuration, it should not exceed 300M.
binlog writing mechanism
To understand binlog
why it reaches 4 G, let's look at the binlog writing mechanism.
binlog
The timing of writing is also very simple. During the execution of the transaction, the log is written first binlog cache
, and when the transaction is committed, it is binlog cache
written binlog
to the file. Because a transaction binlog
cannot be disassembled, no matter how large the transaction is, it must be written at one time, so the system will allocate a block of memory to each thread binlog cache
.
- The above figure
write
refers to writing the log to the file systempage cache
, and does not persist the data to the disk, so the speed is relatively fast - The above figure
fsync
is the operation of persisting data to disk and generatingbinlog
logs
In production, binlog
the configuration in MySQL max_binlog_size
is 250M, max_binlog_size
but it is used to control the size of a single binary log. When the current log file size exceeds this variable, the switching action will be executed. , this setting cannot strictly control the size of the Binlog, especially binlog
when it is close to the maximum value and encounters a relatively large transaction, in order to ensure the integrity of the transaction, the action of switching logs may not be performed, and all $ QL is recorded into the current log until the end of the transaction. In general, the default value can be adopted.
So we doubt whether we have encountered a big transaction, so we need to see which transaction caused the content in the binlog.
View binlog logs
We can use mysqlbinlog
this tool to view the content in the binlog. For specific usage, refer to the official website: https://dev.mysql.com/doc/refman/8.0/en/mysqlbinlog.html
.
- view
binlog
log
./mysqlbinlog --no-defaults --base64-output=decode-rows -vv /mysqldata/mysql/binlog/mysql-bin.004816|more
复制代码
binlog
Statistics of the byte size occupied by the log file in units of transactions
./mysqlbinlog --no-defaults --base64-output=decode-rows -vv /mysqldata/mysql/binlog/mysql-bin.004816|grep GTID -B1|grep '^# at' | awk '{print $3}' | awk 'NR==1 {tmp=$1} NR>1 {print ($1-tmp, tmp, $1); tmp=$1}'|sort -n -r|more
复制代码
A certain transaction in production actually occupies 4 G.
- Pass
start-position
andstop-position
count the byte size occupied by each SQL of this transaction
./mysqlbinlog --no-defaults --base64-output=decode-rows --start-position='xxxx' --stop-position='xxxxx' -vv /mysqldata/mysql/binlog/mysql-bin.004816 |grep '^# at'| awk '{print $3}' | awk 'NR==1 {tmp=$1} NR>1 {print ($1-tmp, tmp, $1); tmp=$1}'|sort -n -r|more
复制代码
It was found that the largest SQL took up 32M in size, so how many more than 10M are there?
- By the number of more than 10M size
./mysqlbinlog --no-defaults --base64-output=decode-rows --start-position='xxxx' --stop-position='xxxxx' -vv /mysqldata/mysql/binlog/mysql-bin.004816|grep '^# at' | awk '{print $3}' | awk 'NR==1 {tmp=$1} NR>1 {print ($1-tmp, tmp, $1); tmp=$1}'|awk '$1>10000000 {print $0}'|wc -l
复制代码
The statistical results show that there are more than 200, gross estimates, there are nearly 4 G
- 根据pos, 我们看下究竟是什么SQL导致的
./mysqlbinlog --no-defaults --base64-output=decode-rows --start-position='xxxx' --stop-position='xxxxx' -vv /mysqldata/mysql/binlog/mysql-bin.004816|grep '^# atxxxx' -C5| grep -v '###' | more
复制代码
根据sql,分析了下,这个表正好有个blob
字段,统计了下blob字段总合大概有3个G大小,然后我们业务上有个导入操作,这是一个非常大的事务,会频繁更新这表中记录的更新时间,导致生成binlog
非常大。
问题: 明明只是简单的修改更新时间的语句,压根没有动blob
字段,为什么生产的binlog
这么大?因为生产的binlog采用的是row模式。
binlog的模式
binlog
日志记录存在3种模式,而生产使用的是row
模式,它最大的特点,是很精确,你更新表中某行的任何一个字段,会记录下整行的内容,这也就是为什么blob
字段都被记录到binlog
中,导致binlog
非常大。此外,binlog
还有statement
和mixed
两种模式。
- STATEMENT模式 ,基于SQL语句的复制
- 优点: 不需要记录每一行数据的变化,减少
binlog
日志量,节约IO,提高性能。 - 缺点: 由于只记录语句,所以,在
statement leve
l下 已经发现了有不少情况会造成MySQL的复制出现问题,主要是修改数据的时候使用了某些定的函数或者功能的时候会出现。
- ROW模式,基于行的复制
5.1.5版本的MySQL才开始支持,不记录每条sql语句的上下文信息,仅记录哪条数据被修改了,修改成什么样了。
- 优点:
binlog
中可以不记录执行的sql语句的上下文相关的信息,仅仅只需要记录那一条被修改。所以rowlevel
的日志内容会非常清楚的记录下每一行数据修改的细节。不会出现某些特定的情况下的存储过程或function
,以及trigger
的调用和触发无法被正确复制的问题 - 缺点: 所有的执行的语句当记录到日志中的时候,都将以每行记录的修改来记录,会产生大量的日志内容。
- MIXED模式
从5.1.8版本开始,MySQL提供了Mixed
格式,实际上就是Statement
与Row
的结合。
在Mixed
模式下,一般的语句修改使用statment
格式保存binlog
。如一些函数,statement
无法完成主从复制的操作,则采用row格式
保存binlog
。
总结
最终分析下来,我们定位到原来是由于大事务+blob字段大致binlog非常大,最终我们采用了修改业务代码,将blob字段单独拆到一张表中解决。所以,在设计开发过程中,要尽量避免大事务,同时在数据库建模的时候特别考虑将blob字段独立成表。
欢迎关注个人公众号【JAVA旭阳】交流学习