"MySQL Practical Combat 45 Lectures" - Study Notes 26 "How to catch up with the main database in the standby database"

This article introduces the strategies and ideas of how the standby library can catch up with the main library when the master and backup delays occur. The table-by-table distribution strategy under MySQL 5.5, the library parallel replication strategy of MySQL 5.6, the MariaDB parallel replication strategy based on group commit optimization, and the optimization of MySQL 5.7 based on MariaDB's strategy;

Why is it difficult for the standby database to "catch up" with the main database under a single thread?

The previous article introduced the reasons for the delay of the main and standby databases. It may be the performance gap between the physical machines of the main and standby databases, the additional reading pressure on the standby database, or the continuous increase of sbm (seconds_behind_master) caused by the submission of large transactions by the main database;

According to daily work experience, the master-slave delay generally occurs after the main database performs a large number of updates in a short period of time, and after a period of time after the centralized update ends, the master-slave delay gradually decreases;

Consider a question: If the master and standby machines have the same performance, but the standby database uses a single thread, will the master-slave delay occur ? The answer is - yes !

The reason is simple: on the main database, various locks affect the concurrency ; since the InnoDB engine supports row locks (small granularity), except for the extreme scenario where all concurrent transactions are updating the same row (hot row), Its support for business concurrency is still very friendly; if at this time, the standby database is single-threaded to consume relaylog transfer logs, even if the performance of the primary and standby machines is the same, it will definitely not be able to catch up with the primary database;

Recall the flow chart of master-slave synchronization; when it comes to the parallel replication capability of master and slave, we should pay attention to the two black arrows in the figure; one arrow represents the client writing to the master database, and the other arrow represents the standby database Execute the relay log (relaylog) on ​​sql_thread;

If the parallelism is represented by the thickness of the arrows, as shown in the figure, the first arrow is obviously thicker than the second arrow; if sql_thread uses a single thread, it will cause the application log of the standby database to be not fast enough, causing the main preparation delay;

The model of multi-thread replication of standby database

Through the above analysis, in order to catch up with the main library, the standby library needs to use multi-threading to do the master-slave replication, similar to the following model:

The coordinator is the original sql_thread, but now it no longer directly updates the data, and is only responsible for reading the transfer log and distributing transactions; the one that actually updates the log becomes a worker thread;

The premise that the standby database can perform data synchronization tasks in parallel : Since the standby database also provides queries while executing the master-standby synchronization, the data in the standby database must not only ensure the final consistency, but also ensure that it is consistent with the master database during the execution of the master-standby synchronization task. consistent ;

Therefore, when the coordinator distributes, it needs to meet the following two basic requirements:

  • Principle 1: Update coverage cannot be caused ; this requires that two transactions that update the same row must be distributed to the same worker, so as to ensure that they are executed in chronological order;

  • Principle 2: The same transaction cannot be disassembled and must be placed in the same worker to ensure that the isolation of the transaction is not destroyed;

The multi-threaded replication of each version of MySQL described below follows these two basic principles;

MySQL 5.5 version distribution strategy by table

The basic idea of ​​​​distributing transactions by table is: if two transactions update different tables, they can be parallelized, which naturally ensures that two workers will not update the same row; under this idea, you need to pay attention—if there are transactions across tables, how tasks need to be assigned ;

As shown in the figure, each worker thread corresponds to a hash table, which is used to save the tables involved in the transactions currently in the worker's "execution queue"; the key of the hash table is "library name. table name", and the value is a number. Indicates how many transactions in the queue modify this table;

Hash table update rules : When a transaction is assigned to a worker, the table involved in the transaction will be added to the hash table of the current worker; after the worker finishes executing the current transaction, this table will be removed from the hash table;

举个例子说明任务分配规则,图中,hash_table_1表示,现在worker_1的“待执行事务队列”里,有4个事务涉及到db1.t1表,有1个事务涉及到db1.t2表;hash_table_2表示,现在worker_2中有1个事务涉及到db1.t3表;此时,coordinator从中转日志中读入一个新事务T,这个事务修改的行涉及到表db1.t1和db1.t3,执行任务分配的步骤如下:

  1. 事务T涉及到表db1.t1和db1.t3,而worker_1中有事务在修改表t1,说明当前事务T和worker_1是冲突的;

  1. 按照这个逻辑,顺序判断事务T和每个worker队列的冲突关系;一遍下来,发现事务T跟worker_2也冲突;

  1. 事务T跟多于一个worker冲突,coordinator线程就进入等待;同时,每个worker继续执行,同时修改hash_table;

  1. 假设worker_2执行完db1.t3相关的事务,就会从hash_table_2中把db1.t3这一项去掉;

  1. 此时,coordinator发现跟事务T冲突的worker只有worker_1这1个工作线程,因此就把它分配给worker_1;

  1. coordinator继续读下一个中转日志,继续分配事务;

核心就是,每个事务在分发的时候,跟所有worker的冲突关系包括以下三种情况:

  • 如果跟所有worker都不冲突,coordinator线程就会把这个事务分配给最空闲的woker;

  • 如果跟多于一个worker冲突,coordinator线程就进入等待状态,直到和这个事务存在冲突关系的worker只剩下1个;

  • 如果只跟一个worker冲突,coordinator线程就会把这个事务分配给这个存在冲突关系的worker;

优点:事务涉及的表大部分都为单表时,任务被同时均匀分配给各个worker线程,执行效率很高;

缺点:碰到热点表,比如所有的更新事务都会涉及到某一个表的时候,所有事务都会被分配到同一个worker中,就变成单线程复制了,效率低;

MySQL5.6版本的按数据库并行复制策略

这个策略相对于上面的按表并行策略,粒度更大,也更简单;区别就是,在用于决定分发策略的hash表里,将key从"库名.表名"改成了"库名";

优点:相比于按表分发,构造hash值的时候很快,只需要库名;这个策略的并行效果,取决于压力模型,如果在主库上有多个DB,并且各个DB的压力均衡,使用这个策略的效果会很好;

缺点:与按表分发类似,如果主库上的表都放在同一个DB里面,或者只有一个库是热点库,那这个策略就没有效果了;

MariaDB基于组提交 (group commit) 优化的并行复制策略

简单介绍下,MariaDB是MySQL的一个分支,主要由开源社区在维护,而MySQL是Oracle维护,MariaDB完全兼容MySQL;

简单回顾下MySQL的组提交(group commit)机制:在第一个事务写完redolog buffer / binlog files以后,尽量晚调用fsync,尽量在fsync之前,让更多的redolog/binlog加入批次(提交组),然后再对整个组一次性fsync;一次组提交里面,组员越多,节约磁盘IOPS的效果越好

而MariaDB的并行复制策略利用的就是这个特性:

  • 能够在同一组里提交的事务,一定不会修改同一行;

  • 主库上可以并行执行的事务,备库上也一定是可以并行执行的;

实现上,MariaDB是这么做的:

  1. 在一组里面一起提交的事务,有一个相同的commit_id,下一组就是commit_id+1;

  1. commit_id直接写到binlog里面;

  1. 传到备库应用的时候,相同commit_id的事务分发到多个worker执行;

  1. 这一组全部执行完成后,coordinator再去取下一批;

如图,这个策略有一个问题:它并没有实现“真正的模拟主库并发度”这个目标;在主库上,一组事务在commit的时候,下一组事务是同时处于“执行中”状态的;而在备库上执行的时候,要等第一组事务完全执行完成后,第二组事务才能开始执行,这样系统的吞吐量就不够

优点:这个策略是一个很漂亮的创新,之前业界的思路都是在“分析binlog,并拆分到worker”上;而MariaDB的这个策略,目标是“模拟主库的并行模式”;它对原系统的改造非常少,实现也很优雅;

缺点:这个方案很容易被大事务拖后腿;假设图里同个组内trx2是一个超大事务,那么在备库应用的时候,trx1和trx3执行完成后,就只能等trx2完全执行完成,下一组才能开始执行;这段时间,相当于只有一个worker线程在工作,是对资源的浪费;

MySQL 5.7 版本在MariaDB的策略基础上的优化;

官方的MySQL5.7版本对MySQL5.6和MariaDB的方案进行了合并,通过参数slave-parallel-type控制并行复制策略:

  • 配置为DATABASE,使用MySQL5.6版本的按数据库并行策略;

  • 配置为LOGICAL_CLOCK,使用类似MariaDB的策略,并且对并行度做了优化;

先思考个问题:为什么MariaDB选择主库上能够在"同一组里提交committing"的事务并行执行?选择上图中同时处于"执行状态running"的所有事务并行执行可以吗?

答案是——不能!因为,同时处于"执行状态running"的所有事务,这里面可能存在由于锁冲突而处于锁等待状态的事务;如果这些事务在备库上被分配到不同的worker,就会出现备库跟主库不一致的情况;而上面提到的MariaDB这个策略的核心,是"所有处于commit"状态的事务可以并行;事务处于committing状态,表示已经通过了锁冲突的检验了

回顾下前文MySQL针对组提交(group commit)的优化

如图,其实不用等到commit阶段,只要能够到达redolog prepare阶段,就表示事务已经通过锁冲突的检验了;因此,MySQL5.7并行复制策略的思想是:

  • 同时处于prepare状态的事务,在备库执行时是可以并行的;

  • 处于prepare状态的事务,与处于commit状态的事务之间,在备库执行时也是可以并行的;

MySQL针对组提交的优化思路是通过控制fsync前delay的参数(延迟时间、延迟事务数量),从而拉长binlog从write到fsync的时间,以此减少binlog的写盘次数;在MySQL5.7的并行复制策略里,它们可以用来制造更多的“同时处于prepare阶段的事务”,这样就增加了备库复制的并行度;

下篇文章:《MySQL实战45讲》——学习笔记28 “读写分离/主从延迟的解决方案/GTID“

本章参考:26 | 备库为什么会延迟好几个小时?

Guess you like

Origin blog.csdn.net/minghao0508/article/details/128918530