Database transaction interbank transaction model series -MySQL

MySQL to say and actually destined, graduate's first job was assigned to the RDS team, responsible for the MySQL database server on the cloud to get made. Although the whole day and MySQL deal, but to be honest that time was not very in-depth understanding of MySQL kernel, do basic management and control systems are doing around MySQL, compare the top. Fortunately, surrounded by MySQL kernel class people of God, more or less there are some fragmentary records and vague understanding of some basic knowledge of MySQL under their influence, these foundations are very important for the understanding of MySQL finishing interbank transaction model today. More importantly, there are many puzzling places can also turn to the great God.

MySQL transaction model, there are many online introduction, before writing this article, I also look at a lot of information as a reference, in order to make yourself understood more in-depth and comprehensive. After reading the article describes the most part found in the article is not complete, for instance, some only describes the isolation level at several MySQL performance, and did not interpret from a technical perspective. Some articles say would really round, but lack a little disorganized read is not easy to understand. This is the point I wanted to bring you something different, from a technical point of interpretation, and beneficial to understand.

MySQL transactional atomicity guarantee

Atomic transactions require a series of operations in a transaction are either completed, or do not do anything, not only do half. Atomic atomic operation is very easy to implement, as HBase Bank of China affairs atomic level implementation is relatively simple. But for the transaction consisting of multiple statements, if an exception occurs during the execution of a transaction, the need to ensure atomicity can only roll back, roll back to the state before the start of the transaction, the transaction did not like this happened. How to achieve it?

MySQL achieve rollback operation entirely depends on the undo log, say one, undo log in addition to MySQL atomicity guarantees, but also to implement MVCC, hereinafter also involved. The undo operation atomicity before any data before the first data record will be modified in the undo log, then the actual modification. If you need to roll back abnormal, the system can restore data to its state before the transaction began to use the backup undo. MySQL is represented in the transaction FIG basic data structure, wherein the field is associated with the undo insert_undo and update_undo, point to undo log produced by this firm.

Database transaction interbank transaction model series -MySQL

The rollback transaction update_undo (or insert_undo) to find the corresponding undo log, to do the reverse operation. For cleaning delete data flag has been marked for deletion, the update data for updating directly rollback; insert slightly more complex, not only to delete the data, but also to delete the relevant record clustered index and a secondary index.

undo log is a very important piece of content MySQL kernel, involving more knowledge and complex, such as:

  • 1. undo log must persist before the data modification, undo log record redo persistence or need to prevent downtime abnormal? If you need it also involves downtime recovery ...

  • 2. How to implement MVCC by undo log?

  • 3. Those undo log can be recycled to clean under what scenario? How to clean?

MySQL ensure transactional consistency: consistency strong transactional guarantees

MySQL transaction isolation level


Read Uncommitted (RU interpretation techniques: X-write locks for concurrency)

Read Uncommitted achieved only write concurrency control, and there is no effective read and write concurrency control, leading to the current transaction may read modify data in other transactions has not been submitted, the accuracy of these data are not reliable (likely to be rolled back ), so all the assumptions made on this basis would not fly. In reality there are few business scenarios will choose the isolation level.

Write concurrent implementation mechanism and HBase is no different, we are using two-phase locking protocol for the respective recording plus row lock. But MySQL BOC lock mechanism is more complex, according to whether the rows primary key index, the only index non-unique index or no index divided into multiple lock situation.

  • 1. If the id column is the primary key index, MySQL will only clustered index record locking.

  • 2. If the id column is a unique secondary indexes, MySQL will provide a secondary index leaf node and the clustered index record locking.

  • 3. If a non-unique index id column, MySQL will satisfy all the conditions (id = 15) and the leaf nodes corresponding to the secondary index clustered index record locking.

  • 4. If no id column is indexed, the SQL clustered index will take full table scan, and the scan result is loaded into the SQL Server filter layer, so all records will InnoDB scanned to lock together, if SQL Server layer filtering does not meet the conditions, InnoDB will release the lock. Therefore, InnoDB will be to scan all records are locked, Scary, huh!

Then both RC, RR, or is it Serialization, write concurrency control mechanisms use the above, it will not repeat them. The next analysis will focus on reading and writing concurrency control mechanism RC and RR isolation levels.

Before details of RC and RR, it is necessary to introduce this first MySQL in MVCC mechanism because RC and RR are using MVCC mechanism between concurrent read and write transactions. There are only two differences in implementation details, talk to you next specific differences.

MVCC in MySQL

MySQL in MVCC mechanism is much more complex compared to HBase, the data structure involves more complicated. To explain the relatively clear, with a chestnut as a template to explain. There is a line such as the current record as shown below:

Database transaction interbank transaction model series -MySQL

Front four is the actual column values ​​of the rows, the need to focus on is DB_TRX_ID and DB_ROLL_PTR two hidden columns (invisible to the user). Wherein DB_TRX_ID modify the line transaction represents a transaction ID, rather DB_ROLL_PTR represents a pointer to the row rollback, all versions of the data on the rows, in the undo list are organized by the undo history value in the row of the actual point record list.

Now suppose there is a transaction trx2 modify the row data, the bank records will be changed to the next graphic, DB_TRX_ID was last modified that line transaction transaction ID (trx2), DB_ROLL_PTR point to undo history list:

Database transaction interbank transaction model series -MySQL

Understanding of the MySQL rows, let's look at the basic structure of the transaction, the following figure is a MySQL transactional data structure, we mentioned above. Transaction will create a data structure for storing transaction-related information after the open, lock information, undo log and a very important read_view information.

read_view MySQL to save the entire list of all active transactions, as shown when the current transaction open, at the time the current transaction open, the system has an active transaction trx4, trx6, trx7 and trx10. Further, up_trx_id represents the current transaction start, transaction list smallest current transaction ID; low_trx_id represents the current transaction start, transaction list largest current transaction ID.

Database transaction interbank transaction model series -MySQL

read_view implement MVCC is a key point, which is used to determine which version of the record of the current transaction visible. If the current transaction to read a row record, the version number (transaction ID) of the rows of trxid, then:

  • 1. 如果trxid < up_trx_id,说明该行记录所在的事务已经在当前事务创建之前就提交了,所以该行记录对当前事务可见。

  • 2. 如果trxid > low_trx_id,说明该行事务所在的事务是在当前事务创建之后才开启,所以该行记录对当前事务不可见。

  • 3. 如果up_trx_id < trxid < low_trx_id, 那么表明该行记录所在事务在本次新事务创建的时候处于活动状态。从up_trx_id到low_trx_id进行遍历,如果trxid等于他们之中的某个事务id的话,那么不可见,否则可见。

以下面行记录为例,该行记录存在多个版本(trx2、trx5、trx7以及trx12),其中trx12是最新版本。看看该行记录中哪个版本对当前事务可见。

  • 1. 该行记录的最新版本为trx12,与当前事务read_view进行对比发现,trx12大于当前活跃事务列表中的最大事务trx10,表示trx12是在当前事务创建之后才开启的,因此不可见。

  • 2. 再查看该行记录的第二个最新版本为trx7,与当前事务read_view对比发现,trx7介于当前活跃事务列表最小事务ID和最大事务ID之间,表明该行记录所在事务在当前事务创建的时候处于活动状态,在活跃列表中遍历发现trx7确实存在,说明该事务还没有提交,所以对当前事务不可见。

  • 3. 继续查看该记录的第三个最新版本trx5,也介于当前活跃事务列表最小事务ID和最大事务ID之间,表明该行记录所在事务在当前事务创建的时候处于活动状态,但遍历发现该版本并不在活跃事务列表中,说明trx5对应事务已经提交(注:事务提交时间与事务编号没有任何关联,有可能事务编号大的事务先提交,事务编号小的事务后提交),因此trx5版本行记录对当前事务可见,直接返回。

Database transaction interbank transaction model series -MySQL


Read Committed(技术解读:写写并发使用X锁,读写并发使用MVCC避免脏读)

上文介绍了MySQL中MVCC技术实现机制,但要明白RC隔离级别下事务可见性,还需要get一个核心点:RC隔离级别下的事务在每次执行select时都会生成一个最新的read_view代替原有的read_view。

Database transaction interbank transaction model series -MySQL

如上图所示,左侧为1号事务,在不同时间点对id=1的记录分别查询了三次。右侧为2号事务,对id=1的记录进行了更新。更新前该记录只有一个版本,更新好变成了两个版本。

1号事务在RC隔离级别下每次执行select请求都会生成一个最新的read_view,前两次查询生成的全局事务活跃列表中包含trx2,因此根据MVCC规定查到的记录为老版本;最后一次查询的时间点位于2号事务提交之后,因此生成的全局活跃事务列表中不包含trx2,此时在根据MVCC规定查到的记录就是最新版本记录。

Repeatable Read(技术解读:写写并发使用X锁,读写并发使用MVCC避免不可重复读;当前读使用Gap锁避免幻读)

和RC模式不同,RR模式下事务不会再每次执行select的时候生成最新的read_view,而是在事务第一次select时就生成read_view,后续不会再变更,直至当前事务结束。这样可以有效避免不可重复读,使得当前事务在整个事务过程中读到的数据都保持一致。示意图如下所示:

Database transaction interbank transaction model series -MySQL

这个就很容易理解,三次查询所使用的全局活跃事务列表都一样,且都是第一次生成的read_view,那之后查到的记录必然和第一次查到的记录一致。

RR隔离级别能够避免幻读吗?

如果对幻读还不了解的话,可以参考该系列的第一篇文章。如下图所示,1号事务对针对id>1的过滤条件执行了三次查询,2号事务执行了一次插入,插入的记录刚好符合id>1这个条件。可以看出来,三次查询得到的数据是一致的,这个是由RR隔离级别的MVCC机制保证的。这么看来,是避免了幻读,但是在最后1号事务在id=2处插入一条记录,MySQL会返回Duplicate entry的错误,可见避免了幻读是一种假象。

Database transaction interbank transaction model series -MySQL

严格意义避免幻读(技术解读:当前读使用Gap锁避免幻读)

之前提到的所有RR级别的select语句我们称为快照读,快照读能够保证不可重复读,但并不能避免幻读。于是MySQL又提出”当前读”的概念,常见的当前读语句有:

1.  select for update

2.  select lock in share mode

3.  update / delete

并且规定,RR级别下当前读语句会给记录加上一种特殊的锁-Gap锁,Gap锁并不锁定某个具体的记录,而是锁定记录与记录之间的间隔,保证这个间隔中不会插入新的其他记录。下图是一个示意图:

Database transaction interbank transaction model series -MySQL

上图中1号事务首先执行了一个当前读的select语句,这个语句会在 id > 0的所有间隔加上Gap锁,接下来2号事务在id = 3处执行插入时系统就会返回Lock wait timeout execcded的异常。当然,其他事务可以在id <= 0的条件下插入成功,这没问题。

Serializable (技术解读:S锁(读)+X锁(写))

Serialization隔离级别是最严格的隔离级别,所有读请求都会加上读锁,不分快照读和当前读,所有写会加上写锁。当然,这种隔离级别的性能因为锁开销而相对最差。

MySQL事务持久性保证

MySQL事务持久化策略和HBase基本相同,但是涉及的组件相对比较多,主要有doublewrite、redo log以及binlog:

1. MySQL数据持久化(DoubleWrite)

实际上MySQL的真实数据写入分为两次写入,一次写入到一个称为DoubleWrite的地方,写成功之后再真实写入数据所在磁盘。为什么要写两次?这是因为MySQL数据页大小与磁盘一次原子操作大小不一致,有可能会出现部分写入的情况,比如默认InnoDB数据页大小为16K,而磁盘一次原子写入大小为512字节(扇区大小),这样一个数据页写入需要多次IO,这样一旦中间发生异常就会出现数据丢失。另外需要注意的是DoubleWrite性能并不会影响太大,因为写入DoubleWrite是顺序写入,对性能影响来说不是很大。

2. redolog持久化策略(innodb_flush_log_at_trx_commit)

redolog是InnoDB的WAL,数据先写入redolog并落盘,再写入更新到bufferpool。redolog的持久化策略和HBase中hlog的持久化策略一致,默认为1,表示每次事务提交之后log就会持久化到磁盘;该值为0表示每隔1秒钟左右由异步线程持久化到磁盘,这种情况下MySQL发生宕机有可能会丢失部分数据。该值为2表示每次事务提交之后log会flush到操作系统缓冲区,再由操作系统异步flush到磁盘,这种情况下MySQL发生宕机不会丢失数据,但机器宕机有可能会丢失部分数据。

3. binlog持久化策略(sync_binlog)

binlog作为Server层的日志系统,主要以events的形式顺序纪录了数据库的各种操作,同时可以纪录每次操作所花费的时间。在MySQL官方文档上,主要介绍了Binlog的两个最基本核心作用:备份和复制,因此binlog的持久化会一定程度影响数据备份和复制的完整性。和redo持久化策略相同,可取值有0,1,N。默认为0,表示写入操作系统缓冲区,异步flush到磁盘。该值为1表示同步写入磁盘。为N则表示每写N次操作系统缓冲就执行一次刷新操作。

To sum up, this is the third series of database transactions, introduced the single core MySQL interbank transaction model, in which the lock isolation techniques involved, MVCC mechanisms for a more detailed explanation. The relevant characteristics of atomicity, durability, also simple analysis and explanation. Then it will take everyone together to talk about distributed transaction model, and see what the difference between a stand-alone transaction model in the end.

Guess you like

Origin blog.51cto.com/14230003/2448194