MySQL combat: How is MVCC implemented?

insert image description here

What does MVCC do?

In order to achieve parallel reading and writing without locking, MySQL has come up with an MVCC mechanism. Let's take a look at how MVCC achieves parallel reading and writing?

For tables using the InnoDB storage engine, the clustered index records contain the following two necessary hidden columns

trx_id : Every time a transaction changes a clustered index record, it will assign the transaction id of the transaction to the trx_id hidden column

roll_pointer : Every time a clustered index record is changed, the old version will be written to the undo log. This hidden column is equivalent to a pointer, through which he can find the information before the modification of the record

If the name of a record is changed from Diaochan to Wang Zhaojun and Xi Shi, there will be the following records. Multiple records form a version chain.
insert image description here
First, review the concept of isolation level, so that you will not be confused by the following content.

√ means it will happen, × means it will not happen

isolation level dirty read non-repeatable read hallucinations
read uncommitted
read committed ×
repeatable read × ×
serializable (serializable) × × ×

read committed

Create the following table

CREATE TABLE `account` (
  `id` int(2) NOT NULL AUTO_INCREMENT,
  `name` varchar(10) DEFAULT NULL,
  `balance` int(3) DEFAULT '0',
  PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=4 DEFAULT CHARSET=utf8mb4;

The data in the table is as follows, set the isolation level to read committed
insert image description here

time Client A (Tab A) Client B (Tab B)
T1 set session transaction isolation level read committed;
start transaction;
select * from account where id = 2;
query balance output is 0
T2 set session transaction isolation level read committed;
start transaction;
update account set balance = balance + 1000 where id = 2;
select * from account where id = 2;
commit;
查询余额输出1000
T3 select * from account where id = 2;
commit;
query balance output 1000

Non-repeatable read means that in transaction 1, a piece of data is read, and before transaction 1 ends, transaction 2 also accesses this data, modifies this data, and commits it. Immediately after, transaction 1 reads the data again. Due to the modification of transaction 2, the data read twice by transaction 1 may be different, so it is called non-repeatable read.

repeatable read

The data in the table is as follows, set the isolation level to repeatable read
insert image description here

time Client A (Tab A) Client B (Tab B)
T1 set session transaction isolation level repeatable read;
start transaction;
select * from account where id = 2;
query balance output is 0
T2 set session transaction isolation level repeatable read;
start transaction;
update account set balance = balance + 1000 where id = 2;
select * from account where id = 2;
commit;
查询余额输出1000
T3 select * from account where id = 2;
commit;
query balance output 0

Look carefully at the output of this example and the above example in the T3 time period. Do you understand what is repeatable reading? When we set the isolation level of the current session to repeatable read, the current session can be read repeatedly, that is, the result set of each read is the same, regardless of whether other transactions are committed.

When I finished this experiment, I was blinded, how does MySQL support these two isolation levels? Let's look down

How is MVCC implemented?

In order to determine which version in the version chain is visible to the current transaction, MySQL devised the concept of ReadView . The 4 important ones are as follows

m_ids : List of transaction ids that are active in the current system when ReadView is generated

min_trx_id : When generating ReadView, the smallest transaction id active in the current system, that is, the smallest value in m_ids

max_trx_id : The transaction id value that the system should assign to the next transaction when generating the ReadView

creator_trx_id : The transaction id of the transaction that generated the ReadView

当对表中的记录进行改动时,执行insert,delete,update这些语句时,才会为事务分配唯一的事务id,否则一个事务的事务id值默认为0。

max_trx_id并不是m_ids中的最大值,事务id是递增分配的。比如现在有事务id为1,2,3这三个事务,之后事务id为3的事务提交了,当有一个新的事务生成ReadView时,m_ids的值就包括1和2,min_trx_id的值就是1,max_trx_id的值就是4

mvcc判断版本链中哪个版本对当前事务是可见的过程如下
Please add image description
执行过程如下:

  1. 如果被访问版本的trx_id=creator_id,意味着当前事务在访问它自己修改过的记录,所以该版本可以被当前事务访问
  2. 如果被访问版本的trx_id<min_trx_id,表明生成该版本的事务在当前事务生成ReadView前已经提交,所以该版本可以被当前事务访问
  3. 被访问版本的trx_id>=max_trx_id,表明生成该版本的事务在当前事务生成ReadView后才开启,该版本不可以被当前事务访问
  4. 被访问版本的trx_id是否在m_ids列表中
    4.1 是,创建ReadView时,该版本还是活跃的,该版本不可以被访问。顺着版本链找下一个版本的数据,继续执行上面的步骤判断可见性,如果最后一个版本还不可见,意味着记录对当前事务完全不可见
    4.2 否,创建ReadView时,生成该版本的事务已经被提交,该版本可以被访问

看着图有点懵?是时候来个例子了

建立如下表

CREATE TABLE `girl` (
  `id` int(11) NOT NULL,
  `name` varchar(255),
  `age` int(11),
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

Read Committed

Read Committed(读已提交),每次读取数据前都生成一个ReadView

insert image description here
下面是3个事务执行的过程,一行代表一个时间点
insert image description here
先分析一下5这个时间点select的执行过程

  1. 系统中有两个事务id分别为100,200的事务正在执行
  2. 执行select语句时生成一个ReadView,mids=[100,200],min_trx_id=100,max_trx_id=201,creator_trx_id=0(select这个事务没有执行更改操作,事务id默认为0)
  3. 最新版本的name列为西施,该版本trx_id值为100,在mids列表中,不符合可见性要求,根据roll_pointer跳到下一个版本
  4. 下一个版本的name列王昭君,该版本的trx_id值为100,也在mids列表内,因此也不符合要求,继续跳到下一个版本
  5. 下一个版本的name列为貂蝉,该版本的trx_id值为10,小于min_trx_id,因此最后返回的name值为貂蝉

insert image description here

再分析一下8这个时间点select的执行过程

  1. 系统中有一个事务id为200的事务正在执行(事务id为100的事务已经提交)
  2. 执行select语句时生成一个ReadView,mids=[200],min_trx_id=200,max_trx_id=201,creator_trx_id=0
  3. 最新版本的name列为杨玉环,该版本trx_id值为200,在mids列表中,不符合可见性要求,根据roll_pointer跳到下一个版本
  4. 下一个版本的name列为西施,该版本的trx_id值为100,小于min_trx_id,因此最后返回的name值为西施

当事务id为200的事务提交时,查询得到的name列为杨玉环。

Repeatable Read

Repeatable Read(可重复读),在第一次读取数据时生成一个ReadView
insert image description here
可重复读因为只在第一次读取数据的时候生成ReadView,所以每次读到的是相同的版本,即name值一直为貂蝉,具体的过程上面已经演示了两遍了,我这里就不重复演示了,相信你一定会自己分析了。

mvcc即多版本并发控制,通过读取指定版本的历史记录,并通过 undo log 保证读取的记录值符合事务所处的隔离级别,在不加锁的情况下解决读写冲突

参考博客

"How MySQL Works: Understanding MySQL from the Root"
[1] https://blog.csdn.net/qq_35190492/article/details/106915564

Guess you like

Origin blog.csdn.net/zzti_erlie/article/details/123743604