MVCC concurrency management control

MVCC concurrency management control

Read and write problems caused by concurrency

Because mysql is a cs structure, one server can connect to multiple clients, so there must be multiple threads to read, read, write, and write the same resource (the same record) .

Dirty write: A transaction modifies uncommitted data of other transactions. 20220329195612

Dirty read: A transaction reads uncommitted data from other transactions. 20220329200122

Non-repeatable read: Transaction A first queries records according to certain conditions, then transaction B modifies the data in it, transaction A queries records again according to the same conditions, and finds that the data value has changed compared with the previous one. (modified highlighted) 20220329200907

Phantom reading: Transaction A first queries records according to certain conditions, then transaction B adds or deletes some data, transaction A queries records again according to the same query conditions, and finds more or less data than before. (highlighting deletions and additions) 20220329200857

What transaction isolation level solves

In order to solve the problems of concurrent read and write, designers intend to lose some isolation to avoid these problems

read uncommitted read committed repeatable read serialization

Set some isolation levels, the lower the isolation level, the more serious problems are more likely to occur

20220329202846Serialization performance loss is the most serious, because it guarantees that threads can only modify the same resource serially.

Set transaction isolation level

SET [GLOBAL|SESSIONTRANSACTION ISOLATION LEVEL level;

There are four levels

  • READ UNCOMMITTED (read the latest data)
  • READ COMMITTED (read committed data, need to use mvcc)
  • REPEATABLE READ (non-repeatable read, need to use mvcc)
  • SERIALIZABLE (serialization, direct locking)

The origin of MVCC concurrency management control

Why does mysql need mvcc

Because of the data access problem caused by concurrency, to solve the data access problem, at the same time, we must take into account the efficiency, and we cannot blindly lock it. Like RC and RR, it needs to be realized by the mvcc mechanism.

Like mentioning MVCC, it is necessary to understand the concepts of current read and snapshot read

  • 当前读:读取最新的数据,可能会对数据产生影响,像 select xxx for update,insert ,delete,update,都会触发当前读, 可重复读使用当前读时,mvcc机制就不能用了,需要使用行锁(临建锁=记录锁+间隙锁)保证其隔离作用。
  • 快照读:利用了mvcc机制,读取某个版本数据信息,是的整个事务数据前后是一致的,但无法保证数据是最新的。像一般的 select 用的就是当前读

版本链

对于InnoDB数据引擎来说,聚簇索引的每条记录都会有两个隐藏列 trx_id:每次一个事务对某条聚簇索引记录进行改动时,都会把该事务的 事务id 赋值给 trx_id 隐藏列 roll_pointer: 回滚指针,指向旧记录(undo log)

20220329204008
20220329204008

此时事务id为100,和200开始进行更新操作

20220329204619 每次更新记录,对应记录的undo log就会多一条 像下面这样 20220329204757 随着修改次数增多,undo log会被roll_pointers属性连成一个单向链表, 我们把这个链表称之为 版本链 ,版本链的头节点就是当前记录最新的值

readView(读视图)

  • 对于【读未提交】,只需要读最新的记录就好,都不需要用到版本链。
  • 对于【串行化】,因为他的隔离级别最高,是并发变成串行,不会存在并发问题,只是效率变低了,所以也不需要用到版本链。
  • 对【读已提交】,【可重复读】, 都需要保证不能读到其他事务未提交的数据,所以就需要判断下, 版本链上的哪个版本是对当前事务具有可见性,这个就需要readView啦

readView结构

  • m_ids: 表示生成readView的时候,当前未提交的事务id.
  • min_trx_id: 表示生成readView的时候,未提交的事务中事务id最小的那个。
  • max_trx_id: 表示生成readView的时候,应该分配下一个事务的id。
  • creator_trx_id: 表示生成该 ReadView 的事务的 事务id 。

我们前边说过,只有在对表中的记录做改动时(执行INSERT、DELETE、UPDATE这些语句时)才会为事务分配事务id,否则在一个只读事务中的事务id值都默认为0。

根据readView判断版本链上的版本对事务的可见性

  1. 如果版本链上的版本的trx_id与readView上的creator_trx_id 相同,说明当前事务是在访问自己修改过的记录。
  2. 如果版本链上的版本的trx_id 小于readView上的min_trx_id,说明此版本已经提交,具有可见性。
  3. 如果版本链上的版本的trx_id 大于readView上的max_trx_id,说明此版本是发生在readView之后,不具有可见性。
  4. If the trx_id attribute value of the accessed version is between the min_trx_id and max_trx_id of ReadView, then you need to judge whether the trx_id attribute value is in the m_ids list. If it is, it means that the transaction generating this version is still active when creating ReadView. It cannot be accessed; if it is not, it means that the transaction that generated this version has been submitted when ReadView was created, and this version can be accessed.

The generation timing of readView for read Committed and repeatable read

  • [read committed] is a transaction, and a readView is generated every time a query is made
  • [repeatable read] is the transaction, the first query generates readView, and then it will not be generated.

To sum up: Check whether the version on the version chain is submitted earlier than readView. If it is earlier, it means that the transaction has been submitted and has visibility. If it is later, it means that the version is still uncommitted and has no visibility.

It's not easy to be original, please give me a thumbs up and go!

This article is published by mdnice multi-platform

Guess you like

Origin blog.csdn.net/ZHUXIUQINGIT/article/details/129172751