MySQL basics (35) multi-version concurrency control

1. What is MVCC

MVCC (Multiversion Concurrency Control), multi-version concurrency control. As the name suggests, MVCC implements databases through multiple version management of data rows 并发控制. This technology ensures that operations are performed under InnoDB's transaction isolation level 一致性读. In other words, it is to query some rows that are being updated by another transaction, and you can see the values ​​before they were updated, so that you do not have to wait for another transaction to release the lock when doing the query.

2. Snapshot reading and current reading

The implementation of MVCC in MySQL InnoDB is mainly to improve the concurrency performance of the database and use a better way to handle it, 读-写冲突so that it can be done even when there are read and write conflicts, 不加锁 , 非阻塞并发读and this read refers to 快照读, not 当前读. The current read is actually a locking operation and an implementation of pessimistic locking. The essence of MVCC is a way of using optimistic locking thinking.

2.1 Snapshot reading

Snapshot read, also called consistent read, reads snapshot data. 不加锁的简单的 SELECT 都属于快照读, that is, non-blocking reading without locking; for example:

SELECT * FROM player WHERE ...

The reason why snapshot reading occurs is based on the consideration of improving concurrency performance. The implementation of snapshot reading is based on MVCC, which avoids locking operations and reduces overhead in many cases.

Since it is based on multiple versions, the snapshot read may not necessarily read the latest version of the data, but may be the previous historical version.

The premise of snapshot reading is that the isolation level is not the serial level. Snapshot reading at the serial level will degenerate into current reading.

2.2 Current reading

The current read is the latest version of the record (the latest data, not the historical version of the data). When reading, it must also ensure that other concurrent transactions cannot modify the current record, and the read record will be locked. A locked SELECT, or addition, deletion, or modification of data will result in current reading. for example:

SELECT * FROM student LOCK IN SHARE MODE; # 共享锁
SELECT * FROM student FOR UPDATE; # 排他锁
INSERT INTO student values ... # 排他锁
DELETE FROM student WHERE ... # 排他锁
UPDATE student SET ... # 排他锁

3. Review

3.1 Let’s talk about isolation levels again

We know that transactions have 4 isolation levels, and there may be three concurrency problems:
Insert image description here
Another picture:
Insert image description here

3.2 Hidden fields and Undo Log version chain

Looking back at the version chain of the undo log, for InnoDBtables using storage engines, their clustered index records contain two necessary hidden columns.

  • trx_id: Every time a transaction changes a clustered index record, the transaction ID of the transaction will be assigned to the trx_id hidden column.
  • roll_pointer: Every time a clustered index record is modified, the old version will be written to the undo log. Then this hidden column is equivalent to a pointer, through which the information before the record can be found is modified.
    Insert image description here

insert undo only works when the transaction is rolled back. When the transaction is committed, this type of undo log is useless, and the Undo Log
Segment it occupies will also be recycled by the system (that is, the Undo page linked list occupied by the undo log will either be reused, or released).

Assume that two transactions with transaction IDs of 10and , respectively, 20operate on this record UPDATE. The operation process is as follows:
Insert image description here
every time a record is modified, an undo log will be recorded, and each undo log also has an roll_pointerattribute ( INSERT the undo log corresponding to the operation There is no such attribute, because the record does not have an earlier version), you can undo日志connect them all into a linked list:
Insert image description here
every time the record is updated, the old value will be put undo日志into one, even if it is one of the records For old versions, as the number of updates increases, all versions will be roll_pointerconnected into a linked list by attributes. We call this linked list 版本链. The head node of the version chain is the latest value of the current record.

Each version also contains the corresponding version when that version was generated 事务id.

4. MVCC implementation principle of ReadView

The implementation of MVCC depends on: 隐藏字段, Undo Log, Read View.

4.1 What is ReadView

4.2 Design ideas

Using READ UNCOMMITTEDan isolation level transaction, since you can read records modified by uncommitted transactions, you can just read the latest version of the record directly.

Using SERIALIZABLEisolation level transactions, InnoDB requires locking to access records.

Transactions using READ COMMITTEDand REPEATABLE READisolation levels must ensure that 已经提交了的records modified by the transaction are read. If another transaction has modified the record but has not yet submitted it, it cannot directly read the latest version of the record. The core problem is to determine which version in the version chain is visible to the current transaction. This is the main problem to be solved by ReadView.

This ReadView mainly contains 4 important contents, which are as follows:

  1. creator_trx_id, create the transaction ID of this Read View.

    Note: A transaction ID will be assigned to a transaction only when changes are made to records in the table (when INSERT, DELETE, and UPDATE statements are executed). Otherwise, the transaction ID value in a read-only transaction defaults to 0.

  2. trx_ids, indicating the active read and write transactions in the current system when the ReadView is generated 事务id列表.

  3. up_limit_id, the smallest transaction ID among active transactions.

  4. low_limit_id, indicating the value that should be assigned to the next transaction in the system when the ReadView is generated id. low_limit_id is
    the maximum transaction ID value of the system. It should be noted here that it is the transaction ID in the system and needs to be distinguished from the active transaction ID.

    Note: low_limit_id is not the maximum value in trx_ids, transaction ids are allocated incrementally. For example, there are now three transactions with IDs 1, 2, and 3, and then the transaction with ID 3 is committed. Then when a new read transaction generates ReadView, trx_ids includes 1 and 2, the value of up_limit_id is 1, and the value of low_limit_id is 4.

4.3 ReadView rules

With this ReadView, when accessing a record, you only need to follow the steps below to determine whether a certain version of the record is visible.

  • If the value of the trx_id attribute of the accessed version is the same as the value in ReadView creator_trx_id, it means that the current transaction is accessing its own modified records, so this version can be accessed by the current transaction.
  • If the value of the trx_id attribute of the accessed version is smaller than the value in ReadView up_limit_id, it indicates that the transaction that generated this version has been committed before the current transaction generates ReadView, so this version can be accessed by the current transaction.
  • If the value of the trx_id attribute of the accessed version is greater than or equal to the value in ReadView low_limit_id, it means that the transaction that generated this version was started after the current transaction generated ReadView, so this version cannot be accessed by the current transaction.
  • If the trx_id attribute value of the accessed version is between ReadView's up_limit_idand low_limit_id, then you need to determine whether the trx_id attribute value is in trx_idsthe list.
    • If it is, it means that the transaction that generated this version was still active when the ReadView was created, and this version cannot be accessed.
    • If not, it means that the transaction that generated this version when the ReadView was created has been committed and this version can be accessed.

4.4 MVCC overall operation process

After understanding these concepts, let's take a look at how the system finds it through MVCC when querying a record:

  1. First, get the version number of the transaction itself, which is the transaction ID;
  2. GetReadView;
  3. Query the data obtained and then compare it with the transaction version number in ReadView;
  4. If the ReadView rules are not met, you need to obtain a historical snapshot from the Undo Log;
  5. Finally, data that conforms to the rules is returned.

When the isolation level is Read Committed, each SELECT query in a transaction will re-obtain a Read View.

As shown in the table:
Insert image description here

Note that the same query statement will obtain the Read View again. If the Read View is different, non-repeatable reading or phantom reading may occur.

When the isolation level is repeatable read, non-repeatable read is avoided. This is because a transaction only obtains a Read View once during the first SELECT, and all subsequent SELECTs will reuse this Read View, as shown in the following table Shown:
Insert image description here

5. Give examples

5.1 READ COMMITTED isolation level

READ COMMITTED :每次读取数据前都生成一个ReadView.
There are currently two transactions 事务idbeing executed:1020

# Transaction 10
BEGIN;
UPDATE student SET name="李四" WHERE id=1;
UPDATE student SET name="王五" WHERE id=1;
# Transaction 20
BEGIN;
# 更新了一些别的表的记录
...

idAt this moment, the version linked list obtained for the record in the table student 1is as follows:
Insert image description here
Assume that a READ COMMITTEDtransaction using the isolation level starts to execute:

# 使用READ COMMITTED隔离级别的事务
BEGIN;
# SELECT1:Transaction 10、20未提交
SELECT * FROM student WHERE id = 1; # 得到的列name的值为'张三'

After that, we submit the transaction 事务idfor 10:

# Transaction 10
BEGIN;
UPDATE student SET name="李四" WHERE id=1;
UPDATE student SET name="王五" WHERE id=1;
COMMIT;

Then in the transaction 事务idfor , 20update the record for studentin the table :id1

# Transaction 20
BEGIN;
# 更新了一些别的表的记录
...
UPDATE student SET name="钱七" WHERE id=1;
UPDATE student SET name="宋八" WHERE id=1;

idAt this moment, the version chain of the record for in the table student 1looks like this:
Insert image description here
Then continue to search for the record for in READ COMMITTEDthe transaction that just used the isolation level , as follows:id1

# 使用READ COMMITTED隔离级别的事务
BEGIN;
# SELECT1:Transaction 10、20均未提交
SELECT * FROM student WHERE id = 1; # 得到的列name的值为'张三'
# SELECT2:Transaction 10提交,Transaction 20未提交
SELECT * FROM student WHERE id = 1; # 得到的列name的值为'王五'

5.2 REPEATABLE READ isolation level

For transactions using REPEATABLE READisolation level, one will only be generated when the query statement is executed for the first time ReadView, and subsequent queries will not be generated repeatedly.

For example, there are two 事务idtransactions being executed in the system :1020

# Transaction 10
BEGIN;
UPDATE student SET name="李四" WHERE id=1;
UPDATE student SET name="王五" WHERE id=1;
# Transaction 20
BEGIN;
# 更新了一些别的表的记录
...

idAt this moment, the version linked list obtained for the record in the table student 1is as follows:
Insert image description here
Assume that a REPEATABLE READtransaction using the isolation level starts to execute:

# 使用REPEATABLE READ隔离级别的事务
BEGIN;
# SELECT1:Transaction 10、20未提交
SELECT * FROM student WHERE id = 1; # 得到的列name的值为'张三'

After that, we commit the transaction 事务idfor 10, like this:

# Transaction 10
BEGIN;
UPDATE student SET name="李四" WHERE id=1;
UPDATE student SET name="王五" WHERE id=1;
COMMIT;

Then in the transaction 事务idfor , 20update the record for studentin the table :id1

# Transaction 20
BEGIN;
# 更新了一些别的表的记录
...
UPDATE student SET name="钱七" WHERE id=1;
UPDATE student SET name="宋八" WHERE id=1;

At this moment, the version chain length of the record for idin the table student is as follows: Then continue to search for the record for in the transaction that just used the isolation level , as follows:1
Insert image description here
REPEATABLE READid1

# 使用REPEATABLE READ隔离级别的事务
BEGIN;
# SELECT1:Transaction 10、20均未提交
SELECT * FROM student WHERE id = 1; # 得到的列name的值为'张三'
# SELECT2:Transaction 10提交,Transaction 20未提交
SELECT * FROM student WHERE id = 1; # 得到的列name的值仍为'张三'

5.3 How to solve phantom reading

Next, we will explain how InnoDB solves phantom reads.

Assume that there is only one piece of data in the table student. In the data content, the primary key id=1 and the hidden trx_id=10. Its undo log is as shown in the figure below.
Insert image description here
Assume that transaction A and transaction B are executed concurrently. 事务 AThe transaction id is 20, 事务 Band the transaction id is 30.

Step 1: Transaction A starts querying data for the first time, and the query SQL statement is as follows.

select * from student where id >= 1;

Before starting the query, MySQL will generate a ReadView for transaction A. At this time, the content of ReadView is as follows: trx_ids=[20,30] , up_limit_id=20 , low_limit_id=31 , creator_trx_id=20.

Since there is only one piece of data in the table student at this time and it meets the condition of where id>=1, it will be queried. Then according to the ReadView mechanism, it is found that the trx_id=10 of the row of data is smaller than the up_limit_id in the ReadView of transaction A. This means that this data is data that has been submitted by other transactions before transaction A is started, so transaction A can read it.

Conclusion: The first query of transaction A can read a piece of data with id=1.

Step 2: Then proceed with transaction B (trx_id=30), insert two new pieces of data into the table student, and submit the transaction.

insert into student(id,name) values(2,'李四');
insert into student(id,name) values(3,'王五');

At this time, there are three pieces of data in the table student, and the corresponding undo is as shown in the figure below:
Insert image description here
Step 3: Then transaction A starts the second query. According to the rules of repeatable read isolation level, transaction A will not be regenerated at this time. ReadView. At this time, the three pieces of data in the student table all meet the condition of where id>=1, so they will be found first. Then according to the ReadView mechanism, it is judged whether each piece of data can be seen by transaction A.

1) First of all, the data with id=1, as mentioned before, can be seen by transaction A.

2) Then there is the data with id=2, and its trx_id=30. At this time, transaction A finds that this value is between up_limit_id and low_limit_id, so it needs to be judged whether 30 is in the trx_ids array. Since transaction A's trx_ids=[20,30], in the array, this means that the data with id=2 was submitted by other transactions started at the same time as transaction A, so this data cannot be seen by transaction A. .

3) Similarly, the trx_id of the data with id=3 is also 30, so it cannot be seen by transaction A.

Insert image description here
Conclusion: The second query of final transaction A can only query the data with id=1. This is the same as the result of the first query of transaction A, so there is no phantom reading phenomenon. Therefore, under the repeatable read isolation level of MySQL, there is no phantom reading problem.

6. Summary

Here is the process for transactions MVCCat READ COMMITTDthese REPEATABLE READtwo isolation levels to access the recorded version chain when performing a snapshot read operation. This allows different transactions 读-写and 写-读operations to be executed concurrently, thereby improving system performance.

The core point lies in the principle of ReadView. READ COMMITTDA REPEATABLE READbig difference between these two isolation levels is the different timing of generating ReadView:

  • READ COMMITTDA ReadView will be generated before each ordinary SELECT operation.
  • REPEATABLE READJust generate a ReadView before performing a normal SELECT operation for the first time, and reuse this ReadView for subsequent query operations.

Guess you like

Origin blog.csdn.net/zhufei463738313/article/details/130706259