See through MySQL (9): MVCC multi-version concurrency control

1 、 MVCC

​ MVCC, the full name of Multi-Version Concurrency Control, that is, multi-version concurrency control. MVCC is a method of concurrency control, generally in a database management system, to achieve concurrent access to the database, and to achieve transactional memory in a programming language.

	MVCC在MySQL InnoDB中的实现主要是为了提高数据库并发性能,用更好的方式去处理读写冲突,做到即使有读写冲突时,也能做到不加锁,非阻塞并发读。

2. Current reading

​ Operations such as select lock in share mode (shared lock), select for update; update, insert, delete (exclusive lock) are all current reads. Why are they called current reads? That is, it reads the latest version of the record. When reading, it must ensure that other concurrent transactions cannot modify the current record and lock the read record.

3. Snapshot read

​ Like the select operation without locking is snapshot read, that is, non-blocking read without locking; the premise of snapshot read is that the isolation level is not the serial level, and the snapshot read under the serial level will degenerate into the current read; the reason for the snapshot The situation of reading is based on the consideration of improving concurrency performance. The realization of snapshot reading is based on multi-version concurrency control, namely MVCC. MVCC can be considered as a variant of row locking, but in many cases, it avoids locking operations and reduces Because it is based on multiple versions, the snapshot read may not necessarily read the latest version of the data, but may be the previous historical version

4. Current read, snapshot read, MVCC relationship

​ MVCC multi-version concurrency control refers to maintaining multiple versions of a data, so that there is no conflict between read and write operations. Snapshot read is a non-blocking read function of MySQL to implement MVCC. The concrete realization of MVCC module in MySQL is realized by three implicit fields, undo log and read view three components.

5. Problems solved by MVCC

​ There are three database concurrency scenarios, namely:

​ 1. Reading: There is no problem, and no concurrency control is required

​ 2. Read and write: There are thread safety issues, which may cause transaction isolation problems, and may encounter dirty reads, phantom reads, and non-repeatable reads.

​ 3. Writing: There are thread safety issues, and there may be missing updates.

​ MVCC is a lock-free concurrency control used to resolve read-write conflicts, that is, to allocate a single-item-increasing timestamp for the transaction, save a version for each modification, the version is associated with the transaction timestamp, and the read operation is only read when the transaction starts. A snapshot of the previous database, so MVCC can solve the following problems for the database:

​ 1. When reading and writing to the database concurrently, you can achieve that you do not need to block the write operation during the read operation, and the write operation does not need to block the read operation, which improves the performance of concurrent read and write of the database.

​ 2. Solve transaction isolation problems such as dirty reads, phantom reads, and non-repeatable reads, but cannot solve the problem of update loss

6. MVCC realization principle

​ The realization principle of mvcc mainly depends on the three hidden fields in the record, undolog and read view.

Hidden fields

​ In addition to our custom fields for each row of records, there are also fields such as DB_TRX_ID, DB_ROLL_PTR, DB_ROW_ID implicitly defined by the database

​ DB_TRX_ID

​ 6 bytes, the transaction id of the most recent modification, the transaction id of the record that created this record or the last modification of the record

​ DB_ROLL_PTR

​ 7 bytes, rollback pointer, point to the previous version of this record, used to cooperate with undolog, point to the previous old version

​ DB_ROW_JD

​ 6 bytes, hidden primary key, if the data table does not have a primary key, then innodb will automatically generate a 6-byte row_id

​ The record is shown in the figure:

Insert picture description here

​ In the above figure, DB_ROW_ID is the only implicit primary key generated by the database by default for this row record, DB_TRX_ID is the transaction ID currently operating the record, DB_ROLL_PTR is a rollback pointer, used to cooperate with the undo log, pointing to the previous old version

undo log

​ The undolog is called the rollback log, which means that the log that is generated during insert, delete, and update operations is convenient for rollback.

​ When the insert operation is performed, the generated undolog is only needed when the transaction is rolled back, and can be discarded immediately after the transaction is committed

​ When performing update and delete operations, the generated undolog is not only needed when the transaction is rolled back, but also when the snapshot is read, so it cannot be deleted arbitrarily, only when the log is not involved in the snapshot read or transaction rollback. The corresponding log will be uniformly cleared by the purge thread (when the data is updated and deleted, the deleted_bit of the old record is only set, not the actual deletion of the outdated record, because innodb has a special purge to save disk space. The thread clears the record whose deleted_bit is true. If the deleted_id of a record is true and the DB_TRX_ID is visible relative to the read view of the purge thread, then this record can be cleared at a certain time)

Here we look at undolog generated record chain

​ 1. Assuming that a transaction with transaction number 1 inserts a record into the table, then the state of the row data at this time is:

Insert picture description here

​ 2. Suppose there is a second transaction number 2 to modify the name of the record and change it to lisi

​ When transaction 2 modifies the row of record data, the database will add an exclusive lock to the row

​ Then copy the row of data to the undolog as an old record, that is, there is a copy of the current row in the undolog

​ After the copy is completed, modify the line name to lisi, and modify the transaction id of the hidden field to the id of the current transaction 2, and the rollback pointer points to the copy record copied to the undolog

​ After the transaction is committed, release the lock

Insert picture description here

​ 3. Assuming that the third transaction number is 3, the age of the record is modified to 32

​ When transaction 3 modifies the row of data, the database will add an exclusive lock to the row

​ Then copy the row of data to the undolog, as the old record, and find that the row record already has an undolog, then the latest old data is used as the header of the linked list and inserted at the top of the undolog of the row record

​ Modify the row age to be 32 years old, and modify the transaction id of the hidden field to the id of the current transaction 3. The rollback pointer points to the copy record of the undolog just copied

​ The transaction is committed and the lock is released

Insert picture description here

​ From the above series of figures, you can find that the modification of the same record in different transactions or the same transaction will cause the undolog of the record to generate a linear list of record versions, that is, a linked list. The chain head of the undolog is the latest old record. , The end of the chain is the earliest old record.

Read View

​ If you understand the above process, then you need to further understand the concept of read view.

​ Read View is a read view produced when a transaction performs a snapshot read operation. At the moment when the transaction performs a snapshot read, a current snapshot of the data system is generated, and the id of the current active transaction of the system is recorded and maintained. The id value of the transaction is Increasing.

​ In fact, the biggest role of Read View is to make visibility judgments, which means that when a transaction is performing a snapshot read, a Read View view of the record is created, and it is used as a condition to determine whether the current transaction can See which version of the data, it is possible to read the latest data, or it is possible to read the data of a certain version in the undolog recorded in the current row

​ The visibility algorithm that Read View follows is mainly to take out the DB_TRX_ID (current transaction id) in the latest record of the data to be modified and compare it with the id of other active transactions in the system. If DB_TRX_ID is compared with the attributes of Read View , Does not meet the visibility, then use the DB_ROLL_PTR rollback pointer to take out the DB_TRX_ID in the undolog for comparison, that is, traverse the DB_TRX_ID in the linked list until you find the DB_TRX_ID that meets the conditions. The old record where this DB_TRX_ID is located is the latest that the current transaction can see Old version data.

​ The visibility rules of Read View are as follows:

​ First of all, we must know the three global attributes in Read View:

​ trx_list: A list of values ​​used to maintain the transaction IDs that are active in the system when the Read View is generated

​ up_limit_id: Record the ID with the smallest transaction ID in the trx_list list

​ low_limit_id: The next transaction ID that has not been allocated by the system at the time the Read View is generated,

​ The specific comparison rules are as follows:

​ 1. First compare DB_TRX_ID <up_limit_id, if it is less than, the current transaction can see the record where DB_TRX_ID is located, if it is greater than or equal to the next judgment

​ 2. Next, determine that DB_TRX_ID >= low_limit_id. If it is greater than or equal to, it means that the record where DB_TRX_ID is located does not appear until the Read View is generated, so it must not be visible to the current transaction. If it is less than, then enter the next step

​ 3. Determine whether DB_TRX_ID is in an active transaction. If it is, it means that the transaction is still active at the time the Read View is generated. There is no commit, and the modified data is not visible to the current transaction. If it is not, it means this transaction. Commit is started before the Read View is generated, so the result of the modification can be seen.

7. The overall processing flow of MVCC

Assume that four transactions are executing at the same time, as shown in the following figure:

Transaction 1 Transaction 2 Transaction 3 Transaction 4
Transaction start Transaction start Transaction start Transaction start
Modified and submitted
processing Snapshot read processing

From the above table, we can see that when transaction 2 performs a snapshot read on a row of data, the database generates a Read View view for the row of data. You can see that transaction 1 and transaction 3 are still active, and transaction 4 is in transaction 2 The update was submitted just before the snapshot read, so the current active transactions 1 and 3 of the system are recorded in the Read View and maintained in a list. At the same time, you can see that the value of up_limit_id is 1, and the value of low_limit_id is 5, as shown in the following figure:

Insert picture description here

In the above example, only transaction 4 modified the row record, and before transaction 2 performed the snapshot read, the transaction was committed, so the undolog of the current data of the row is as follows:

Insert picture description here

​ When transaction 2 reads the row record in the snapshot, it will take the DB_TRX_ID of the row record to compare it with up_limit_id, lower_limit_id and the active transaction list, and read transaction 2 to see which version of the row record is.

​ The specific process is as follows: first compare the transaction ID (4) recorded in the row with the up_limit_id in the Read View to determine whether it is less than or not. It is found through the comparison that it is not less than, so the conditions are not met. Continue to determine whether 4 is greater than or equal to low_limit_id, and pass The comparison finds that it is not greater than that, so it does not meet the conditions. It is judged whether transaction 4 is processed in the trx_list list. If it is found that it is not in the list again, then the visibility condition is met, so the latest result submitted after transaction 4 is modified is visible to the snapshot of transaction 2. Therefore, the latest data record read by transaction 2 is the version submitted by transaction 4, and the version submitted by transaction 4 is also the latest version from a global perspective. As shown below:

Insert picture description here

When the above content is understood, then everyone should be able to figure out the relationship between these core concepts. Let's talk about the difference in snapshot reading under a different isolation level.

8. What is the difference between InnoDB snapshot read under RC and RR levels

​ Because of the different generation timing of the Read View, the results of the snapshot read under the RC and RR levels are different.

​ 1. The first snapshot read of a record of a transaction at the RR level will create a snapshot, Read View, which records other transactions that are currently active in the system, and then use it when calling the snapshot read. It is the same Read View, so as long as the current transaction uses the snapshot read before the other transaction commits and updates, then the subsequent snapshot reads use the same Read View, so the subsequent changes are not visible

​ 2. At the RR level, when the snapshot read generates the Read View, the Read View will record the snapshot of all other activities and transactions at this time. The modification of these transactions is invisible to the current transaction, and the transaction created earlier than the Read View The changes made are visible

​ 3. At the RC level, each snapshot read in the transaction will generate a new snapshot and Read View. This is why we can see the updates submitted by other transactions in the RC level transaction.

Summary: In RC isolation level is read each snapshot is generated and get the latest Read View, while under the RR isolation level, it is the first snapshot of reading the same transaction will create Read View, after Snapshot reads get the same Read View.

Guess you like

Origin blog.csdn.net/u013277209/article/details/114360409