Mysql advanced MVCC version control

What is MVCC?

MVCC ( Multiversion Concurrency Control ), multi-version concurrency control. As the name suggests, MVCC implements database concurrency control through multiple version management of data rows . This technology guarantees consistent read operations under InnoDB 's transaction isolation level . In other words, it is to query some rows that are being updated by another transaction, and you can see the values ​​before they were updated, so that you do not have to wait for another transaction to release the lock when doing the query.

What are the benefits of using MVCC (problems solved)

There are 4 isolation levels in SQL: read uncommitted, read committed, repeatable read, and serialized. They read the three types respectively as shown below:

dirty read non-repeatable read phantom reading
Read uncommitted ×   × ×
Read submitted × ×
repeatable read ×
serialization

As can be seen from the above table, from top to bottom, the lower the isolation level, the lower the performance, but the smaller the problem of data concurrency, and by the time of serialization, it is already the slowest performance. However, MVCC version concurrency control can make repeatable reading and solve the problem of phantom reading , which greatly enhances the performance of SQL. As shown below:

                

dirty read non-repeatable read phantom reading
Read uncommitted × × ×
Read submitted × ×
Repeatable read/serialization

        Repeatable reading/serialization uses the MVCC+Next-key lock mechanism.

MVCC can not use the lock mechanism, but solve the problem of non-repeatable reading and phantom reading through optimistic locking! It can replace row-level locks in most cases and reduce system overhead.

Implementation principle of MVCC

The implementation principle of MVCC is: hidden fields , UNDO LOG field chain , READ VIEW view

What is ReadView

In the MVCC mechanism, when multiple transactions update multiple versions of the same row record , multiple historical snapshots will be generated, and these historical snapshots are stored in the UNDO LOG. If a transaction wants to query this row record, it needs to read Which version of the row record should be obtained? At this time, we need READ View to help us solve the visibility problem.

ReadView is a transaction that performs a snapshot read operation when using the MVCC mechanism to generate a read view. When a transaction is started, a current snapshot of the database system will be generated. InnDB constructs a set of arrays for each transaction to record and maintain the ID of the currently active transaction in the system .

The ReadView view mainly contains 4 important contents:

        1. creator_trx_id: the transaction ID that created this ReadView

        2. trx_ids: Indicates the ID list of the current active transaction when generating ReadView.

        3. up_limit_id: the smallest transaction ID of the active transaction

        4. low_limit_id: indicates the system’s maximum transaction ID + 1

MVCC operation process

When we query a record, we will first get the version number of the transaction, which is its own transaction ID, and then generate

undo log (historical snapshot) + ReadView In ReadView, the ID of the smallest transaction will be compared with the current trx_ids. If it is in the current active list, continue to search downwards until it finds a piece of data smaller than itself.

The vernacular would be too difficult to understand, so let’s go straight to the picture

Notice:

       3 intermediate levels corresponding to MVCC

        Read Committed: A new view appears for each query

       Repeatable read : When the transaction is not committed, each query is the view of the first query.

        Serialization : same as repeatable read

MVCC reads the submitted process

You can take a look at a screenshot I found online.

        Use the read committed isolation level: a new view will be generated for each query

              Now there are two transactions transaction10 and transaction20 

 You can see that transaction 10 updated some data, and transaction 20 did not update the data. At this time, an undo log snapshot will be generated and saved. When we use the transaction to read the transaction currently being operated, ReadView will be triggered. In ReadView, There are 4 key data creator_trx_id, trx_ids, up_limit_id, low_limit_id. These data have been mentioned above, so trx_ids is [10,20] ; up_limit_id is 10 ; low_limit_id is 21.

        Select the data in the version chain of the undo log. The first one is Wang Wu trx_id 10. It is found that it is already in the range of trx_ids in ReadView , indicating uncommitted transactions , so it cannot be queried, and then pushes down in sequence. Li Fourth, it is also in the active chain. When Zhang San's trx_id is pushed to 8 , it is found that it is not in the active version chain , indicating that it has been submitted and the data can be queried , so Zhang San's data is returned!

 Let’s look at the level of repeatable reads : MVCC solves the problems of phantom reads and non-repeatable reads.

MVCC repeatable read process

Or just look at the pictures directly for easier understanding.

 We can find that the value obtained is Zhang San, which is the same as the above read submitted operation.

The ReadView generated by this query is: trx_ids is [10,20] ; up_limit_id is 10 ; low_limit_id is 21

Because the current isolation level is repeatable read , subsequent transactions will use this ReadView view

We found that its undoLog version chain has changed, but his view ReadView is still the view queried for the first time, and the trx_ids [10,20]  in the view queried for the first time are

Therefore, when comparing undolog through trx_id and ReadView, the data of Zhang San was finally queried!

MVCC phantom reading process

If the above two isolation levels are understood, phantom reading will be easy to understand, because phantom reading also generates the ReadView view during the first query, so no matter how many times it is queried, there will only be this one view. , so when we completed the data insertion and compared the undolog chain with the ReadView, we found that the inserted new data was in the active id, so the current data could not be queried. Next, we searched downwards until we found the version that was not in the active id. Data in the chain!

 

 

Guess you like

Origin blog.csdn.net/wang20000102/article/details/132294018