MySQL Knowledge Learning 05 (InnoDB Storage Engine's Implementation of MVCC)

1. Consistent non-locking read and locked read

consistent non-locking read

For this 一致性非锁定读(Consistent Nonlocking Reads), the usual practice is to add a version number or timestamp field, and the version number + 1 or update the timestamp while updating the data. When querying, compare the currently visible version number with the version number of the corresponding record. If the record version is smaller than the visible version, it means that the record is visible

In the InnoDB storage engine, 多版本控制 (multi versioning)it is the implementation of non-locking read. If the read row is performing DELETEan or UPDATEoperation, the read operation will not wait for the release of the row lock. On the contrary, InnoDBthe storage engine will read a snapshot data of the row. For this way of reading historical data, we call it snapshot read (snapshot read)

在 Repeatable Read 和 Read Committed 两个隔离级别下, if it is to execute a normal select statement (excluding select ... lock in share mode, select ... for update) will be used 一致性非锁定读(MVCC). And realize repeatable reading and prevent partial phantom readingRepeatable Read underMVCC

2. Lock read

If the following statement is executed, that is 锁定读(Locking Reads)

  • select … lock in share mode
  • select … for update
  • insert, update, delete operations

Under locking read, the latest version of the data is read, which is also known as read 当前读(current read). Locking reads lock the read records:

  • select ... lock in share mode: Add S lock to the record , and other transactions can also add S lock, if you add X lock, it will be blocked
  • select ... for update, insert, update, delete: add X lock to the record , and other transactions cannot add any locks. Under consistent non-locking read, even if the read record has been added X lock by other transactions, the record is also Can be read, that is, read snapshot data. As mentioned above, under Repeatable Read, MVCC prevents partial phantom reading. The "partial" here means that in the case of consistent non-locking read, only the data inserted before the first query can be read (according to Read View To judge data visibility, Read View is generated at the first query). but! If it is a current read, the latest data is read every time. At this time, if other transactions insert data between the two queries, a phantom read will occur. Therefore, when InnoDB implements Repeatable Read, if the current read is performed, Next-key Lock will be used for the read records to prevent other transactions from inserting data in the gap

3. InnoDB's implementation of MVCC

The implementation of MVCC depends on: hidden fields, Read View, and undo log . In the internal implementation, InnoDB judges the visibility of the data through the DB_TRX_ID and Read View of the data row. If it is not visible, it finds the historical version in the undo log through the DB_ROLL_PTR of the data row. The data version read by each transaction may be different. In the same transaction, the user can only see the modification that has been submitted before the transaction creates the Read View and the modification made by the transaction itself

hidden field

Internally, InnoDBthe storage engine adds three hidden fields to each row of data:

  • DB_TRX_ID(6字节): Indicates the last transaction id that inserted or updated the row. Also, a delete operation is considered an update internally, except that it will be marked as deleted in the field Record headerin the record headerdeleted_flag
  • DB_ROLL_PTR(7字节)Rollback pointer, pointing to this row undo log. null if the row has not been updated
  • DB_ROW_ID(6字节): If no primary key is set and the table does not have a unique non-empty index, InnoDB will use this id to generate a clustered index

ReadView

class ReadView {
    
    
  /* ... */
private:
  trx_id_t m_low_limit_id;      /* 大于等于这个 ID 的事务均不可见 */

  trx_id_t m_up_limit_id;       /* 小于这个 ID 的事务均可见 */

  trx_id_t m_creator_trx_id;    /* 创建该 Read View 的事务ID */

  trx_id_t m_low_limit_no;      /* 事务 Number, 小于该 Number 的 Undo Logs 均可以被 Purge */

  ids_t m_ids;                  /* 创建 Read View 时的活跃事务列表 */

  m_closed;                     /* 标记 Read View 是否 close */
}

Read ViewIt is mainly used to 做可见性判断save "other active transactions that are currently invisible to this transaction"

There are mainly the following fields:

  • m_low_limit_id: The largest transaction ID+1 that has occurred so far, that is, the next transaction ID to be assigned. Data versions greater than or equal to this ID are not visible
  • m_up_limit_id: The smallest transaction ID in the active transaction list m_ids, if m_ids is empty, then m_up_limit_id is m_low_limit_id. Data versions smaller than this ID are visible
  • m_ids: A list of other uncommitted active transaction IDs when the Read View was created. When creating a Read View, record the current uncommitted transaction ID, and even if they modify the value of the record row, they will not be visible to the current transaction. m_ids does not include the current transaction itself and committed transactions (in memory)
  • m_creator_trx_id: Create the transaction ID of the Read View

Schematic diagram of transaction visibility:

insert image description here

undo-log

The undo log has two main functions:

  • When the transaction is rolled back, it is used to restore the data to the way it was before modification
  • Another function is that MVCCwhen reading a record, if the record is occupied by other transactions or the current version is not visible to the transaction, the previous version data can be read through the undo log to achieve non-locking read

There are two types of undo log in the InnoDB storage engine: insert undo log and update undo log :

1. Insert undo log : refers to the undo log generated in the insert operation. Because the record of the insert operation is only visible to the transaction itself and not to other transactions, the undo log can be deleted directly after the transaction is committed. No need for purge operation

insertThe initial state of the data when:

insert image description here

2. Update undo log : the undo log generated in the update or delete operation. The undo log may need to provide an MVCC mechanism, so it cannot be deleted when the transaction is committed. Put it into the undo log linked list when submitting, and wait for the purge thread to perform the final deletion

When the data is modified for the first time:

insert image description here

When the data is modified for the second time:

insert image description here

The modification of the same record line by different transactions or the same transaction will make the undo log of the record line into a linked list, the head of the chain is the latest record, and the tail of the chain is the earliest old record.

4. Data visibility algorithm

In InnoDBthe storage engine, after creating a new transaction and before executing each statement, a snapshot (Read View)select is created , which stores the ID number of the active (no commit) transaction in the current database system . In fact, simply speaking, it saves the list of other transaction IDs (ie m_ids) that should not be seen by this transaction in the system. When the user wants to read a record row in this transaction, InnoDB will compare the record row with some variables in and the current transaction ID to determine whether the visibility condition is metDB_TRX_IDRead View

The specific comparison algorithm is as follows:

insert image description here

  1. If the record DB_TRX_ID < m_up_limit_id, it indicates that the latest transaction (DB_TRX_ID) that modified the row was committed before the current transaction created the snapshot, so the value of the record row is visible to the current transaction
  2. If DB_TRX_ID >= m_low_limit_id, it indicates that the transaction (DB_TRX_ID) that modified the row last modified the row after the current transaction created the snapshot, so the value of the record row is not visible to the current transaction. skip to step 5
  3. If m_ids is empty, it means that the transaction that modifies the row has been committed before the current transaction creates a snapshot, so the value of the record row is visible to the current transaction
  4. If m_up_limit_id <= DB_TRX_ID < m_low_limit_id, it indicates that the latest transaction (DB_TRX_ID) that modifies the row may be in the "active state" or "committed state" when the current transaction creates a snapshot; so it is necessary to search the active transaction list m_ids (source code is used in the binary search, because it is ordered)
    --------If DB_TRX_ID can be found in the active transaction list m_ids, it indicates that: ① Before the current transaction creates a snapshot, the value of the record row is replaced by the transaction ID It is modified for the transaction of DB_TRX_ID, but not submitted; or ② After the snapshot is created for the current transaction, the value of the record row is modified by the transaction whose transaction ID is DB_TRX_ID. In these cases, the value of this row is not visible to the current transaction. Skip to step 5
    --------If it is not found in the active transaction list, it means that the "transaction with id trx_id" has modified "the value of the record row" and before the "current transaction" creates a snapshot has been committed, so the record row is visible to the current transaction
  5. Take out the snapshot record from the undo log pointed to by the DB_ROLL_PTR pointer of the record line, use the DB_TRX_ID of the snapshot record to skip to step 1 and start judging again until you find a satisfactory snapshot version or return empty

5. The difference between MVCC under RC and RR isolation level

Under the transaction isolation level RC and RR (the default transaction isolation level of the InnoDB storage engine), the InnoDB storage engine uses MVCC (non-locking consistent read), but their timing of generating Read View is different

  • Generate a Read View (m_ids list) before each select query under the RC isolation level
  • Under the RR isolation level, only generate a Read View (m_ids list) before the first select data after the transaction starts

6. MVCC solves the non-repeatable read problem

Although both RC and RR use MVCC to read snapshot data, due to the different timing of generating Read View , repeatable reading is achieved at the RR level.

for example:

insert image description here

7. The generation of ReadView under RC

  1. Assuming that the timeline comes to T4, then the version chain of data row id = 1 at this time is:

insert image description here

Since a Read View is generated for each query at the RC level, and transactions 101 and 102 have not been submitted, the active transaction m_ids in the Read View generated by transaction 103 is: [101,102], m_low_limit_id is: 104, m_up_limit_id is: 101, m_creator_trx_id is: 103

  • At this time, the latest recorded DB_TRX_ID is 101, m_up_limit_id <= 101 < m_low_limit_id, so you need to search in the m_ids list and find that DB_TRX_ID exists in the list, then this record is not visible
  • Find the previous version record in the undo log according to DB_ROLL_PTR. The DB_TRX_ID of the previous record is still 101, which is invisible
  • Continue to find the previous DB_TRX_ID is 1, satisfying 1 < m_up_limit_id, it can be seen, so the transaction 103 query data is name = cauliflower
  1. The timeline comes to T6, and the version chain of the data is:

insert image description here

Because at the RC level, the Read View is regenerated. At this time, transaction 101 has been submitted, but 102 has not been submitted. Therefore, the transaction m_ids active in the Read View at this time: [102], m_low_limit_id is: 104, m_up_limit_id is: 102, m_creator_trx_id is :103

  • At this time, the latest recorded DB_TRX_ID is 102, m_up_limit_id <= 102 < m_low_limit_id, so you need to search in the m_ids list and find that DB_TRX_ID exists in the list, then this record is not visible
  • Find the previous version record in the undo log according to DB_ROLL_PTR. The DB_TRX_ID of the previous record is 101, which satisfies 101 < m_up_limit_id, and the record is visible. Therefore, the data queried at time T6 is name = Li Si, which is inconsistent with the result queried at time T4 , can not be read repeatedly!
  1. The timeline comes to T9, and the version chain of the data is:

insert image description here

Regenerate the Read View. At this time, both transactions 101 and 102 have been submitted, so m_ids is empty, then m_up_limit_id = m_low_limit_id = 104, the latest version of the transaction ID is 102, satisfying 102 < m_low_limit_id, it can be seen that the query result is name = Zhao Liu

Summary: Under the RC isolation level, the transaction will generate and set a new Read View at the beginning of each query, so it will cause non-repeatable read

8. The generation of ReadView under RR

At the repeatable read level, only a Read View (m_ids list) will be generated when the data is read for the first time after the transaction starts

  1. The version chain in case of T4 is:

insert image description here

A Read View is generated when the select statement is currently executed. At this time, m_ids: [101,102], m_low_limit_id is: 104, m_up_limit_id is: 101, m_creator_trx_id is: 103. At this time, it is the same as under the RC level:

  • The DB_TRX_ID of the latest record is 101, m_up_limit_id <= 101 < m_low_limit_id, so you need to search in the m_ids list and find that DB_TRX_ID exists in the list, then this record is invisible
  • Find the previous version record in the undo log according to DB_ROLL_PTR. The DB_TRX_ID of the previous record is still 101, which is invisible
  • Continue to find the previous DB_TRX_ID is 1, satisfying 1 < m_up_limit_id, it can be seen, so the transaction 103 query data is name = cauliflower
  1. At time point T6:

insert image description here

在 RR 级别下只会生成一次Read View,所以此时依然沿用 m_ids :[101,102], m_low_limit_id is: 104, m_up_limit_id is: 101, m_creator_trx_id is: 103

  • The DB_TRX_ID of the latest record is 102, m_up_limit_id <= 102 < m_low_limit_id, so search in the m_ids list and find that DB_TRX_ID exists in the list, then this record is invisible. Find the previous version record in the undo log according to DB_ROLL_PTR, and the DB_TRX_ID of the previous record is 101, invisible
  • Continue to find the previous version record in the undo log according to DB_ROLL_PTR. The DB_TRX_ID of the previous record is still 101, which is invisible
  • Continue to find the previous DB_TRX_ID is 1, satisfying 1 < m_up_limit_id, it can be seen, so the transaction 103 query data is name = cauliflower
  1. At time point T9:

insert image description here

At this time, the situation is exactly the same as that of T6. Since Read View has been generated, m_ids is still used at this time: [101,102], so the query result is still name = cauliflower

9. MVCC➕Next-key-Lock to prevent phantom reading

InnoDBThe storage engine solves the phantom reading problem RRthrough MVCCand at the level:Next-key Lock

1. Execute ordinary select, at this time, the data will be read in the way of MVCC snapshot read

In the case of snapshot reads, the RR isolation level will only generate a Read View for the first query after the transaction is opened, and use it until the transaction is committed. Therefore, after the Read View is generated, the update and insert record versions made by other transactions are not visible to the current transaction, which realizes repeatable reading and prevents "phantom reading" under snapshot reading

2. Execute select...for update/lock in share mode, insert, update, delete, etc. currently read

Under the current reading, all the latest data are read. If other transactions insert new records, and it happens to be within the query range of the current transaction, phantom reading will occur!

InnoDBUse Next-key Lockto prevent this. When the current read is performed, the read records will be locked, and their gaps will be locked to prevent other transactions from inserting data within the query range. As long as I don't let you insert, there will be no phantom reads

Guess you like

Origin blog.csdn.net/ldy007714/article/details/130489608