Isolation level of MySQL transaction and detailed explanation of MVCC

1. Transaction isolation level

First create a table:

#主键命名为number,而不是id,是想和后边要用到的事务id做区别
CREATE TABLE hero (
number INT,
name VARCHAR(100),
country varchar(100),
PRIMARY KEY (number)
) Engine=InnoDB CHARSET=utf8;

INSERT INTO hero VALUES(1, '刘备', '蜀');

For the same server, there can be several clients connected to it. After each client is connected to the server, it can be called a session (Session).
In theory, when a transaction accesses a certain data, other transactions should be queued. After the transaction is committed, other transactions can continue to access the data, but this will have a great impact on performance. And we need to maintain transaction isolation, but also make the server perform as high as possible when processing multiple transactions accessing the same data (a part of the isolation is sacrificed in exchange for a part of the performance).

1. Problems encountered in concurrent transaction execution

① Dirty writing:
If one transaction modifies data that has been modified by another uncommitted transaction, that means a dirty write has occurred.
insert image description here
If the transaction in Session B is then rolled back, then the update in Session A will no longer exist.

②Dirty reading:
If a transaction reads data modified by another uncommitted transaction, it means that a dirty read has occurred.

③ Non-repeatable reading:
If a transaction can only read the data modified by another transaction that has been committed, and every time the other transaction modifies and commits the data, the transaction can query the latest value, which means that a non-repeatable read occurs. .
insert image description here
④Phantom reading:
If a transaction first queries some records according to certain conditions, and then another transaction inserts records that meet these conditions into the table, when the original transaction queries again according to the conditions, the records inserted by another transaction can also be read out. , it means that a phantom reading has occurred.
insert image description here

For a record that has been read before, but can't read it later, what is it?
This is equivalent to the phenomenon of non-repeatable read for each record. The phantom read only highlights the read records that were not obtained by the previous read.

2. 4 isolation levels of transactions

Sort by the severity of the problems encountered by concurrent transaction execution: dirty writes > dirty reads > non-repeatable reads > phantom reads

The above-mentioned abandonment of a part of isolation in exchange for a part of performance is reflected here: to establish some isolation levels, the lower the isolation level, the more serious the problem is likely to occur.

There are 4 isolation levels established in the SQL standard

  • READ UNCOMMITTED : Read uncommitted.
  • READ COMMITTED : Read committed.
  • REPEATABLE READ : Repeatable read.
  • SERIALIZABLE : Serializable.
isolation level dirty read non-repeatable read hallucinations
READ UNCOMMITTED Possible Possible Possible
READ COMMITTED Not Possible Possible Possible
REPEATABLE READ Not Possible Not Possible Possible
SERIALIZABLE Not Possible Not Possible Not Possible
  1. Under the SERIALIZABLE isolation level, various problems cannot occur.
  2. The problem of dirty writing is too serious, no matter what isolation level, dirty writing is not allowed to happen.
  3. MySQL's default isolation level is REPEATABLE READ

Set the isolation level:

SET [GLOBAL|SESSION] TRANSACTION ISOLATION LEVEL level;
/*level: {
    REPEATABLE READ
  | READ COMMITTED
  | READ UNCOMMITTED
  | SERIALIZABLE
  }*/

2. Principle of MVCC

MVCC (Multi-Version Concurrency Control, Multi-Version Concurrency Control) refers to the process of accessing the recorded version chain when using the two isolation levels of READ COMMITTD and REPEATABLE READ to perform ordinary SEELCT operations.

1. Version chain

For a table using the InnoDB storage engine, its clustered index record contains two necessary hidden columns (row_id is not necessary, and the table we create will not contain a primary key or a non-NULL UNIQUE key. row_id column):

  • trx_id : Every time a transaction changes a clustered index record, the transaction id of the transaction is assigned to the trx_id hidden column.
  • roll_pointer: Every time a clustered index record is changed, the old version will be written to the undo log, and then this hidden column is equivalent to a pointer, which can be used to find the information before the record is modified.

Every time a record is changed, an undo log will be recorded, and each undo log also has a roll_pointer attribute (the undo log corresponding to the INSERT operation does not have this attribute, because the record does not have an earlier version). are linked together into a linked list. After each update of the record, the old value will be put into an undo log, even if it is an old version of the record, with the increase of the number of updates, all versions will be connected into a linked list by the roll_pointer attribute, we put this linked list is calledversion chain, the head node of the version chain is the latest value of the current record.

2. ReadView

  • For transactions using the READ UNCOMMITTED isolation level, since the records modified by uncommitted transactions can be read, it is good to directly read the latest version of the records.
  • For transactions using the SERIALIZABLE isolation level, it is stipulated that a lock is used to access records.
  • For transactions using the READ COMMITTED and REPEATABLE READ isolation levels, it is necessary to ensure that the records modified by the committed transaction are read, that is to say, if another transaction has modified the record but has not yet committed, it cannot directly read the latest version of the record.

The core problem is: it is necessary to determine which version in the version chain is visible to the current transaction.

To this end, a concept of ReadView is proposed. ReadView mainly contains four important contents:

  1. m_ids : A list of transaction ids representing the read and write transactions that were active in the current system when the ReadView was generated .
  2. min_trx_id: Indicates the smallest transaction id in the active read and write transactions in the current system when ReadView is generated , that is, the smallest value in m_ids.
  3. max_trx_id : Indicates the id value that should be assigned to the next transaction in the system when ReadView is generated .

Note that max_trx_id is not the maximum value in m_ids, transaction ids are allocated incrementally. For example, there are now three transactions with id 1, 2, and 3, and then the transaction with id 3 is committed. Then when a new read transaction generates ReadView, m_ids includes 1 and 2, the value of min_trx_id is 1, and the value of max_trx_id is 4.

  1. creator_trx_id : Indicates the transaction id of the transaction that generated the ReadView.

Steps to use ReadView to determine whether a version of a record is visible:

  1. If the value of the trx_id attribute of the accessed version is the same as the value of the creator_trx_id in ReadView, it means that the current transaction is accessing its own modified record, so this version can be accessed by the current transaction.
  2. If the value of the trx_id attribute of the accessed version is less than the min_trx_id value in ReadView, it indicates that the transaction that generated this version has been committed before the current transaction generates ReadView, so this version can be accessed by the current transaction.
  3. If the value of the trx_id attribute of the accessed version is greater than the max_trx_id value in ReadView, it indicates that the transaction that generates this version is opened after the current transaction generates ReadView, so this version cannot be accessed by the current transaction.
  4. If the value of the trx_id attribute of the accessed version is between min_trx_id and max_trx_id of ReadView, it is necessary to determine whether the value of the trx_id attribute is in the m_ids list. cannot be accessed. If not, it means that the transaction that generated the version when the ReadView was created has been committed and the version can be accessed.

A very big difference between the READ COMMITTED and REPEATABLE READ isolation levels is the timing of when they generate the ReadView.

  • READ COMMITTED - generate a ReadView before each read data
  • REPEATABLE READ - Generates a ReadView when reading data for the first time, and reuses this ReadView for subsequent query operations.
  • When executing a DELETE statement or an UPDATE statement that updates the primary key, the corresponding record is not immediately deleted from the page, but a so-called delete mark operation is performed, which is equivalent to just marking the record with a delete flag, which is mainly for MVCC service.
  • MVCC only takes effect when doing ordinary SEELCT queries.

Guess you like

Origin blog.csdn.net/myjess/article/details/115868211