The records read by MySQL are inconsistent with my imagination

Abstract: During the operation of concurrent transactions, there will be some phenomena that may cause consistency problems. This article will analyze them in detail.

This article is shared from Huawei Cloud Community " The records read by MySQL are inconsistent with my imagination - transaction isolation level and MVCC ", author: Zhuanyeyang__.

Introduction to the characteristics of transactions

1.1 Atomicity

Either do it all or not do it at all. A series of operations are inseparable. If an error occurs during the execution of the operation, the operation that has been executed will be restored to the state before it was executed. For example, in a transfer, one party cannot deduct money and the other party does not increase the balance.

1.2 Isolation

Any other state operation cannot affect this state operation transition. For example, A transfers two accounts to B almost at the same time. The balance on the card read by different transactions is 12 yuan. After the first transaction A-5 yuan, the second transaction Two transactions A-5 (is this 12-5 or 7-5?), so MySQL needs some measures to ensure the isolation of these operations.

1.3 Consistency

If the data in the database all conform to the constraints of the real world, then the data is consistent, or consistent.

For example, the balance cannot be less than 0, and some business ids cannot be empty. The database itself can solve some consistency requirements for us, such as NOT NULL to reject the insertion of NULL values, but more needs to be guaranteed by the programmers who write business codes. For example, in Spring Boot, the input parameters can be @NotNull or @NotBlank and the like to perform input parameter verification.

Checking the consistency of the database is a performance-intensive job. For example, creating a trigger for a table will check whether the condition is met whenever a record is inserted or updated. If the calculation of a certain column is involved, it will seriously affect the insertion or The speed of updates.

Try not to write the judgment conditions (consistency check) of the verification parameters in the MySQL statement, which not only affects the speed of inserting and updating, but also takes time to connect to the database. If it can be solved at the business level, it can be judged at the business level.

Tip: The CHECK clause when creating a table is useless for consistency check, and MySQL will not check whether the constraints in the CHECK clause are established. for example:

create table test (
 id unsigned int not null auto_increment comment ‘主键id’,
 name varchar(100) comment ‘姓名’,
 balance int comment ‘余额’,
 primary key (id),
 check (balance >= 0)
);

1.4 Durability

The data modified by the database should be retained in the disk, and no matter what happens, the impact of this operation should not be lost. For example, after the transfer is successful, it cannot be restored to the state before the transfer, and the money will be gone.

We put forward the first letters of these four characteristics and sort them into an English word: ACID (meaning "acid" in English), which is easy to remember

2. Create table

CREATE TABLE hero (
    number INT,
    name VARCHAR(100),
    country varchar(100),
 PRIMARY KEY (number),
 KEY idx_hero_name (name)
) Engine=InnoDB CHARSET=utf8;

Here, the primary key of the hero table is named number to distinguish it from the subsequent transaction id. For the sake of simplicity, constraints and comments are not written.

Then insert a piece of data into this table:

INSERT INTO hero VALUES(1, '刘备', '蜀');

The data in the table now looks like this:

3. Transaction isolation level

MySQL is a software with a client/server architecture. For the same server, several clients can connect to it. After each client connects to the server, a session (Session) is formed. Each client can issue a request statement to the server in its own session, and a request statement may be part of a transaction. A server can handle multiple transactions from multiple clients at the same time.

3.1 Consistency problems encountered when transactions are executed concurrently

In different isolation levels, several phenomena may occur in the operation of the database. as follows:

3.1.1 Dirty Write (used to familiarize and understand ACID characteristics, dirty write is impossible in practice)

If a transaction modifies data that has been modified by another uncommitted transaction, it means that a dirty write has occurred. as follows:

Assuming that two sessions each open a transaction TA and TB,

  • The original x=0, y=0, TA first modified x=3, TB modified x=1, y=1, then TB submitted, and finally TA rolled back.
    If the rollback of TA results in x=0, then the atomicity is broken for TB, because x is rolled back and y is still modified normally.
    If the rollback of TA causes all modifications of TB to be rolled back, then the persistence of TB is destroyed. Obviously, TB has been submitted, how can an uncommitted TA destroy the persistence of TB?

Regardless of the isolation level, dirty writes are not allowed, so dirty writes can also be used as a preface to introduce transaction characteristics, just understand.

3.1.2 Dirty Read

If a transaction reads data modified by another uncommitted transaction, it means that a dirty read has occurred. The schematic diagram is as follows:

Session A and Session B each start a transaction. The transaction in Session B first updates the name column of the record whose number is 1 to 'Guan Yu', and then the transaction in Session A queries the record whose number is 1. If the value of the column name is read as 'Guan Yu', and the transaction in Session B is rolled back later, then the transaction in Session A is equivalent to reading a non-existing data, which is called dirty reading .

In this example, the transaction in Session B is rollback. Even if it is committed, although the final state of the database is consistent, when the transaction in Session A reads the record number=1, the transaction gets an inconsistent result. state. The inconsistent state of the database should not be exposed to users.

A stricter explanation: Assuming that transactions T1 and T2 are executed concurrently, they both need to access data item X, T1 first modifies the value of X, and then T2 reads the modified X value of the uncommitted transaction T1, then T1 terminates and T2 submit. This means that T2 reads a value that does not exist at all, which is also a strict interpretation of dirty reads.

3.1.3 Non-Repeatable Read (Non-Repeatable Read)

If a transaction modifies the data read by another uncommitted transaction, it means that a non-repeatable read phenomenon has occurred , or a fuzzy read (Fuzzy Read) phenomenon.

The read 'Liu Bei' was changed to 'Guan Yu', and the read 'Guan Yu' was changed to 'Zhang Fei'.

A stricter explanation: Assuming that transactions T1 and T2 are executed concurrently, they all access data item X, T1 reads the value of X first, then T2 modifies the value of X read by uncommitted transaction T1, and then T2 commits, Then when T1 reads the value of data item X again, it will get a different value from the first read.

3.1.4 Phantom (Phantom)

If a transaction first queries some records according to certain conditions, and then another transaction inserts records that meet these conditions into the table, when the original transaction queries according to the conditions again, the records inserted by another transaction can also be read out , that means phantom reading has occurred, the schematic diagram is as follows:

A stricter explanation: Assuming that transactions T1 and T2 are executed concurrently, T1 first reads the records that meet the search condition P, and then T2 writes the records that meet the search condition P. Later, when T1 reads the records that meet the search condition P, it will find that the records read twice are different.

If some records matching number > 0 are deleted in Session B instead of inserting new records, then fewer records will be read in Session A based on the condition of number > 0. Is this phenomenon considered phantom reading ? To clarify, this phenomenon does not belong to phantom reading. Phantom reading emphasizes that when a transaction reads records multiple times according to the same condition, it reads records that were not read before.

We only consider what is mentioned in the SQL standard here, and do not consider the descriptions of other papers. For MySQL, phantom reading emphasizes that "when a transaction reads records multiple times according to a certain same search condition, the subsequent read A record that has not been read before", may be caused by the insert operation of other transactions. What about the situation that the record that has been read before cannot be read later? We regard this phenomenon as the non-repeatable reading of each record in the result set.

For example: the first time I read three records of abc, and the second time I read abd, there are more d records and less c records. How to analyze this?
For record c, non-repeatable reading has occurred, and for record d Say, a phantom read has occurred. Consistency issues can be analyzed for each record.

The basis for judging whether a consistency problem is likely to occur is that at the moment of preparation for reading, the value of some columns in the database you want to query may be different from the actual query, and it is considered that a consistency problem may occur.

To sum up: Dirty reads, non-repeatable reads, and phantom reads may cause consistency problems.

Since these problems will arise, SQL also has some standards to deal with these problems, let's see

3.2 Four isolation levels in the SQL standard

Let's rank these phenomena according to the severity that may lead to consistency problems:

Dirty read > Non-repeatable read > Phantom read

Abandoning part of isolation in exchange for part of performance is reflected here: setting up some isolation levels, the lower the isolation level, the more likely serious problems will occur. A group of people (not the uncle who designed MySQL) formulated a so-called SQL standard, which established four isolation levels:

  • READ UNCOMMITTED: Uncommitted read.
  • READ COMMITTED: Committed to read (also referred to as RC) .  
  • REPEATABLE READ: Repeatable read (also referred to as RR) . 
  • SERIALIZABLE: Serializable.

According to the SQL standard (in the SQL standard, not in MySQL), different phenomena can occur in concurrent transactions for different isolation levels. The details are as follows:

The SQL92 standard does not point out the phenomenon of dirty writes. The phenomenon of dirty writes has a serious impact on consistency. Regardless of the isolation level, dirty writes are not allowed to occur, so I won’t mention them here.

3.3 Four isolation levels supported in MySQL

Different database vendors support the four isolation levels specified in the SQL standard differently. For example, Oracle only supports READ COMMITTED (Oracle 's default isolation level ) and SERIALIZABLE isolation level. Although the MySQL discussed here supports four isolation levels, it is somewhat different from the problems allowed by the isolation levels at all levels stipulated in the SQL standard. MySQL can largely prohibit phantom reading problems under the REPEATABLE READ isolation level . happened (how to prohibit it will be explained in detail later).

The default isolation level of MySQL is REPEATABLE READ. The isolation level of my own project in the production environment is READ COMMITTED, and some related interfaces may operate a certain account in the same table at the same time. The concurrency is high. My operation is : Every time before entering the transaction, the Redis distributed lock will be used to lock the account and then enter the transaction. Only one operation can be successful for the same account at a time, so that there will be no multiple transactions concurrently to operate the data related to this account , there will be no chance of non-repeatable reads and phantom reads for this record.

3.3.1 How to set the isolation level of the transaction

We can modify the isolation level of the transaction through the following statement (in actual development, developers will not be allowed to do this at will, you can try it on your own computer):

SET [GLOBAL|SESSION] TRANSACTION ISOLATION LEVEL level;

There are 4 optional values ​​for level:

level: {
 REPEATABLE READ
 | READ COMMITTED
 | READ UNCOMMITTED
 | SERIALIZABLE
}

In the statement for setting the isolation level of the transaction, the GLOBAL keyword, the SESSION keyword or nothing can be placed after the SET keyword, which will have different effects on transactions of different scopes, as follows:

  • Use the GLOBAL keyword (effects at global scope):

For example:

SET GLOBAL TRANSACTION ISOLATION LEVEL SERIALIZABLE;

but:

  1. It only works for new sessions created after the statement is executed.
  2. The currently existing session is invalid.

The so-called newly generated session, if you are operating navicat, you have to close the connection and then open the connection to be considered a new session. If you only create a new query, it is still the same session, and you cannot see the change of the isolation level before and after setting.

  • Use the SESSION keyword (affects at session scope):

Let's say this:

SET SESSION TRANSACTION ISOLATION LEVEL SERIALIZABLE;

but:

  1. Valid for all subsequent transactions of the current session
  2. This statement can be executed in the middle of an already opened transaction, but it will not affect the currently executing transaction.
  3. If executed between transactions, it is valid for subsequent transactions.
  • The above two keywords are not used (only affect the next transaction after executing this SET statement):

For example:

SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;

but:

  1. Only valid for the next upcoming transaction in the current session.
  2. After the next transaction finishes executing, subsequent transactions will revert to the previous isolation level.
  3. This statement cannot be executed in the middle of an already opened transaction, otherwise an error will be reported.

If we want to change the default isolation level of the transaction when the server starts, we can modify the value of the startup parameter transaction-isolation. For example, if we specify --transaction-isolation=SERIALIZABLE when starting the server, the default isolation level of the transaction will change from the original The REPEATABLE READ becomes SERIALIZABLE.

You can determine the default isolation level of the current session by viewing the value of the system variable transaction_isolation:

SHOW VARIABLES LIKE 'transaction_isolation';

Note: transaction_isolation was introduced in MySQL 5.7.20 to replace tx_isolation. If you are using a previous version of MySQL, please replace the above system variable transaction_isolation with tx_isolation.

Or use a more convenient way of writing:

SELECT @@transaction_isolation;

When we used the SET TRANSACTION syntax to set the isolation level of the transaction, we were actually setting the value of the system variable transaction_isolation indirectly. We can also directly modify the system variable transaction_isolation to set the isolation level of the transaction. System variables Generally, system variables only have two scopes of GLOBAL and SESSION, but this transaction_isolation has 3 (GLOBAL, SESSION, only for the next transaction), and the setting syntax is somewhat special. For more details, please refer to the document: transaction_isolation.
Here is a summary:

4. MVCC principle

4.1 Version chain

In the previous article , the underlying MySQL row format record header information was revealed. For a table using the InnoDB storage engine, its clustered index records contain two necessary hidden columns (row_id is not necessary. In the table we created The row_id column will not be included when there is a primary key or a UNIQUE key with NOT NULL restriction):

  • trx_id: Every time a transaction modifies a clustered index record, the transaction id of the transaction is assigned to the trx_id hidden column.
  • roll_pointer: Every time a clustered index record is modified, the old version will be written into the undo log, and then this hidden column is equivalent to a pointer, through which the information before the modification of the record can be found.

Let's say our table hero now contains only one record:

Assuming that the transaction id of inserting this record is 80, the schematic diagram of this record at this moment is as follows:

Assume that the next two transactions with transaction IDs of 100 and 200 respectively perform UPDATE operations on this record, and the operation process is as follows:

Is it possible to cross-update the same record in two transactions? Can't! Isn't this just that one transaction modifies the data modified by another uncommitted transaction, which becomes dirty writing? InnoDB uses locks to ensure that no dirty writes will occur, that is, before the first transaction updates a record, it will lock the record, and when another transaction updates the record again, it needs to wait for the first transaction to update the record. A transaction is committed, and the update can only be continued after the lock is released. So here trx 200 is blocked because of the lock at step ③④⑤ . About the lock, I will introduce it in a follow-up article.

Every time a record is changed, an undo log will be recorded, and each undo log also has a roll_pointer attribute (the undo log corresponding to the INSERT operation does not have this attribute, because the record does not have an earlier version), and these undo logs can be They are all connected together to form a linked list, so the current situation looks like the following picture:

After each update of the record, the old value will be put into an undo log (even if it is an old version of the record). As the number of updates increases, all versions will be connected into a linked list by the roll_pointer attribute. We put This linked list is called a version chain , and the head node of the version chain is the latest value of the current record. In addition, each version also contains the corresponding transaction id when the version was generated (this is very important). We will use the version chain of this record to control the behavior of concurrent transactions accessing the same record . We call this mechanism multiple Version Concurrency Control (Multi-Version Concurrency Control, MVCC)

As can be seen from the above figure, the clustered index records and the roll_pointer attribute in the undo log can be concatenated into a recorded version chain. The same record can have multiple versions in the system, which is the multi-version concurrency control (MVCC) of the database

In the undo log generated by the UPDATE operation, only the information of some index columns and updated columns will be recorded, and the information of all columns will not be recorded. In the undo log shown in the previous figure, the reason why all the columns of a record The information is drawn for the convenience of understanding (because it is very intuitive to show what the values ​​of each column in this version are). For example, for the undo log whose trx_id is 80, the country column information is not recorded, so how do you know the value of the country column in this version? Not updating the column means that the column has the same value as in the previous version. If the undo log of the previous version did not record the value of this column, then it is the same as the value of this column in the previous version. If the undo log of each version does not record the value of the column, indicating that the column has never been updated, then the value of the country column of the version with trx_id 80 is the same as the value of the country column recorded by the clustered index in the data page .

4.2 ReadView

A record has been updated so many versions? Which version of the record in the version chain is visible to the current transaction? This visibility is different in different isolation levels

  • For transactions using the READ UNCOMMITTED isolation level, since records modified by uncommitted transactions can be read, it is good to directly read the latest version of the records. (does not generate ReadView)
  • For transactions using the SERIALIZABLE isolation level, the uncle who designed InnoDB stipulated that the records be accessed by locking. (does not generate ReadView)
  • For transactions using the READ COMMITTED and REPEATABLE READ isolation levels, it is necessary to ensure that the records modified by the committed transaction are read. That is to say, if another transaction has modified the record but has not yet committed it, it cannot directly read the latest version. record of. (Only the two isolation levels RC and RR generate ReadView when reading data)

It must be noted that there is no ReadView without transactions. ReadView is generated by transactions and is based on the entire database.

In this regard, the uncle who designed InnoDB proposed a concept of ReadView ( some translated as "consistency view" )

Notice! There are two concepts of "views" in MySQL:

One is view. It is a virtual table defined with a query statement, which executes the query statement and generates results when called. The syntax for creating a view is create view ..., and its query method is the same as that of a table. The other is the consistent read view used by InnoDB when implementing MVCC, that is, consistent read view, which is used to support the implementation of RC and RR isolation levels. ReadView has no physical structure and is used to define "what data can I see" during transaction execution.

This ReadView mainly contains 4 more important contents:

  1. m_ids: Indicates the transaction id list of active read and write transactions in the current system when ReadView is generated. "Active" means that it has been started but not submitted yet.
  2. min_trx_id: Indicates the smallest transaction id among the active read and write transactions in the current system when ReadView is generated, that is, the minimum value in m_ids.
  3. max_trx_id: Indicates the transaction id value that should be assigned to the next transaction in the system when ReadView is generated.
    Note that max_trx_id is not the maximum value in m_ids, and transaction ids are allocated incrementally. For example, now there are three transactions with transaction ids 1, 2, and 3, and then the transaction with transaction id 3 is submitted. Then when a new read transaction generates ReadView, m_ids includes 1 and 2, the value of min_trx_id is 1, and the value of max_trx_id is 4.
  4. creator_trx_id: Indicates the transaction id of the transaction that generated the ReadView.

Only when the records in the table are changed (when executing INSERT, DELETE, UPDATE statements), trx_id will be assigned to the transaction. Otherwise, the value trx_id of the transaction id in a read-only transaction defaults to 0. Before trx_id is assigned , the value of creator_trx_id is 0, after assigning trx_id, creator_trx_id changes to the trx_id of the corresponding transaction.

In MySQL, a very big difference between READ COMMITTED and REPEATABLE READ isolation levels is that they generate ReadView at different times. Let's still take the table hero as an example, assuming that there is only one record inserted by the transaction whose transaction id is 80 in the table hero:

Note: When a ReadView is generated, the values ​​of m_ids, min_trx_id, max_trx_id, creator_trx_id and other variables are fixed. For example, if a transaction is submitted at this time, the value of the m_ids active transaction list will not change. ReadView is like a snapshot, once it is generated, it will not change unless a new one is generated.

Next, let's take a look at the so-called timing of generating ReadView between READ COMMITTED and REPEATABLE READ.

4.2.1 READ COMMITTED —— A ReadView is generated before each data read in a transaction

For example, now there are two transactions with transaction ids 100 and 200 in the system being executed:

Again, during transaction execution, a unique transaction id will be assigned only when the record is actually modified for the first time (such as using INSERT, DELETE, UPDATE statements), and this transaction id is incremented. That's why we update some records in other tables in Transaction 200, in order to let it assign a transaction id.

At this moment, the version linked list obtained by the record whose number is 1 in the table hero is as follows:

Assuming that a transaction using the READ COMMITTED isolation level is now executed:

# 使用READ COMMITTED隔离级别的事务
BEGIN;
# SELECT1:Transaction 100、200未提交
SELECT * FROM hero WHERE number = 1; # 得到的列name的值为'刘备'

The execution process of this SELECT1 is as follows:

  1. When the SELECT statement is executed, a ReadView will be generated first . The content of the m_ids list of the ReadView is [100, 200], the min_trx_id is 100, the max_trx_id is 201, and the creator_trx_id is 0.
  2. Then pick the visible records from the version chain. It can be seen from the figure that the content of the column name of the latest version is 'Zhang Fei', and the trx_id value of this version is 100. In the m_ids list, it shows that the transaction with trx_id of 100 has not been submitted yet, so it does not meet the visibility requirements. Jump to the next version according to roll_pointer.
  3. The content of the column name in the next version is 'Guan Yu', and the trx_id value of this version is also 100, which is also in the m_ids list, so it does not meet the requirements, so continue to skip to the next version.
  4. The content of the column name in the next version is 'Liu Bei'. The trx_id value of this version is 80, which is smaller than the min_trx_id value of 100 in ReadView, indicating that the transaction with trx_id of 80 has been submitted, so this version meets the requirements, and finally returned to The user's version is the record whose column name is 'Liu Bei'.

After that, we submit the transaction with transaction id 100, as follows:

# Transaction 100
BEGIN;
UPDATE hero SET name = '关羽' WHERE number = 1;
UPDATE hero SET name = '张飞' WHERE number = 1;
COMMIT;

Then go to the transaction with transaction id 200 to update the record with number 1 in the table hero:

# Transaction 200
BEGIN;
# 更新了一些别的表的记录
...
UPDATE hero SET name = '赵云' WHERE number = 1;
UPDATE hero SET name = '诸葛亮' WHERE number = 1;

At this moment, the version chain of the record whose number is 1 in the table hero looks like this:

Then go to the transaction that just used the READ COMMITTED isolation level to continue to find the record with the number 1, as follows

# 使用READ COMMITTED隔离级别的事务
BEGIN;
# SELECT1:Transaction 100、200均未提交(第一次查询两个事务均未提交)
SELECT * FROM hero WHERE number = 1; # 得到的列name的值为'刘备'
# SELECT2:Transaction 100提交,Transaction 200未提交(第二次查询事务id为100的事务提交了)
SELECT * FROM hero WHERE number = 1; # 得到的列name的值为'张飞'

Analyze the execution process of SELECT2

  1. When the SELECT statement is executed, a ReadView will be generated separately . The content of the m_ids list of the ReadView is [200] (the transaction with the transaction id of 100 has been submitted, so it will not be there when the ReadView is generated again), and the min_trx_id is 200 , max_trx_id is 201, creator_trx_id is 0.
  2. Then select the visible records from the version chain. As can be seen from the figure, the content of the column name of the latest version is 'Zhuge Liang', and the trx_id value of this version is 200, which is in the m_ids list, so it does not meet the visibility requirements. According to roll_pointer skips to the next version.
  3. The content of the column name in the next version is 'Zhao Yun', and the trx_id value of this version is 200, which is also in the m_ids list, so it does not meet the requirements, so continue to skip to the next version.
  4. The content of the column name in the next version is 'Zhang Fei', the trx_id value of this version is 100, which is less than the min_trx_id value of 200 in ReadView, so this version meets the requirements, and the final version returned to the user is the column name as 'Zhang Fei' record.

By analogy, if the record whose transaction id is 200 is also submitted later, when the record whose number value is 1 in the table hero is queried again in a transaction using the READ COMMITTED isolation level, the result obtained is 'Zhuge Liang'. To sum it up: transactions using the READ COMMITTED isolation level will generate an independent ReadView at the beginning of each query.

Note: Under RC, in a transaction, after a query statement is executed, the ReadView generated by the transaction is useless, and the ReadView must be regenerated for the next query.

4.2.2 REPEATABLE READ —— Generate a ReadView when reading data for the first time in a transaction

According to the definition of repeatable reading, when a transaction starts, you can see all committed transaction results. But during the subsequent execution of this transaction, updates from other transactions are not visible to it.

For transactions using the REPEATABLE READ isolation level, a ReadView will only be generated when the query statement is executed for the first time, and subsequent queries will not be generated repeatedly. Let's analyze again with the same example as before.

For example, now there are two transactions with transaction ids 100 and 200 in the system being executed:

At this moment, the version linked list obtained by the record whose number is 1 in the table hero is as follows:

Assuming that a transaction using the REPEATABLE READ isolation level is now executed:

# 使用REPEATABLE READ隔离级别的事务
BEGIN;
# SELECT1:Transaction 100、200未提交
SELECT * FROM hero WHERE number = 1; # 得到的列name的值为'刘备'

The analysis process here is exactly the same as the SELECT1 analysis process of the READ COMMITTED isolation level in Section 4.2.1, so I won’t repeat it here. The result of the query is the record whose name is 'Liu Bei'.

Let's submit the transaction whose transaction id is 100, as follows:

# Transaction 100
BEGIN;
UPDATE hero SET name = '关羽' WHERE number = 1;
UPDATE hero SET name = '张飞' WHERE number = 1;
COMMIT;

Then go to the transaction with transaction id 200 to update the record with number 1 in the table hero:

# Transaction 200
BEGIN;
# 更新了一些别的表的记录
...
UPDATE hero SET name = '赵云' WHERE number = 1;
UPDATE hero SET name = '诸葛亮' WHERE number = 1;

At this moment, the version chain of the record whose number is 1 in the table hero looks like this:

Until here, the example analysis is the same as the analysis process of the READ COMMITTED isolation level in Section 4.2.1. Next, something different came.

Then go to the transaction that just used the REPEATABLE READ isolation level to continue to find the record with the number 1, as follows:

# 使用REPEATABLE READ隔离级别的事务300
BEGIN;
# SELECT1:Transaction 100、200均未提交
SELECT * FROM hero WHERE number = 1; # 得到的列name的值为'刘备'
# SELECT2:Transaction 100提交,Transaction 200未提交
SELECT * FROM hero WHERE number = 1; # 得到的列name的值仍为'刘备'

Note that the execution process of this SELECT2 is as follows:

  1. Because the isolation level of the current transaction is REPEATABLE READ, and ReadView has been generated when SELECT1 is executed before, so the previous ReadView is directly reused at this time.  The content of the m_ids list of the previous ReadView is [100, 200] , the min_trx_id is 100, the max_trx_id is 201, and the creator_trx_id is 0. 
  2. Then select the visible records from the version chain. As can be seen from the figure, the content of the column name of the latest version is 'Zhuge Liang', and the trx_id value of this version is 200, which is in the m_ids list, so it does not meet the visibility requirements. According to roll_pointer skips to the next version.
  3. The content of the column name in the next version is 'Zhao Yun', and the trx_id value of this version is 200, which is also in the m_ids list, so it does not meet the requirements, so continue to skip to the next version.
  4. The content of the column name in the next version is 'Zhang Fei', the trx_id value of this version is 100, and the m_ids list contains the transaction id with a value of 100, so this version does not meet the requirements, and the next column name is the same The content of the 'Guan Yu' version does not meet the requirements. Go ahead and skip to the next version.
  5. The content of the column name in the next version is 'Liu Bei', the trx_id value of this version is 80, which is smaller than the min_trx_id value of 100 in ReadView, so this version meets the requirements, and the final version returned to the user is the column name of ' Liu Bei's record.

That is to say, under the REPEATABLE READ isolation level, the results of the two queries of the transaction are the same. The value of the name column of the record is 'Liu Bei', which is why under RR, there will be no reason for non-repeatable reading . If we submit the record with transaction id 200 later, and then continue to search for the record with number 1 in the transaction that used the REPEATABLE READ isolation level just now, the result is still 'Liu Bei'.

How to operate if you want to read the latest name value of 'Zhuge Liang'?

Premise: Submit the transactions with transaction ids 100 and 200.

  1. At this time, if you submit the transaction with transaction id 300, ReadView will be useless. A new ReadView will be generated when you start a new transaction query next time. If there is no 100 or 200 in the m_ids list, you can query the name as 'Zhuge Liang'. .
  2. If the new query does not have a transaction, then there is no such thing as ReadView, and the record named 'Zhuge Liang' can be found by directly selecting the query, because transactions 100 and 200 have been submitted.

Note the comparison:

Under RR, when a transaction is committed, the ReadView it generates is useless.

Under RC, in a transaction, after a query statement is executed, the ReadView generated by the transaction is useless, and the ReadView must be regenerated for the next query.

hint:

Under RR, if you use the START TRANSACTION WITH CONSISTENT SNAPSHOT statement to start a transaction, a ReadView will be generated immediately after the statement is executed, not when the first SELECT statement is executed.

Start with the statement START TRANSACTION WITH CONSISTENT SNAPSHOT to create a ReadView that lasts the entire transaction. So under the RC isolation level (ReadView is created for every read), this usage is meaningless, which is equivalent to a normal start transaction.

4.2.3 Summary of ReadView's Visibility Rules

When accessing a record, you only need to follow the steps below to determine whether a version of the record is visible:

  1. When trx_id = creator_trx_id, it means that the current transaction is accessing its own modified records, so this version can be accessed by the current transaction.
  2. When trx_id < min_trx_id, it indicates that the transaction that generated this version has been committed before the current transaction generates ReadView, so this version can be accessed by the current transaction.
  3. When trx_id ≥ max_trx_id, it indicates that the transaction that generates this version is opened after the current transaction generates ReadView, so this version cannot be accessed by the current transaction.
  4. Between min_trx_id ≤ trx_id ≤ max_trx_id, you need to judge whether the trx_id attribute value is in the m_ids list. If it is, it means that the transaction that generated this version when ReadView was created is still active, and this version cannot be accessed; if not, it means The transaction that generated this version has been committed when the ReadView was created, and this version can be accessed.

If the data of a certain version is not visible to the current transaction, follow the version chain to find the data of the next version, continue to judge the visibility according to the above steps, and so on, until the last version in the version chain. If the last version is also not visible, it means that the record is completely invisible to the transaction, and the query result does not include the record.

As mentioned above, ReadView is based on the entire library. If a library has 100G, then if I start a transaction, will MySQL copy 100G of data? How slow is this, but our usual transactions are executed very quickly.

In fact, we don't need to copy the 100G data. InnoDB uses the version chain and active transaction id list to achieve "second-level creation of ReadView".

Thinking questions:

Under the RR isolation level, transactions T1 and T2 are executed concurrently. T1 first reads 3 records according to a certain search condition, then transaction T2 inserts a record that meets the corresponding search condition and submits it, and then transaction T1 executes the query according to the same search condition. How is the result?

Analysis: According to the version chain and ReadView analysis, when T1 searches for 3 records for the first time, ReadView is generated. At this time, T1 and T2 are both active in the m_ids list, so the version record T1 inserted in T2 is invisible , so the second search of transaction T1 is still 3 records. At this time, phantom reading is avoided under RR.

Due to the specific implementation of MySQL, phantom reading cannot be completely avoided under the RR isolation level (it can only be avoided to a large extent), and only locking can be completely avoided.

4.3 Why is it not recommended to use long transactions?

When talking about the version chain, we said that when each record is updated, a rollback undo log (also called a rollback segment ) will be recorded at the same time. Through the rollback operation, the value of the previous state can be obtained.  

The name of the current record whose number is 1 is 'Zhuge Liang', but when querying this record, transactions started at different times will have different ReadViews. As shown in the figure, to get the record whose name is 'Liu Bei', the current value must be obtained by performing all the rollback operations in the figure in sequence.

  • The rollback segment takes up a lot of memory, so when will the rollback segment be deleted?

From the above figure, we can see that the rollback segment contains records modified by previous transactions. After the transaction is committed, the old version of the record is no longer needed, so only when all transactions since the rollback segment is enabled are committed, rollback segment can be deleted.

  • Why is long transaction not recommended?

Long transactions mean that there will be very old records in the system. If the transaction is not committed, the old version of the record will always exist. Since these transactions may access any data in the database at any time, before the transaction is committed, the rollback records it may use in the database must be kept, which will result in a large amount of storage space being occupied.

In MySQL 5.5 and earlier versions, the rollback log is placed in the ibdata file together with the data dictionary. Even if the long transaction is finally committed and the rollback segment is cleared, the file will not become smaller. Sometimes the data is only 20GB, and the rollback segment has a 200GB library. In the end, I had to rebuild the entire library in order to clean up the rollback segment.

In addition to the impact on the rollback segment, long transactions also occupy lock resources and may drag down the entire library.

  • How to query long transactions?

Query long transactions in the innodb_trx table of the information_schema library, such as the following statement, which is used to find transactions that last longer than 60s.

select * from information_schema.innodb_trx where TIME_TO_SEC(timediff(now(),trx_started))>60

4.4 Non-clustered index and MVCC

As mentioned earlier, only clustered index records have trx_id and roll_pointer hidden columns. If a query statement uses a secondary index to execute the query, how to judge the visibility?

begin;
select * from hero where name = '刘备';

The judgment condition here is name. This is an ordinary non-clustered index. Without trx_id and roll_pointer, how can we judge the visibility based on the version chain and ReadView?

Note: trx_id is the place where the transaction id of the transaction is recorded. The absence of this column only means that the non-clustered index record is not saved, and it does not mean that there is no transaction id when the transaction is executed.

The process is as follows:

Step 1: The Page Header part of the non-clustered index page has an attribute named PAGE_MAX_TRX_ID, whenever the record in the page is added, deleted or modified , as follows:

// 这里用伪代码说明更便捷
if(如果执行该事务的事务id > PAGE_MAX_TRX_ID) {
PAGE_MAX_TRX_ID = 如果执行该事务的事务id;
}

So the PAGE_MAX_TRX_ID attribute value represents the largest transaction id that modifies the non-clustered index page.

When the SELECT statement finds a non-clustered index record according to the condition, as follows:

if (对应ReadView的min_trx_id > PAGE_MAX_TRX_ID) {
说明该页面中的所有记录都对该ReadView可见
} else {
执行步骤2
}

Step 2: After returning to the table according to the primary key, after obtaining the clustered index records that meet the search conditions, find the first version visible to the ReadView according to the version chain, and then judge whether the value of the corresponding non-clustered index column in this version is the same as that using the non-clustered index column. The same value when querying the clustered index. This example is to judge whether the name of the visible version is 'Liu Bei'. If so, send this record to the client (if there are other search conditions in the where clause, you need to continue to judge and filter before returning), otherwise skip the record.

4.5 MVCC Summary

The so-called MVCC (Multi-Version Concurrency Control, multi-version concurrency control) refers to the process of accessing the recorded version chain when using the two isolation levels of READ COMMITTD and REPEATABLE READ to perform ordinary SELECT operations. In this way, the read-write and write-read operations of different transactions can be performed concurrently, thereby improving system performance. A big difference between the two isolation levels of READ COMMITTD and REPEATABLE READ is that the timing of generating ReadView is different. READ COMMITTD generates a ReadView before every ordinary SELECT operation in a transaction, while REPEATABLE READ only generates a ReadView in a transaction. A ReadView is generated before the first normal SELECT operation, and this ReadView is reused for subsequent query operations.

5. For the reflection and summary of the whole article, you need to understand these questions

  1. What is the concept of business?
  2. What does MySQL's transaction isolation level read uncommitted, read committed, repeatable read, and serial read mean?
  3. Read submitted, how is repeatable read implemented through view construction?
  4. How is transaction isolation implemented through ReadView (read view)?
  5. What is the concept of concurrent version control (MVCC) and how is it implemented?
  6. Disadvantages of using long transactions? Why using long transactions may drag down the entire library?
  7. How to query the long transactions in each table?
  8. How to avoid long transactions?

 

Click to follow and learn about Huawei Cloud's fresh technologies for the first time~

{{o.name}}
{{m.name}}

Guess you like

Origin my.oschina.net/u/4526289/blog/10075914