[Intermediate] In-depth analysis of transaction isolation level

[Intermediate] In-depth analysis of transaction isolation level

This article introduces the four transaction isolation levels in detail, and uses examples to illustrate what kind of reading phenomenon different levels can solve. And introduced the realization principle of different isolation levels in relational databases.

In the DBMS, transactions ensure that a sequence of operations can be executed all or none (atomicity), from one state to another state (consistency). Because the business satisfies the durability. Therefore, once the transaction is submitted, the data can be persisted, and because the transaction satisfies isolation, when multiple transactions process the same data at the same time, multiple transactions directly do not affect each other, so, In the process of concurrent operations of multiple transactions, if the isolation level is not well controlled, reading phenomena such as dirty reads, non-repeatable reads, or phantom reads may occur.

Among the four ACID attributes of database transactions, isolation is the most commonly relaxed one. You can use the database lock mechanism or the multi-version concurrency control mechanism to obtain a higher isolation level during data operations. However, as the database isolation level increases, the concurrency of data will also decrease. Therefore, how to make a good trade-off between concurrency and isolation has become a crucial issue.

In software development, there are a variety of best practices for almost every type of problem for our reference. Many DBMSs define multiple different "transaction isolation levels" to control the degree of lock and concurrency.

There are four standard isolation levels defined by ANSI/ISO SQL, from highest to lowest: Serializable, Repeatable reads, Read committed, Read uncommitted.

The following will introduce the concepts, usage and problems of these four transaction isolation levels in turn (reading phenomenon)

Read uncommitted


Uncommitted read (READ UNCOMMITTED) is the lowest isolation level. By name, we can know that under this transaction isolation level, one transaction can read the uncommitted data of another transaction.

Uncommitted read database lock situation (implementation principle)

The transaction did not lock the data while reading the data.

When modifying data, only add row-level shared locks to the data.
phenomenon:

When transaction 1 reads a row of records, transaction 2 can also read and update this row of records (because transaction 1 does not add any locks to the data)

When transaction 2 updates the record, transaction 1 reads the record again, and can read the modified version of the record of transaction 2 (because transaction 2 only adds shared read lock, transaction 1 can add shared read lock to read Data), even if the modification has not yet been submitted.

When transaction 1 updates a row of records, transaction 2 cannot update this row of records until the end of transaction 1. (Because transaction one pair of data adds a shared read lock, transaction two cannot add an exclusive write lock to modify data)

For example

事务一 事务二
/* Query 1 */
 SELECT age FROM users WHERE id = 1;  
/* will read 20 */  

 /* Query 2 */ 
UPDATE users SET age = 21 WHERE id = 1;
 /* No commit here */
 /* Query 1 */
SELECT age FROM users WHERE id = 1;
/* will read 21 */  

 ROLLBACK;
/* lock-based DIRTY READ */

The following is to borrow the example I gave in the article Analysis of Database Read Phenomenon to illustrate the isolation between two transactions in the isolation level of uncommitted read.

The transaction was queried twice in total. During the two queries, transaction two modified the data without committing. But the second query of transaction one found the modified result of transaction two. In the analysis of database reading phenomenon, we introduced this phenomenon, which we call dirty reading.

Therefore, uncommitted reads will lead to dirty reads

Read committed


READ COMMITTED can also be translated as read committed. It can also be analyzed by name. In the process of modifying data in a transaction, if the transaction has not been committed, other transactions cannot read the data.

Database lock status for commit read

The transaction adds a row-level shared lock to the currently read data (the lock is only added when it is read). Once the row is read, the row-level shared lock is released immediately;

At the moment when a transaction updates certain data (that is, the moment when the update occurs), a row-level exclusive lock must be added to it first, and it is not released until the end of the transaction.

phenomenon:

Transaction 1 reads a row of records in the entire process, transaction 2 can read the row of records (because the transaction to the row of records to increase the row-level shared lock, transaction two can also increase the sharing of the data Lock to read data.).

At the moment when transaction 1 reads a row, transaction 2 cannot modify the row data, but as long as transaction 1 finishes reading the changed row data, transaction 2 can modify the row data. (A transaction will add a shared lock to the data at the moment of reading, and no other transaction can add an exclusive lock to the row of data. But as soon as the transaction has read the row of data, it will release the row-level shared lock. Once the lock is released , Transaction two can add exclusive lock to the data and modify the data)

When transaction 1 updates a row of records, transaction 2 cannot update this row of records until the end of transaction 1. (When transaction one updates the data, it will add an exclusive lock to the row data, and the lock will be released until the transaction ends. Therefore, before transaction two is committed, transaction one can read data without adding a shared lock to the data. Therefore, submitting the read can solve the dirty read phenomenon)

For example

事务一 事务二
/* Query 1 */
 SELECT * FROM users WHERE id = 1;  

 /* Query 2 */ 
UPDATE users SET age = 21 WHERE id = 1;
 COMMIT;
 /* in multiversion concurrency control, or lock-based READ COMMITTED */ 
 /* Query 1 */
SELECT * FROM users WHERE id = 1;
 COMMIT; 
/*lock-based REPEATABLE READ */ 

In the commit read isolation level, transaction one cannot read data before transaction two is committed. Only after transaction two is committed, transaction one can read the data.

But from the above example, we can also see that the results of one or two reads of the transaction are not consistent, so committing the read cannot solve the phenomenon of non-repeatable reads.

In short, the isolation level of commit read ensures that any data read is committed data and avoids dirty reads. But there is no guarantee that the same data can be read when the transaction is re-read, because other transactions can modify the data just read after each data read.

Repeatable reads


Repeatable reads (REPEATABLE READS), due to the read isolation level submitted will produce non-repeatable reads. Therefore, an isolation level that is one level higher than the commit read can solve the problem of non-repeatable reads. This isolation level is called Repeatable Reading (Is this name very capricious!!)

Repeatable read database lock situation

At the moment when a transaction reads certain data (that is, the moment it starts to read), a row-level shared lock must be added to it first, and it will not be released until the end of the transaction;

At the moment when a transaction updates certain data (that is, the moment when the update occurs), a row-level exclusive lock must be added to it first, and it is not released until the end of the transaction.

phenomenon

Transaction 1 reads a row of records in the entire process, transaction 2 can read the row of records (because the transaction to the row of records to increase the row-level shared lock, transaction two can also increase the sharing of the data Lock to read data.).

Transaction 1 is in the entire process of reading a row of records, transaction 2 cannot modify the data of that row (transaction 1 will add a shared lock to the data during the entire process of reading, and the lock will not be released until the transaction is committed, so the entire process, No other transaction can add an exclusive lock to the row of data. Therefore, repeatable reads can solve the phenomenon of non-repeatable reads)

When transaction 1 updates a row of records, transaction 2 cannot update this row of records until the end of transaction 1. (When transaction one updates the data, it will add an exclusive lock to the row data, and the lock will be released until the transaction ends. Therefore, before transaction two is committed, transaction one can read data without adding a shared lock to the data. Therefore, submitting the read can solve the dirty read phenomenon)

For example

事务一 事务二
/* Query 1 */
 SELECT * FROM users WHERE id = 1; 
 COMMIT;    

 /* Query 2 */ 
UPDATE users SET age = 21 WHERE id = 1;
 COMMIT;
 /* in multiversion concurrency control, or lock-based READ COMMITTED */ 

In the above example, only after transaction one is committed, transaction two can change the row of data. Therefore, as long as the period from the beginning to the end of the transaction, no matter how many times he reads the row of data, the result is the same.

From the above example, we can draw the conclusion: Repeatable read isolation level can solve the phenomenon of non-repeatable read. But in the isolation level of repeatable reading, there is another reading phenomenon that he cannot solve, that is, phantom reading. Look at the following example:

事务一 事务二
/* Query 1 */
 SELECT * FROM users WHERE age BETWEEN 10 AND 30;   

 /* Query 2 */ 
INSERT INTO users VALUES ( 3, 'Bob', 27 );
 COMMIT;
 /* Query 1 */
SELECT * FROM users WHERE age BETWEEN 10 AND 30;    

The execution and phenomena of the above two transactions are as follows:

1. The first query condition of transaction one is age BETWEEN 10 AND 30; if this is ten records meet the condition. At this time, he will add row-level shared locks to the ten eligible records. No other transaction can change these ten records.

2. Transaction two executes a sql statement, the content of the statement is to insert a piece of data into the table. Because no transaction adds table-level locks to the table at this time, the operation can be executed smoothly.

3. When transaction one executes SELECT * FROM users WHERE age BETWEEN 10 AND 30; again, the result returned records becomes eleven, which is an increase from the previous one, which is the one just inserted by transaction two.

Therefore, the results of the two range queries of transaction one are not the same. This is the phantom reading we mentioned.

Serializable (Serializable)


Serializable (Serializable) is the highest isolation level. All the phantom reads that cannot be solved by the aforementioned isolation levels can be solved in the serializable isolation level.

We have said that the reason for the phantom read is that the range locks (range-locks: use a "WHERE" clause to describe the range locks in the SELECT query) are not added when the transaction is in the range query, which leads to the phantom read.

Serializable database lock situation

When a transaction reads data, it must first add a table-level shared lock to it, and will not release it until the end of the transaction;

When a transaction updates data, it must first add a table-level exclusive lock to it and release it until the end of the transaction.

phenomenon

When transaction 1 is reading the records in table A, transaction 2 can also read table A, but cannot update, add, or delete table A until transaction 1 ends. (Because the transaction adds a table-level shared lock to the table, other transactions can only increase the shared lock to read data, and cannot perform any other operations)

When transaction 1 is updating a record in table A, transaction 2 cannot read any record in table A, and it is impossible to update, add, or delete table A until the end of transaction 1. (The transaction adds table-level exclusive locks to the table, and other transactions cannot add shared locks or exclusive locks to the table, and therefore cannot perform any operations)

Although serialization solves the reading phenomena of dirty reads, non-repeatable reads, and phantom reads. But serialized transactions will have the following effects:

1. Unable to read records that have been modified but not committed by other transactions.

2. Before the current transaction is completed, other transactions cannot modify the records read by the current transaction.

3. Before the current transaction is completed, the index key value of the new record inserted by other transactions cannot be in the index key range read by any statement of the current transaction.

The four transaction isolation levels are getting higher and higher in isolation, but at the same time they are getting lower and lower in concurrency. The reason there are several isolation levels is to facilitate developers to choose the most appropriate isolation level according to business needs during the development process.
[Intermediate] In-depth analysis of transaction isolation level

Guess you like

Origin blog.51cto.com/13626762/2545861