A Brief Analysis of the Must-know Database Reading Phenomenon

"Read phenomenon" is a situation that may be encountered in reading data when multiple transactions are executed concurrently. Knowing them first helps to understand what each isolation level means. These include dirty reads, non-repeatable reads, and phantom reads.

dirty read

Dirty read, also known as reading of invalid data, means that during database access, transaction T1 modifies a value, then transaction T2 reads the value, and then T1 revokes the modification of the value for some reason, which leads to The data read by T2 is invalid.

Dirty read refers to when a transaction is accessing data and making modifications to the data, and the modification has not been committed to the database, at this time, another transaction also accesses the data and then uses the data. Because this data is uncommitted data, the data read by another transaction is dirty data, and operations based on dirty data may be incorrect.

for example:

In the example below, transaction 2 modifies a row but does not commit it, and transaction 1 reads the uncommitted data. Now if transaction 2 rolls back the previous modification or makes another modification, the data found in transaction 1 is incorrect.


business one business two
/* Query 1 */

SELECT age FROM users WHERE id = 1;

/* will read 20 */
 
  /* Query 2 */
UPDATE users SET age = 21 WHERE id = 1;

/* No commit here */
/* Query 1 */
SELECT age FROM users WHERE id = 1;
/* will read 21 */
 
  ROLLBACK;
/* lock-based DIRTY READ */

In this example, after transaction 2 is rolled back, there is no data with id 1 and age 21. So, as soon as the transaction reads a piece of dirty data.

non-repeatable read

Non-repeatable read means that during database access, two identical queries within the scope of a transaction return different data. This is due to commits modified by other transactions in the system at query time. For example, transaction T1 reads a certain data, transaction T2 reads and modifies the data, and T1 reads the data again in order to verify the read value, and different results are obtained.

A more understandable way of saying it is: within a transaction, read the same data multiple times. While this transaction is not over, another transaction also accesses the same data. Then, between the two reads of the first transaction. Due to the modification of the second transaction, the data read by the first transaction may be different, so the data read twice in one transaction is different, so it is called non-repeatable read, that is, original read It cannot be repeated.

for example:

In lock-based concurrency control, the phenomenon of "non-repeatable read" occurs when no read locks (read locks) are acquired when a SELECT operation is performed or the read lock is released immediately after the SELECT operation is performed; multi-version concurrency control A "non-repeatable read" phenomenon also occurs when a transaction with a conflicting commit is not required to be rolled back.

business one business two
/* Query 1 */

SELECT * FROM users WHERE id = 1;
 
  /* Query 2 */
UPDATE users SET age = 21 WHERE id = 1;

COMMIT;

/* in multiversion concurrency
control, or lock-based READ COMMITTED */
/* Query 1 */
SELECT * FROM users WHERE id = 1;

COMMIT;
/*lock-based REPEATABLE READ */
 

In this example, transaction 2 commits successfully, so his modifications to the row with id 1 are visible to other transactions. But transaction 1 has previously read another "age" value from this row.

hallucinations

Phantom read refers to a phenomenon that occurs when transactions are not executed independently. For example, the first transaction modifies the data in a table. For example, this modification involves "all data rows" in the table. At the same time, the second transaction also modifies the data in this table by inserting "a new row of data" into the table. Then, the user who operates the first transaction will find that there are no modified data rows in the table in the future, as if a hallucination has occurred. The general solution to the hallucination is to increase the range lock RangeS, and the lock detection range is only read to avoid phantom reads.

A "phantom read" is a special case of Non-repeatable reads: a "phantom read" may occur when a SELECT ... WHERE operation is performed without a transaction acquiring a range lock.

for example:

When transaction 1 executes SELECT...

business one business two
/* Query 1 */

SELECT * FROM users
WHERE age BETWEEN 10 AND 30;
 
  /* Query 2 */
INSERT INTO users VALUES ( 3, 'Bob', 27 );

COMMIT;
/* Query 1 */
SELECT * FROM users
WHERE age BETWEEN 10 AND 30;
 

In this example, transaction one executes the same query twice. However, a piece of data that meets the query conditions of transaction 1 is added to the database in the middle of the two operations, resulting in a phantom read.

solution

To solve read phenomena such as dirty reads, non-repeatable reads, and phantom reads, it is necessary to increase the isolation level of transactions. But at the same time, the higher the isolation level of the transaction, the lower the concurrency capability. Therefore, readers need to make trade-offs according to business needs.

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325813699&siteId=291194637