SQL transactions and isolation

affairs

business definition

A transaction is a plurality of statements to complete a task. Among these statements, as long as one statement fails, the entire transaction will fail, even if the previous statement has been executed, it will be withdrawn

for example:

I go to the bank to transfer money to Brother Wang. There are two steps to transfer money. The first step is to take out my money, and the second step is to give the money to Brother Wang. What if I just took out my money but it didn’t arrive? The bank in Brother Wang’s account exploded, and my money disappeared out of thin air. This is unreasonable, so we use transactions and put two operations in one transaction. If the operation of transferring money to Brother Wang’s account is not completed, then my money back

nature of affairs

There are four major characteristics of transactions, each of which takes the first letter to get ACID ("sour"):

  1. Atomicity Atomicity, that is, integrity, cannot be split (unbreakable), all statements must be executed successfully to complete the transaction, otherwise, as long as some statements fail to execute, the executed statements will also be restored
  2. Consistency consistency means that our database will always maintain a consistent state through transactions, for example, there will be no orders without complete order items
  3. Isolation isolation means that transactions are isolated from each other and do not affect each other, especially when access to the same data is required. Specifically, if multiple transactions want to modify the same data, the data will be locked and can only be modified by one transaction at a time, and other transactions must wait for the execution of this transaction to complete.
  4. Durability persistence means that once the transaction is executed, this modification is permanent, and any power failure or system crash will not affect this data modification

create transaction 

START TRANSACTION;



COMMIT;

Just put our business code between these two lines of commands

If we suddenly find that we have made a mistake while writing, we can directly use ROLLBACK instead of COMMIT

autocommit

The SELECT, INSERT, UPDATE or DELETE statements we execute are actually automatically packaged into a transaction by MYSQL and then automatically submitted

SHOW VARIABLES LIKE 'autocommit';

 We can see that this parameter is turned on

concurrency issues

 Lost Updates Lost Updates

 The update of operation A is not over yet, at this time, the update operation of B still reads the unupdated value, so after the actual update of B is completed, the operation of A is overwritten

Dirty Reads Dirty reads

Transaction A increased a customer’s points from 10 points to 20 points, but it was read by transaction B before submission. Transaction B determined the discount amount according to the unsubmitted customer points, but then transaction A was returned, so The customer's points are actually still 10 points, so transaction B is equivalent to reading data that has never been submitted in a database and making decisions based on it, which is called dirty reading

Non-repeating Reads non-repeatable read (or Inconsistent Read inconsistent read)

In the statement of transaction A, it is necessary to read the score data of a certain customer twice. The first reading is 10 points. At this time, transaction B updates the score to 0 points and submits it. Then transaction A reads the score for the second time as 0 points, this has occurred non-repeatable read or inconsistent read

Maintain data consistency and make decisions based on the initial state of the data when transaction A starts executing. If this is what we want, we must increase the isolation level of transaction A so that it cannot see other transactions during execution. Data changes (even if submitted), SQL has a standard isolation level called Repeatable Read, which can ensure that the read data is repeatable and consistent, no matter what changes are made to the data by other transactions in the process, What is read is the initial state of the data

Phantom Reads phantom reading

Misunderstanding of phantom reading: it is said that phantom reading is that transaction A performs two select operations to obtain different data sets, that is, select 1 obtains 10 records, and select 2 obtains 11 records. This is actually not a phantom read, it is a kind of non-repeatable read, which will only appear at the RU RC level, but will not appear at the default RR isolation level of mysql.

Phantom reading does not mean that the result sets obtained by multiple reads in a transaction are different. More importantly, phantom reading is that the data state represented by the result set obtained by a certain select operation cannot support subsequent business operations. To be more specific: the select record does not exist, and the record is ready to be inserted, but when the insert is executed, it is found that the record already exists and cannot be inserted, as if hallucinating

isolation level

Read Uncommitted read uncommitted

The lowest level of isolation can't solve any problems, that is, this transaction has not been committed yet, other transactions can read its data, or it can be directly understood as no isolation at all

Read Committed read committed

Only committed data can be read, which prevents Dirty Reads

Repeatable Read Repeatable Read

Different reads will return the same result, even if the data is changed and committed during this period, the data read at the beginning of the transaction shall prevail (pass)

Serializable serialization

All of the above problems can be prevented, and this level also prevents phantom reads. If other transactions modify data that may affect the result, our transaction must wait for it to complete, but this will obviously put a burden on the server, because managing waiting transactions requires Consumes additional storage and CPU resources

There is a very interesting summary about these, that is, we think that reading committed locks the read behavior (cannot read uncommitted data), so the problem of dirty reading is solved

Repeatable submission locks the row, so it solves the concurrency problem caused by row modification (lost update, non-repeatable read)

Serialization locks the table, so it solves the problem of inserting and deleting the entire table (phantom reading)

A more detailed explanation:

  • For transactions at the "read uncommitted" isolation level, since the data modified by uncommitted transactions can be read, it is good to read the latest data directly;
  • For transactions at the "serialization" isolation level, parallel access is avoided by adding read-write locks;
  • For transactions at the "Read Commit" and "Repeatable Read" isolation levels, they are  implemented through Read View  . The difference between them is that the timing of creating Read View is different. You can understand Read View as a data snapshot. Like taking pictures with a camera, freeze the scenery at a certain moment. The "Read Commit" isolation level is to regenerate a Read View "before each statement is executed", while the "Repeatable Read" isolation level is to generate a Read View "when starting a transaction", and then use this Read during the entire transaction View .

Lower isolation levels are easier to concurrency, more users can access the same data at the same time, but there will be more concurrency issues, on the other hand, because there are fewer locks for isolation, the performance will be higher

Conversely, higher isolation levels limit concurrency and reduce concurrency issues, but at the expense of performance and scalability, as we need more locks and resources

The default level of MySQL is Repeatable Read, which can prevent all concurrency problems except phantom reading and is faster than serialization. Although it cannot prevent phantom reading, it can avoid phantom reading in most cases. In most cases, it should be kept This default level.

Of course, if it is important to prevent phantom reading for a specific transaction, it can be changed to Serializable serialization

For some batch reports that do not require high data consistency or when data is rarely updated and want to achieve better performance, the first two levels can be considered

In general, the default isolation level is generally maintained, and only changed when necessary

To set the isolation level:

SET [SESSION]/[GLOBAL] TRANSACTION ISOLATION LEVEL SERIALIZABLE

SESSION is to set the isolation level of all transactions after this session (link), plus GLOBAL is the isolation level of all transactions of all conversations after setting

Write the corresponding isolation after LEVEL

Let's look at an example

There is an account balance table, which contains a record with an account balance of 1 million. Then there are two concurrent transactions. Transaction A is only responsible for querying the balance, and transaction B will change my balance to 2 million. The following is the behavior of executing the two transactions in chronological order:

What are the results under different isolation levels?

Uncommitted read: V1V2V3 are both 2 million. Transaction A can see the changes in transaction B before committing

Committed read: V1 1 million V2 V3 2 million Uncommitted changes are not visible in V1

Repeatable read: V1 V2 1 million V3 2 million From the definition point of view, the results of multiple reads in the transaction are consistent. In principle, the entire transaction uses a snapshot at the beginning of the transaction

Serialization (serialization): V1 V2 1 million V3 2 million When transaction A starts to read, the read-write lock locks the data, and the lock is not released until transaction A ends, and transaction B can start modifying the data

deadlock

The addition, deletion, and modification statements in the transaction will lock the relevant rows (if two concurrent transactions lock the row to be used by the other party in the next step, a deadlock will occur, and the deadlock cannot be completely avoided, but there are some ways to reduce its occurrence possibility of

 When transaction 1 executes to the second modification, it is found that the row transaction 2 is locked

When transaction 2 executes to the second modification, it is found that the row transaction 1 is locked (the lock will be released only after the transaction is executed as a whole)

Ways to avoid deadlock:

  1. Pay attention to the order of statements: If you detect that two transactions are always deadlocked, check their code, these transactions may be part of a stored procedure, look at the order of statements in the transaction, if these transactions update records in the opposite order, it is very easy There may be deadlocks, in order to reduce deadlocks, we can follow the same order when updating multiple records
  2. Try to keep your affairs small and of short duration so that they are less likely to conflict with other affairs
  3. If your transaction needs to operate a very large table, the running time may be very long, and the risk of conflict will be high. See if you can avoid such a thing from running during peak hours to avoid a large number of active users

Guess you like

Origin blog.csdn.net/chara9885/article/details/131494340