Four characteristics of database transactions

ACID properties

The four characteristics of transaction (transaction) in database management system (according to the acronym in order to explain during analysis): atomicity (Atomicity), consistency (Consistency), isolation (Isolation), persistence (Durability)

The so-called transaction is a sequence of operations. These operations are either executed or not executed. It is an indivisible unit of work . (A group of instructions or operations that perform a single logical function is called a transaction)

Detailed explanation

1. Atomicity

Atomicity means that a transaction is an indivisible unit of work , and either all operations in the transaction occur or none occur.

The example of " transfer from A to B " can be used to illustrate and explain

In DBMS, by default, a SQL is a separate transaction , and the transaction is automatically committed . Only by explicitly using start transaction to open a transaction can a code block be executed in the transaction.

2. Consistency

Consistency means that the integrity constraints of the database are not violated before the transaction starts and after the transaction ends . This means that database transactions cannot destroy the integrity of relational data and the consistency of business logic .

For example, if A transfers money to B, regardless of whether the transaction operation of the transfer is successful or not, the total deposits of the two remain unchanged (this is the consistency of business logic, and it is better to understand the integrity of the database relationship constraints).

Guarantee mechanism (also starting from two aspects): at the database level , before and after a transaction is executed, the data will comply with the constraints you set ( unique constraints, foreign key constraints, check constraints, etc.) and trigger settings; in addition, the internal data structure of the database (such as B-tree index or doubly linked list) must be correct. The consistency of the business is generally guaranteed by the developer, and can also be transferred to the database level.

3. Isolation

When multiple transactions access concurrently, the transactions are isolated , and one transaction should not affect the running effect of other transactions.

In a concurrent environment, when different transactions manipulate the same data at the same time , each transaction has its own complete data space . Modifications made by concurrent transactions must be isolated from modifications made by any other concurrent transaction. When a transaction views data updates, the state of the data is either the state before another transaction modifies it, or the state after another transaction modifies it, and the transaction will not view the data in the intermediate state .

The most complex problems of transactions are caused by transaction isolation. Complete isolation is unrealistic. Complete isolation requires the database to execute only one transaction at a time, which will seriously affect performance.

For the transaction isolation level in isolation (impact between transactions), see the corresponding blog post

4. Persistence

This is a feature that is best understood: persistence, which means that after the transaction is completed, the changes made by the transaction to the database will be permanently saved in the database and will not be rolled back. (The completed transaction is a permanent part of the system , and the impact on the system is permanent, and the modification will always be maintained even if there is a fatal system failure)

write ahead logging : SQL Server uses WAL (Write-Ahead Logging) technology to ensure the ACID characteristics of transaction logs. Before data is written to the database, it is first written to the log, and then the log records are changed to the storage.

Three types of data reading problems

  1.Dirty Read (dirty read)

  2.Unrepeatable Read (non-repeatable read)

  3.Phantom Read

Two types of data update problems

  1. The first type of lost update

  2. The second type of lost updates

First look at " dirty reading ". When I saw the word "dirty", I thought of disgusting and dirty. How can the data be dirty? In fact, it is what we often call "garbage data". For example, there are two transactions, which are executing concurrently (that is, competing). Take a look at this form below and you'll know exactly what I'm talking about:

 



The balance should be 1100 yuan! Please look at the time point T6, the query balance of transaction A at this time is 900 yuan, this data is dirty data, it is caused by transaction A, it is obvious that the transaction has not been isolated, seeped in, and messed up.

So dirty reading is very important and must be solved! It is the last word to isolate transactions.

 

How to explain non-repeatable reading ? Or use a similar example to illustrate:

 


 

In fact, transaction A did not do anything other than inquire twice, and as a result, the money changed from 1000 to 0, which is repeated reading. It is conceivable that this was done by someone else, not me. In fact, this is also reasonable. After all, transaction B submits the transaction, and the database persists the result, so when transaction A reads it again, it will naturally change.

This phenomenon is basically understandable, but it is not allowed in some perverted scenarios. After all, this phenomenon is also caused by the lack of isolation between transactions, but we seem to be able to ignore this problem.

 

Phantom reading . I go! Isn't the word Phantom just "ghost, ghost"? When I first saw this word, my little friends were really stunned. No wonder it has to be translated into "phantom reading" here, it can't be translated into "ghost reading" or "ghost reading". In fact, the meaning is that ghosts are reading it, not humans, or if you don’t know why, it changes, very dizzy, really dizzy. Let's talk with an example:


 

Bank staff see different results every time they count the total deposits. But this is indeed quite normal. The total deposit has increased. It must be that someone is saving money at this time. But if the banking system is really designed this way, it's game over. This is also caused by the lack of transaction isolation, but for most application systems, this seems normal, understandable, and allowed. Those disgusting systems in the bank have very strict requirements. When counting, they will even isolate all other operations. This isolation level is considered to be very high (it is estimated to be at the SERIALIZABLE level).

 

The first type of lost update, when transaction A is withdrawn, the updated data of transaction B that has been submitted is overwritten. This kind of mistake can cause serious problems, as can be seen by the following account withdrawal transfer:

However, in the current four arbitrary isolation levels, this situation will not happen, otherwise it will be absolutely messy. I didn’t submit the transaction but just undo it, and overwrite other people’s, which is too scary.


The second type of lost update, transaction B overwrites the data submitted by transaction A, causing the operation of transaction A to be lost

 

To sum up, the above-mentioned problems related to reading data caused by concurrent transactions can be described in one sentence:

  1. Dirty read: Transaction A reads the uncommitted data of transaction B, and performs other operations on this basis.

2. Non-repeatable read: Transaction A reads the changed data   submitted by transaction B.

3. Phantom read: Transaction A reads the new data   submitted by transaction B.

The first one is firmly resisted, and the latter two can be ignored in most cases.

This is why there must be a transaction isolation level, which is like a wall, isolating different transactions. Looking at the table below, you can see what kind of transaction concurrency issues different transaction isolation levels can handle:

Guess you like

Origin blog.csdn.net/mmk27_word/article/details/108666098