database transaction

Database transaction (1)

Overview

  A database transaction is a series of operations performed as a single logical unit of work. Transaction processing ensures that data-oriented resources are not permanently updated unless all operations within the transactional unit complete successfully. By combining a set of related operations into an all-success or all-fail unit, error recovery can be simplified and applications made more reliable. For a logical unit of work to be a transaction, it must satisfy the so-called ACID (Atomicity, Consistency, Isolation, and Durability) properties.

Operating procedures

  Consider an online shopping transaction. The payment process includes at least the following database operations:
  Update the customer's inventory information for the item purchased
  Save the customer's payment information--may include interaction with the banking system Generate the order and save it to   the
  database
Update user-related information, such as the number
  of purchases, etc. Under normal circumstances, these operations will proceed smoothly, and the final transaction will be successful, and all database information related to the transaction will be successfully updated. However, if there is an error in any link in this series of processes, such as an exception occurs when updating the commodity inventory information, the customer's bank account has insufficient funds, etc., the transaction will fail. Once the transaction fails, all the information in the database must remain in the same state before the transaction. For example, the last step of updating user information fails and the transaction fails, so it must be ensured that the failed transaction does not affect the state of the database - the inventory information has not been Updates, no payments from users, and no orders are generated. Otherwise, the information in the database will be chaotic and unpredictable.
  Database transactions are precisely the technique used to ensure the smoothness and predictability of transactions in this situation.

ACID properties of database transactions

An atomic transaction
  must be an atomic unit of work; either all or none of its data modifications are performed. Often, operations associated with a transaction have a common goal and are interdependent. If the system performs only a subset of these operations, the overall goal of the transaction may be violated. Atomicity eliminates the possibility of the system processing subsets of operations. A
consistent
  transaction must complete all data in a consistent state. In a relational database, all rules must be applied to the modification of a transaction to maintain the integrity of all data. At the end of the transaction, all internal data structures (such as B-tree indexes or doubly linked lists) must be correct. Some of the responsibility for maintaining consistency rests with the application developer, who must ensure that the application has enforced all known integrity constraints. For example, when developing an application for transferring money, you should avoid moving the decimal point arbitrarily during the transfer process.
Insulation Modifications
  made by concurrent transactions must be isolated from modifications made by any other concurrent transaction. The state of the data when the transaction views the data is either the state before another concurrent transaction modifies it, or the state after another transaction modifies it. The transaction does not view the data in the intermediate state. This is called serializability because it enables the starting data to be reloaded and a series of transactions to be replayed so that the data ends up in the same state as the original transaction executed. The highest isolation level is obtained when the transaction is serializable. At this level, the results obtained from a set of concurrently executable transactions are the same as those obtained by running each transaction sequentially. Because high isolation limits the number of transactions that can be executed in parallel, some applications lower the isolation level in exchange for greater throughput. Prevention of data loss
durability (durability)
  After the transaction is completed, its impact on the system is permanent. This modification will persist even in the event of a fatal system failure.

Database transaction (2)

The problem of data concurrency 
    A database may have multiple access clients, and these clients can access the database in a concurrent manner. The same data in the database may be accessed by multiple transactions at the same time. If the necessary isolation measures are not taken, it will lead to various concurrency problems and destroy the integrity of the data. These problems can be categorized into 5 categories, including 3 types of data read problems (dirty reads, phantom reads and non-repeatable reads) and 2 types of data update problems (type 1 missing updates and type 2 missing updates). Below, we explain the scenarios that cause problems through examples.

Dirty read 
    Before explaining dirty reading, let's tell a joke: a stuttering person is walking around in front of the counter of a beverage store, and the boss greets him enthusiastically: "Drink a bottle?" ...drinking...drinking...", the boss swiftly opened the can and handed it to Stutter, who finally choked out his words: "I...drink...drink...I can't afford it!". In this joke, the beverage shop owner does a dirty read on stuttering. 
A transaction reads the uncommitted change data of B transaction, and operates on the basis of this data. If transaction B happens to be rolled back, then the data read by transaction A is not recognized at all. Let's look at the dirty read scenario caused by the concurrent withdrawal transaction and transfer transaction:
     

time
transfer transaction A
Withdrawal transaction B
T1
 
start a transaction
T2
start a transaction
 
T3
     
Query account balance is 1000 yuan     
T4
        
Withdraw 500 yuan and change the balance to 500 yuan
T5
Query account balance of 500 yuan (dirty read)
 
T6
 
The balance of the canceled transaction is restored to 1000 yuan
T7
Import 100 yuan to change the balance to 600 yuan
 
T8
commit transaction
 

  在这个场景中,B希望取款500元而后又撤销了动作,而A往相同的账户中转账100元,就因为A事务读取了B事务尚未提交的数据,因而造成账户白白丢失了500元。

不可重复读(unrepeatable read) 
   不可重复读是指A事务读取了B事务已经提交的更改数据。假设A在取款事务的过程中,B往该账户转账100元,A两次读取账户的余额发生不一致:

 
时间
取款事务A
转账事务B
T1
 
开始事务
T2
开始事务
                          
T3
                              
查询账户余额为1000元      
T4
查询账户余额为1000元
                          
T5
                  
取出100元把余额改为900元
T6
 
提交事务                  
T7
查询账户余额为900 元(和T4 读取的不一致)
 

   在同一事务中,T4时间点和T7时间点读取账户存款余额不一样。
幻象读(phantom read) 
    A事务读取B事务提交的新增数据,这时A事务将出现幻象读的问题。幻象读一般发生在计算统计数据的事务中,举一个例子,假设银行系统在同一个事务中,两次统计存款账户的总金额,在两次统计过程中,刚好新增了一个存款账户,并存入100元,这时,两次统计的总金额将不一致:  

时间
统计金额事务A
转账事务B
T1
 
开始事务
T2
开始事务
             
T3
统计总存款数为10000元
             
T4
 
新增一个存款账户,存款为100元
T5
 
提交事务     
T6
再次统计总存款数为10100 元(幻象读)
 

  如果新增数据刚好满足事务的查询条件,这个新数据就进入了事务的视野,因而产生了两个统计不一致的情况。 
  幻象读和不可重复读是两个容易混淆的概念,前者是指读到了其它已经提交事务的新增数据,而后者是指读到了已经提交事务的更改数据(更改或删除),为了避免这两种情况,采取的对策是不同的,防止读取到更改数据,只需要对操作的数据添加行级锁,阻止操作中的数据发生变化,而防止读取到新增数据,则往往需要添加表级锁——将整个表锁定,防止新增数据。

第一类丢失更新 
    A事务撤销时,把已经提交的B事务的更新数据覆盖了。这种错误可能造成很严重的问题,通过下面的账户取款转账就可以看出来: 
    

时间
取款事务A
转账事务B
T1
开始事务
 
T2
 
开始事务
T3
查询账户余额为1000元     
 
T4
 
查询账户余额为1000元
T5
 
汇入100元把余额改为1100元
T6
 
提交事务
T7
取出100元把余额改为900元
 
T8
撤销事务
 
T9
余额恢复为1000 元(丢失更新)
 


  A事务在撤销时,“不小心”将B事务已经转入账户的金额给抹去了。 
第二类丢失更新 
  A事务覆盖B事务已经提交的数据,造成B事务所做操作丢失:  

时间
转账事务A
取款事务B
T1
 
开始事务
T2
开始事务
                         
T3
               
查询账户余额为1000元     
T4
查询账户余额为1000元
                         
T5
 
取出100元把余额改为900元
T6
 
提交事务           
T7
汇入100元
 
T8
提交事务
 
T9
把余额改为1100 元(丢失更新)
 


    上面的例子里由于支票转账事务覆盖了取款事务对存款余额所做的更新,导致银行最后损失了100元,相反如果转账事务先提交,那么用户账户将损失100元。

 

数据库事务(三)

 

事务隔离级别 
    尽管数据库为用户提供了锁的DML操作方式,但直接使用锁管理是非常麻烦的,因此数据库为用户提供了自动锁机制。只要用户指定会话的事务隔离级别,数据库就会分析事务中的SQL语句,然后自动为事务操作的数据资源添加上适合的锁。此外数据库还会维护这些锁,当一个资源上的锁数目太多时,自动进行锁升级以提高系统的运行性能,而这一过程对用户来说完全是透明的。 
    ANSI/ISO SQL 92标准定义了4个等级的事务隔离级别,在相同数据环境下,使用相同的输入,执行相同的工作,根据不同的隔离级别,可以导致不同的结果。不同事务隔离级别能够解决的数据并发问题的能力是不同的。 
        表 1 事务隔离级别对并发问题的解决情况 

隔离级别
脏读
不可
重复读
幻象读
第一类丢失更新
第二类丢失更新
READ UNCOMMITED
允许
允许
允许
不允许
允许
READ COMMITTED
不允许
允许
允许
不允许
允许
REPEATABLE READ
不允许
不允许
允许
不允许
不允许
SERIALIZABLE
不允许
不允许
不允许
不允许
不允许


    事务的隔离级别和数据库并发性是对立的,两者此增彼长。一般来说,使用READ UNCOMMITED隔离级别的数据库拥有最高的并发性和吞吐量,而使用SERIALIZABLE隔离级别的数据库并发性最低。

    SQL 92定义READ UNCOMMITED主要是为了提供非阻塞读的能力,Oracle虽然也支持READ UNCOMMITED,但它不支持脏读,因为Oracle使用多版本机制彻底解决了在非阻塞读时读到脏数据的问题并保证读的一致性,所以,Oracle的READ COMMITTED隔离级别就已经满足了SQL 92标准的REPEATABLE READ隔离级别。

    SQL 92推荐使用REPEATABLE READ以保证数据的读一致性,不过用户可以根据应用的需要选择适合的隔离等级。

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Guess you like

Origin http://10.200.1.11:23101/article/api/json?id=326782193&siteId=291194637