Transaction and concurrency control of database

The concept of transaction : It is a sequence of database operations defined by the user. These operations are either all done or not done at all, which is an inseparable unit of work. A transaction can be an SQL statement, a group of SQL statements, or an entire program. Generally speaking, a program will contain multiple transactions. The start and end of the transaction can be controlled by the user. If the user does not explicitly define the transaction, the DBMS will automatically divide the transaction according to the default. In SQL, there are three statements that define a transaction: BEGIN TRANSACTION / COMMIT / ROLLBACK.

Four characteristics of transactions :

Atomicity: A transaction is a logical unit of work in the database, and the operations included in the transaction are either all done or not done at all.

Consistency: The result of transaction execution must be to bring the database from one consistent state to another. A database is said to be in a consistent state if all operations are successfully committed during a transaction operation. If the database system fails during operation, some transactions are not completed and are forced to abort. Some of the modifications made to the database by these unfinished transactions have already been written to the physical database, which means the database is in an incorrect state. It can be seen that consistency and atomicity are closely related.

Isolation: The execution of a transaction cannot be interfered with by other transactions. Transactions executing concurrently cannot interfere with each other.

Persistence (durability): Once a transaction is committed, its changes to the data in the database should be permanent.

Operation of transactions : Transactions can be executed serially one by one, that is to say, only one transaction can be run at a time, and other transactions can only run after this transaction is completed. However, during the execution of transactions, different resources may be required, sometimes CPU, sometimes database access, sometimes I/O, and sometimes communication. If transactions are executed serially, many system resources will be is idle, so in order to fully utilize system resources, multiple transactions should be allowed to run in parallel.

Parallel execution of transactions (single processor): The parallel execution of transactions is essentially the parallel operations of these parallel transactions running alternately in turn. In a multiprocessor system, true simultaneous concurrent execution can be achieved.

Note: A transaction is the basic unit of concurrency control . During the parallel execution of transactions, multiple transactions may be involved in accessing the same piece of data at the same time, which may result in data corruption. Hence the emergence of locking technology.

Blocking (locking) : It is a very important technique for implementing concurrency control. (For example, before a transaction A operates on a data object such as a table or record, it first sends a request to the system and locks it. After locking, transaction A has certain control over the data object, and transaction A has no control over the data object. Before the release, other transactions should update the data object) It is divided into write locks and read locks.

Write lock : If a write lock is added to a data object in a transaction A, only A is allowed to read and modify the data, and other transactions cannot lock the data object until A releases the lock , other transactions can operate on it. (Popular point: if a transaction adds a write lock, no other transaction can lock her).

Read lock : If transaction A has a read lock on a piece of data, then only A is allowed to read the data and cannot perform modification operations. Other transactions can also perform read lock operations on the data, but cannot add write lock operations.

Livelock : Transaction A locks data X, then transaction B requests to lock data X, and then transaction C requests to lock data X. When A releases the lock of data X, the system hands data X to the transaction first. C, and then transaction D requests to lock data X again. When transaction B is used up, the system hands over data X to D, so that transaction B is always in a waiting state, which is the so-called livelock state. This method can be solved by using a first-come, first-served strategy.

Deadlock : When transaction A locks data X, it needs data Y, and transaction B needs data X after locking data Y. Neither of them releases their own part of the resources, and they are both in a state of waiting for a transition. This situation is a deadlock.

Several solutions to deadlock :

1. Deadlock Prevention

1.1 One-time blocking method: lock all the data required by transaction A, otherwise it cannot be executed. Several features exposed by this method: when the amount of data is too large, it will reduce the concurrency of the bear;

1.2 Sequential locking method: prescribe a blocking order for the data, and all transactions must be blocked in this order.

2. Diagnosis and removal of deadlock:

2.1. Timeout method: specify a time, if a transaction exceeds the specified time, it is considered a deadlock.

2.2 Waiting graph method

Serializability of concurrent scheduling : the concurrent execution of multiple transactions is correct if and only if the result is the same as the result of executing these transactions serially in a certain order, this scheduling strategy is called serializable .

Two -end lock protocol (a common rule for data locking) : The transaction is divided into two stages. The first stage is to obtain the blockade. In this stage, the transaction can apply for any type of lock on any data item, but cannot release any Lock. The second phase is to release the blockade, in this phase, the transaction can release any type of lock on any data item, but can no longer apply for any locks.

Programmatic transaction : Programmatic transaction requires you to directly add transaction processing logic in the code. You may need to explicitly call beginTransaction(), commit(), rollback() and other transaction management related methods in the code, such as executing a method When you need transaction processing, you need to start the transaction at the beginning of the a method, and then process it. At the end of the method, close the transaction.

Declarative transaction : The practice of declarative transaction is to add annotations around the a method or define it directly in the configuration file. The a method requires transaction processing. In spring, it will be intercepted before and after the a method through the configuration file, and transactions will be added.

两者区别：编程式事务侵入性比较强，但处理粒度更细.

--------------------------------------------------------------Java中的事务分类（下面为转载）------------------------------------------------------------------------

转自：http://lavasoft.blog.51cto.com/62575/53815/

内容如下：

Java事务处理总结

一、什么是Java事务

通常的观念认为，事务仅与数据库相关。

事务必须服从ISO/IEC所制定的ACID原则。ACID是原子性（atomicity）、一致性（consistency）、隔离性（isolation）和持久性（durability）的缩写。事务的原子性表示事务执行过程中的任何失败都将导致事务所做的任何修改失效。一致性表示当事务执行失败时，所有被该事务影响的数据都应该恢复到事务执行前的状态。隔离性表示在事务执行过程中对数据的修改，在事务提交之前对其他事务不可见。持久性表示已提交的数据在事务执行失败时，数据的状态都应该正确。

通俗的理解，事务是一组原子操作单元，从数据库角度说，就是一组SQL指令，要么全部执行成功，若因为某个原因其中一条指令执行有错误，则撤销先前执行过的所有指令。更简答的说就是：要么全部执行成功，要么撤销不执行。

既然事务的概念从数据库而来，那Java事务是什么？之间有什么联系？

实际上，一个Java应用系统，如果要操作数据库，则通过JDBC来实现的。增加、修改、删除都是通过相应方法间接来实现的，事务的控制也相应转移到Java程序代码中。因此，数据库操作的事务习惯上就称为Java事务。

二、为什么需要事务

事务是为解决数据安全操作提出的，事务控制实际上就是控制数据的安全访问。具一个简单例子：比如银行转帐业务，账户A要将自己账户上的1000元转到B账户下面，A账户余额首先要减去1000元，然后B账户要增加1000元。假如在中间网络出现了问题，A账户减去1000元已经结束，B因为网络中断而操作失败，那么整个业务失败，必须做出控制，要求A账户转帐业务撤销。这才能保证业务的正确性，完成这个操走就需要事务，将A账户资金减少和B账户资金增加方到一个事务里面，要么全部执行成功，要么操作全部撤销，这样就保持了数据的安全性。

三、Java事务的类型

Java事务的类型有三种：JDBC事务、JTA(Java Transaction API)事务、容器事务。

1、JDBC事务

JDBC 事务是用 Connection 对象控制的。JDBC Connection 接口( java.sql.Connection )提供了两种事务模式：自动提交和手工提交。 java.sql.Connection 提供了以下控制事务的方法：

public void setAutoCommit(boolean)

public boolean getAutoCommit()

public void commit()

public void rollback()

使用 JDBC 事务界定时，您可以将多个 SQL 语句结合到一个事务中。JDBC 事务的一个缺点是事务的范围局限于一个数据库连接。一个 JDBC 事务不能跨越多个数据库。

2、JTA(Java Transaction API)事务

JTA是一种高层的，与实现无关的，与协议无关的API，应用程序和应用服务器可以使用JTA来访问事务。

JTA允许应用程序执行分布式事务处理--在两个或多个网络计算机资源上访问并且更新数据，这些数据可以分布在多个数据库上。JDBC驱动程序的JTA支持极大地增强了数据访问能力。

如果计划用 JTA 界定事务，那么就需要有一个实现 javax.sql.XADataSource 、 javax.sql.XAConnection 和 javax.sql.XAResource 接口的 JDBC 驱动程序。一个实现了这些接口的驱动程序将可以参与 JTA 事务。一个 XADataSource 对象就是一个 XAConnection 对象的工厂。 XAConnection s 是参与 JTA 事务的 JDBC 连接。

您将需要用应用服务器的管理工具设置 XADataSource 。从应用服务器和 JDBC 驱动程序的文档中可以了解到相关的指导。

J2EE 应用程序用 JNDI 查询数据源。一旦应用程序找到了数据源对象，它就调用 javax.sql.DataSource.getConnection() 以获得到数据库的连接。

XA 连接与非 XA 连接不同。一定要记住 XA 连接参与了 JTA 事务。这意味着 XA 连接不支持 JDBC 的自动提交功能。同时，应用程序一定不要对 XA 连接调用 java.sql.Connection.commit() 或者 java.sql.Connection.rollback() 。相反，应用程序应该使用 UserTransaction.begin()、 UserTransaction.commit() 和 serTransaction.rollback() 。

3、容器事务

容器事务主要是J2EE应用服务器提供的，容器事务大多是基于JTA完成，这是一个基于JNDI的，相当复杂的API实现。相对编码实现JTA事务管理，我们可以通过EJB容器提供的容器事务管理机制（CMT）完成同一个功能，这项功能由J2EE应用服务器提供。这使得我们可以简单的指定将哪个方法加入事务，一旦指定，容器将负责事务管理任务。这是我们土建的解决方式，因为通过这种方式我们可以将事务代码排除在逻辑编码之外，同时将所有困难交给J2EE容器去解决。使用EJB CMT的另外一个好处就是程序员无需关心JTA API的编码，不过，理论上我们必须使用EJB。

四、三种事务差异

1、JDBC事务控制的局限性在一个数据库连接内，但是其使用简单。

2、JTA事务的功能强大，事务可以跨越多个数据库或多个DAO，使用也比较复杂。

3、容器事务，主要指的是J2EE应用服务器提供的事务管理，局限于EJB应用使用。

五、总结

事务控制是构建J2EE应用不可缺少的一部分，合理选择应用何种事务对整个应用系统来说至关重要。一般说来，在单个JDBC 连接连接的情况下可以选择JDBC事务，在跨多个连接或者数据库情况下，需要选择使用JTA事务，如果用到了EJB，则可以考虑使用EJB容器事务。

Transaction and concurrency control of database

Guess you like