Transaction and lock

 

Transactions and locks
Definition of
transaction Transaction (Transaction) generally refers to what to do or do. In computer terms, it refers to a program execution unit (unit) that accesses and may update various data items in the database. Transaction (Transaction) is a program execution unit (unit) that accesses and may update various data items in the database. Transactions are usually caused by the execution of user programs written in high-level database manipulation languages ​​or programming languages ​​(such as SQL, C++ or Java), and are defined by statements (or function calls) in the form of begin transaction and end transaction. A transaction consists of all operations performed between the beginning of the transaction and the end of the transaction. --From Baidu Encyclopedia

In short: Transaction (Transaction) is the basic unit of concurrency control.
The so-called transaction, it is a sequence of operations, these operations are either executed or not executed, it is an indivisible unit of work.
For example: in the bank transfer problem, account A transfers money to account B, there will be two corresponding operations, account A subtracts the corresponding transfer amount, and account B increases the corresponding transfer amount. These two operations are either executed or not executed. At this time, these two operations should be regarded as one transaction to ensure data consistency.

The characteristics of transactions ACID
atomicity (Atomicity): A transaction is an indivisible unit of work, and all operations included in the transaction are either done or not done.
Consistency (Consistency): The transaction must change the database from one consistent state to another consistent state. Consistency and atomicity are closely related.
Isolation: The execution of a transaction cannot be interfered by other transactions. That is to say, the internal operations of a transaction and the data used are isolated from other concurrent transactions, and each transaction executed concurrently cannot interfere with each other.
Durability: Durability, also known as permanence, means that once a transaction is committed, its changes to the data in the database should be permanent. The following other operations or failures should not have any effect on it.
The statement corresponding to the transaction
BEGIN TRANSACTION starts the transaction
COMMIT TRANSACTION commits the transaction
ROLLBACK TRANSACTION rolls back the transaction
Transaction concurrency control The
transaction does not consider the problems caused by isolation.
Dirty read: This type of exception is because one transaction reads another transaction modified but not committed Data, when the modified transaction performs a rollback operation, it will cause a read transaction exception.
Non-repeatable read: read a row of data in the table within one transaction, and the results of multiple reads are different. (One transaction reads the data submitted by another transaction)
Phantom read (virtual read): Refers to reading the data inserted by another transaction in a transaction, resulting in inconsistent reading before and after. For example, read the entire table, that is, the number of rows in the table. For example, the first time a table is read, there are 3 records, and the second time the table is read, there are 4 records (different from non-repeatable reading: non-repeatable reading is for data The value of phantom reading is for the amount of data)
database transaction isolation level (defined by the SQL standard)
READ UNCOMMITTED (uncommitted read): Modifications in the transaction, even if not committed, other transactions can also be seen. It is easy to cause many problems such as dirty reading. If it is not necessary,
READ COMMITTED is rarely used : the default isolation level of most database systems (except Mysql, etc.). This isolation level is the beginning of a transaction. You can only see the results of completed transactions. What is being executed cannot be seen by other transactions. At this level, the phenomenon of reading old data will appear, that is, the problem of non-repeatable reading.
REPEATABLE READ (repeatable read): solves the problem of dirty reads. This level ensures that the results of each row of records are consistent, that is, the problem of reading old data as mentioned above, but it cannot solve another problem. Row, as the name implies, is the row data that pops out suddenly. It means that a certain transaction is reading a certain range of data, but another transaction inserts data into this range of data, which causes the number of rows of data to be inconsistent when it is read multiple times. That is, phantom reading. -MYSQL default isolation level
SERIALIZABLE (serializable): the highest isolation level, it avoids the previous phantom reading situation by forcing transactions to be executed serially (note that it is serial), because he adds a large number of locks, resulting in a large number of Request timeout, so the performance will be lower. This isolation level may be considered when data consistency is particularly needed and the amount of concurrency does not need to be that large.
Isolation level Dirty read possibility Non-repeatable read possibility Phantasmal read possibility Locked read
READ UNCOMMITTED Yes Yes Yes No
READ COMMITTED No Yes Yes No
REPEATABLE READ No No Yes No
SERIALIZABLE No No No Yes
Database lock
Basic types of database lock:
X locks: exclusive for write operations 
- a data object without adding any lock, a transaction can lock them plus X, and other matters can not be combined with any of its lock
S-lock: share for read operations 
- After a transaction adds an S lock to a data object, other transactions cannot add an X lock to it, but can add an S lock
U lock: update 
-when a transaction wants to update a data object, first apply for the U lock of the object. The object has a U lock, allowing other transactions to add an S lock to it. At the last writing, apply to upgrade U lock to X lock. There is no need to add X
locks of different levels in the whole process. The
first level of lock protocol (dirty data, non-repeatable read) before 
any transaction writes a certain data, you must add X lock to it, and release it after the transaction ends. No S lock is used, and no lock is required to read data.
The end of the transaction includes the normal end (COMMIT) and abnormal end (ROLLBACK).
The second-level lockout protocol (non-repeatable read) 
meets the first-level lockout protocol, and any transaction must add an S lock to it before reading certain data, and then release it after reading it.
Third-level lockout protocol () 
meets the first-level lockout protocol , And any transaction must add an S lock to it before reading certain data, and release the lock after the transaction is over
. We can see the relationship between the isolation level and the locking protocol: 
first-level lock protocol -> READ UNCOMMITTED 
Level 2 Blocking Protocol -> READ COMMITTED 
Level 3 Blocking Protocol -> REPEATABLE READ

Other locking protocols
Two-phase locking protocol:

The entire transaction is divided into two phases, the first phase is locking, the latter phase is unlocking. In the locking phase, the transaction can only lock and manipulate data, but cannot unlock it. Until the first lock is released by the transaction, it enters the unlock phase. During this process, the transaction can only be unlocked, and data can also be manipulated. . The two-phase lock protocol makes transactions have a high degree of concurrency, because unlocking does not have to occur at the end of the transaction. Its shortcoming is that it does not solve the deadlock problem, because it has no sequence requirements in the locking phase. For example, two transactions apply for A and B locks respectively, and then apply for each other's locks, and enter the deadlock state at this time.
Theorem: If all transactions comply with the two-stage lock protocol, then all cross-scheduling of these transactions is serializable.
Multi-granularity locking protocol

Row-level locking: high overhead and slow locking; deadlock will occur; the smallest locking granularity, the lowest probability of lock conflicts, and the highest concurrency. Only implement
page-level locks at the storage engine layer : overhead and lock time are between table locks and row locks; deadlocks will occur; locking granularity is between table locks and row locks, and the concurrency is generally
table-level locks: overhead Small, fast locking; no deadlock; large locking granularity, the highest probability of lock conflicts, and the lowest concurrency.

Transactions and locks
Definition of
transaction Transaction (Transaction) generally refers to what to do or do. In computer terms, it refers to a program execution unit (unit) that accesses and may update various data items in the database. Transaction (Transaction) is a program execution unit (unit) that accesses and may update various data items in the database. Transactions are usually caused by the execution of user programs written in high-level database manipulation languages ​​or programming languages ​​(such as SQL, C++ or Java), and are defined by statements (or function calls) in the form of begin transaction and end transaction. A transaction consists of all operations performed between the beginning of the transaction and the end of the transaction. --From Baidu Encyclopedia

In short: Transaction (Transaction) is the basic unit of concurrency control.
The so-called transaction, it is a sequence of operations, these operations are either executed or not executed, it is an indivisible unit of work.
For example: in the bank transfer problem, account A transfers money to account B, there will be two corresponding operations, account A subtracts the corresponding transfer amount, and account B increases the corresponding transfer amount. These two operations are either executed or not executed. At this time, these two operations should be regarded as one transaction to ensure data consistency.

The characteristics of transactions ACID
atomicity (Atomicity): A transaction is an indivisible unit of work, and all operations included in the transaction are either done or not done.
Consistency (Consistency): The transaction must change the database from one consistent state to another consistent state. Consistency and atomicity are closely related.
Isolation: The execution of a transaction cannot be interfered by other transactions. That is to say, the internal operations of a transaction and the data used are isolated from other concurrent transactions, and each transaction executed concurrently cannot interfere with each other.
Durability: Durability, also known as permanence, means that once a transaction is committed, its changes to the data in the database should be permanent. The following other operations or failures should not have any effect on it.
The statement corresponding to the transaction
BEGIN TRANSACTION starts the transaction
COMMIT TRANSACTION commits the transaction
ROLLBACK TRANSACTION rolls back the transaction
Transaction concurrency control The
transaction does not consider the problems caused by isolation.
Dirty read: This type of exception is because one transaction reads another transaction modified but not committed Data, when the modified transaction performs a rollback operation, it will cause a read transaction exception.
Non-repeatable read: Read a row of data in the table within a transaction, and the results of multiple reads are different. (One transaction reads the data submitted by another transaction)
Phantom read (virtual read): Refers to reading the data inserted by another transaction in a transaction, resulting in inconsistent reading before and after. For example, read the entire table, that is , the number of rows in the table . For example, the first time a table is read, there are 3 records, and the second time the table is read, there are 4 records (different from non-repeatable reading: non-repeatable reading is for data The value of phantom reading is for the amount of data)
Database transaction isolation level (defined by the SQL standard)
READ UNCOMMITTED: Modifications in a transaction, even if they are not committed, other transactions can be seen. It is easy to cause many problems such as dirty reading. If it is not necessary,
READ COMMITTED is rarely used : the default isolation level of most database systems (except Mysql, etc.). This isolation level is the beginning of a transaction. You can only see the results of completed transactions. What is being executed cannot be seen by other transactions. At this level, the phenomenon of reading old data will appear, that is, the problem of non-repeatable reading.
REPEATABLE READ (repeatable read): solves the problem of dirty reads. This level ensures that the results of each row of records are consistent, that is, the problem of reading old data as mentioned above, but it cannot solve another problem. Row, as the name implies, is the row data that pops out suddenly. It means that a certain transaction is reading a certain range of data, but another transaction inserts data into this range of data, which causes the number of rows of data to be inconsistent when it is read multiple times. That is, phantom reading. -MYSQL default isolation level
SERIALIZABLE (serializable): the highest isolation level, it avoids the previous phantom reading situation by forcing transactions to be executed serially (note that it is serial), because he adds a large number of locks, resulting in a large number of Request timeout, so the performance will be lower. This isolation level may be considered when data consistency is particularly needed and the amount of concurrency does not need to be that large.
Isolation level Dirty read possibility Non-repeatable read possibility Phantasmal read possibility Locked read
READ UNCOMMITTED Yes Yes Yes No
READ COMMITTED No Yes Yes No
REPEATABLE READ No No Yes No
SERIALIZABLE No No No Yes
Database locks
The basic types of database locks:
X locks: exclusive for write operations 
-a transaction can add X locks to a data object without any locks, while other transactions cannot add any locks to it.
S Lock: share is used for read operations 
-after a transaction adds an S lock to a data object, other transactions cannot add X lock to it, but can add S lock
U lock: update 
-when the transaction wants to update the data object, apply for it first U lock of the object. The object has a U lock, allowing other transactions to add an S lock to it. At the last writing, apply to upgrade the U lock to X lock. There is no need to add X
locks of different levels in the whole process. The
first level of lock protocol (dirty data, non-repeatable read) before 
any transaction writes a certain data, you must add X lock to it, and release it after the transaction ends. No S lock is used, and no lock is required to read data.
The end of the transaction includes the normal end (COMMIT) and abnormal end (ROLLBACK).
The second-level lockout protocol (non-repeatable reading) 
meets the first-level lockout protocol, and any transaction must add an S lock to it before reading a certain data, and then release it after reading it.
Third-level lockout protocol () 
meets the first-level lockout protocol , And any transaction must add an S lock to it before reading certain data, and release the lock after the transaction is over
. We can see the relationship between the isolation level and the locking protocol: 
first-level lock protocol -> READ UNCOMMITTED 
Level 2 Blocking Protocol -> READ COMMITTED 
Level 3 Blocking Protocol -> REPEATABLE READ

Other locking protocols
Two-phase locking protocol:

The entire transaction is divided into two phases, the first phase is locking, the latter phase is unlocking. In the locking phase, the transaction can only lock and manipulate data, but cannot unlock it. Until the first lock is released by the transaction, it enters the unlock phase. During this process, the transaction can only be unlocked, and data can also be manipulated. . The two-phase lock protocol makes transactions have a high degree of concurrency, because unlocking does not have to occur at the end of the transaction. Its shortcoming is that it does not solve the deadlock problem, because it has no sequence requirements in the locking phase. For example, two transactions apply for A and B locks respectively, and then apply for each other's locks, and enter the deadlock state at this time.
Theorem: If all transactions comply with the two-stage lock protocol, then all cross-scheduling of these transactions is serializable.
Multi-granularity locking protocol

Row-level locking: high overhead and slow locking; deadlock will occur; the smallest locking granularity, the lowest probability of lock conflicts, and the highest concurrency. Only implement
page-level locks at the storage engine layer : overhead and lock time are between table locks and row locks; deadlocks will occur; locking granularity is between table locks and row locks, and the concurrency is generally
table-level locks: overhead Small, fast locking; no deadlock; large locking granularity, the highest probability of lock conflicts, and the lowest concurrency.
Reference: https://blog.csdn.net/qq_33983617/article/details/81836526

Guess you like

Origin blog.csdn.net/JHON07/article/details/86624848