In this article from the "Introduction to Database Systems", the contents of the article are some simple knowledge, does not involve implementation, there is a realization of the related articles link
I. Overview of Concurrency Control
Data inconsistency caused by concurrent operation
1. Modify loss (Lost the Update)
2. Non Repeatable Read (Non-Repeatable the Read)
3. Read "dirty" data (Dirty Read)
non-repeatable read transaction T1 refers to the read data, transaction T2 update operation is performed, so that T1 can not be reproduced before the first reading result. It includes three cases: after reading the first data T1, T2 on the data made to add, delete, modify operation, T1 again in the same manner as when reading data, the data can not be obtained with the same last.
Second, the type of blockade
Basic Block Type: exclusive lock / X Lock (write lock), a shared lock / S lock (read lock)
1. exclusive lock allows read and modify data, exclusion
2. Shared locks allow reading of data can not be modified to allow other plus transaction on the data shared lock is not allowed to add exclusive lock
3. compatibility: read locks and read locks are compatible, otherwise is incompatible (between two transactions) (same transaction,)
Third, the locking protocol (to solve the problem of data consistency)
A locking protocol (only write-lock)
(1) modify the data: The transaction must write-lock before data can be modified, released end of the transaction, prevent the loss of modification
(2) to read the data: the transaction only needs to read the data, not locked, so it can not be guaranteed repeat read and do not read "dirty" data (because it is not locked, can not guarantee that data is not modified other matters)
Two locking protocol (plus read lock only)
(1) the transaction must be added to read lock (does not allow other transactions from modifying data) before reading the data, transactions end release, can prevent the loss modify and read "dirty" data
on the following (2) does not guarantee repeatable read, because reading release the lock
Three locking protocol (plus read and write locks)
(1) modify the data: The transaction must be added before writing the data to modify the lock, release the end of the transaction, prevent the loss of modification
(2) to read the data: The transaction must be added to read lock before the data is read, the release end of the transaction, dirty data reading can be prevented and non-repeatable read
Fourth, live locks and deadlocks (problems caused by the blockade of the agreement and resolved)
Livelock
1. Description of the problem
in a transaction request in a permanently blocked waiting
2. Avoid Livelock
first-come, first-served policy
Deadlock
1. Problem Description
T1 and T2 can never be two transaction ends, the wait cycle
2. deadlock prevention: disruption of deadlock conditions (not suitable for the characteristics of the database
)
a blocking method: Each requires once all transaction must All data to be used in the lock, otherwise we can not continue (to reduce the degree of concurrency) (observe two-phase locking protocol)
order blocking method: a pre-defined data objects blocking order, all transactions in that order blockade (difficult to achieve)
3. deadlock diagnosis and lift the (way more commonly used)
(1) diagnostic deadlock
timeout method: wait time a transaction exceeds the prescribed time limit, it considers that a deadlock
wait-for graph method: Figure dynamic transactions wait All transactions reflected in the case of waiting (wait T1 if T2, then T1, T2 can be drawn between a directed edge from the point T1 T2), if there is found the circuit diagram, indicates that the system deadlock occurred
(2) deadlock lifting
select a minimum transaction processing costs deadlock, revoke, this transaction releases all locks held by the other transaction to continue running forever
Five concurrent scheduling serializability
Serializable schedule
And serial schedule serializable schedule, was believed to be correct scheduling
1. serialization (the Serializable) scheduling
a plurality of concurrently executing transactions is correct if and only if the same result with the order of these transactions are executed according to a result of serially
2. serializability (Serializability)
is correct scheduling of concurrent transactions guidelines; a given concurrent scheduling, if and only if it is serialized, it is considered proper scheduling
For the understanding of this statement, by the following three figures:
Conflict serializable schedule
Definition: A schedule Sc order conflicts in ensuring the same operation, by exchanging the order of the two transactions are not operated to give another scheduling conflict Sc ', if Sc' is serial, Sc is called scheduling conflict serializable of schedule
This method can be used to judge whether a scheduling conflict serializable (if a scheduling conflict is serializable, it must be serializable scheduling)
1. Conflict operation: refers to different transactions on the same data read and write operations and write operations (review: after adding data write locks and read locks, locked in during the lock does not allow other transaction data add to this)
2. the action can not be exchanged: two operations of the same transaction, different transactions conflict operations
FIG be understood from the following several:
Six, two-phase locking protocol (scheduling concurrency as serializability) (possible deadlocks)
A database management system commonly used in two-stage concurrent locking protocol implemented method of scheduling as serializability scheduled so as to ensure the correctness
Comply with two locking protocol must be a feasible scheduling
1. two stages: locking and unlocking the
lock phase (also known as extended phase) can not release any locks, you can apply locks to meet the eligibility criteria of
unlocking phase (also called the systolic phase) can not apply for any lock, the lock can only be released
2. after the release of a blockade, the transaction no longer apply for and obtain any other blockade
If the concurrent transactions obey the two-phase locking protocol, any concurrent scheduling policy on these matters are serializable
Seven blockade of granularity
Definition: Block size of the object
block objects: a logic unit, physical units
in a relational database, an object block:
- :: attribute value logic unit, a set of tuples, the relationship between the index value of the property, the entire index, the entire database
- Physical Unit: p (data or index pages), and other physical records
Block size
1. Blocked larger particle size, the less the database can be a data block unit, the smaller the degree of concurrency, the system overhead smaller
2. The smaller the particle size of the blockade, high degree of concurrency, but the greater the system overhead
granularity: database> relations> tuple
Example understood that:
if the block size is a data page, to modify the transaction T1 tuple L1, T1 must lock the entire data page A contains L1. If the transaction T1 T2 A tuple to modify the A locking L2, T2 is forced to wait until the release of A T1
if the particle size is blocked tuple, then T1 and T2 can simultaneously lock the L1 and L2, no mutual waiting, the system increases the degree of parallelism
another example, a transaction needs to read the entire table T, if the block size is a tuple, T must be locked in the table for each tuple, extremely expensive
3. Select block size (block considering two factors overhead and concurrency, appropriately selected block size)
need to handle a large number of user transactions tuple more relationships: a database-block units
need to handle a large number of user transactions tuple: to relationship-blocking units
only handle a small number of user transactions tuples: the block units tuple
Multi-granularity locking
1. Multi-size tree
in a tree structure to represent the multi-level block granularity, the entire database is the root node, representing the maximum data size, leaf nodes represent the smallest data size
three size tree:
2. Multi-granularity locking protocol
- It allows each node in the tree is a multi-granularity locking is independently
- For a node locked means that all descendants of the node node is also to be the same type of lock
- A data object may be blocking in two ways in a multi-granularity locking in: explicit and implicit Block blocked. Explicit Block: block directly applied on the data object; Implicit Block: is independent of the data object is not locked, due to its superior node lock the lock of the data object plus
3. When a node of the tree is a multi-granularity locking, the blockade needs conflict check, the check node data objects, superior, subordinate whether the lock has been added incompatible
Intent locks
1. Purpose: To improve the efficiency of the inspection system is a data object locking
2. The requirements
if the increase intent locks on a node, then the node of the lower layer node is being locked
to a node plus any basic lock , it must first upper node plus an intent lock
3. the lock intention classified
intent share lock (iS lock)
intent exclusive locks (IX lock)
shared intent exclusive lock (SIX lock)
When a data object to the lock, to give the corresponding increase of its parent intent locks
Transaction T1 S locks on to add a tuple in R1, have to first relational database plus IS R1 and lock
transaction T1 to add X lock on R1 is a tuple, and will have first relational database plus IX locks R1
If the addition of a data object SIX lock, showing its lock plus S, plus IX lock, i.e. SIX = S + IX. Plus SIX lock on a table, then the transaction is to read the entire table (so to add S lock on the table), and it will update individual tuples (so to add IX lock on the table)
4. intent lock compatibility matrix
The strength of the lock
6. The sequence of locking and unlocking
the lock: the top-down
unlock: BOTTOM
Eight other concurrency control mechanism
The timestamp method
optimistic Control Act
multi-version concurrency control