Database management (transactions, ACID, concurrency, blocking, serializability, isolation)

Database management (transactions, ACID, concurrency, blocking, serializability, isolation) (transfer)

 

1. Database transactions

1.1 Database Transaction refers to a series of operations performed as a single logical unit of work.

1.2 Four characteristics of transactions (ACID):
(1) Atomic (atomicity) A transaction must be an atomic unit of work; for its data modification, either all or none are performed. Often, operations associated with a transaction have a common goal and are interdependent. Atomicity eliminates the possibility of the system processing subsets of operations.
(2) Consistency (consistency) When a transaction is completed, all data must be kept in a consistent state. At the end of the transaction, all internal data structures (such as B-tree indexes or doubly linked lists) must be correct.
(3) isolation (insulation) (isolation) modifications made by concurrent transactions must be isolated from modifications made by any other concurrent transactions. The state of the data when the transaction views the data is either the state before another concurrent transaction modifies it, or the state after another transaction modifies it. The transaction does not view the data in the intermediate state. This is called serializability because it enables the starting data to be reloaded and a series of transactions to be replayed so that the data ends up in the same state as the original transaction executed. The highest isolation level is obtained when the transaction is serializable. At this level, the results obtained from a set of concurrently executable transactions are the same as those obtained by running each transaction sequentially. Because high isolation limits the number of transactions that can be executed in parallel, some applications lower the isolation level in exchange for greater throughput. Prevent data loss.
(4) Duration (durability) After the transaction is completed, its impact on the system is permanent. This modification will persist even in the event of a fatal system failure.

1.3 There are three types of transaction models: 
(1) Implicit transaction means that each data operation statement automatically becomes a transaction; each transaction has no explicit start and end markers.
(2) An explicit transaction refers to a transaction with an explicit start and end marker; or the start is implicit, and the end of the transaction is clearly marked. (begin transaction transaction starts --commit transaction ends normally --rollback transaction rolls back in error)
(3) Automatic transaction is automatically defaulted by the system, and the start and end do not need to be marked. 

2. Concurrency control

2.1 Common concurrent and concurrent consistency problems include: lost update, dirty read, unrepeatable read, phantom read, also known as phantom read, phantom read, inconsistency Reads, or ghost data, are often grouped with non-repeatable reads).

2.2 In order to solve the problem of concurrency inconsistency, the SQL standard defines four types of isolation levels, including some specific rules to limit which changes inside and outside the transaction are visible and which are invisible. Low-level isolation levels generally support higher concurrent processing and have lower system overhead.
(1) ReadUncommitted (read uncommitted content) At this isolation level, all transactions can see the execution results of other uncommitted transactions. This isolation level is rarely used in practical applications because its performance is not much better than other levels. Reading uncommitted data is also known as a dirty read.
(2) ReadCommitted (read commit content) This is the default isolation level of most database systems (such as SQLSever, Oracle, but not MySQL default). It satisfies the simple definition of isolation: a transaction can only see changes made by committed transactions. This isolation level also supports the so-called non-repeatable read, because other instances of the same transaction may have new commits during the processing of this instance, so the same select may return different results.
(3) RepeatableRead (repeatable read) This is the default transaction isolation level of MySQL, which ensures that multiple instances of the same transaction will see the same data rows when reading data concurrently. In theory, though, this could lead to phantom reads.
(4) Serializable (serializable) This is the highest isolation level, which solves the phantom reading problem by forcing transaction ordering so that they cannot conflict with each other. In short, it adds a shared lock on each read data row. At this level, a lot of timeouts and lock contention can result.

 

  Dirty Read Non-Repeatable Read Error Read/Phantom Read/Phantom Read
Read Uncommitted
Read Committed ×
Repeatable Read × ×
Serializable Serializable × × ×

2.3 In order to reflect the isolation level, the database uses locking technology (locking)

(1) S locks, Share Locks, shared locks , read locks, the locked object can be read by the lock-holding transaction, but cannot be modified, and other transactions can also add s locks on it.
(2) X locks, Exclusive Locks, exclusive locks , write locks, the locked object can only be read and modified by the transaction holding the lock, and other transactions cannot add other locks to the object, nor can it be read and modified. the object.

 2.4 The introduction of blocking technology brings about the "deadlock" problem

There are two types of methods to solve deadlocks:
(1) Prevention method: one-time blocking method (each transaction must lock all the data used, otherwise it cannot be executed) and sequential blocking method (for the data used in accordance with the preset sequential locking).
(2) Diagnosis and release method: timeout method (a transaction exceeds the specified time, it is determined that a deadlock occurs) and waiting graph method ( transaction waiting graph is a directed graph G=(T, U), T is a set of nodes, each A node represents a running transaction; U is a set of edges, each edge represents the transaction waiting situation. If transaction T1 waits for transaction T2, there is a directed edge between T1 and T2, from T1 to T2. If it is found If there is a loop in the figure, it means that there is a deadlock in the system).

2.5 Locking Protocol

2.5.1 The three-level blocking protocol of the blocking protocol to ensure data consistency

There are three types of data inconsistencies that can arise from incorrect scheduling of concurrent operations: lost modifications, non-repeatable reads, and reads of "dirty" data. The three-tier lockdown protocols each address this to varying degrees.
(1) Level 1 locking protocol : Before transaction T modifies data R, it must lock X first, and it will not be released until the end of the transaction. Transaction end includes normal end and abnormal end. A level 1 blocking protocol prevents lost modifications and guarantees that transaction T is recoverable. In the level 1 blocking protocol, if you just read the data without modifying it, there is no need to lock it, so it cannot guarantee repeatable reading and no reading of "dirty" data.
(2) Level 2 blocking protocol : The level 1 blocking protocol plus transaction T must add an S lock to the data R before reading it, and the S lock can be released after reading. In addition to preventing lost modifications , the level 2 blocking protocol further prevents reading of "dirty" data . In the level 2 blocking protocol, since the S lock can be released after reading the data, it cannot guarantee repeated reading.
(3) Level 3 blocking protocol : Level 1 blocking protocol plus transaction T must add S lock to it before reading data R, and it will not be released until the transaction result. In addition to preventing lost modifications and unread "dirty" data , the Level 3 blocking protocol further prevents non-repeatable reads

 2.5.2 Two-stage Locking Protocol of Locking Protocols to Ensure Serializability of Parallel Scheduling

Serializability is the only criterion for the correctness of parallel scheduling. Two-stage lock (2PL for short) protocol is a blocking protocol provided to ensure serializability of parallel scheduling.
The two-stage lock protocol stipulates that
before any data is read or written, the transaction must first obtain the blockade of the data, and after releasing a blockade, the transaction will not obtain any other blockade.
The meaning of the so-called "two-stage" lock is that the transaction is divided into two stages. The first stage is to obtain the blockade, also known as the expansion stage, and the second stage is to release the blockade, also known as the contraction stage.

 2.6 Serializable

A schedule is a time-ordered sequence of important operations for one or more transactions.
A schedule is said to be serial if its actions are first all actions of one transaction, then all actions of another transaction, and so on, with no mixing of actions.
The transactional correctness principle tells us that each serial schedule will keep the database state consistent. Usually, regardless of the initial state of the database, a schedule has the same impact on the database state as a serial schedule, and we say that this schedule is serializable. [A parallel schedule becomes serializable if the result of a parallel schedule is equivalent to the result of a serial schedule]

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=326529561&siteId=291194637