Detailed explanation of database transactions and locks

1. What is a transaction?

Refers to a series of operations performed as a single logical unit of work, either completely or not at all. Transaction processing ensures that data-oriented resources are not permanently updated unless all operations within the transactional unit complete successfully. By combining a set of related operations into an all-success or all-fail unit, error recovery can be simplified and applications made more reliable. For a logical unit of work to be a transaction, it must satisfy the so-called ACID (Atomicity, Consistency, Isolation, and Durability) properties. Transaction is a logical unit of work in database operation, and the transaction management subsystem responsible for transaction processing.

Take an example to deepen your understanding: the same bank transfer, A transfers 1,000 yuan to B, there are two operations here, one is to deduct 1,000 yuan from account A, and the two operations are to add 1,000 yuan to account B. The two constitute Transfer this transaction.

Both operations are successful, account A is debited by 1,000 yuan, account B is increased by 1,000 yuan, and the transaction is successful Both operations fail, the amount of account A and account B remain unchanged, and the transaction fails

Finally, think about it, how can the A account deduct 1,000 yuan, and the B account amount remains unchanged? If you put the two operations in one transaction, and it is the internal transaction support provided by the database, there will be no problem. But developers put two operations in two transactions, and the second transaction fails and an intermediate state occurs. In reality, if the distributed transactions implemented by yourself are not properly handled, an intermediate state will also appear. This is not the fault of the transaction. The transaction itself stipulates that there will be no intermediate state. It is the solution made by the transaction implementer.

2. 4 characteristics of transactions

Atomic: A transaction must be an atomic unit of work; either all or none of its data modifications are performed. Often, operations associated with a transaction have a common goal and are interdependent. If the system performs only a subset of these operations, the overall goal of the transaction may be violated. Atomicity eliminates the possibility of the system processing subsets of operations.

Consistency: The consistency of a transaction means that the database must be in a consistent state before and after a transaction is executed. This property is called transactional consistency. A database is said to be consistent if its state satisfies all integrity constraints.

Isolation: Modifications made by concurrent transactions must be isolated from modifications made by any other concurrent transaction. The state of the data when the transaction views the data, whether it is the state before another transaction is executed or a state in the middle, and what influence there is between them, can be controlled by setting the isolation level.

Durability: After the transaction ends, the result of the transaction processing must be able to be solidified, that is, the data written to the database file will not be lost even if the machine is down, and its impact on the system is permanent.

3. Transaction concurrency control

From another direction, if we do not control the concurrency of transactions, let's look at the abnormal situations in the concurrent operation of the database, some of which are acceptable to us, and some are unacceptable. Note that the exception here is a specific context. down, not necessarily wrong or something. Suppose there is an order table with a field called count, which is used as a count and the current value is 100

The first type of lost update (Update Lost): This type of update loss is due to rollback, so it is also called rollback loss. At this time, two transactions update count at the same time, both transactions read 100, transaction one is successfully updated and committed, count=100+1=101, transaction two fails to update for some reason, and then rolls back, transaction two The count is restored to the 100 it read at the beginning, and the update of transaction one is lost.

Dirty Read: This kind of exception occurs because one transaction reads data that has been modified but not committed by another transaction. For example, transaction 1 updated count=101, but did not commit, transaction 2 now reads count with a value of 101 instead of 100, then transaction 1 rolls back for some reason, and then the second transaction reads This value of is the beginning of the nightmare.

Not Repeatable Read: This kind of exception is that a transaction executes two or more queries on the same row of data, but gets different results, that is, you cannot repeat (ie multiple times) within a transaction. ) to read a row of data, if you do this, there is no guarantee that the result of each read will be the same, it may be the same or not. The result is that other transactions have updated the row data between the two queries. For example, transaction 1 first queries count, and the value is 100. At this time, transaction 2 updates count=101. Once transaction reads count again, the value will become 101, and the results of the two reads are different.

The second type of lost update (Second Update Lost): This type of update is lost because the update is overwritten by other transactions, which can also be called overwrite loss. For example, two transactions update count at the same time, and both read the initial value of 100. The first transaction is successfully updated and submitted, count=100+1=101, and the second transaction is successfully updated and submitted, count=100+1=101 , Since the count of transaction two still increases from 100, the update of transaction one is lost.

Phantom Read: Phantom read is similar to non-repeatable read, but it is not the value of the data but the amount of data. This kind of exception is that the amount of data in a transaction is different in the process of two queries, which makes people think that hallucinations occur. This is probably how hallucinations are obtained. For example, transaction 1 queries how many records there are in the order table, transaction 2 adds a new record, and then transaction 1 checks how many records there are in the order table, and finds that it is different from the first time, which is phantom reading.

Database transaction isolation level

Seeing the problems mentioned above, you may think, I wipe, what should I do with so many pits. In fact, the above situations are not necessarily avoided. It depends on your business requirements, including the load of your database, which will affect your decision. I don't know if you have noticed that the above abnormal situations are caused by the mutual influence of multiple transactions, which means that there needs to be some way between the two transactions to separate them to some extent, reduce and avoid mutual influence. At this time, the database transaction isolation level comes into play, and the database isolation level is generally implemented through database locks.

Read Uncommitted: This isolation level means that even if the update statement of a transaction is not committed, other transactions can read the change, and several exceptions may occur. It is very error-prone, has no security at all, and is basically not used.

Read Committed: This isolation level means that a transaction can only see the committed updates of other transactions, but not uncommitted updates, eliminating dirty reads and the first type of lost updates, which is the case for most databases. The default isolation level, such as Oracle , Sqlserver.

Repeatable Read: This isolation level means that the same query for the data content is performed twice or more in a transaction, and the results obtained are the same, but the query for the number of data items is not guaranteed to be the same, as long as It is forbidden to write if there is read-modified data, which eliminates non-repeatable read and the second type of update loss, which is the default isolation level of Mysql database .

Serializable: This means that other transactions are not allowed to execute concurrently when this transaction is executed. For fully serialized reads, writing is prohibited as long as there is a read, but it can be read at the same time, eliminating phantom reads. This is the highest level of transaction isolation. Although it is the safest and most worry-free, it is too inefficient and generally not used.

The following are the control capabilities of various isolation levels for each exception:

Level\Exception Type 1 Update Lost Dirty Read Non-Repeatable Read Type 2 Lost Update Phantom Read

Read uncommitted YYYYY

Read Submitted NNYYY

Repeatable read NNNNY

Serialize NNNNN

Database lock classification

Generally, it can be divided into two categories, one is pessimistic lock, the other is optimistic lock, pessimistic lock is generally what we usually call database lock mechanism, optimistic lock generally refers to a lock mechanism implemented by users themselves, such as optimistic lock implemented by hibernate or even Programming languages ​​also have applications of the idea of ​​optimistic locking.

Pessimistic lock: As the name implies, it is very pessimistic. It holds a conservative attitude towards data being modified by the outside world, believing that the data will be modified at any time, so the data needs to be locked during the entire data processing. Pessimistic locks generally rely on the lock mechanism provided by relational databases. In fact, row locks and table locks in relational databases are pessimistic locks, whether they are read-write locks.

Pessimistic locks are divided according to the nature of use:

Shared locks (Share locks are abbreviated as S locks): Also known as read locks, transaction A adds s locks to object T, and other transactions can only add S to T. Multiple transactions can read at the same time, but cannot have write operations until A releases the S lock.

Exclusive locks (abbreviated as X locks): Also known as write locks, after transaction A adds X lock to object T, other transactions cannot add any locks to T, only transaction A can read and write object T until A releases X lock.

Update lock (referred to as U lock for short): It is used to reserve an X lock to be applied to this object, which allows other transactions to read, but does not allow U lock or X lock to be applied; when the object being read is about to be updated, then Upgrade to X lock, mainly used to prevent deadlock. Because when using shared locks, the operation of modifying data is divided into two steps. First, a shared lock is obtained, the data is read, and then the shared lock is upgraded to an exclusive lock, and then the modification operation is performed. In this way, if two or more transactions apply for a shared lock on an object at the same time, when modifying data, these transactions must upgrade the shared lock to an exclusive lock. None of these transactions will release the shared lock but wait for the other party to release it, thus causing a deadlock. If a data directly applies for an update lock before modification, and then upgrades it to an exclusive lock when the data is modified, deadlock can be avoided.

Pessimistic locks are divided according to their scope:

Row lock: The scope of the lock is at the row level. The database can determine which rows need to be locked and use row locks. If it does not know which rows will be affected, it will use table locks. For example, a user table user, has the primary key id and the user's birthday. When you use a statement like update ... where id=?, the database knows exactly which row will be affected, and it will use row locks. When you use update ... where birthday =? When such a statement does not know in advance which rows will be affected, table locks may be used. Table lock: The scope of the lock is the entire table.

Optimistic lock: As the name implies, it is very optimistic. Every time you operate the data, you think that no one will come back to modify it, so you don't lock it, but when you update it, you will judge whether the data has been modified during this period, and you need to do it yourself. accomplish. Since there are pessimistic locks provided by the database that can be easily used, why use optimistic locks? When there are far more read operations than write operations, most of them are reads. At this time, an update operation lock will block all reads, reducing throughput. Finally, the lock needs to be released. The lock requires some overhead. We only need to find a way to solve the synchronization problem of a very small amount of update operations. In other words, if the gap between read and write ratios is not very large or your system is not responding in a timely manner, and the throughput is a bottleneck, then do not use optimistic locking, which increases complexity and brings additional risks.

Optimistic locking implementation:

Version number (recorded as version): It is to add a version identifier to the data. In the database, a version field is added to the table. Each update adds 1 to this field. When reading the data, the version is read out and compared when updating. version, if the version you started to read can be updated, if the current version is larger than the old version, it means that other transactions have updated the data and increased the version number. At this time, you will get a notification that cannot be updated. Decide what to do based on this notification, such as starting over again. The key here is that the two actions of judging version and updating need to be executed as an atomic unit. Otherwise, other transactions have modified the version before the official update after you judge that it can be updated. At this time, if you update again, it may overwrite the previous transaction. The update caused the second type of lost update, so you can use a statement such as update ... where ... and version=”old version” to be notified according to whether the returned result is 0 or non-0. If it is 0, the update is not successful, because The version has been changed. If it returns non-zero, the update is successful. Timestamp: It is basically the same as the version number. It is only judged by the timestamp. Note that the timestamp of the database server cannot be the time of the business system. Fields to be updated: Similar to the version number method, except that no additional fields are added, and the valid data fields are directly used as version control information, because sometimes we may not be able to change the database table structure of the old system. Suppose there is a field to be updated called count, first read the count, and compare whether the value of count in the database is the value I expect (that is, the value to start reading) when updating, and if it is, I will modify the value of the count Update to this field, otherwise the update fails. Java's basic types of atomic type objects such as AtomicInteger are this idea.

All fields: Similar to the fields to be updated, except that all fields are used as version control information, and the update will be performed only if all fields have not changed.

The difference between several methods of optimistic locking:

The new system design can use the version method and the timestamp method. It is necessary to add fields. The application scope is the entire data. No matter which field is modified, the version will be updated. That is to say, the two unrelated fields of the same record updated by two transactions are mutually exclusive. , cannot be synchronized. When the old system cannot modify the database table structure, the data field is used as version control information, and no new fields are required. The field to be updated can be synchronized as long as the fields modified by other transactions and the current transaction do not overlap, and the concurrency is higher. .

 

Reference link: https://www.2cto.com/database/201609/546060.html

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325267396&siteId=291194637