blocking makes serializable
专栏content:
- Handwritten database toadb
This column mainly introduces how to develop from scratch, the steps of development, the principles involved in the development process, problems encountered, etc., so that everyone can keep up. And can be developed together so that everyone who needs it can become a participant.
This column will be updated regularly, and the corresponding code will also be updated regularly. The code at each stage will be tagged to facilitate learning at each stage.
Open source contribution:
Personal homepage:My homepage
Manage community: Open source database
Motto: When the sky is strong, a gentleman will strive to strive for self-improvement; when the terrain is weak, a gentleman will carry his wealth with kindness.
Article directory
Preface
With the rapid development of information technology, data has penetrated into various fields and become one of the most important assets of modern society. In this era of big data, database theory plays a vital role in data management, storage and processing. However, many readers may be confused about database theory and do not know how to choose a suitable database, how to design an effective database structure, and how to process and manage large amounts of data. Therefore, this column aims to provide readers with a comprehensive and in-depth guide to database theory to help them better understand and apply database technology.
Database theory is the study of how to effectively manage, store and retrieve data. In the modern information society, the amount of data is growing exponentially, and how to efficiently process and manage this data has become an important issue. At the same time, with the continuous development of emerging technologies such as cloud computing, the Internet of Things, and big data, the importance of database theory has become increasingly prominent.
Therefore, the sharing of this column hopes to improve everyone's knowledge and understanding of database theory and help interested friends.
Overview
Database concurrency control, the most commonly used scheduler structure, is to lock the database access elements to prevent deserialization behavior.
To put it simply, when a transaction accesses an element, it first acquires its lock to avoid deserialized access by other transactions.
This article will introduce this blocking mode, which allows the scheduler to generate serializable action sequences, and its problems.
Lock
In concurrent programming, we will come into contact with many types of system variables, which allow us to serialize execution of a certain code area. These system variables are also used in the database to implement locks in the database.
makes the database elements protected by locks only accessible in a serial manner. The lock is acquired before the access. If the acquisition is different, you can only wait or abort. Once the lock is acquired, the operation can be performed. After the operation is completed, Release the lock.
In an actual database, there are many types of locks, which will be introduced in detail later.
When using locks for the scheduler, it must be correct on both structures:
- transaction structure,
First, read and write operations can only be performed after locking and before releasing the lock;
Second, if an element is locked, the lock must be released after use;< /span>
- Scheduling structure
The same element can only be locked by one transaction; while another transaction that attempts to lock will either abort or wait; this is implemented differently in different databases.
block scheduler
The figure below shows a blocked scheduler model architecture.
Let's use a simple lock model to introduce it. If we only have one kind of lock for each database element, we must acquire the lock before reading and writing, and release the lock after use.
In order for the scheduler to make decisions, the scheduler has a lock table that records the lock status of each database element. If there is a lock on it, it means that a transaction is using the database element and other transactions cannot access it.
- Examples are as follows:
Transaction T1 | Transaction T2 | DataA | Data B |
---|---|---|---|
25 | 25 | ||
lock(A); read(A); | |||
A = A + 100 | |||
write(A); unlock(A); | 125 | ||
lock(A), read(A) | |||
A = A*2 | |||
write(A);unlock(A) | 250 | ||
lock(B); read(B); | |||
B = B*2 | |||
write(B);unlock(B); | 50 | ||
lock(B); read(B); | |||
B = B + 100 | |||
write(B);unlock(B); | 150 |
Through this example, we see an interesting phenomenon. Although the scheduler has added locks to transaction access, and transactions access the same database elements serially, the final schedule is not serializable, and the final status is inconsistent. of.
The description is just a simple blockade, and the result in the example is not a conflict serializable result.
two stage lockdown
Here we introduce a locking condition called two-phase locking (2PL). Under this surprising condition, transaction scheduling can be guaranteed to be conflict-serializable.
principle
Two-stage blockade and unlocking need to meet the following conditions:
- In each transaction, all blocking requests must precede all unlocking requests;
Here the transaction is divided into two stages,
- The first stage is the blocking stage. In this stage, the database elements that need to be accessed are locked in turn;
- The second stage is the unlocking stage. In this stage, there can be no more locking requests; the lock is released in sequence;
Two-stage blocking, like transaction consistency, limits the order of actions in a transaction. Transactions that meet this condition are called two-stage blocking transactions.
- Example
Then let’s look at the above example again, after scheduling it according to 2PL:
Transaction T1 | Transaction T2 | DataA | Data B |
---|---|---|---|
25 | 25 | ||
lock(A); read(A); | |||
A = A + 100 | |||
write(A); lock(B); unlock(A); | 125 | ||
lock(A), read(A) | |||
A = A*2 | |||
write(A); | 250 | ||
lock(B); rejected | |||
read(B); | |||
B = B + 100 | |||
write(B);unlock(B); | 125 | ||
lock(B); unlock(A); read(B); | |||
B = B*2 | |||
write(B);unlock(B); | 250 |
After scheduling that meets the 2PL conditions, we can see that the final result is conflict serializable, which is the same result as the serial execution of two transactions.
analyze
How does the two-stage lockdown work?
In fact, if we observe carefully, we will find that the order of transaction execution is consistent with the order of first unlocking, that is, transactions are executed serially on each database element in the order of unlocking, thus ensuring the effect of serial execution. Conflict serializability is achieved.
Of course, it can be strictly proved by induction, so I won’t go into details here.
deadlock
Although two-stage blocking can make transaction scheduling conflict serializable, it has an unsolved problem - deadlock, that is, the scheduler forces several transactions to wait for a lock held by another transaction, and the transaction requires Wait for locks held by previous transactions.
Examples are as follows:
Transaction T1 | Transaction T2 | DataA | Data B |
---|---|---|---|
25 | 25 | ||
lock(A); read(A); | |||
lock(B); read(B); | |||
A = A + 100 | |||
B = B*2 | |||
write(A); | 125 | ||
write(B); | 50 | ||
lock(B) rejected | |||
lock(A) rejected |
Now, neither transaction can continue to proceed downward, and both will wait forever. This will be shared in the following introduction, and this problem will be solved.
Summarize
Through the blocking mode, the scheduler can produce conflict-serializable scheduling, but when multiple database elements are involved in each transaction, there is a risk of deadlock.
end
Thank you very much for your support. Don’t forget to leave your valuable comments while browsing. If you think it is worthy of encouragement, please like and save it. I will work harder!
Author’s email: [email protected]
If there are any errors or omissions, please point them out and learn from each other.