[Database] The database is a scheduler based on the blocking mechanism, which makes conflicts serializable and ensures transaction and scheduling consistency.

blocking makes serializable

专栏content

  • Handwritten database toadb
    This column mainly introduces how to develop from scratch, the steps of development, the principles involved in the development process, problems encountered, etc., so that everyone can keep up. And can be developed together so that everyone who needs it can become a participant.
    This column will be updated regularly, and the corresponding code will also be updated regularly. The code at each stage will be tagged to facilitate learning at each stage.

Open source contribution

Personal homepage:My homepage
Manage community: Open source database
Motto: When the sky is strong, a gentleman will strive to strive for self-improvement; when the terrain is weak, a gentleman will carry his wealth with kindness.

Preface

With the rapid development of information technology, data has penetrated into various fields and become one of the most important assets of modern society. In this era of big data, database theory plays a vital role in data management, storage and processing. However, many readers may be confused about database theory and do not know how to choose a suitable database, how to design an effective database structure, and how to process and manage large amounts of data. Therefore, this column aims to provide readers with a comprehensive and in-depth guide to database theory to help them better understand and apply database technology.

Database theory is the study of how to effectively manage, store and retrieve data. In the modern information society, the amount of data is growing exponentially, and how to efficiently process and manage this data has become an important issue. At the same time, with the continuous development of emerging technologies such as cloud computing, the Internet of Things, and big data, the importance of database theory has become increasingly prominent.

Therefore, the sharing of this column hopes to improve everyone's knowledge and understanding of database theory and help interested friends.

Overview

Database concurrency control, the most commonly used scheduler structure, is to lock the database access elements to prevent deserialization behavior.

To put it simply, when a transaction accesses an element, it first acquires its lock to avoid deserialized access by other transactions.

This article will introduce this blocking mode, which allows the scheduler to generate serializable action sequences, and its problems.

Lock

In concurrent programming, we will come into contact with many types of system variables, which allow us to serialize execution of a certain code area. These system variables are also used in the database to implement locks in the database.

makes the database elements protected by locks only accessible in a serial manner. The lock is acquired before the access. If the acquisition is different, you can only wait or abort. Once the lock is acquired, the operation can be performed. After the operation is completed, Release the lock.
In an actual database, there are many types of locks, which will be introduced in detail later.

When using locks for the scheduler, it must be correct on both structures:

  • transaction structure,

First, read and write operations can only be performed after locking and before releasing the lock;
Second, if an element is locked, the lock must be released after use;< /span>

  • Scheduling structure

The same element can only be locked by one transaction; while another transaction that attempts to lock will either abort or wait; this is implemented differently in different databases.

block scheduler

The figure below shows a blocked scheduler model architecture.
Insert image description here

Let's use a simple lock model to introduce it. If we only have one kind of lock for each database element, we must acquire the lock before reading and writing, and release the lock after use.

In order for the scheduler to make decisions, the scheduler has a lock table that records the lock status of each database element. If there is a lock on it, it means that a transaction is using the database element and other transactions cannot access it.

  • Examples are as follows:
Transaction T1 Transaction T2 DataA Data B
25 25
lock(A); read(A);
A = A + 100
write(A); unlock(A); 125
lock(A), read(A)
A = A*2
write(A);unlock(A) 250
lock(B); read(B);
B = B*2
write(B);unlock(B); 50
lock(B); read(B);
B = B + 100
write(B);unlock(B); 150

Through this example, we see an interesting phenomenon. Although the scheduler has added locks to transaction access, and transactions access the same database elements serially, the final schedule is not serializable, and the final status is inconsistent. of.

The description is just a simple blockade, and the result in the example is not a conflict serializable result.

two stage lockdown

Here we introduce a locking condition called two-phase locking (2PL). Under this surprising condition, transaction scheduling can be guaranteed to be conflict-serializable.

principle

Two-stage blockade and unlocking need to meet the following conditions:

  • In each transaction, all blocking requests must precede all unlocking requests;

Here the transaction is divided into two stages,

  • The first stage is the blocking stage. In this stage, the database elements that need to be accessed are locked in turn;
  • The second stage is the unlocking stage. In this stage, there can be no more locking requests; the lock is released in sequence;

Two-stage blocking, like transaction consistency, limits the order of actions in a transaction. Transactions that meet this condition are called two-stage blocking transactions.

  • Example

Then let’s look at the above example again, after scheduling it according to 2PL:

Transaction T1 Transaction T2 DataA Data B
25 25
lock(A); read(A);
A = A + 100
write(A); lock(B); unlock(A); 125
lock(A), read(A)
A = A*2
write(A); 250
lock(B); rejected
read(B);
B = B + 100
write(B);unlock(B); 125
lock(B); unlock(A); read(B);
B = B*2
write(B);unlock(B); 250

After scheduling that meets the 2PL conditions, we can see that the final result is conflict serializable, which is the same result as the serial execution of two transactions.

analyze

How does the two-stage lockdown work?

In fact, if we observe carefully, we will find that the order of transaction execution is consistent with the order of first unlocking, that is, transactions are executed serially on each database element in the order of unlocking, thus ensuring the effect of serial execution. Conflict serializability is achieved.

Of course, it can be strictly proved by induction, so I won’t go into details here.

deadlock

Although two-stage blocking can make transaction scheduling conflict serializable, it has an unsolved problem - deadlock, that is, the scheduler forces several transactions to wait for a lock held by another transaction, and the transaction requires Wait for locks held by previous transactions.

Examples are as follows:

Transaction T1 Transaction T2 DataA Data B
25 25
lock(A); read(A);
lock(B); read(B);
A = A + 100
B = B*2
write(A); 125
write(B); 50
lock(B) rejected
lock(A) rejected

Now, neither transaction can continue to proceed downward, and both will wait forever. This will be shared in the following introduction, and this problem will be solved.

Summarize

Through the blocking mode, the scheduler can produce conflict-serializable scheduling, but when multiple database elements are involved in each transaction, there is a risk of deadlock.

end

Thank you very much for your support. Don’t forget to leave your valuable comments while browsing. If you think it is worthy of encouragement, please like and save it. I will work harder!

Author’s email: [email protected]
If there are any errors or omissions, please point them out and learn from each other.

Guess you like

Origin blog.csdn.net/senllang/article/details/134759655