[Database] Targets of database concurrency control, analysis of serializable sequences, concurrency control scheduler model

Database concurrency control

专栏content

  • Handwritten database toadb
    This column mainly introduces how to develop from scratch, the steps of development, the principles involved in the development process, problems encountered, etc., so that everyone can keep up. And can be developed together so that everyone who needs it can become a participant.
    This column will be updated regularly, and the corresponding code will also be updated regularly. The code at each stage will be tagged to facilitate learning at each stage.

Open source contribution

Personal homepage:My homepage
Manage community: Open source database
Motto: When the sky is strong, a gentleman will strive to strive for self-improvement; when the terrain is weak, a gentleman will carry his wealth with kindness.


Insert image description here

Preface

With the rapid development of information technology, data has penetrated into various fields and become one of the most important assets of modern society. In this era of big data, database theory plays a vital role in data management, storage and processing. However, many readers may be confused about database theory and do not know how to choose a suitable database, how to design an effective database structure, and how to process and manage large amounts of data. Therefore, this column aims to provide readers with a comprehensive and in-depth guide to database theory to help them better understand and apply database technology.

Database theory is the study of how to effectively manage, store and retrieve data. In the modern information society, the amount of data is growing exponentially, and how to efficiently process and manage this data has become an important issue. At the same time, with the continuous development of emerging technologies such as cloud computing, the Internet of Things, and big data, the importance of database theory has become increasingly prominent.

Therefore, the sharing of this column hopes to improve everyone's knowledge and understanding of database theory and help interested friends.

Overview

The database will run many transactions at the same time, some are initiated by the client and some are generated within the database system. The concurrent execution of these transactions and their mutual impact will lead to inconsistent database status;

Although the execution status of each transaction is correct and no failures or errors occur, there is no way to ensure that the data is correct.

This requires unified coordination by the database, so that when various transactions are executed concurrently, they are executed in order according to certain specifications. This is what the scheduler in the database needs to do.

This article will talk about the concurrent scheduler of the database.

concurrent scheduler

The process by which the database scheduler allows concurrently executed transactions and maintains a consistent database state is concurrency control.

When a transaction is executed and database elements need to be read and written, a request will be made to the scheduler. In most cases, the scheduler will directly perform read and write processing; if the database element is not in the buffer, it will first request the buffer management The server makes a request to load it into the buffer.

In some cases, it is unsafe to execute immediately, and the scheduler will delay these requests. In some concurrency control technologies, the scheduler will even refuse, leading to the abort of the transaction.

Serializable

How the scheduler determines the safety of execution, that is, concurrent execution of transactions maintains the consistency of the database state, which is called serializability in the database;

There is of course another stronger, more important condition called conflicting serializability, which is what most databases really implement as schedulers.

serializability concept

When a transaction is executed in an isolated state (that is, no other transactions are executed concurrently with it), the database is converted from any one state to another consistent state; there are usually other transactions concurrently with it, so this principle cannot be applied .

Therefore, we need a serializable scheduling strategy so that the results of concurrent transactions can be serialized and scheduled to be the same as the results of executing one transaction at a time. Then the sequence of execution actions generated by this schedule is called serializable. oriented scheduling.

case analysis

Suppose there are two transactions T1 and T2, the operation diagonals are data A and data B, and the initial values ​​​​are both 25;
When each transaction performs calculations, it will first read Output the data, modify it, and then write it back;

  • The transaction execution sequence is that T1 is executed, and then T2 is executed.
Transaction T1 Transaction T2 DataA Data B
25 25
read(A,t)
t = t + 100
write(A,t) 125
read(B,t)
t = t + 100
write(B,t) 125
read(A,t)
t = t*2
write(A,t) 250
read(B,t)
t = t*2
write(B,t) 250
  • The transaction execution sequence is that after T2 is executed, T1 will be executed again.
Transaction T1 Transaction T2 DataA Data B
25 25
read(A,t)
t = t*2
write(A,t) 50
read(B,t)
t = t*2
write(B,t) 50
read(A,t)
t = t + 100
write(A,t) 150
read(B,t)
t = t + 100
write(B,t) 150

Judging from the execution sequence of these two transactions, the initial state is the same, but the state after execution is indeed different under different execution orders. The result of the serial execution of two transactions is related to the order in which the two transactions are executed.

The above is the result of the serial execution of two transactions. When the transactions are concurrent, will the result be the same as the serial execution?

  • A possible sequence in the concurrent execution of two transactions
Transaction T1 Transaction T2 DataA Data B
25 25
read(A,t)
t = t + 100
write(A,t) 125
read(A,t)
t = t*2
write(A,t) 250
read(B,t)
t = t*2
write(B,t) 50
read(B,t)
t = t + 100
write(B,t) 150

Obviously, in the execution sequence after this scheduling, the results obtained are A=250 and B=150, which are different from the results of the serial execution of the above two transactions. The final status is inconsistent, so this kind of scheduling is not serializable. .

How to achieve serialization after scheduling? The database achieves this goal through a serializable model.

serializable model

If multiple transactions are simply executed serially in order, consistent results can be achieved. The actions of multiple transactions can overlap, and at the same time, the results are the same as those executed sequentially. This serial scheduling method can complete business processing more efficiently.

In most databases, a model composed of three methods, namely blocking, timestamp and validity confirmation, is used to achieve serialization of concurrent transactions and ensure transaction characteristics.

Summarize

The goal of database concurrency control is that when transactions are executed concurrently, their execution sequence can be serialized and the state of the database remains consistent.
To implement the visitor pattern in C language, we can first define some structures to represent element objects and visitor objects. Element objects can be accessed by visitors, and visitor objects can access the element objects and perform some operations.

The following is a simple example that defines an element object of type string and a visitor object that outputs a string. In the main function, we create an element object of type string and then use the visitor object to access it and output "Hello, world!".

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

// 定义字符串类型的元素对象
typedef struct Element {
    
    
    char* str;
} Element;

// 定义输出字符串的访问者对象
typedef struct Visitor {
    
    
    void (*visit)(Element*);
} Visitor;

// 定义一个函数,用于创建字符串类型的元素对象
Element* create_element(const char* str) {
    
    
    Element* element = (Element*)malloc(sizeof(Element));
    element->str = (char*)malloc(strlen(str) + 1);
    strcpy(element->str, str);
    return element;
}

// 定义一个函数,用于销毁字符串类型的元素对象
void destroy_element(Element* element) {
    
    
    free(element->str);
    free(element);
}

// 定义一个函数,用于执行输出字符串的操作
void visit_element(Visitor* visitor, Element* element) {
    
    
    visitor->visit(element);
}

// 定义一个函数,用于创建输出字符串的访问者对象
Visitor* create_visitor() {
    
    
    Visitor* visitor = (Visitor*)malloc(sizeof(Visitor));
    visitor->visit = (void (*)(Element*))printf;
    return visitor;
}

// 定义一个函数,用于销毁输出字符串的访问者对象
void destroy_visitor(Visitor* visitor) {
    
    
    free(visitor);
}

int main() {
    
    
    // 创建一个字符串类型的元素对象,并赋值 "Hello, world!"
    Element* element = create_element("Hello, world!");
    // 创建一个输出字符串的访问者对象
    Visitor* visitor = create_visitor();
    // 使用访问者对象访问元素对象并输出 "Hello, world!"
    visit_element(visitor, element);
    // 销毁元素对象和访问者对象,释放内存资源
    destroy_element(element);
    destroy_visitor(visitor);
    return 0;
}

end

Thank you very much for your support. Don’t forget to leave your valuable comments while browsing. If you think it is worthy of encouragement, please like and save it. I will work harder!

Author’s email: [email protected]
If there are any errors or omissions, please point them out and learn from each other.

Guess you like

Origin blog.csdn.net/senllang/article/details/134725616