[Santian Engine] Huawei Santian Engine core architecture source code architecture, multi-threaded services, data node management, and metadata management between multiple nodes

cantian engine source code structure

Column content :

  • Santian Engine Kernel Architecture
    In this column, let’s talk about the Santian Engine kernel architecture, and how to achieve multi-read and multi-write of multi-machine database nodes, the difference from traditional active and standby, MPP, analysis of technical difficulties, data metadata synchronization, multi-master Support for fault recovery in case of nodes.

  • This column mainly introduces how to develop the handwritten database toadb
    from scratch, the steps of development, the principles involved in the development process, problems encountered, etc., so that everyone can keep up and develop together, so that everyone who needs it can become a participant .
    This column will be updated regularly, and the corresponding code will also be updated regularly. The code at each stage will be tagged to facilitate learning at each stage.

Open source contributions :

Personal homepage : My homepage
Management community : Open source database
Motto: When the sky is strong, a gentleman strives to strive for self-improvement; when the terrain is good, a gentleman carries great virtues.

Preface

The development of domestic databases is in full swing, and good news is heard at various conferences every year. In addition to the various technological evolutions of the database itself, this year Huawei released the Shentian Engine, which is a base form of the database, that is, all databases. The database can be built on the basis of Santian Engine to form a database system with a multi-master distributed architecture, which is why it is called an engine.

This column will talk about the internal architecture of Santian Engine in detail and how to adapt to Santian Engine.

Overview

The core code of the cantian engine has been basically open source, making enthusiasts eager to try it. Today I will share with you the source code structure of the cantian engine.

Source code address

gitee address

Source code directory structure

The source code is mainly in the pkg/src directory:

[senllang@hatch src]$ ll
total 44
drwxr-xr-x.  2 senllang develops 4096 Dec  9 14:03 cluster
-rw-r--r--.  1 senllang develops  792 Dec  9 14:03 CMakeLists.txt
drwxr-xr-x.  4 senllang develops   94 Dec  9 14:03 cmd
drwxr-xr-x.  4 senllang develops   56 Dec  9 14:03 cms
drwxr-xr-x.  3 senllang develops 8192 Dec  9 14:03 common
drwxr-xr-x.  3 senllang develops   39 Dec  9 14:03 driver
drwxr-xr-x.  2 senllang develops 4096 Dec  9 14:03 gstbox
drwxr-xr-x. 18 senllang develops 4096 Dec  9 14:03 kernel
drwxr-xr-x.  2 senllang develops 4096 Dec  9 14:03 mec
drwxr-xr-x.  2 senllang develops 4096 Dec  9 14:03 protocol
drwxr-xr-x.  2 senllang develops   66 Dec  9 14:03 rc
drwxr-xr-x.  3 senllang develops 4096 Dec  9 14:03 server
drwxr-xr-x.  4 senllang develops 4096 Dec  9 14:03 tse
drwxr-xr-x.  2 senllang develops   60 Dec  9 14:03 upgrade_check
drwxr-xr-x.  4 senllang develops   52 Dec  9 14:03 utils
drwxr-xr-x.  2 senllang develops   72 Dec  9 14:03 version

As shown above, you can see the directory named after the module.

Introduction to main modules

There are several main modules:

cantianLib

It can exist as a separate node, such as a data node or a coordination node. Generally used as a data node, it is the storage of management data. It only has the function of management and coordination. The CMS node is responsible for realizing data and equipment.

cantianlib is the bridge between the database engine and the CMS node. It handles data requests in various SQL execution scenarios and records transaction logs.

Source code directory

cantianLib code is located in
./pkg/src/server/

system structure

Insert image description here

It is also implemented with a multi-threaded architecture. Its front end is a DB agent, which listens for network requests from it, such as logging in to the database, executing DDL, DML, etc., and converts them into requests for data, and then sends them to the CMS. It mainly includes several services:

  • Kernel service is mainly used for data and lock request processing, and is responsible for buffer management, catalog metadata management, etc.;
  • DB background service mainly handles requests from DB agent and converts SQL scenarios into data requests; by binding sessions to thread services, it can handle concurrent access to data;

CMS module

CMS is a core service that mainly schedules and manages distributed resources, distributed locks, and storage devices. The whole is a multi-threaded architecture that interacts with the front-end through the network, responds to resource requests, and obtains them from the storage file system.

Source code directory

The source code of the CMS module is located in
./pkg/src/cms/cms

system structure

Its code structure is shown in the figure below
Insert image description here

The module has three main stages:

  • Initialization; including configuration loading and initialization, initialization of each service, and startup of service threads, devices and networks;
  • The running phase; mainly handles events from messages and processes and responds to messages; here the main requirements are allocation of different resources and locks;
  • Exit; when cms exits, clean and download data;

Summarize

Generally speaking, Santian engine is divided into three major parts.

  • The first is the agent that is combined with database computing. There is no open source seen here. It mainly applies to the data node for release of the data needed during database computing;
  • The second is cantianLib, which is the data node. This part maintains database-related data, such as data cache, database metadata, etc., synchronizes between multiple nodes, and also has some services that need to be coordinated by multiple nodes, such as backup, checkpoint, Transaction submission, etc.;
  • The third part is CMS, which is the physical storage part. It disperses data rows to storage devices, manages data distribution, and can read and write data from physical devices when requesting data;

Of course, the corresponding physical storage device is also considered the fourth part, which can correspond to NFS or DBstor. There is no open source for the optimization of this part.

end

Thank you very much for your support. Don’t forget to leave your valuable comments while browsing. If you think it is worthy of encouragement, please like and save it. I will work harder!

Author's email: [email protected]
If there are any errors or omissions, please point them out and learn from each other.

Guess you like

Origin blog.csdn.net/senllang/article/details/135075873