Introduction to GaussDB database management system

Insert image description here

1. Development of GaussDB

Insert image description here

2. Ecology of GaussDB

Insert image description here

  1. Internal:
    Cloud + automation solution. Through the cloudization of database operation infrastructure, the daily work of DBA (database administrator) and operation and maintenance personnel
    is automated.
  2. External:
    Adopt an ecological connection and integration solution that connects and certifies ecological partners around the database to solve ecological problems such as difficulty in obtaining developers/DBAs and difficulty in connecting applications.

3.GaussDB characteristics and technical competitiveness

  1. Distributed:
    Distributed transaction capabilities + cross-DC (Data Center, data center) high availability capabilities solve bottlenecks such as insufficient scalability and availability of traditional relational databases.
  2. Cloud architecture:
    Cloud architecture that meets public cloud, private cloud and hybrid cloud scenarios, and cloud database requirements that meet diverse demand scenarios.
  3. Mixed load:
    Run multiple workloads in a set of databases, simplify system deployment, eliminate data consistency problems caused by data copying or relocation, and also improve the system's performance. Reliability and real-time.
  4. Multi-mode heterogeneity:
    Build a new database to manage multi-mode data such as mobile Internet, Internet of Things, artificial intelligence, time series, images, etc., and realize full utilization by transforming and optimizing the database architecture The computing power advantage of “general-purpose processor + heterogeneous accelerator”.
  5. AI+DB:
    With the accuracy and scope of application of the A algorithm, it supports problem solving in specific scenarios such as database parameter tuning and SQL execution optimization. It supports image, language, Extract structured information from unstructured data such as text.

4. Design ideas and user objects

Design idea: Use cloud technology and A technology to provide the infrastructure of cloud-deployed database system services with an extremely wide scope of space management to achieve the sharing of computer resources.

  1. Public cloud database system services: oriented to the database needs of small and medium-sized enterprises. Provide public cloud database system services for small and medium-sized enterprises, significantly reducing the operating costs of such entities.
  2. Private cloud database system services: oriented to the database service needs of medium and large enterprises. Purchase a large number of equipment within the entity and build related PaS layer and SaS layer at the same time. Database service is a very critical type of service. This enables the new information systems within the internal and various departments to share relevant resources, realize data sharing at the same time, and reduce overall maintenance costs, ultimately reducing the total cost of ownership.
  3. Database system services
    Choose public cloud services and which database system services choose private cloud services, mainly from the perspective of reducing the total cost of ownership (TCO) of the system. Including construction costs, operation and maintenance costs, depreciation expenses, etc.

5. Elastically scalable multi-tenant database architecture

Insert image description here

6. Cloud database cloning and replication

Insert image description here
Perform operations such as cloning and copying the production database system. The cloned and copied database system can be used in non-production systems and used for
development and testing processes or to participate in benchmark tests.

The database system of the user's non-production system maintains the same data as the current production system. At the same time, part of the updated data in the production system can also be synchronized to the non-production database system in real time, thereby maintaining the consistency between the two parts of data.

7. Design ideas of multi-mode database

Design idea: Provide unified multi-mode data management, processing capabilities, and unified operation and maintenance capabilities on top of the database system.

  1. Storage of multi-modal data: For a unified multi-modal database system, it is necessary to provide storage capabilities for multiple data models, including relationships, time series, flow graphs, space, etc.
  2. Multi-mode data processing: For a unified multi-mode database system, it needs to provide processing capabilities for multiple database models, including relationships, time series, flow graphs, space, etc.
  3. Relevant conversion between multi-mode data: In most cases, customers have only one data generation source, that is, the data model of the data generation source is single, but subsequent processing may require the use of multiple models to represent the physical world and then perform data processing. Processing, or the need for collaboration between multiple models to complete a single task. Therefore, data conversion between different models is also extremely important.

8. Multi-mode database system architecture

Insert image description here
Introduce the Multi–Model Database Uniform Framework to provide users with unified data access and maintenance for multi-mode databases such as relational databases, graph databases, and time series databases.
The interface simplifies the learning and usage costs of operation and maintenance and application developers, and improves the security of data use (data does not need to be switched between multiple systems, reducing the need for data to be transferred in > time of exposure on the network).

9. Overall architecture of GaussDB database

image.png
GaussDB mainly includes four logical modules:

9.1 Database front-end

Submit transactions, based on MySQL8.0, 100% compatible.

9.2 Storage Abstraction Layer (SAL)

Data sharding, fault recovery, remote data storage.

9.3 Log Store (log storage)

Log storage is a service executed in the storage layer and is responsible for storing log records. Once
all log records belonging to the transaction are persisted, the completion of the transaction can be confirmed to the client
.

9.4 Page Store (page storage)

Page Store server is another service in the storage layer. The GaussDB database is
divided into fixed-size (10GB) partitions, and these partitions are called slices. Each
Page Store server processes multiple slices from different databases, receiving logs belonging to the slices it
is responsible for. A database can have multiple slices, and each slice is replicated to 3 Page Stores to ensure durability and availability.

10. Deployment mode

Insert image description here

10.1 Single AZ deployment

  • 3 copies: The copies are on different nodes.
  • Log Store: All 3 copies are persisted before the transaction can be submitted; data can be read from any copy.
  • Page Store: If any one of the 3 copies is persisted, it is successful: data can be synchronized between the copies.
    .

10.2 多AZ

  • 6 replicas: Each AZ contains two replicas.
  • Log Store: 6 copies, 4 successful writes are required for writing, and 3 copies are required for reading to be effective.
  • Page Store: If any one of the 6 copies is persisted, it is successful: data can be synchronized between the copies.

11.Writing process

Insert image description here

12.Reading process

The database front-end reads data in page units. When reading or modifying data, the database front-end needs to read the corresponding page into the
buffer pool. When a new page needs to be read but the buffer pooli is full, the system must eliminate a
page for replacement.

  1. GaussDB has modified the page elimination algorithm to ensure that all log records corresponding to dirty pages are successfully written to at least one Page Store before
    eliminates the page. Therefore, GaussDB ensures that before the log record reaches the Page Store, the corresponding page can be accessed from the outuffer pool
    . Once eliminated, it can be read from the Page Store immediately.
  2. For each slice, the SAL records the LSN of the last log record sent to the slice. When the master node reads the page, the read operation reaches
    SAL, and SAL will issue a read request with the above LSN. Read requests are sent by the system to known Page Store nodes with low latency.
    If the selected node is unavailable, or it has not received all log records up to the specified LSN, a read exception will be returned and SAL will attempt to
    access the next A Page Store node that stores the slicer until a node that can satisfy the request is found.

13.Log Store log storage failure recovery

  1. Temporary failure:
    ·The Log Store changes to Read-only mode, there will be no new requests, and the node is set to a temporary failure state. After recovery, no recovery is required, and lost data can be retrieved from other replicas.
  2. Permanent failure:
    ·The faulty node is removed from the cluster, and the data lost on the node will be reconstructed on other replicas.

Guess you like

Origin blog.csdn.net/qq_20143059/article/details/124385637