Ape Chuang Call for Papers | Analysis of 3 key technologies for GaussDB performance improvement from single-machine million tpmc to distributed tens of millions tpmc

Article directory

1 Panorama of HUAWEI CLOUD database services

1.1 HUAWEI CLOUD database service

Huawei Cloud database service is divided into two main service types: self-developed database service and cloud hosting


1.2 Relational and non-relational databases

In the relational database, we have Huawei's self-developed database GaussDB, which can currently support a variety of different ecosystems

  1. Support the interface and syntax of mainstream databases such as mysql
  2. Support data warehouse scene
  3. Support TP and AP

In the non-relational database, we have two major platforms, one is Huawei's self-developed database GaussDB, which can support mainstream application scenarios (document type, wide table type, cache type, substantive type) in the current market, and is compatible with mainstream The database interface is compatible.

One is that databases such as MongoDB are also hosted on the cloud. HUAWEI CLOUD provides many application tools to help developers

  1. database migration
  2. data migration
  3. Database operation and maintenance (for different personnel)
  4. Data Intermediary Services

2 Key Technologies of Huawei Self-developed Database

2.1 GAUSSDB (for openGauss) enterprise-level distributed database

Positioned as an enterprise-level distributed cloud database, the architecture focuses on building the enterprise-level capabilities of traditional databases and the high scalability and high availability capabilities of Internet distributed databases


Architecture diagram of GAUSSDB (for openGauss)

GAUSSDB (for openGauss) is oriented to the core application scenarios of enterprises, and built against the first-class database.

2.1.1 Problems encountered by GAUSSDB (for openGauss) Memory access asymmetry problem


In the traditional server structure, we have CPU, memory, etc. The speed of CPU is much faster than that of memory, and the north bridge is needed to connect CPU and memory, so as to reduce the delay

In this case, the database does not need to pay attention to the hardware connection.

However, as the speed of the CPU increases and the amount of memory increases, the role of the North Bridge becomes smaller and smaller, and it is gradually eliminated.

At present, different CPUs of the current host are connected together through a high-speed bus, and the CPU can also be directly connected to the memory.

At this time we need a database to consider the hardware architecture. Cache consistency problem


The speed between CPU and memory is different. In order to solve the speed problem, the CPU has a first-level cache, a second-level cache, a third-level cache, etc., but when there are too many CPUs, if the database cannot perceive the hardware architecture, each CPU is accessing at the same time

When caching, the same variable is being loaded, and the CPU must ensure read and write consistency, which will cause problems, and this efficiency will be reduced by an order of magnitude

2.1.2 Solutions GAUSSDB (for openGauss) black technology 1: NUMA-Aware extreme optimization

In principle, NUMA-Aware technology perceives the asymmetry of CPU and other hardware architectures


  1. Partition the data stored in the database, reduce the conflict between CPUs as much as possible, improve parallelism, and provide overall CPU utilization
  2. Global variables in the database have a specific single CPU responsible for processing read operations, and other CPUs perform write operations to reduce data access conflicts GAUSSDB (for openGauss) black technology 2: GTM-Lite distributed expansion technology


Distributed strong consistency, the distributed GTM-Lite solution provides global transaction submission number management to achieve strong consistency, and there is no central node performance bottleneck


  • CSN timestamp and transaction number TXID decoupling;
  • GTM-Lite only manages global timestamps;
  • Two-dimensional management of local timestamp and global timestamp;
  • Advance the global timestamp only on transaction commit;
  • Access the global timestamp only when cross-shard access is involved, otherwise use the local timestamp;
  • GTM-Lite asynchronous synchronization, reducing GTM-Lite access delay;

When using the global time, it needs to go through multiple network hops, which will increase the delay. The solution is to coordinate the nodes to synchronize GAUSSDB (for openGauss) Black Technology 3: The distributed optimizer provides the ultimate distributed expansion capability


To make good use of system resources

  1. Improve parallel execution capabilities

    Make full use of the current multi-core features, and improve system throughput through multi-threaded concurrent execution

  2. The optimizer generates running rules and executes them through vectors to improve execution efficiency

  3. Compile and execute to improve the utilization of CPU instructions

    Change from interpreted execution to compiled execution, answer to reduce the number of operator instructions, and improve operational efficiency

  • AI blessing drives database self-optimization and self-diagnosis


Combining deep reinforcement learning and global optimization algorithms, fine-grained tuning for different categories of parameters

  1. Tuning time: from days to minutes
  2. Index recommendation: recommend the optimal index based on the user's single statement or batch load


Continuously collect database operation data, and realize intelligent monitoring based on algorithms such as timing prediction and anomaly detection

  1. Fault warning: predict resource and performance trend, intelligent detection of abnormal faults

2.1.3 GAUSSDB (for openGauss) realization effect


The optimization effect is very good

2.1.4 Examples of GAUSSDB (for openGauss) application scenarios

HUAWEI CLOUD GaussDB Supports Localized Distributed Transformation of ICBC's Core Business System


2.2 Analysis of Internet-oriented cloud-native database architecture

2.2.1 Challenges of Open Source Mysql

  1. The RTO time of active/standby switchover is long
  2. Rebuilding an instance takes too long
  3. The freshness of read-only data on the standby machine is low
  4. Computing and storage are not decoupled, resulting in low utilization
  5. Network Resource Utilization
  6. The number of standby machines is limited (affecting host performance, complex networking)


2.2.2 Solutions Black technology 1: LOG IS DATABASE, cloud-native data separated from storage and calculation


image-20210530180013606 Black Technology 2: Near Data Process+ Parallel, Provide Ultimate Performance



Operator pushdown support: projection, predicate, aggregation operator, MVCC visibility judgment. The page is judged and processed on the storage node, the size of the returned page is greatly reduced, and the network IO is reduced. count(*) query reduces IO throughput by 40-80 times

Highly parallel (three layers of parallelism)

The first layer: multi-worker parallel query of the computing node.
The second layer: from the original serial reading of one page to batch reading of multiple pages. Pages are distributed on different slices, and multiple slices are read in parallel. The third
layer: a single On the storage node, multiple NDP processing threads process pages in parallel, and calculate the push-down operator logic. Black Technology 3: Ultimate Backup and Recovery

Database dedicated distributed storage system, the ultimate data backup and recovery performance


Overall Recovery Time Comparison


2.2.3 Advantages of GaussDB compared to open source databases


2.2.4 Application examples

Yongan Insurance successfully migrated from a mainstream commercial database to GaussDB(for MySQL)



2.3 Decryption of HUAWEI CLOUD GaussDB (for Influx) billion-level timeline technology

2.3.1 Time series data model and main application scenarios

The two areas where the time series data model is most widely used are Iot and monitoring


2.3.2 Public Cloud SRE Trends Public cloud SRE trend 1: Time series data with explosive growth

Taking CloudMonitorCenter as an example, the time series database supports the collection and storage of monitoring raw data in two business scenarios

  • System indicators: mainly monitor CPu, Disk, network card, load, TCP, NTP, Ping, etc.
  • Custom indicators: including container monitoring indicators, database indicators, process indicators, middleware indicators, log indicators, business indicators, etc. The indicator names, data labels, and data types of different indicator types are different.

image-20210530201557690 Public cloud SRE trend 2: The value of time series data is getting higher and higher


2.3.3 Challenges faced by cloud operation and maintenance monitoring system Architecture Status

image-20210530201806926 Data Expansion + Business Complexity + Rapid Business Change

Challenges in the rapid growth of HUAWEI CLOUD business:

  1. There are many business types and rapid changes, and it is difficult to quickly satisfy analysis demands
  2. Large data scale, fast growth, high data processing efficiency requirements
  3. Different businesses are highly correlated, and it is difficult to analyze the root cause of failures

image-20210530201955081 Multi-mode NoSQL service GaussDB NoSQL

GaussDB NoSQL is an active-active fully distributed architecture multi-mode NoSQL database service based on Huawei's latest generation of DFV computing and storage separation architecture. It is highly compatible with four mainstream NoSQL interfaces: MongoDB, Cassandra, Redis, and InfluxDB. Compared with the community version, it has the advantages of minute-level computing expansion, second-level storage expansion, strong data consistency, ultra-low latency, and high-speed backup and recovery. It is suitable for loT, weather, Internet, games and other fields.


GaussDB(for Influx) : One-stop time series data storage, analysis and insight platform


2.3.4 GaussDB (for Influx) Architecture of Cloud Native Time Series Database

Distributed + separation of storage and computing + high availability


2.3.5 Decryption of GaussDB (for Influx) billion-level timeline technology

90% of the hotspots of time series data are recent data, ensuring massive data storage while providing extreme performance


Two-Level Partitioning Strategy

  • Time partition: data is partitioned by time period, and the partition duration is configurable. You can define full-memory, full-hot storage, or full-cold storage databases according to your needs.
    Timeline partitioning: Range/Hash partitioning based on shard key.

  • The partition strategy (hash/range) can be switched in the new time partition according to the demand

data classification

  • Data is graded according to time period Hot data is stored in memory
  • Warm data on SSD and cold data on HDD

dedicated storage engine


Write performance acceleration

  • Batch stream combined with pre-aggregated data
  • asynchronous log
  • Row-column mixed memory layout, reducing data conversion overhead

Timing file layout

  • Dedicated layout for timing data
  • Column storage, multi-level index
  • Smooth upgrade of multiple versions

multi-stage compression algorithm

  • Temporal Similarity Compression Algorithm
  • time delta compression

multimodal index

  • Dimension index: Inverted index locates data source
  • Timeline index: massive timeline kv index

query acceleration

  • Add item-level BRIN index, reduce aggregation calculation, and speed up Scan
  • Multi-level cache to speed up aggregation queries
  • Optimizer cost evaluation, large query control, select secondary index or scan

Adaptive Compression Algorithm

According to the different data types and data change trends of Timestamp and field data, different numerical transformation algorithms are used, and then an adaptive numerical compression algorithm is designed according to the transformed data distribution. Finally, high-performance dictionary encoding method is combined to realize efficient adaptive compression of time series data . At the same time, similarity compression is performed for the time difference in the TSM file to further reduce the storage cost of time series data.


High-performance multi-dimensional aggregation query

Large data aggregation query performance is 2~5 times that of open source; support multi-dimensional condition combination query;
technical solution

  • MPP architecture, a query statement is executed concurrently on multiple nodes and multiple cores;
  • Vectorized query engine, each iteration returns data in batches, and the query performance under the big data star is better;
  • Incremental aggregation engine based on the sliding window aggregation query, most of which are directly hit and returned from the aggregation result cache, only the incremental data part needs to be aggregated;
  • Support multi-dimensional inverted index, support multi-dimensional condition combination query, avoid a large amount of Scan data;
  • Support storage summary index, which can filter irrelevant data faster;


Storage Analysis Report


3. Summary

There are many open source databases currently on the market, and they are widely used. Many open source databases are weak in terms of usability and supporting capabilities, and require continuous maintenance. Moreover, once data loss occurs, it is difficult to recover quickly, causing immeasurable losses. Therefore, open source databases on the cloud can only solve the demands of small and medium-sized enterprises such as simplified deployment, operation and maintenance, tuning, and extreme cost performance.

HUAWEI CLOUD GaussDB (for Influx) relies on the multi-mode NoSQL service GaussDB NoSQL and distributed + storage computing separation + high-availability architecture to significantly improve the advantages of IoT and monitoring compared to traditional database products.

On the basis of supporting traditional businesses, GaussDB continues to build competitive features, providing infinite possibilities for enterprises to face the challenges of the 5G era.

Guess you like

Origin blog.csdn.net/qq_43475285/article/details/127138827