Peerless China-made Database Billboard-February 2022

Fishing alone in the cold river and snow

Birds fly away in thousands of mountains, and people in thousands of paths disappear. Boat PC World Weng, fishing alone trees and snow.

Boss Mo (Mo Tianlun) is wearing a coir raincoat and a bamboo hat, and is rowing alone on the Tianchi Lake in Changbai Mountain.

The boat walks in the painting with a leaf, and I forget the sky of Shu in a happy mood. Crouching Dragon, Hidden Tiger and Three Treasures Land, climbing the peak and facing the water, a pool of sky.

Suddenly, I saw a white eagle flying from the northern sky, with a red scroll under the eagle's claws. Suddenly the eagle's claws were loosened, and the scroll flew towards Boss Mo. Boss Mo was holding the fishing rod with his right hand, looking calmly at the lake, and stretched out his left hand two times. Pointing to, accurately catching the scroll flying from the upper left, it is the supreme magic skill of splitting the mind and multi-tasking, left and right fighting each other. When opening the scroll, only two large characters can be seen: "The guest is here!".

It turns out that today is the database martial arts conference in February 2022. The Wudang School (relational database), the Emei School (document database), the Kongtong School (key-value database), the Kunlun School (wide-column storage database), and the Diancang School. Pai (Graphic Database), Huashan Pai (Distributed Database), Qingcheng Pai (Yunyuan), Songshan Shaolin Pai (Time Series Database) gathered on Changbai Mountain, and Boss Mo (Mo Tianlun) is the host of this martial arts conference.

Boss Mo got up and tapped the edge of the boat with his toes, his body was already flying forward, and he landed gently on the bank with a Wudang Ladder. He put on his skis and slid towards the southeast at high speed. The boss flew up at high speed, turned around in the air, and a difficult 1620 slid across the air. When he landed, he performed a trick of falling, and his feet landed firmly on the ground. The database martial arts conference in February 2022 officially begins!

The heroes of the world are born in my generation

Boss Mo (Mo Tianlun) turned his back to the crowd, and heaved a sharp breath. Suddenly, there was a roar of a lion from his mouth, which was the golden lion king Xie Xun's famous stunt lion roar. A large piece of snow fell, and a large piece of text was revealed. It turned out that when Boss Mo (Mo Tianlun) stepped on his skis and flew down from the jumping platform, he immediately swung several palms towards the snowy slope. Like a thousand palms, they used the Great Mercy and Great Compassion Qianye Palm of the Shaolin School to strike a large piece of text on the snow. When everyone looked up, the text turned out to be the February 2022 database martial arts conference ranking list.

Just looking at the top five are:

First place: Linghu Chong sent by Huashan (TiDB database)

Second place: Daoist Chongxu of Wudang School (OpenGauss database)

Third place: Feng Buping, Jianzong of Huashan School (OceanBase database)

Fourth place: Zhang Sanfeng of Wudang School (Dameng Database)

Fifth place: Yu Canghai from Qingcheng School (GaussDB database)

 Motianlun China Database Popularity Ranking

It can be seen that Huashan sent Linghu Chong (TiDB database) since winning the position of martial arts leader in early 2020, and has been dominating the list until now.

Wulin leader Linghu Chong ( TiDB )

Although Linghuchong of Huashan School (TiDB) started relatively late and was only established in 2016, its talent is high and its progress is unmatched. TiDB is an open source distributed relational database independently designed and developed by PingCAP. It is an online The integrated distributed database product of transaction processing and online analytical processing (Hybrid Transactional and Analytical Processing, HTAP), the goal is to provide users with one-stop OLTP (Online Transactional Processing), OLAP (Online Analytical Processing), HTAP solutions. 

data distribution

Linghu Chong (TiDB) has achieved great success in his famous stunt Dugu Nine Swords, especially under the MPP architecture, the data distribution technology is also miraculous.

For parallel databases based on the Shared Nothing architecture, data distribution (Data Distributing) is inevitable. At the same time, the distribution of the data of the entire system on multiple processing units also determines the overall performance of the system. If the data distribution is uneven and there is serious data skew, there will be a short-board effect, which will become the bottleneck of the overall system performance. Therefore, for a parallel database based on the Shared Nothing architecture, how to distribute data on the entire system is a key consideration in the overall design of the database.

When martial arts seniors use HASH distribution and random distribution one after another, there will be performance problems in some specific scenarios, such as:

  1. Performance problems caused by data skew caused by unreasonable selection of HASH distribution columns when creating a table.
  2. When connecting multiple tables, the connection column is not a HASH distribution column, and it needs to be dynamically redistributed or pull the copied table, which is inefficient.

Linghuchong (TiDB) of Huashan faction can indeed find another way and choose a different path. He uses the Dugu Nine Swords Breaking Sword style to store the source data in RocksDB, and RocksDB is responsible for the specific data landing. The reason for this choice is that developing a stand-alone storage engine requires a lot of work, especially for a high-performance stand-alone engine, which requires various meticulous optimizations. RocksDB is an excellent stand-alone KV storage engine open sourced by Facebook. It can meet various requirements of TiKV for a stand-alone engine. For a KV system, there are two typical solutions for distributing data across multiple machines:

  • Hash: Hash is done according to the Key, and the corresponding storage node is selected according to the Hash value.
  • Range: Range is divided according to Key, and a certain continuous Key is stored on one storage node.

TiKV chooses the second method, dividing the entire Key-Value space into many segments, each segment is a series of continuous Keys, each segment is called a Region, and will try to keep the data stored in each Region within a certain size , currently the default is 96MB in TiKV. Each Region can be described by a left-closed right-open interval such as [StartKey, EndKey).

This data distribution method seems to solve the problems caused by the distribution of HASH-based data. There is no need to specify the HASH distribution column when creating a table, and there is no need to care about the data distribution problem. There will be no data skew caused by unreasonable selection of the HASH distribution column. , and will not cause performance problems because the multi-table association does not have an associated HASH distribution column. However, whether it will cause other problems needs further investigation.

On January 29, 2022, the head office of one of the four major state-owned banks issued an announcement of winning the bid for the HTAP database centralized procurement project, and Pingkai Xingchen TiDB enterprise-level distributed database successfully won the bid. TiDB HTAP database has become an important engine for banking business innovation.

Daoist Chongxu ( openGauss )

The Taoist Chongxu of the Wudang faction ranked second (OpenGauss database) is on the rise, approaching the throne of the leader of the martial arts alliance, with a force value of 552.15 points, only 37 points away from the first place.

Daoist Chongxu (OpenGauss database) became famous for his unique Tai Chi swordsmanship, which is subtle and mysterious, stressing that the god is before the sword, and it is endless. Respond to all changes with the same, use your own bluntness to block the enemy's lack of front. If a big net is cast out, it will be gradually tightened towards the center. This way of swordsmanship is just big and small, all kinds of circles in front and back, in terms of tricks. It can be said that there is only one move, but this move can never be dealt with endlessly.

The openGauss database is based on the PG-XC project and deeply integrates Huawei's years of experience in enterprise-level application scenarios in the database field. Among the total 950,000 lines of kernel code, Huawei has modified and added 700,000 lines of core code, retaining the PostgreSQL interface and public Function code, focusing on optimization and modification on architecture, transaction, storage engine, optimizer and Kunpeng chip.

Jianzong Feng Buping ( OceanBase )

Master Jianzong, Feng Buping, scored 514.71 points for force this month, an increase of 29.7 points compared to last month, and his ranking rose from fourth to third.

Jianzong Feng Buping, the sword technique is not only exquisite in moves, but also fierce in the sword, not just winning with sword moves, the opponent is like a small boat in the hundreds of meters of torrents, the wind howls, the waves are like mountains, one after another white waves He rushed towards the boat, and the boat went up and down with the waves, until it was swallowed up by the waves.

OceanBase is an enterprise-level distributed relational database independently developed by Ant Group. Based on a distributed architecture and a general-purpose server, it achieves financial-level reliability and data consistency. It owns 100% intellectual property rights and was founded in 2010. OceanBase has the characteristics of strong data consistency, high availability, high performance, online expansion, high compatibility with SQL standards and mainstream relational databases, and low cost. It has steadily supported Double 11 for 8 consecutive years, and innovatively launched a new city-level disaster recovery standard of "three places and five centers". It is the only domestically produced distributed database in the world that has set new world records in both TPC-C and TPC-H tests.

HTAP mixed load, using the same set of high-performance parallel execution engine, combined with a unique data storage method, conducts in-depth optimization of transaction and analysis scenarios respectively. Isolate the computing resources used by different loads to avoid mutual interference between analysis scenarios and transaction scenarios.

The increase in OceanBase score this month is inseparable from the release of new versions. On January 6, at the DC2021 Distributed Database Developers Conference, OceanBase CTO Yang Chuanhui announced the official release of OceanBase Community Edition 3.1.2, and launched the community edition tool system, achieving a double leap in usability and ease of use.

The OceanBase database adopts the Shared-Nothing architecture, and each node is completely equal. Each node has its own SQL engine and storage engine, and runs on a cluster composed of ordinary PC servers. It has scalability, high availability, high performance, Low cost, cloud native and other core features.

The overall architecture of the OceanBase database is shown in the figure below

data distribution

From the perspective of data distribution, OceanBase database is the realization of database partition table in distributed system, and its syntax and usage are compatible with traditional database partition table. When the capacity or service capability of the table is insufficient, you only need to add more table partitions through OceanBase management commands.

Distributed Partition Table

Traditional databases support partitioned tables. Common partitioning methods include Hash partitioning, Range partitioning, and List partitioning, and support secondary composite partitioning. The OceanBase database follows the use of the partition table, but the partitions can be evenly distributed on any node of the database. The OceanBase database is based on the distributed partition table data model. On the one hand, data storage and processing capabilities can be expanded horizontally, enjoying the dividends of distributed technology. The same use of distributed databases, with complete global indexes and global constraints.

The OceanBase database has requirements for the selection of partition keys. If the table is set with Primary Keys, the partition key must be a column in Primary Keys. If the table is not set with Primary Keys, then the partition key is not required.

Hash partition

Hash partition needs to specify the partition key and the number of partitions. Calculate the result of an int type through the partition expression of Hash, and then take the modulus of this result and the number of partitions to obtain which partition the row of data belongs to. Typically used for point queries given a partition key, such as partitioning by user ID. Hash partitioning usually eliminates hot queries.

The following example creates table t1, selects column c1 as the partition key for Hash partitioning, and the number of partitions is 5.

create table t1 (c1 int, c2 int) partition by hash(c1) partitions 5。

For relational tables, it is recommended to use the association key as the partition key, and use the same partition method, and use Table Group to configure the same partition on the same node to reduce cross-node data interaction.

Wudang Clan Zhang Sanfeng ( DM )

Zhang Sanfeng of the Wudang school has a force score of 483.91 this month, a decrease of 35.4 points from the previous month, and his ranking dropped from third to fourth.

On the morning of February 9, 2022, the kick-off meeting of the "Spring Dawn Action" serving high-tech enterprises in Hubei Province in 2022 was held in Wuhan. At the meeting, the Hubei Provincial Department of Science and Technology and the Hubei Science and Technology Information Research Institute released the 2021 Top 100 High-tech Enterprises in Hubei Province list. With its strong strength, Dameng Data was listed among the top 100 high-tech companies in Hubei Province!

According to the "China Relational Database Software Market Tracking Report for the First Half of 2021" released by the authoritative market research organization IDC, in the Chinese relational database market, Wuhan Dameng Database Co., Ltd.'s market share in the first half of 2021 increased by 229% year-on-year. Ranked No. 1 among domestic and foreign mainstream manufacturers.

Dameng has a variety of database products:

Among them, DM database management system software, referred to as "DM database". The positioning of the DM database is the same as Oracle, which is a large-scale general-purpose relational database.

Dameng has always adhered to original innovation and independent research and development. At present, it has mastered the core cutting-edge technologies in the field of data management and data analysis, has all source codes, and has completely independent intellectual property rights. DM database will learn from commercial database software such as Oracle or open source database products such as PostgreSQL in terms of database thinking or function display, but it has nothing to do with these products at the code level.

Dameng analytical large-scale data processing cluster DMMPP

Dameng analytical large-scale data processing cluster software (DMMPP) is a completely peer-to-peer and shared-nothing parallel cluster component developed based on Dameng database management system. It supports organizing multiple DM8 nodes into a parallel computing network and provides a unified The database service can support up to 1024 nodes, supports TB to PB-level data storage and analysis, and provides high availability and dynamic expansion capabilities. It is a cost-effective general solution for ultra-large data applications.

data distribution

Supports multiple data distributions, including HASH distribution, range distribution, and random distribution; supports horizontal partitioning, vertical partitioning, and multi-level hybrid partitioning of tables, and provides combined support for data distribution and data partitioning, providing extremely high flexibility.

Qingcheng School Yu Canghai ( GaussDB )

Qingcheng faction Yu Canghai (GaussDB) has a score of 427.86 for force this month, an increase of 39.55 points compared to last month. The ranking remains unchanged and is still fifth.

GaussDB is the brand name of Huawei's database products, a tribute to the mathematician Gauss.

It has a variety of database products:

GaussDB T(OLTP)

The GaussDB T (OLTP) database is a distributed database independently developed by Huawei. Its predecessor is GaussDB 100. It is based on the comprehensive transformation of the self-developed in-memory database developed by Huawei in 2007 and commercialized in the field of telecom billing. It supports x86 and Huawei Kunpeng The hardware architecture, based on the innovative database kernel, provides real-time processing capabilities for high-concurrency transactions, financial-level high-availability and distributed high-scalability capabilities for three centers in two places, and is used to support core key systems in industries such as finance, government, and telecommunications. Currently, it supports mainstream deployment methods such as stand-alone, active/standby, and distributed. Easy to use, 98% Oracle syntax compatibility.

GaussDB A(OLAP)

Its predecessor is GaussDB 200, a distributed database with analysis and mixed load capabilities. Since 2011, it has been independently developed on the basis of PostgreSQL 9.2.4. It supports x86 and Huawei Kunpeng hardware architectures, supports row storage and column storage, and provides PB (Petabyte) level data analysis capability, multi-mode analysis capability and real-time processing capability.

GaussDB A adopts the MPP (Massively Parallel Processing) architecture, which has huge advantages over traditional databases in terms of core technology. It can solve the data processing performance problems of users in many industries, and can provide a cost-effective general computing platform for ultra-large-scale data management. It can be used to support various data warehouse systems, BI (Business Intelligence) systems and decision support systems, and provide unified services for decision analysis of upper-level applications.

As soon as you enter the rivers and lakes, you will be reminded:

DB-Engines Ranking DB-Engines Ranking - popularity ranking of database management systems

Check the overseas DB-Engines, the top two on the list are still from the Sun Moon God Sect (Oracle Bone Inscriptions).

The leader of the Sun Moon God Sect is Oracle, who ranks first. His unique skill of attracting stars (OLTP) is a rare opponent in the world. With profound internal skills and strong martial arts, although the domestic market is gradually shrinking, the position of the leader is still unshakable.

Dongfang Unbeaten (MySQL), ranked second, has improved rapidly. This month's force value is as high as 1214.68 points, only 42 points lower than Ren Woxing (Oracle), and it is also the most recent difference in force value. . MYSQL database is the most popular open source database. 90% of the world's top 20 Internet sites use MYSQL to use databases, and the other 80% of big data platforms are combined with MYSQL databases.

Huashan faction Linghu Chong (TiDB) ranks 94th in the Overseas Gods List (DB-Engines).

School Book:

I still remember that when I first entered the database industry, I mastered an Oracle technology, and I was able to travel alone. Later, I came into contact with relational databases such as MySQL, DB2, and SQLServer. Later, I came into contact with NOSQL databases such as Redis, MongoDB, etc. at work. If you learn these database technologies in the future, you will have certain competitiveness in this industry. However, with the rapid development of domestic databases, there are currently 195 domestic databases. How to learn domestic databases has become a question that DBAs often think about.

When learning domestic databases, the first thing that comes to mind is the official technical documents. However, there are many domestic database manufacturers, and the technical capabilities of each manufacturer are uneven. The technical documents released are also very different. Being able to access the pre-sales PPT and other materials written by these domestic databases through channels such as Baidu and the Motianlun community is very unfriendly to DBAs who want to learn. I don’t know whether it is because of closed source reasons or different sales methods . Technical documentation or unpublished technical documentation.

The following are the technical documents of the top five database vendors in terms of popularity. The completeness, technical details, and technical depth of each technical document are different.

TiDB Martial Art Cheats (Technical Documentation)

Introduction to TiDB | PingCAP Docs 

openGauss Cheats (Technical Documentation)

Quickstart | openGauss

OceanBase Martial Cheats (Technical Documentation)

OceanBase enterprise-level distributed relational database 

Cheats of Dameng School (Technical Documentation)

DM Database Quick Start Guide | Dameng Technical Documentation 

GaussDB Martial Cheats (Technical Documentation)

Huawei GaussDB A Configuration Manual, Product Documentation, PDF - Huawei

Technical documents cannot be viewed online, only download is supported

Ordinary users do not have download permissions

Need to be a customer or partner to download permissions

Advice from IT workers

The following represent only personal thoughts.

At present, there are many ways to rank databases, such as ranking by popularity, ranking by sales, etc. It is recommended to consider introducing rankings such as the completeness of technical documents and technical depth of domestic database manufacturers in the future to improve the efficiency of technology promotion.

Sometimes when comparing a function of a domestic database, it is often necessary to consult a large amount of information, or even manually build an environment for testing. It is recommended to consider ranking some important functions of domestic databases, such as domestic distributed database data distribution technology rankings, domestic database locks Mechanism ranking, domestic data compression technology ranking, domestic database horizontal expansion and contraction technology ranking, domestic database disaster recovery technology ranking, domestic database maintenance difficulty ranking, etc.

We hope that we can gain a more in-depth and objective understanding of domestic databases through the comparison of databases in more dimensions, instead of being misled by some promotional keywords such as "the world's first" and "the industry's first", or maliciously belittling friends and businessmen.

  • Remarks: Various metaphors have been introduced in the article. Please do not use the same number as martial arts, martial arts figures, etc.

Original link: https://www.modb.pro/db/331031

Disclaimer: This article is the original content of Chen Juchao , the author of Motianlun   , and represents the author's point of view. If you have comments and suggestions on the above content, please point and communicate in the comment area below, or click on the homepage of the author Mo Tianlun to leave a message. *

Related Reading


Motianlun provides one-stop comprehensive services around the learning and growth of data people, and builds a unified platform integrating news information, online Q&A, live event broadcasting, online courses, document reading, resource downloading, knowledge sharing and online operation and maintenance. Continue to promote knowledge dissemination and technological innovation in the data field.

Follow the official public account: Motianlun, Motianlun Platform, Motianlun Growth Camp, Database Localization, Database Information

Guess you like

Origin blog.csdn.net/Era666/article/details/123132622#comments_22869581