Who can be proud of the world: Let’s talk about China’s database rankings and domestic database selection in August

introduction

The world's turmoil comes from my generation, and the years rush me as soon as I enter the world.

Some people say that where there is a database, there is a "jianghu".

There have been many seniors who devoted themselves passionately to the domestic database business for decades, but it was like a fleeting moment, and in the end, some people were happy and some were sad.

As of August 2023, Motianlun has included 286 kinds of domestic databases to participate in the ranking. The development of domestic databases appears to be in full swing and full of vigor, but in fact it is undercurrents and dangerous. If you make a wrong step, you may lose everything.

Regarding the "grievances" and "love and hatred" in the database arena, let's take a look at the Motianlun Chinese database popularity rankings in August 2023.

Domestic database ranking

The following data is referenced from Mo Tianlun: https://www.modb.pro/dbRank

A total of 286 database products participated in the ranking in August 2023. The top ten are:

1: Ant Group’s OceanBase database

2: PingCAP's TiDB database

3: Huawei’s openGauss database

4: Alibaba’s PolarDB database

5: Huawei's GaussDB database

6: Dameng database of Wuhan Dameng Company

7: Renda Jincang database of Renda Jincang Company

8: GBase database of Nanda General Company

9: Tencent Cloud's TDSQL database

10: Alibaba Cloud's AnalyticDB database

Database vendor:

It can be seen from the manufacturers that the gradient of domestic database manufacturers has been very obvious. Among the top ten, there are three of the four old domestic database manufacturers, namely Dameng, Renda Jincang, and Nanda General. They were all established around 2000 and are engaged in research and development. The earliest batch of domestic databases ranked 6th, 7th, and 8th respectively. There are three from the Ali family, ranking 1st, 4th, and 10th respectively. Huawei has two, ranking 3rd and 5th respectively. Tencent has one, ranking 9th.

PingCAP was established in 2015. Compared with the previous manufacturers, it was established relatively late and the company's comprehensive capabilities are slightly poorer. However, its database product TiDB is very strong and is a "dark horse" of domestic databases. It has dominated the list for a long time and ranks No. 1 this month. Two.

db-engines ranking list:

DB-Engines Ranking - popularity ranking of database management systems

420 systems in ranking, August 2023

Domestic database

Judging from the popularity rankings of foreign db-engines databases, the rankings of domestic databases are relatively low, among which TiDB ranks 103, and OceanBase ranks 147. There is still a long way to go for domestic databases to be introduced to the world.

Domestic database statistics

Number of domestic databases

The number of domestic database products is increasing almost every month. In September 2020, there were 104 statistics, and by August 2023, it had grown to 286. According to market rules, the best will survive and the slightest will be eliminated. It is expected that there will be a significant downward trend in recent years. Personally, I think that not many domestic database products will survive in the end. I hope that easy-to-use database products can be promoted as soon as possible.

Statistics by model

Relational databases still dominate and are the most competitive field.

Model Relational multiple models key value Column cluster Timing space vector search picture
quantity 176 3 12 4 41 5 7 6 32

Statistics by processing scenario

Handle the scene OLTP OLAP PAH
quantity 108 33 29

Statistics by technical architecture

Technology Architecture centralized distributed
quantity 116 137

Statistics by deployment mode

deployment mode local deployment cloud native
quantity 200 47

By open source/commercial statistics

Open source/commercial Open source Business
quantity 48 238

The two heroes compete for hegemony OceanBase vs TiDB

Since January 2020, TiDB has dominated the rankings for a total of 34 months. Until December 2022, OceanBase finally overtook TiDB and won the championship. After that, it continued to dominate the rankings for nine consecutive months with "extreme cruelty". This month’s score is 609.61, which is 20.09 points higher than TiDB. Judging from the popularity rankings, OceanBase still has a significant advantage.

Both database products are distributed databases, but there are still big differences in specific technical implementation:

Just do a simple comparison:

It cannot be completely said that one technology is definitely better than another. The applicable scenarios are different and the suitable technologies are also different.

From a technical point of view, both databases have their own advantages and disadvantages, such as:

Distributed architecture:

TiDB storage and calculation separation

advantage:

Management nodes, computing nodes, and storage nodes are separated, and the elastic expansion and contraction capabilities are strong.

shortcoming:

1. The components are more complex and the maintenance cost is high;

2. Compared with the OceanBase peer-to-peer architecture, the tidb computing node cannot implement local data caching like a traditional database. Data access cannot be obtained directly from the computing node locally, but must be obtained through the storage node. Of course, the storage node can cache, and RocksDB will be stored on the disk. The file on the block is divided into blocks according to a certain size. When reading a block, first go to the BlockCache in the memory to check whether the block data exists. If it exists, it can be read directly from the memory without accessing the disk.

Data fragmentation method:

TiDB automatic sharding

Advantages: Compared with OceanBase, tidb executes table creation statements transparently to developers, and the table creation syntax is highly compatible with MySQL;

Disadvantages: Since the data landing is handled by the RocksDB storage engine, the reading path is long, the data is enlarged, more disk space is required, and the disk performance is higher.

recent activities:

OceanBase has recently started to hold the third database competition. Many universities have participated, which has increased the popularity of the database. The competition of domestic databases may develop into a competition of domestic database talents in the later stage, which will be of great help to the later development of OceanBase.

As a completely self-developed domestic distributed database, OceanBase has held two consecutive database competitions. The third competition has been newly upgraded to the National College Student Computer System Ability Competition, and registration has started on August 14, 2023.

This competition was initiated by the System Ability Training Research Expert Group, co-sponsored by the National College Computer Education Research Association and the universities that initiated the System Ability Training Research Project, and hosted by OceanBase. It is open to college students and aims to promote professional construction and innovative talents in the computer field through subject competitions. To cultivate system reform, the competition encourages students to design and implement comprehensive computer systems, cultivate system-level design, analysis, optimization and application capabilities, and enhance students' technological innovation and engineering capabilities, as well as teamwork capabilities. The competition serves the national talent training strategy, uses competition to promote learning and competition, and builds an open platform for communication, display, and cooperation for the growth of high-level computer talents. The bonuses are extremely generous.

Thoughts on domestic database selection

What are the concerns when selecting domestic databases:

Note: The following represents only my personal views

Take the selection of domestic databases for traditional financial transaction systems as an example (bank payment systems, etc.):

1.Database type

It must be a relational OLTP database, and it is best to have successful cases of core financial transaction systems. The replacement of some edge non-core systems is not very convincing.

Non-relational and OLAP databases can be excluded, including 110 non-relational databases and 33 OLAP databases in relational databases. A total of 143 databases have been removed, and half of the domestic databases have been excluded.

2. Database Architecture

Supports stand-alone, active-standby, shared cluster, and distributed cluster architecture.

The evolution process of the database architecture is also the evolution process of the database usage requirements. At the earliest when there were no requirements for disaster recovery, high availability, RTO, RPO, etc., the stand-alone architecture could meet the basic business needs. Later, when there was a need for disaster recovery, the main and backup systems were introduced. , with the demand for high availability, shared clusters were introduced. After the concurrent load of the data volume business was too high, distributed clusters were introduced.

For some small companies or small systems, when the concurrent load and the amount of data are not large, the active-standby architecture can basically meet the business needs, and there is no need to use a distributed cluster architecture. Especially for some start-up companies, the initial business volume is very low. Consider hardware and software In terms of cost and maintenance cost, the active and backup architecture can fully meet the needs.

Some people may say that domestic database replacement is originally intended for large enterprises, and small enterprises can completely choose open source databases.

But sometimes, the application is bound to the database. For example, an OA system purchased by a small enterprise only supports TiDB’s domestic database and does not support the specified open source database. There are also small companies that are gradually growing. As the business scale grows, the database architecture may gradually be replaced by the active-standby architecture to the distributed architecture. At this time, the company only hopes to replace the database architecture, not the database product.

At present, some domestic databases only support distributed architectures, and do not support single-machine, active-standby, and shared cluster architectures. This is not friendly to some small enterprises or small systems. For example, OceanBase only supports distributed databases before version 4.0. Building a set requires extremely high hardware costs. On August 10, 2022, at the OceanBase annual conference, OceanBase 4.0, code-named "Xiaoyu", was officially released. This is the industry's first single-machine distributed integrated database. It realizes stand-alone deployment and takes into account the scalability of the distributed architecture and the performance advantages of the centralized architecture. It not only breaks through the bottleneck of the stand-alone performance of the distributed database, but also realizes the historical "leapfrogging" of the industry in which the performance of the stand-alone surpasses the centralized database. More importantly, it also has lower deployment costs and operation and maintenance complexity, flexibly meets the needs of different usage scenarios, and greatly reduces the threshold for small and medium-sized enterprises to use distributed databases, allowing distributed databases to be implemented in thousands of industries. “Inclusiveization” has laid a solid foundation.

We also hope that domestic database manufacturers will mostly consider small and medium-sized companies when developing database products.

Therefore, it is hoped that the selected localized database architecture can support multiple architectures such as stand-alone, active/standby, shared cluster, and distributed cluster at the same time, so as to be able to cope with various stages of the company's development.

3. Hardware requirements

In the replacement cost of domestic databases, the hardware cost accounts for a large part. For customers, it is hoped that the hardware investment cost after the replacement will not be much higher than the previous cost.

For example, the previous system mainly used the oracle dg architecture. It only needed two virtual machines, and the minimum configuration of 4C 8G 100GB was enough to run the system. Considering the expansion of the computer room, it is difficult for ordinary small businesses to accept.

If you consider centralized deployment, dozens or hundreds of servers will build a large database cluster for use by all systems, and the cost of trial and error is too high. In the early stage, when enterprises do not have sufficient confidence in domestic databases, they dare not rashly deploy All systems are replaced with a set of large-scale domestic distributed database.

Therefore, it is hoped that the localized database chosen will have minimal installation hardware requirements and low hardware resource costs.

4. Deployment method

Support for private deployment

Some cloud-native databases can only run on the public cloud of a specific manufacturer, which will lead to a complete tie-up between the database and the cloud provider, and the business negotiation part will be very passive, and many companies do not want to put their own company The data is placed on the public cloud.

Therefore, the localized database you wish to choose needs to support privatized deployment.

5. Motian Wheel Heat

1. Motianlun ranks first in popularity

Motianlun's domestic database popularity ranking scoring rules are calculated based on comprehensive factors such as search engines, trend indexes, tripartite evaluations, ecology, patents, papers, recruitment positions, books, Gartner market share rankings, and magic quadrants. The value is very low, possibly for several reasons:

(1) Newly added to Motianlun list; (2) Newly developed database products; (3) Low publicity; (4) Few user groups; (5) Too niche; (6) Poor technical community.

Regardless of the reason, as a technician, it is difficult to evaluate the adaptability of domestic databases with low scores through various factors.

Therefore, in the selection of domestic databases, you can consider only focusing on the top 20 Motianlun popularity rankings, and the trial and error costs with low scores will be higher.

Roughly divide the popularity echelon by score. This is the total list of all classified databases. If you evaluate a certain type of database, you can filter by type and then evaluate.

T0 King of Glory: greater than or equal to 500 points

OceanBase(609.61)、TiDB(589.52)、 openGauss(582.52)、 PolarDB (576.69)、GaussDB(570.75)

T1 strongest king: more than 300 and less than 500

Dameng (482.95), Renmin University of Finance and Economics (451.51), GBase (364.75), TDSQL (344.63)

T2 Supreme Starlight: greater than 100 and less than 300

AnalyticDB(213.10)、AntDB(165.68)、TDengine(109.06)、GoldenDB(101.32)

T3 Eternal Diamond: greater than 50 and less than 100

Shenzhou General (78.81), MogDB (71.55), Doris (68.08), DolphinDB (64.65), Kyligence (60.62)...

T4 noble platinum: less than 50

Wanli Database, KunDB, SelectDB...

6.Official technical documents

I hope there are mature official technical documents.

For example, there are detailed installation and deployment documents, upgrade documents, patch maintenance documents, backup and recovery documents, architecture documents, etc. under various architectures and platforms. In terms of technical documents, we hope to eventually be on par with Oracle.

In October 2022, I published an article on the public account "IT Xiao Chen" "Comparison of official technical documents of domestic databases (backup and recovery part)" , focusing on six popular domestic databases (TiDB, Dameng, OcenBase, openGauss, GaussDB , GBase) backup and recovery technical documents. Among them, GBASE, DM, and TiDB databases have relatively better technical documents about backup and recovery, at least they will not immediately persuade people to quit. OpenGauss is relatively worse, and OceanBase and GaussDB lack technical documents in this regard. It's very serious. There are very few descriptions of technical details, principles, etc. It's been nearly a year, and I don't know if it has improved now. Many other database manufacturers set a lot of restrictions on documents. For example, only those who become internal customers have permission to view computing documents. Some domestic databases even have no technical documents.

Therefore, it is hoped that the official technical documentation of the selected localized database should be detailed enough so that technical personnel will have enough confidence to learn and promote it.

7.After-sales service quality

Hope to have perfect after-sales service.

For example, if there is a serious bug in the use of a certain type of domestic database or the customer has new needs, how long can the R&D team repair and version iteration, according to the customer's importance level, BUG level, etc., after-sales should have a complete and fast solution.

Due to the rapid growth in the number of domestic databases, there will be a shortage of technical personnel. When I contacted domestic database technicians several times, I found that some after-sales technicians did not have a database foundation and were unable to distinguish some basic database concepts, and even had little understanding of database operation and maintenance. Little is known about its importance. Some database manufacturers may, in order to recruit a large number of technical personnel as quickly as possible, recruit some personnel with zero database foundation. After 2-3 weeks of training on the company's database products, they will be sent to various projects to support. This This was unimaginable a few years ago. I still remember that when recruiting DBAs, companies placed great emphasis on age and work experience. It would be difficult to find a DBA job if you were too young. Without certain experience, companies would not dare to put important data in your hands for management. Now it seems to be the opposite. If you are older, you will be looked down upon. Age is so strict that it is becoming more and more difficult for front-line technicians.

8. Database authentication

Hope to have domestic database certification

For example, DCA, DCP, DCM of Dameng database, GBase 8s, GBase 8a, GBase 8c certification of NTU general database, etc.

In the process of participating in database certification, you will have a better understanding of such database products and make more reasonable judgments on database selection.

Of course, each domestic database manufacturer has different database certification training capabilities. Among them, I personally feel that domestic database certifications are relatively professional, such as Dameng and GBase. I have not been exposed to other certifications, so I will not comment for now.

Among them, Nanda General's GBase 8s, GBase 8a, and GBase 8c certifications are available for free. The training videos and documents are detailed, and there is a dedicated WeChat group for Q&A. The overall training and examination process is very friendly.

9. Autonomous and controllable

I hope that domestic databases can be similar to the "Nine Yang Manual". They will not worry about any bugs or needs. They will be strong and the breeze will blow over the mountains. He lets him do whatever he wants, and the bright moon shines on the river. He is cruel to himself, but he is evil to himself, and I am very angry.

It is independent and controllable and can learn from other databases, but it is not recommended to directly use existing open source or commercial databases, that is, to integrate the distributed agent middleware and open source databases into a new distributed database.

Almost all domestic databases are promoted to the outside world as completely independent and controllable. However, when bugs occur in database products, not every manufacturer has the ability to quickly locate and fix the bug.

I have always been puzzled by this kind of domestic database. I directly refer to existing open source databases, such as the MySQL database directly used by the data layer. Other computing layers and scheduling layers may be self-developed. It is not clear whether this type of domestic database is a pure domestic database. . When major MySQL bugs occur in such databases, do they need to wait for Oracle and other officials to resolve them? Or the manufacturer can modify and repair the bugs of the current version of MySQL by itself. It is not clear whether doing so will conflict with the GPL open source agreement. Little is known about this area, so we will not discuss it.

10. Company nature

State-owned enterprises/private enterprises, large factories/small factories

It is best for the manufacturer to be a state-owned enterprise or a large factory

The competition among domestic database manufacturers is extremely fierce. For some small companies, if they do not manage well in the later stage, there is a risk of bankruptcy. The factory company can avoid such problems to a large extent.

11. Customer groups

It is best to have successful replacement cases for core financial transaction systems. The replacement of some marginal non-core systems is not very convincing.

12.POC

Domestic database selection usually has a POC test link, which is very important. If the POC test fails, no one is willing to take risks.

The database manufacturer and the company's technical personnel jointly conduct sufficient testing on installation and deployment, high availability, disaster recovery, backup and recovery, data migration, database synchronization, syntax compatibility, etc., and will evaluate whether such databases meet the company's requirements based on the final test scores. .

13. Fees

reasonable fees

Including software, hardware, technical support and other costs.

14. Operation and maintenance costs

1. High syntax compatibility

For example, in the OceanBase distributed database, you need to manually specify the partition key when creating a table. For developers, changing the table building habits requires a certain amount of learning cost. If database migration is involved, there will also be additional workload of adjusting the syntax. If the partition key is unreasonably specified, performance problems may occur later, which increases the workload of operation and maintenance personnel to optimize the database.

Relatively speaking, TiDB's syntax compatibility is higher. There is no need to manually specify the partition key when creating a table. Data distribution is automatically implemented at the bottom.

2. High compatibility of operation and maintenance methods

For DBAs, the higher the compatibility of the operation and maintenance method, the faster they can master it.

For me, the databases with higher familiarity include Oracle and MySQL, the less familiar ones include PostgreSQL , and the informix database has even less exposure before.

Therefore, I can get started faster with some domestic databases developed based on Oracle and MySQL, especially the system structure, data dictionary, backup and recovery, high-availability architecture, etc. are very similar. For example, Dameng Database, some problem-solving ideas are very similar to Oracle. Similar, by analogy. It is slightly more difficult to learn domestic databases based on the informix database. Once such problems arise, since individuals do not have a certain knowledge reserve of the informix database, it is difficult to solve similar problems by analogy with the informix database.

15. Version iteration

Have stable version iterations

For example, there is a small version iteration every few months, and a major version iteration every few years, with a stable version iteration cycle.

If a database product version has not been iteratively updated for many years, there may be several reasons:

1. The debut of the product is the pinnacle, with zero bugs, and meets all the needs of customers (in fact, it will not exist).

2. Few product customers, no in-depth users, BUG, ​​low demand feedback rate.

3. The R&D team has weak capabilities and the product update cycle is long.

Therefore, it is hoped that the localized database selected should have a stable version iteration.

Summarize:

The domestic database popularity rankings released monthly by the Motianlun platform provide a good reference value for us to select domestic databases and can help guide us to make more objective judgments.

For the development of a new product, the larger and more complex the customer base is, the greater the demand will be, and the product will be more and more perfect. On the contrary, if a domestic database product has only a few customers, or even only uses it internally within the company, the use There are few scenarios, the business model also has limitations, and the cost of trial and error will be very high. This is why it is not recommended to directly use the product that is particularly low in the popularity rankings of Motianlun.

Personally, I believe that an ideal domestic transactional OLTP database should have the following characteristics:

It is an OLTP type database. The database manufacturer has the background of a state-owned enterprise or a large factory. The architecture supports stand-alone, active and standby, shared clusters, distributed clusters, etc. It has low hardware resource requirements, low hardware costs, is completely self-developed, independently controllable, and has complete after-sales service. Support capabilities, high popularity of database products, mature official technical documents, official technical certification, successful replacement cases of banking core systems, completely passed the company's internal POC test, stable version iterations, moderate costs, grammar and operation and maintenance, etc. Habit compatibility is high.

There is no 100% perfect domestic database product yet. Of course, no one is perfect and no one is perfect. There is no perfect product, only suitable products. There are also certain trade-offs in the selection of domestic databases. For example, some key points must be met. For example, it must be an OLTP type database. If an OLAP type database is mistakenly selected, the stability of the business, the integrity and consistency of the database will be affected. Major disruptions are possible. Some key points can be relaxed appropriately, such as the compatibility of operation and maintenance methods, etc. Each concern can be set with a corresponding weight score, and finally a suitable domestic database will be selected based on the total score.


Original link: https://www.modb.pro/db/1693833019517915136

Disclaimer: This article is the exclusive contribution of Chen Juchao , the special author of the Motianlun community  . The content is original and only represents the author's personal views. Everyone is welcome to exchange and discuss. If you need to reprint, please contact the author or Mo Tianlun official. If you have comments and suggestions on the above content, please point and communicate in the comment area below, or click on the homepage of the author Mo Tianlun to leave a message.

More exciting content is in Motianlun Technology Community , which provides one-stop comprehensive services around the learning and growth of data people, and creates news, online questions and answers, live events, online courses, document reading, resource downloads, knowledge sharing and online A unified platform that integrates operation and maintenance, and continues to promote knowledge dissemination and technological innovation in the data field.

Guess you like

Origin blog.csdn.net/Era666/article/details/132560378