Yelang in the database circle is arrogant and dangerous!

Author: Yin Haiwen, Oracle ACE, OCM 11g/12c/19c, MySQL 8.0 OCP, Mo Tianlun MVP, technical expert, ITPUB core expert, OCM lecturer, NetSilicon DBA director.

WeChat public account:The fish tank of the fat-headed fish

Today I saw an article invited by Chief Xue in OSCHINA "The emergence and disappearance of domestic databases are not technical problems" and His comments were so touching that I couldn’t help but write an article.

The art of 0 and 1

Now computers, in the final analysis, are the art of using diodes, that is, opening and closing, 0 and 1, running on hardware, whether it is an operating system, software or anything else, is the use of binary, and its essence is still dealing with mathematics. Taking the database as an example, all algorithms in it have to deal with the most basic mathematics, involving the combination of basic binaries, hardware firmware, hardware instruction sets, hardware drivers, etc., and software operation. I never understand. There are always people saying that many things involving database mathematics have not progressed in recent decades, so we can catch up casually? !

I am a DBA majoring in landscape design at a high school liberal arts college. I can't explain some things from a deep mathematical level, but I can show the results through some methods: An Anfenfen uses the same configuration in Oracle 11g vs 19c, MySQL 5.6 vs 8.0, PG 9 vs 16 run the exact same thing, smooth out the so-called progress brought by the hardware, and see if there is any improvement in performance (not to mention it is difficult to do, at most it will take turns to test a 4C16GB virtual machine).

bandwidth bottleneck

When it comes to bandwidth bottlenecks, it has always been said to be disk IO bottlenecks. The performance of traditional HDDs at most 300MB/s and hundreds of thousands of IOPS is really not good enough. But the current transmission bandwidth is still in a rather embarrassing situation. Most networks are mainly based on 10 Gigabit. I also mentioned in my previous article that the current mainstream PCIe4.0 x4 NVMe SSD single disk limit bandwidth can reach 4000MB. /s, that is, one SSD can fully occupy the 10 Gigabit network (1250MB/s), and the mainstream 32GBps HBA card can also be fully occupied by one SSD. The 40GBps IB switch can barely support one SSD, and the 100GBps RoCE switch can barely support three SSDs. That is to say, beyond, if you go up to multi-channel, and so on, in short, today when NVMe SSD is getting cheaper and cheaper, even considering the performance degradation problem of random reading and writing , a small number of SSDs can also occupy the bandwidth of a single machine.

In the current era of disk performance > transfer performance, I can still hear that compared to Oracle Exadata's distributed storage based on X86 servers, dedicated storage devices are better. What does the Oracle Exadata storage layer do? Exadata Storage Software makes full use of the combination of hardware features and software to filter data within the storage,reduce bandwidth requirements at the network transmission level,Let high-performance disk devices such as NVMe SSD and PMEM play a real role instead of bursting the network. As far as dedicated storage is concerned, it is impossible to break through the transmission bottleneck of using servers. No matter how strong the IO is and how large the export bandwidth is, it can only increase the number of servers using storage devices. For stand-alone servers configured with a large number of high-performance disks, especially It is used for cross-shard operations in distributed databases. As long as the amount of data is slightly larger, it will be difficult for the network to handle it (it may not be caused by disk or memory). This is why distributed databases recommend that associated data be Put it in a shard (this requires a lot of requirements to be eliminated from the database level, or some requirements cannot be implemented using a distributed database).

Looking back at the beginning of this section, why was distributed back then? I personally think that the performance of a single machine is insufficient and more machines are needed to stack, but now a small SSD has better overall IO performance than dozens or hundreds of clusters in the past. Next, has distribution become a pseudo requirement again? Let’s look at Exadata’s solution. Isn’t it more reasonable to make full use of the hardware’s distributed storage + centralized database?

“Far ahead”

I don’t know when it started, but I guess it started with the “crouching dragon and phoenix chick” in the movie “The Tomato Man”. Many words that were once complimentary are now overshadowed by derogatory terms, including one from a major company. A certain loudmouth is "way ahead". I have seen more than once in domestic database communications, promotional materials (including subtext), and industry situation promotions that many of our products are far ahead of Oracle, DB2, SQLServer, MySQL, and PostgreSQL< /span>These foreign database products (I don’t know why there are so many shells in the last two).

Going back to what I wrote before, the database is a basic system engineering that takes a long time to hone. It is not a lot of so-called "advanced concepts, advanced architectures, advanced algorithms (especially those that can only be used without knowing how to implement them). Algorithm)" are stitched together, and the database product can be very powerful? ! This can only bring about the so-called rapid catching up of progress. If we really want to catch up, it is impossible to achieve it by ignoring the foundation.

Summarize

In the database circle, it is not shameful to face up to the gap, but it is shameful to be arrogant. The so-called "far ahead" ultimately brings danger to the system.​​    

Tang Xiaoou, founder of SenseTime, passed away at the age of 55 In 2023, PHP stagnated Wi-Fi 7 will be fully available in early 2024 Debut, 5 times faster than Wi-Fi 6 Hongmeng system is about to become independent, and many universities have set up “Hongmeng classes” Zhihui Jun’s startup company refinances , the amount exceeds 600 million yuan, and the pre-money valuation is 3.5 billion yuan Quark Browser PC version starts internal testing AI code assistant is popular, and programming language rankings are all There's nothing you can do Mate 60 Pro's 5G modem and radio frequency technology are far ahead MariaDB splits SkySQL and is established as an independent company Xiaomi responds to Yu Chengdong’s “keel pivot” plagiarism statement from Huawei
{{o.name}}
{{m.name}}

Guess you like

Origin my.oschina.net/u/3859945/blog/10321420