Integrate storage and computing or separate storage and computing? Talk about architectural choices for database infrastructure

Let’s start with a user case

A financial user asked, is it better to use the local disk of the server or external storage for the database? Intuitively, the performance should be better if the local disk path is short. However, the test results were unexpected: with the same medium concurrency pressure, a mixed random read and write model, the server's local SSD disk totaled 40,000 IOPS, and the latency was as high as 6ms, a regression to the level of the mechanical disk era, and the external storage was also concurrent 300,000 IOPS under pressure, stable latency at 0.2ms.

I/O model

TDSQL+local SSD disk latency

TDSQL + external all-flash storage latency

8KB sequential read

0.95

0.19

8KB random read

6.1

0.33

8KB sequential write

1.8

0.32

8KB mixed random read and write-read

6.4

0.2

8KB mixed random read-write

5.7

0.25

447bb867a42dc8fc92cb8bb18e5f2bd0.png

It can be seen that relying solely on innate physical path advantages or hardware resource stacking does not mean certain efficiency. More advanced architectures and algorithms are the decisive competitiveness. In fact, the integration of storage and calculation and the separation of storage and calculation have been repeated many times in the history of database development, and every change is driven by technological progress.

The first separation of storage and calculation

The initial IT system was small in size and focused on integrating storage and calculation.

The first database management system developed by General Electric Company in 1961, called IDS (Integrated Data Store, integrated data storage), could only run on General Electric's host, and only one file was stored on the local disk, which was controlled by the local computer. The CPU and memory are read and written according to hand-coded instructions. All in all, early database systems were deployed on local disks in an integrated way of storage and calculation for the following two main reasons:

  • The database at that time was a high-end application and was closely bound to various mainframes and medium-sized computers. For example, the classic IBM mainframe System/360 at the time, and the later-known database DB2, was first run on this platform. With its powerful computing power and localized storage capabilities at the time, it took the lead in major financial institutions and scientific research laboratories around the world. status.

  • The amount of data is small. Take the bank credit management system at that time as an example. The data volume of this system was only about 10GB at the time, which was more than enough to use the local storage space of the mainframe.

The volume of IT system data has exploded, and computing and storage have begun to separate.

Since then, with the rise of Internet technology, the amount of data in database systems has begun to surge. Traditional stand-alone database servers cannot meet the storage and access of the growing amount of data. UNIX systems have also become popular, and almost all UNIX hosts have connections. The ability of independent storage servers. During this period, storage area network (SAN) solutions for block device applications and network attached storage (NAS) solutions for network file systems began to dominate. At this time, the main architectural solution choice was storage and computing integration. Moving to a storage-computing separation architecture, external centralized shared storage devices have become popular.

94bd586fb2b5bc6fc507d5f70cdfa29c.png

Commercial database giant Oralce has launched RAC based on shared storage starting from version 9i. Each computing node of RAC is separated from the shared storage, which solves the problem of horizontal expansion of database computing power and fault tolerance. At the same time, taking advantage of the popularity of 1000Mbps network, the cache is also Cache Fusion is built based on the concept of sharing, which is the cache fusion technology that is talked about in the industry. The improvement in efficiency has qualitatively improved Oracle's concurrent processing performance.

19a145b44adaa22aae513a36a489963a.png

There is no doubt that the advent of RAC was revolutionary at the time, and it established Oracle's dominance in the field of core-level database systems. In this separation of storage and computing, the most important supporting technologies mainly include mature network networking technology and mature storage network technology. Of course, Oracle's own killer technology cache fusion also plays a big role.

Storage and computing integration spurred by massive data

In the 10s and 20s of the 21st century, with the rapid development of web 2.0 and 3.0, more and more businesses became Internet businesses, and the upgrade in concurrent business access led to an exponential increase in data volume. People are beginning to realize that it is not only necessary to obtain the data necessary for business production, but also to obtain valuable data after processing massive data. As a result, various industries have begun to build their own business intelligence systems based on the original business to build new core business systems. At this time, the biggest challenges faced by customers are cost and throughput.

21705f9643c629382b7a51ad0324f719.png

Because the prices of traditional minicomputers and high-end storage arrays remain high, building your own analysis system requires a large investment in hardware and license. Moreover, even if you spend a huge amount of money to build it, it is impossible to run data analysis on the database. It's very slow. After looking for a location from the manufacturer, I was told: the disk speed is not enough, the network bandwidth is not enough, the CPU processing is not enough... In short, I have to continue to spend money on the hardware.

The reason is mainly that under the storage and calculation separation architecture based on traditional centralized shared storage, there is a bottleneck in IO: the host side only processes data but does not store data, so it has to fetch data from the shared storage side. At this time, it needs to be added from the network. There is one layer of delay, and another layer is added when the storage side reads the HDD mechanical disk. The layer upon layer of coding causes a great reduction in reading and writing efficiency.

The Hadoop distributed infrastructure developed by the Apache Foundation uses multi-node concurrency and localized computing to break through the IO bottleneck. It was designed to integrate storage and computing from its birth. It solves the problem of storing data by distributing computing tasks to the location where the data is stored. At that time, there was a long-standing network bottleneck problem.

bd0367a3377b3a5803a2c6cf8f3688b5.png

What is special about Hadoop is the data locality under the integrated storage and computing architecture:

  • Each instance (Instance) in the cluster is responsible for both storing data (DataNode) and processing computing tasks (Executor).

  • When scheduling, computing tasks will be sent to the instance where the data is to be processed as much as possible.

Hadoop's first-generation technology framework uses distribution to solve the problem of concurrency and bandwidth, and uses localized computing to solve the network bottleneck problem. At the same time, its multiple Hadoop core components including HDFS distributed storage can be widely deployed in On top of low-priced hardware equipment, it improves the user's cost-effectiveness.

The era of cloud computing brings a new generation of separation of storage and computing

With the rapid development of cloud computing, the concept of pay-as-you-go has begun to take root in people's hearts. Even the core system resources deployed locally are required to be supplied on demand. Therefore, the tightly coupled architecture integrating storage and computing cannot satisfy users at this time. Needed. Database technology is also developing in the direction of resource service, taking the loosely coupled architecture based on independent computing and shared storage a big step forward.

The open source project Kubernetes announced by Google at the 2014 press conference has given full play to the advantages of elastic expansion of cloud computing. It is an architecture system for cloud native application deployment and management based on container technology. Applications deployed by Kubernetes are often It is split into multiple different sub-modules and encapsulated in containers, which are called "microservices". At the same time, each application module is divided into stateful and stateless according to the standard of whether there is persistent data. When the application needs to provide external services, it can first use the storage interface component to connect to the external storage and build a stateful persistent database resource pool. On top of this, various stateless resources can be dynamically loaded according to the service needs of the upper-layer business. The process of microservices is like loading and unloading containers: each container has a standardized regular unit volume, and different numbers of containers can be loaded and unloaded according to the power of the front of the car, ensuring that every train can pull The box is within reach, and the cargo box can be easily exchanged between different truck heads. The advent of Kubernetes has provided a technical foundation for the realization of storage and computing decoupling in cloud infrastructure.

Databases have also been driven by the trend of cloud native concepts and have evolved into serverless database solutions that can adapt to changing and uncertain business needs. Traditional cloud databases only deploy databases on cloud infrastructure without improving and optimizing the databases. They are limited to their integrated storage and computing architecture. The ratio of storage and computing resources is limited to a range, and its elastic range and resource utilization are limited. are subject to greater restrictions. On the other hand, due to Share-Nothing between nodes, adding new nodes will inevitably trigger full data replication across nodes, and performance will drop by at least 20%. PolarDB launched by Alibaba Cloud is a typical Serverless database. It can automatically adjust the resource scale of the database according to the actual business situation. The following figure is the technical architecture of PolarDB:54a44b3eebf11d43b79acb9c9693d5af.png 

f5aec03603962674b7425ad92d3b0327.png

It first decouples and separates computing and storage resources. The architecture of separation of storage and computing is the basis of Serverless capabilities. The separated data is stored uniformly in the resource pool, so PolarDB users can see all data on each computing node. While having a stand-alone database experience, resources are well utilized: the storage space supports Serverless The method automatically scales on demand, and no matter how many computing nodes are expanded, there is always only one copy of the data.

The cloud database Aurora launched by AWS takes full advantage of the distributed advantages based on the separation of storage and computing. It changes the storage end from a centralized resource pool to a distributed cluster volume, making the reliability and fault tolerance of the data better. provides further protection. Let’s take a brief look at Aurora’s technical architecture:

af3a459ffa3d969114044c2307e77b9e.png

Aurora's architecture has the following outstanding features:

  1. The storage layer is used as a cross-data center distributed service system, which is built independently from the computing layer and can be flexibly and freely expanded on demand;

  2. The computing and storage side operations are separated and do not affect each other. The computing side pushes all the data processing logic including redo log to the storage side for asynchronous execution;

  3. Both the computing layer and the storage layer use existing ECS ​​instances and have no special underlying facilities. In other words, Aurora is built based on AWS's existing services and complies with cloud computing service standards.

The storage and computing separation architecture provides the basis for making full use of cloud computing dividends. User businesses under different application scenarios and under different pressures can flexibly increase or decrease corresponding service resources at any time according to needs. At the same time, because storage resources are also decoupled from computing resources, they can also be horizontally expanded on demand, greatly improving resource utilization.

Separation of storage and calculation is the optimal solution for infrastructure architecture in the era of multi-databases

It can be seen that under the new technical level, the IT stack is decoupled in layers, professional things are realized through professional solutions, and the separation of storage and computing architecture has once again become the choice in the trend of technological evolution. AWS Aurora, Huawei Cloud GaussDB, Alibaba Cloud PolarDB, etc. all use shared storage to improve the overall capabilities of the database through the "storage and computing separation" architecture.​ 

When the storage and calculation separation architecture is applied to database infrastructure scenarios, it can bring the following values:

  • Reliability improvement: The reliability of external shared storage combined with the failover capability of the database cluster itself solves the shortcoming of the reliability of the integrated solution that can only rely on the implementation of the upper cluster. .

  • Capability reuse: Use the mature snapshot cloning, data verification, sub-health detection and other capabilities of shared professional distributed storage to quickly improve the overall database solution capabilities.

  • Open architecture: The storage base based on the open ecology can quickly support the operation of many different types of databases, which is conducive to the smooth implementation of database applications.

  • Fast iteration: The SDS software-defined storage architecture that decouples software and hardware makes it easy to achieve continuous improvement in storage capabilities through software version updates and iterations, and is compatible with various databases of different technical schools. The development routes are well matched and evolve simultaneously.

Yunhe Enmo zData

Yunhe Enmo zData Operating platform.

zData zData X can be configured on demand to meet the performance, reliability and scalability requirements of databases of different sizes and improve database management efficiency. It is suitable for scenarios such as core database performance acceleration and multiple heterogeneous database storage resource pool deployment.

Two features of zData X architecture:

  • Storage and computing are separated, and both computing and storage can be elastically expanded on demand.

d20a7b2337da91fc797000dc3b4606ba.png

  • The storage side uses high-performance distributed storage software zStorage to replace the centralized storage in the traditional architecture. zStorage combines the high scalability and software-defined cloud capabilities of distributed storage with the low latency and rich data protection features of centralized storage, providing the database with cloud-based high performance and high Reliable and highly scalable data base. zData X has obvious advantages over both storage and computing integration or storage and computing separation architecture based on traditional centralized storage.

f086819114df766e2eae83cd7597701e.png

3ed1fbe4edf9a23e7c5a9f91ec86dac7.gif

Data drives the future, and Yunhe Enmo lives up to his expectations!


Founded in 2011, Yunhe Enmo is an intelligent data technology provider with the mission of "data-driven, achieving the future". We are committed to bringing data technology to every industry, every organization, and every person, and building a data-driven intelligent future.

Yunhe Enmo specializes in data carrying (distributed storage, continuous data protection), management (database basic software, database cloud management platform, data technology services), processing (application development quality control, data model control, digital transformation consulting) and application (Data service management platform, intelligent data analysis and processing, privacy computing) and other fields provide reliable products, services and solutions for various organizations, focusing on user needs, continue to create value for customers, stimulate the potential of data, and achieve future agility and efficiency of the digital world.

090d7f569d0756cee6cb381f68df268f.gif

Guess you like

Origin blog.csdn.net/weixin_54551388/article/details/134609623