The considerations behind cloud database product and architecture design

Abstract: At the Alibaba Cloud Database Technology Summit, Xiao Shaocong (Tie'an), senior product expert of Alibaba Cloud database, introduced the whole system of Alibaba Cloud database products and shared the implementation architecture of Alibaba Cloud database products, helping everyone understand the full database of Alibaba Cloud What practical scenarios can the product system solve, and at the same time help everyone understand the principle of its solution.

       On August 24th, the Alibaba Cloud Database Technology Summit came. This technology summit invited Alibaba Group and Alibaba Cloud database veterans to share the first-line database practical experience and technical dry goods.

The following content is based on the live video and PPT of the speakers.

       This sharing will mainly introduce how Alibaba Cloud designs the architecture of cloud database products, and the stories behind the architecture design of cloud database products. This sharing will not go very deep into the underlying technical details, but hope that through sharing, everyone can understand how to plan when using ApsaraDB for the cloud database, and what kind of thinking Alibaba Cloud has when designing ApsaraDB for products.

1. Market background of cloud database

Multi-product type mix

       In the market, you can see a wide variety of database products. If you are still using a separate relational database like SQL Server or MySQL today, the scenarios covered by your business may be limited. Often in a medium-sized or relatively large company, many different database products have already been used.

 

System architecture becomes more complex

       As shown in the figure above, usually, a SQL database or NoSQL database is used at the beginning of the business, but as the business slowly develops, the Key-Value cache database will be used, and it may develop later. To the data warehouse, it may also develop to the system of big data.

       Next, I will introduce the development steps of database architecture design in the process of a company's gradual growth from a small enterprise to a large enterprise and the database evolution at each stage of the operation process. As shown in the figure below, in the initial stage, the structure of the entire database architecture is relatively simple. There are only 3 servers in the figure, which means that the company may only have a few database servers at the beginning, which may be SQL systems. It could also be a NoSQL system. At this time, DBAs and managers do not need to do too much structural usage analysis. However, in the process of management, the DBA often has multiple roles. At this time, the DBA may also be doing development and operation and maintenance of the operating system while managing the database.

 

       Further, when the enterprise develops to the beginning of the scale, the database is often unable to run in a single-node mode, and at this time, it is often necessary to cooperate with some clusters and some other databases. For example, at the beginning, only a separate SQL database and a NoSQL database were used, but when the system is beginning to take shape, both databases may be used, and at the same time, a key-value cache database may be used. Mixed to achieve the overall database business effect. In this process, DBA plays a very magical role. At this time, DBA not only needs to do ordinary SQL system management, but also needs to manage NoSQL and Key-Value databases, and in many cases, overall system monitoring and system maintenance are required. Fully supported by the DBA, the DBA at this time can be called the "magic DBA" because he needs to manage everything.

 

       What will happen when the business further develops to the next stage? In fact, many companies often only have one project when they are just starting out. When the company's business develops to a certain scale, the number of projects will gradually increase. Therefore, each project will use a complete set of database structure, and at this time the project will continue to improve and grow, each project will use a separate overall database structure, at this time the database administrator DBA is no longer a single person Now, there are often 2 or 3 DBAs, and each of them should be able to stand alone and should have complete architectural experience. It is at this time that the biggest challenge for enterprises is. Everyone knows that in the career development of technicians, technicians hope to be able to carry more work or make their careers achieve greater development. Therefore, in this process, it is often possible to form a game of cognitive differences between the enterprise and the technicians themselves. In many cases, if the speed of enterprise development cannot keep up with the technological growth rate of technicians, technicians are likely to try to change jobs or find other jobs. When technical experts leave the enterprise, the further development of the enterprise will be hindered.

 

Database Disaster Recovery: Two N Centers

       As the system architecture becomes more and more complex, the enterprise encounters not only the problem of personnel, but also another problem in the process of developing and evolving the database architecture: after the business develops to a certain stage, it is often necessary to The logic of generating more data architectures, including under the requirements of the business and the requirements of the regulatory authorities, may need to implement a series of complex architectures such as two places and three centers, which will also double the operating costs of the business improve. Because it is more convenient to build a database cluster under a separate computer room, and when implementing an architecture like two locations and three centers, it is also necessary to purchase infrastructure such as intra-city optical fibers and remote optical fibers. Fees also tend to make many companies stall in such a position.

 

       If enterprises want to further break through the development bottleneck, they often need to use more database architectures, such as database warehouses that need to use OLAP and massive analysis of big data. In such a case, the construction cost of the DBA team and the overall system will be greatly increased.

 

Alibaba Cloud Cloud Database: Product Concept

       In the previous section, I introduced to you what kind of requirements may arise for the database in the process of an enterprise from small to medium-sized, or in the process of reaching the outbreak period and further rising period. Next, I will share with you what kind of product concepts we hope to provide you in Alibaba Cloud's cloud database.

 

1. First of all, you can see that different databases, combinations of databases, and different levels of database products need to be used in different business development processes. Therefore, Alibaba Cloud will design its own database products so that different levels can be applied. .

2. Second, Alibaba Cloud will try to help enterprises improve their own development efficiency as much as possible, that is, let enterprises easily expand their own resources, and let these resources directly improve production efficiency for users.

3. Third, Alibaba Cloud will directly lower the threshold for building the entire architecture. Under the traditional enterprise architecture, when changing from a single computer room to multiple computer rooms or from a single server to multiple servers, there will be many technologies and production processes. The cost threshold is limited. Alibaba Cloud hopes to minimize these thresholds through the overall underlying architecture of the cloud database, including the Apsara architecture.

4. The last thing I want to mention, and the point I want to share most this time, is that on top of the cloud database, the ultimate goal that Alibaba Cloud hopes to achieve is to liberate the DBA. In fact, under normal circumstances, in a company, DBAs often do not receive great attention, because the daily work done by DBAs in enterprises is often a series of operation and maintenance tasks such as deployment, backup, etc., and these tasks will Occupy 50% to 60% of DBA's time, and these jobs have no way to bring direct productivity to the enterprise. On the cloud database, all the operation and maintenance work is completed through the automatic management of the cloud architecture, and the DBA can devote more time to the optimization of the business architecture. What is the optimization of business architecture? For example, the unreasonable table structure design needs to be optimized, some SQL performance problems need to be optimized, and some designs are outdated in the process of business development and need to be optimized. All these optimizations should be done by DBAs. DBAs are also the easiest group of people to develop into the core architects of the enterprise. Their work should contribute more to the output of the real production capacity and technical capabilities of the enterprise, instead of paying too much attention to the tedious deployment and backup every day. operation and maintenance work.

       The above is the market demand situation and Alibaba Cloud's cloud database product design concept that you have seen in the process of designing the cloud database.

2. The eternal topic: demand

       In fact, the long story told above is all needs, and every need needs to be met. So in the face of these needs, how does Alibaba Cloud's cloud database solve it step by step?

Hierarchical: Extend boundaries to cover users at different tiers

       The first step is to stratify. Friends who have used Alibaba Cloud database before may have the impression that the database version originally launched by Alibaba Cloud is called the high-availability version, which should be the most used version in the current Alibaba Cloud database. In this version, there will be two database servers, one active and one standby. They provide very good performance and can switch quickly. However, under this architecture, the cost is actually doubled. Many users, especially entry-level users who are just starting to use ApsaraDB for databases, often do not need primary and secondary database systems, but want to invest at a lower cost. At this time, Alibaba Cloud launched the basic version of ApsaraDB for it. The architecture of the basic version has only a single node, and the introduction of the basic version reduces the cost of users. At the same time, it is necessary to pay attention to how high availability is guaranteed under a single node or basic version. In fact, you can rest assured that under the basic version, Alibaba Cloud also provides high availability guarantees, but instead of the guarantee of two nodes, the entire database runs on the Apsara architecture. If there is a problem with the database or the host where the database is located If there is a problem, the Feitian system will automatically find new hosts and new nodes to run the entire system, but the switching time will be slightly longer, but there will be no long-term disconnection of the system.

 

       Going a step further, many advanced users, especially those in the financial world, will have higher requirements for data stability and RPO in the event of data failure. At this time, Alibaba Cloud provides a financial version of the database. On the basis of two nodes, it may be extended to the application of clusters of three nodes or even more nodes. After doing this kind of work, Alibaba Cloud actually expanded the boundaries of cloud database products. From the beginning, Alibaba Cloud database only had a high-availability dual-node version, and expanded to a single-node basic version and a multi-node financial version, making different Users who need it can get various cloud database services of different specifications that they need.

Efficiency: Simplify complexity and release workload

       After having the database running environment, you can actually see that each user often has some businesses similar to promotions and activities in different time periods. In these businesses, the user's query requirements are often very high. There is a very high query peak, which can be solved by read-only nodes. In Alibaba Cloud, instances of read-only requests are directly provided, and users do not need to build read-only instances by themselves. If you have built a database yourself, you may have some experience with this process. When building a read-only instance, you often need to build or configure 3 to 4 configuration files, and between each host, including user permissions and password synchronization, etc. All need to be planned, this process is more difficult and troublesome for the primary DBA, and at the same time, it is necessary to ensure the stable operation of the entire system in the process of business development.

 

       On Alibaba Cloud, the process of building a read-only instance will actually be simplified, because the underlying system architecture of the Alibaba Cloud database will automatically perform all configuration and business confirmation. Users only need to click the button on the interface and add a Read instances can create read-only instances, and allow users to create 5 or even 10 read-only instances. During this process, you will find that although Alibaba Cloud provides the function of directly adding read-only instances and completes the synchronization, the business side, that is, the application on the cloud server ECS as shown in the figure above, still needs to read and write The business separation of requests and read-only requests is intrusive for database development, that is to say, the originally developed program only needs one database to operate, but since read-only instances are currently used, all The read-only queries of Alibaba Cloud are individually carried out to allow them to access other nodes, which may be very intrusive to the program, which may prevent many developers from using Alibaba Cloud's read-only instances directly. , or there is a lot of work that needs to be redeveloped.

Efficiency: Simplify complexity, release workload, and directly support read-write separation

      针对上述的问题,阿里云的数据库在发展的过程中也会收到用户的需求和报告,因此阿里云数据库就进行了进一步的优化。在只读实例的运行条件之下,阿里云数据库还进一步地提供了读写分离的IP访问,其主要会在Proxy业务层底下实现所有SQL的收集,并且对于所有的收集到的SQL进行分类,如果发现SQL操作既有读操作也有写操作的时候,也就是读写操作在同一个事务里面的时候,会将这些操作自动地提交到主节点。而如果当发现事务中所有的操作都是读操作的时候,Proxy层就会将这些只读的查询平均地分配到各个只读节点。这意味着应用程序不需要改变本身的代码,阿里云就能够自动地为用户实现读写分离的工作,而业务方不需要去修改自己的业务代码。通过这样查询的读写分离的功能,可以非常好地简化本身开发以及维护的工作量。

 

效率:新一代关系型数据库演进

       其实除了上面所说的这些,阿里云数据库所做的工作还远没有结束。如果大家留意了阿里云最近的新闻或者最新的产品动向就会知道,在阿里云数据库最新的版本中提供了关系型数据库PolarDB的集群,这款产品预计将在十月份推出,在这款产品上面不单单解决了读写分离的问题,也会使用到最新的硬件技术去达到比较好的读写资源比。在读写分离的业务之下,当主节点有数据写入的时候,所有的数据需要同步到每一个只读节点,而在主节点和只读节点之间或许会存在网络延迟,这些网络延迟可能会导致从主节点读出的数据和从只读节点读出的数据出现不一致的情况,而这是需要业务方或者开发人员知晓并通过业务进行保障的。

 

       而在PolarDB中,阿里云尝试使用了一种新型的架构,通过RDMA网络会将下层的各个存储节点进行整合管理,通过分布式Raft协议实现完整的底层集群。这样所能达到的效果就是当主节点进行数据写入的过程中,底层的Raft协议的数据集群会把数据自动打散到三个或者以上的存储节点上面,同时这些数据一旦写入,在其他的只读节点上就可以读到。因此可以看到在这样的架构之下,减少了多节点之间的数据复制,网络带宽的消耗会更低,同时主节点和只读节点之间网络数据延迟基本为零,也就是说只要数据写入了,只读节点就能够读取到,符合ACID的完整原则。所有的数据在存放的时候都不会少于三节点,任何一个节点或者数据模块出现故障的时候,都不会造成数据丢失。在这样的架构之下,可以进一步使得数据库的使用者获取更高的性价比。

门槛: 综合系统管理门槛高

       以上分享的是在数据库关系的演进中阿里云提供的一些思考和产品,而下面会分享另外一个问题。如下图所示,当一个业务发展到比较庞大的数据规模时,存下来的业务数据还需要进行产品的分析,比如当数据量已经存放到两三年的时候,企业主肯定希望能够通过这两三年沉淀的数据来进行业务分析。这个时候,在传统的架构之下,往往会向如图中所看到的把数据通过ETL,也就是数据导入到数据仓库,并在数据仓库之上再去做OLAP的业务分析。同时由于数据量越来越大,因此也需要通过分布式的数据库中间件实现一个动作,也就是将整个的数据库进行分库分表式的管理。当然,这样的功能在互联网圈已经使用的非常多了,但是大家会发现下图中存在四个蓝色的管理标记,这是因为在每个层级当中都需要对于数据库进行一些单独的人为操作和人为干预。

 

简化分库分表管理,一份数据实现OLTP+OLAP=HTAP

       可以看到在上图的整个运作流程中,每一个节点上面都需要进行配置和规划,而这些所有的配置和规划都需要消耗时间。还有一点就是业务系统是OLTP的系统,所有的在线的业务都在上图中左侧的OLTP系统里面,当需要进行分析的时候并不能直接在业务系统进行分析,因为这会影响业务系统本身的性能,因此需要再进行一次数据的抽取,将数据抽取到OLAP的数据仓库中,然后再去做查询。这样动作使得数据多了冗余,而且所有的数据无法实现所谓的“T+0”的实时分析,在这种情况下,所有的操作以及运维管理会消耗更多的使用资源。因此,在阿里云中也提供了HybridDB for MySQL的架构。通过HybridDB for MySQL架构的数据库,可以实现将上图中所看到的整个数据链路,包括分布式数据库中间件以及数据仓库都整合到一个数据库中,这个数据库可以直接实现OLTP的事务操作,同时也接受OLAP的数据分析处理,而且整个系统也是分布式系统。

 

       在这个系统之上最大的好处就是用户不再需要去分别地管理两个业务系统:OLTP系统和OLAP系统。与此同时还可以实现计算和存储的分离操作,如果计算资源不足还可以单独地增加计算资源使得查询速度更快。而且整个系统将会直接兼容MySQL的生态,因此用户不需要过多地修改自己数据库查询的业务逻辑,可以直接使用MySQL的客户端以及各种工具来连接到数据库上面去进行操作。因此,在HybridDB for MySQL数据库中实现了一种新的形态叫做HTAP,实际上就是在同一个数据库里面不仅可以进行OLTP操作,还可以进行OLAP操作,而且其空间可以扩展到超过PB的级别。

释放:安心原于透明,主动的提醒

       阿里云的数据库产品除了提供了以上的功能以外,为了使得DBA更加省心和安心,绝对离不开的就是对于各种资源的监控以及对于引擎的监控。在这里不做过多的解析,因为在产品上大家可以看到,阿里云已经把自己原来在天猫、淘宝等的各方面的经验进行了整体的输出,会提供非常深度的包括TPS、QPS以及缓存命中率等等一系列的监控,而且可以产生直接的图表。在报警方面,可以通过云监控设置所需要的报警,当水位超过了一定的范围之后可以直接发送短信、邮件甚至通过电话的告警来提醒DBA进行扩容或者及时地发现问题。更进一步,阿里云还将会提供云DBA的协助工具,甚至还会为用户提供Index推荐以及像告警错误业务分析等服务。

 

释放用户成本:中小企业也可以获得高端服务

       在企业发展的过程中,随着业务不断地发展,需要更好地保障业务的连续稳定性。对于很多企业而言,数据库中心里面往往只有一个IDC的机房,然而如果这个IDC机房出现断电或者故障的时候,就没有办法进行更进一步的业务操作。阿里云在数据库体系之下已经完成如下图所示的整体架构。当大家看到阿里云数据库产品购买页的时候会发现阿里云不仅提供了单中心的双节点模型,还在很多地域中提供了多可用区的模型。多可用区模型就是把主节点和备用节点放在一个城市的两个不同的可用区上,也就是说用户的实例只购买了一个,但是在部署的时候却部署在了一个城市的两个中心,一旦主中心出现整体故障的时候,用户的业务依然可以通过切换到备用中心继续提供服务。大家可以想象,如果没有云架构的支撑,依靠自己搭建多可用区模型的时候,可能会需要非常高的业务成本,这是因为同城之间光纤搭建的费用是非常昂贵的。

 

       除此之外,阿里云还会为企业提供跨数据中心的访问。很多用户可能会说现在自己的企业还不大,还不需要到这样的业务保护,但是在这里想告诉大家的是这样的观点往往是不正确的。如上图所示,当前在同城双中心灾备里面不需要增加用户的成本,那么就完全可以在企业发展之初就使用这样的架构,因为在企业发展的过程中,任何一个技术上的故障或者服务的宕机往往会造成后续更大的损失。所以如果能够提前在不增加过多成本的情况下实现企业同城容灾以及跨地域业务,往往能够对于企业业务的发展提供更大的帮助。在阿里云上,其实基于阿里云本身的技术架构就可以为用户更好地释放这部分成本,而用户不需要自己搭建光纤就可以复用所有现有的网络,这使得企业在初始阶段就可以像大企业一样使用到所有的数据库的高端服务。

       大家可以看到,在阿里云数据库业务中比较注重的就是如下图所示的五点:如何帮助用户节省成本,如何使数据库的性能达到更高,如何维护业务的连续性,以及业务扩展能力和数据容灾等。而这一切的能力都是通过云托管平台进行规划和赋能的。

 

三、生态的力量

       在以上的内容中为大家分享了云数据库上的产品驱动、阿里云数据库提供了什么样的保护以及阿里云数据库是如何承载用户需求的。接下来为大家分享数据库生态的力量。

阿里云MySQL生态体系

       当阿里云最初去规划数据库产品的时候,首先做的产品就是MySQL,因为在当时MySQL也是使用最为广泛的数据库,特别是在互联网业界。但是在不断的发展过程中也发现不断有更多的互联网业务以及企业客户会进入到阿里云体系上来,所以阿里云需要有更多的数据库类型来对这些用户进行支撑。很多的用户不仅仅需要使用SQL的数据库,还会需要做缓存并且需要进行数据分析。在下图中,大家可以看到阿里云逐步地增加更多的引擎,按照DB-Engines的统计,目前阿里云数据库已经能够覆盖70%的数据库产品,这也是由市场所决定的。而这些产品中的部分产品已经开始走上了阿里云自研的道路,像之前提到的HybridDB for MySQL以及PolarBD等。当然,除了自研产品之外,阿里云也会兼顾到市场上面其他用户的需求,也会提供像SQL Server、HBase、MongoDB以及Redis等一系列的数据库产品,这就是阿里云目前的数据库产品整体大生态。

 

       当然我们也可以看到另外一个生态模型,举个例子就是目前很多的用户都在使用MySQL的数据库,在MySQL数据库之下,阿里云会提供多种不同的数据库形态和模式,让用户可以完全沉浸在MySQL整体生态链路之中。如下图左侧所示,RDS for MySQL提供了基础版、高可用版以及金融版,使得用户可以快速地进行业务的使用,进一步在未来阿里云数据库将会发布的PolarDB也会率先地支持MySQL的引擎,让具有几十TB数据或者上百TB数据的并且想要使用更加稳定的数据库系统的大型客户能够非常好地解决遇到的问题。同时,很多人会认为MySQL上面并不适合去做OLAP业务分析工作,但是如果所有的开发人员都是熟悉MySQL的并且不希望跳出MySQL的框架而且希望去基于MySQL实现业务分析操作,这样通过HybridDB for MySQL就能够继续去承载这样的业务,而且在这个系统上面还可以同时整合OLTP和OLAP。因此,在数据库产品的规划过程之中,阿里云会充分地考虑用户本身的感情因素,当用户特别倾向于某一数据库的时候,阿里云就会针对于这个数据库做出一系列的产品,使得用户可以通过统一的技术去完成所有的技术工作,而没必要将所有的工作分散到不同的数据库并使用不同的SQL模型进行重新开发。

 

阿里云PostgreSQL生态系统

       除了MySQL生态之外,近年PostgreSQL生态系统也是非常火热的,阿里云数据库团队在PostgreSQL生态上也沿用了和MySQL生态中相同的思路。阿里云不只是为用户提供一个单独的RDS for PostgreSQL的系统,因为PostgreSQL和Oracle比较相似,所以还会针对基于PostgreSQL的增强版本——RDS for PPS来协助Oracle用户来进行数据迁移。同时阿里云也会推出针对于数据仓库的HybridDB for PostgreSQL来实现数据分析。而且所有的这些体系都可以通过外部表的形式去操作OSS,甚至在OSS上面放一份数据,各个不同的OLTP、OLAP数据库产品都可以对于OSS上的数据进行读写操作和分析应用来实现整体生态链的运行过程。

 

四、SQL+NoSQL+Big Data一站式解决数据打通

       除了上述提到的阿里云为拆分的MySQL和PostgreSQL生态链打造的独特的方案之外,阿里云数据库还会与阿里云的各种数据链路的软件进行整合规划。在下面这张图中,大家可以看到,通过阿里云的DTS以及CDP这样数据工具,可以将前端的Key-Value的缓存层、OLTP、NoSQL、分析以及Big Data进行整体数据的打通。云上的数据可以通过比较方便的方式加上业务架构的模型开发就可以实现对于所有数据在各个数据产品之间的无缝打通,并实现了整体的数据交换。交换完数据之后就可以让各个数据系统更大地发挥自己的业务价值。

 

       如今,数据库其实已经是达到了百花齐放的状态,目前有非常多的引擎以及不同的业务规划。而阿里云的云数据库依旧秉持着这样的几点产品理念:阿里云会为用户提供不同层级的数据库产品,让用户可以实现不同的需求,不同的用户可以购买到不同价格、可靠性以及性能的数据库产品。阿里云希望通过云平台的打通实现用户数据库构建的最快速的发展效率,而不希望因为架构的改变或者演变,而去等待几周甚至一个月的规划,而希望通过点几下按钮就能够得到新的数据库或者搭建出新的集群,并与原有的集群进行无缝连接。同时,在成本上面,因为得到了云基础架构的保证,用户没有必要自己去再搭建昂贵的光纤或者机柜等硬件设备,而可以直接去生产实例。用户所购买的云数据库其实代表了很多东西,包括软件、机器、机架以及网络等,而这一系列的东西阿里云已经搭建好了,用户可以根据自己的需求直接去购买一个月、两个月或者一年的使用量级,而没有必要去一次性地进行成本的支付。最后,阿里云希望通过自动化的部署、管理和监控,释放DBA的工作量,让DBA免于去被部署、管理等运维工作所缠绕,让他们把更多的时间和经历去投入到为企业进行业务优化,用技术为企业创造更多的核心生产力上面去。

本文为云栖社区原创内容,未经允许不得转载,如需转载请发送邮件至[email protected]

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=326223180&siteId=291194637