DTCC 2020 | Alibaba Cloud Zhang Xin: Alibaba Cloud Cloud Native and Multiple Live Solution

Introduction: Multi-Live in different places, as the name implies, is to provide services to the outside at the same time at multiple sites in different places. The main difference from traditional disaster recovery is that all sites in Multi-Live provide services at the same time. In today's increasingly complex business and increasingly stringent disaster recovery requirements, how to implement cloud-native solutions for multiple activities in different places has become a challenge that medium and large enterprises have to face. At the 11th China Database Technology Conference (DTCC2020), Alibaba Cloud senior database expert Zhang Xin shared the solution of Alibaba Cloud's native multiple livelihoods.

Abstract: Multi-living in different places, as the name suggests, is to provide services to multiple sites in different places at the same time. The main difference from traditional disaster recovery is that all sites in "Multi-living" provide external services at the same time. In today's increasingly complex business and increasingly stringent disaster recovery requirements, how to implement cloud-native solutions for multiple activities in different places has become a challenge that medium and large enterprises have to face. At the 11th China Database Technology Conference (DTCC2020), Alibaba Cloud senior database expert Zhang Xin shared the solution of Alibaba Cloud's native multiple livelihoods.
HU8B8777.JPG

The content of this article is organized based on speech recordings and PPT.

Guest introduction:

Zhang Xin (nickname: Six Gold), Alibaba Cloud senior database expert, used to be a DBA to support Alibaba's internal core systems including transactions, advertising, etc. In the past two years, he moved to the proprietary cloud market and provided it to large government and enterprise customers Database solutions.
HU8B8718.JPG

This sharing will be mainly divided into three aspects:

  1. Disaster tolerance architecture analysis
  2. Alibaba Cloud Multi-Live Solution
  3. Customer case

1. Analysis of disaster tolerance architecture

The necessity of disaster tolerance

1.png
Living more in different places itself starts from disaster tolerance, so first introduce the necessity of disaster tolerance. The production system may encounter three types of failures. The first is host-level failures, such as excessive single point load, data corruption, etc.; the second is computer room-level failures, such as power supply failures, computer room network failures, etc.; the third is Regional faults, such as natural disasters. For the above three types of faults, it is obvious that the regional-level faults have the greatest impact, but the probability of occurrence is the lowest, but for host-level faults, the probability of occurrence is not necessarily low and the impact is small. Alibaba has sorted out its own failure types over the years, and found that with the increase in the complexity of the current business system, a single point of failure may also have a global impact, and when the complexity reaches a certain level, if such a single point of failure occurs , Investigation and recovery will be very difficult, so disaster tolerance has become a necessary option for enterprise information construction.

Disaster recovery industry analysis

2.png
From an industry analysis point of view, the disaster recovery market is quite impressive. According to authoritative reports, the global disaster recovery market share will reach 11.59 billion U.S. dollars in 2020, and the customer base is very broad, such as government, finance, energy, Internet, communications, etc. Basically, as long as there is an information system, there is a need for disaster recovery. Alibaba Cloud currently has 100,000 enterprise users and 400,000 database instances, all of which require disaster tolerance guarantees. At the national level, there are also strict compliance requirements, especially now that large government and enterprise customers are required to refer to the "Information System Disaster Recovery and Recovery Specification" GB/T 20988 for disaster recovery construction.

Evolution of disaster tolerance architecture

3.png
The evolution of disaster tolerance architecture is mainly divided into several stages. Disaster recovery in the same city is the simplest, that is, there is an IDC in the same area and the business is deployed, and a computer room backup system and database are deployed during disaster recovery to achieve asynchronous or synchronous data synchronization in the middle, and business traffic is concentrated on one side and the other side Only for disaster preparedness. Later, it gradually evolved into intra-city dual-active. It borrowed the advantages of two data centers in the same city that the geographical distance is relatively close, and the network delay is short. Services can be deployed to both ends because the physical distance is short and the delays are acceptable. . After that, there will be remote dual-active, that is, the two-point three center and the two-place four-center derived from it. The main thing is to add a disaster recovery center on the basis of the same-city dual-active. This disaster recovery center does not receive under normal conditions. For traffic, it will switch only when a regional fault occurs.

Traditional disaster recovery solution

4.png
Reorganize the traditional disaster recovery solution. For intra-city disaster recovery or intra-city dual-active, the advantage is that the deployment is simple and the access cost is very low; the disadvantage is that it only provides intra-city protection, which can only reach level 1 in GB/T 20988 Ability, so for large customers, this program cannot be selected. For remote cold standby, the advantage is also simple deployment, less business intrusion, and the disaster recovery capability of remote deployment is relatively higher, which can reach 2 to 5; the disadvantage is that the cost of cold standby unit redundancy is higher. , Resulting in a certain amount of waste of resources. In addition, because the disaster recovery unit does not receive traffic all year round, it is unknown whether the switch is available when a failure occurs. For the three centers in two places, it is actually a combination of dual-active in the same city and cold standby in different places. The advantage is the advantage of the above two schemes. The disadvantage is that the cost of the cold standby center is wasted and the regional level failure is not daring to proceed. Switch.

2. Alibaba Cloud's solution for multiple livelihoods

Alibaba Cloud Multi-Live Architecture

5.png
The above figure shows the overall architecture of Alibaba Cloud Multi-Activity in different places. In fact, the essence of multiple activities in different places is achieved by top-down traffic isolation of services. Alibaba Cloud divides the entire remote multi-active architecture into three layers. The first layer is the access layer. To achieve remote active-active, you first need to formulate a distribution strategy for the business, such as allocating traffic according to the geographical or user dimensions. Once the distribution strategy is defined, Traffic splitting can be achieved at the access layer, and the traffic belonging to this unit can continue to be transparently transmitted downward, and if it does not belong, it will be transferred to the correct unit. The second layer is the service layer, which is the business system that provides services to the outside world. It is divided into three types: unitized service, centralized service, and ordinary service according to the different providing capabilities. The third layer is the data layer. What this layer needs to solve is the two-way cross-domain synchronization and anti-circulation capabilities that the database needs to have, and it needs to ensure the data quality when the stream is cut.

Alibaba Cloud has detailed the multi-active architecture solution for two business scenarios, OLTP and OLAP, and then introduce them one by one.

Multi-active architecture of OLTP business

6.png
For OLTP business, Alibaba Cloud provides a corresponding multi-active architecture, which contains several key elements. First, multi-active configuration, mainly through MSHA for one-stop multi-active configuration, which is responsible for formulating traffic division strategies and deciding which databases need to be multi-active. Second, the multi-active flow control is mainly performed by MSFE according to established rules. It is responsible for flow identification, flow distribution and flow correction. Third, multi-live data synchronization is mainly achieved through DTS. DTS itself is a data synchronization tool. It adds many new functions for multi-live scenarios, such as anti-circulation, network optimization, and flow cut linkage. Fourth, the multi-active disaster recovery switch is also implemented through MSHA, which is mainly responsible for pushing down the specifications to each layer and performing a global check on the state before the multi-active switch. Fifth, multi-active scenario operation and maintenance is realized through DMS. In the multi-active scenario, DDL changes and data operation and maintenance have double-write problems and synchronization delays. Therefore, the strategies for implementing DDL and DML changes are different. Adapted capabilities to multiple live scenarios.

OLAP business multi-active architecture

7.png
The OLAP business multi-active architecture is not much different from OLTP, and the elements are basically the same. The only difference is that the two-way data synchronization is implemented at the bottom in the OLTP business multi-active architecture. In the OLAP business multi-active architecture, such work is not recommended. . There are two main reasons. First, the bandwidth cost of cross-regional data synchronization is very high. If OLTP has synchronized the data, try to synchronize in the cloud instead of OLAP synchronization; secondly, it is necessary to ensure data consistency If it is synchronized once on OLTP, if it needs to be synchronized once on OLAP, it will be more difficult to ensure data consistency. Therefore, Alibaba Cloud recommends not to do data synchronization on OLAP, but to do all on OLTP, and the data synchronization capabilities can be complemented in the cloud.

Typical dual-active architecture: dual-region four-AZ

8.png
The figure above shows a typical dual-active architecture, which is divided into two regions. Each region has two AZs. First, it has AZ-level disaster tolerance. If a region-level failure occurs, then the region-level Use disaster tolerance. Under this architecture, MSFE and specific business systems are deployed across AZs, with AZ-level high availability in the cloud. The database can be deployed in active and standby mode on AZ1 and AZ2, AZ3 and AZ4, and the bottom layer can realize two-way synchronization through DTS. The data is redundant in four copies, the business redundancy reaches 200%, and the redundancy of each AZ reaches 50%, but when the traffic is actually undertaken, only 25% of each AZ can be realized, and the business can be deployed by itself. For unplanned handovers, minute-level RTO can be achieved.

Different types of services in multiple jobs

9.png
As mentioned earlier, the service layer is divided into three types of services. The first type is unitized service, which is the main type of service under the multi-active architecture. For example, the information modification of Taobao buyers is a typical unitized service. The user ID of the home is used for traffic diversion. In this dimension, closed calls within the unit can be realized, without relying on the peer data, and the underlying data synchronization only ensures that the peer data is complete when the data is switched, and can complete the data. Yes, the business can be run directly after switching. The second is centralized service, mainly for scenarios where global configuration or business has strong central read and write requirements, such as inventory deduction. It is not allowed to deduct the same inventory in multiple places at the same time. This scenario will definitely access the central database. The bottom layer synchronizes data through one-way synchronization. This kind of service does not provide multiple livelihoods, but disaster tolerance capabilities. The third type is ordinary service, which is aimed at if the business is divided according to a certain dimension, then some coupled edge services may not be divided according to the same dimension. This kind of business may choose ordinary services, such as Taobao transactions according to purchase The home ID is divided, then sellers cannot be divided according to this dimension. Ordinary services can tolerate synchronization delays, that is, eventually consistent, but cannot accept access delays, so they are mainly oriented to read services and are not recommended for writing scenarios.

Cross-cloud data synchronization

10.png
The above three service types have different data synchronization methods at the bottom, so two cross-cloud data synchronization methods are given. The first is the COPY type of data synchronization method, which is mainly for centralized services and ordinary services. The data is one-way synchronized, and the unit can only be read but not written. The synchronization task configuration is realized through the whitelist + DDL release method. The second is the UNIT type of data synchronization method, which is mainly oriented to unitized services and ordinary services. The data is bidirectionally synchronized, and each unit can read and write. At this time, it is necessary to solve the anti-loop problem through transaction tables, etc., and through the global Sequence avoid confict.

Anti-circulation&Sequence

11.png
Alibaba Cloud PolarDB and RDS databases have implemented the two capabilities of anti-loop and sequence. In the anti-loop part, there are mainly two ways. The first is the transaction table method, that is, when the business is written to the database, that is, the transaction is submitted and the Binlog is generated. After the Binlog is taken by the DTS and the analysis is completed, the direction will be found When the target unit DB is written, a custom record will be generated in the transaction table, so that the transaction landing in the unit will actually have an additional small Event in addition to the original business logic. After parsing through the DTS of the target end, you will find that there is an additional transaction operation in Binlog, and you will know that this operation is from DTS, not from the business system, so you can filter out the operation and place the data loop. The second method is through THREAD_ID, which is an optimization function customized by the AliSQL kernel. The THREAD_ID of the native MySQL kernel is changed from 8 bytes to 5 bytes. Therefore, the service generation connection can only be between 0x00000 and 0xFFFFF, and the high bit is It is reserved for DTS connections, so that the central DB can distinguish between the two types of connections. Binlog will record all THREAD_IDs. Therefore, DTS can clearly parse out whether the operation comes from the business or the DTS. If it comes from the business, it will be synchronized. If it comes from the DTS, then Interrupt to achieve the anti-circulation function. The first method is somewhat intrusive to the business, and the second is a completely native capability, which does not have much impact on users or the kernel.

For the Sequence function, it is actually to write data on both sides at the same time, and it is necessary to ensure that the data does not conflict. Therefore, Alibaba Cloud has made a globally unique sequence capability for PolarDB-X, and added an identifier to the native DDL to control the number of current units and the Index of each unit. Based on the table created in this way, the internal step size is 100,000 and the number of units is 2, for example, the result is shown in the figure above, so as to achieve the ability of the global sequence.

Data protection in multiple live scenarios

12.png
In the multi-active scenario, the biggest difference from the native is that there is no need to pay attention to availability, but there are more data quality issues. This problem may not occur easily in the single data center scenario, but in the multi-active scenario because the business requires double writing , So it is prone to conflicts in data quality. In the final analysis, all data quality problems are caused by data double writing, so certain protection measures need to be formulated for this scenario. Alibaba Cloud has formulated three-dimensional unit protection measures. The first is the daily state, which provides corresponding methods for the access layer, application layer, and data layer. The multiple active shunt rules for multiple write operations are used for routing logic verification. For unit traffic, the traffic is transferred away at the access layer and application layer, but if it is at the data layer, it is directly blocked. The second is the change state, mainly for data operation and maintenance changes, such as batch data correction. Alibaba Cloud provides the ability to check beforehand and supplement afterwards. On the DMS, check the changes in advance for data change tasks in multi-active scenarios. If If the synchronization delay is too large, it will be blocked, which reduces the probability of data double writing. At the same time, the data is kept consistent by checking before and after the change. The third is the cut-flow state, which is a protection strategy used in the process of multiple live data flow, including absolute write prohibition, delayed write prohibition, front mirror matching synchronization, and delayed check functions.

Multi-flow process

13.png
In the case of multiple live streaming, the front mirror matching function is first turned on. It is generally believed that the data written by the business in the multi-active scenario is more important than the data that is synchronized. Therefore, it is necessary to ensure that the data written by the business is not overwritten by the synchronized data. Therefore, if there is a delay in data synchronization during the switching process, in order to If the business data is not overwritten, you need to take out the front image in Binlog and put it into SQL for execution. After the front mirror matching function is enabled, the new traffic distribution rules will be issued at each layer. After the rule is issued, the absolute write prohibition action will be enabled. During this process, all user traffic participating in the cut flow cannot be executed. . In the write prohibition process, it is first necessary to judge whether all the three-layer rules have converged successfully, and secondly, it is necessary to judge whether the rules of each node in each layer have converged successfully. The ultimate goal is to keep the rules on all servers consistent, so as to ensure that double writing does not occur. . After the above conditions are met, the absolute write ban is lifted, and the delayed write ban is enabled. This can be configured by the user. When the data synchronization is completed, the write prohibition and the previous mirror matching are lifted, and the stream cut process is now complete.

Summary of the value of living in different places

14.png
Simply summarize the value of living more in different places. First of all, Duoju itself is for disaster recovery, but now it is not like placing a disaster recovery unit like traditional disaster recovery. Now the business is disaster recovery, and the business system and the disaster recovery system are closely connected together. Secondly, business continuity is guaranteed, providing high availability for the business. Third, it provides support for the rapid development of business. In the multi-active scenario, many atomic units are divided, and related resources can be rationally allocated according to the atomic units to achieve optimal results, and ultimately have the ability to expand horizontally across regions. Fourth, the traffic is effectively isolated. The remote multi-active solution based on Alibaba Cloud can deploy traffic very flexibly. Specifications can be set according to different dimensions, or set according to different weight ratios, to achieve flexible deployment of traffic size and achieve Carry out risk-controllable technical tests in the smallest unit. Fifth, reduce costs and increase efficiency. Traditional disaster recovery solutions cannot break through the 200% redundancy cost problem, while the three-active and four-active solutions can achieve redundancy costs less than 200%.

Difficulties for users to implement multiple activities in different places by themselves

15.png
Users need to face many difficulties to implement multiple activities in different places by themselves, such as high difficulty in traffic management, complex data synchronization strategies, difficult data quality assurance for disaster recovery switching, and difficulty in unified management and control of multiple data centers. Capability precipitation is the driving force for product-level solutions. Based on Alibaba Cloud's remote multiple activity solution, users only need to understand how to divide traffic.

Advantages of Alibaba Cloud Cloud Native Solution

16.png
At present, there are very few vendors that can achieve product-level multi-activity in different places. After 8 years of accumulation and precipitation, Alibaba Cloud has many advantages in the cloud-native solution for multi-activity in different places.

Third, the case of multiple live customers in different places

Customer case-a tax core system

17.png
The remote multi-activity program of a certain tax core system is also implemented according to a three-tier architecture. At the access layer, it supports traffic splitting in two dimensions, namely province and natural person file number. At the service layer, CSB products are used to implement cross-cloud invocation of common services. At the data layer, data synchronization of different disaster tolerance levels is implemented for different service types. In the end, it achieved two-dimensional multi-activity, second-level switching capability, and reached the national standard 6-level effect. Because it is based on two-unit connection, it has an advantage in cost and has the ability to increase gray scale.

Customer case-an operator customer service system

18.png
An operator’s customer service system has realized the ability to split traffic by province, that is, access to the two centers in the north and south according to the geographical distribution of DNS. The access layer performs judgment and error correction in accordance with routing rules, and the customer’s original system is adapted and modified at the business layer , The service synchronization of the dual centers is realized. At the data layer, two-way data synchronization is achieved through PolarDB-X and DTS. In the end, multiple services of the operator’s customer service system were split up by region. In multiple disaster recovery exercises, it was possible to complete second-level switching and ensure zero data loss. In addition, since two units carry service traffic normally, the cost is also reduced.

Click here to download the presentation PPT

Related Reading

[Contains dry goods PPT download] DTCC 2020 | Alibaba Cloud Ye Zhengsheng: Database 2025
https://developer.aliyun.com/article/780725

[Contains dry goods PPT download] DTCC 2020 | Aliyun Zhao Diankui: PolarDB's smooth migration path to Oracle
https://developer.aliyun.com/article/780749

[Contains dry goods PPT download] DTCC 2020 | Alibaba Cloud Zhu Jie: The latest technology development trend of NoSQL
https://developer.aliyun.com/article/780746

[Containing dry goods PPT download] DTCC 2020 | Alibaba Cloud Wang Tao: Alibaba e-commerce database cloud practice
https://developer.aliyun.com/article/781001

DTCC 2020 | Alibaba Cloud Liang High School: DAS Global Automatic Optimization Practice Based on Workload
https://developer.aliyun.com/article/781036

[Containing dry goods PPT download] DTCC 2020 | Aliyun Chengshi: Database Management in the Cloud Native Era
https://developer.aliyun.com/article/780992

[Containing dry goods PPT download] DTCC 2020 | Alibaba Cloud Ji Jiannan: Interpretation of key technologies for online analysis to enter the Fast Data era
https://developer.aliyun.com/article/780747

Original link: https://developer.aliyun.com/article/781031?

Copyright statement: The content of this article is voluntarily contributed by Alibaba Cloud real-name registered users. The copyright belongs to the original author. The Alibaba Cloud Developer Community does not own its copyright and does not assume corresponding legal responsibilities. For specific rules, please refer to the "Alibaba Cloud Developer Community User Service Agreement" and the "Alibaba Cloud Developer Community Intellectual Property Protection Guidelines". If you find that there is suspected plagiarism in this community, fill in the infringement complaint form to report it. Once verified, the community will immediately delete the suspected infringing content.

Guess you like

Origin blog.csdn.net/alitech2017/article/details/112538776