Necessary for Disaster Recovery Switching-Introduction to Global Traffic Management

Nowadays, people rely more and more on Internet services, and providing low-latency and high-availability services has become an urgent demand of more and more Internet service manufacturers. Have you ever encountered the following problems:
1. Need to reduce network time-consuming, but do not know how to achieve the nearest access to user services?
2. The grayscale verification of the service is required, but I do n’t know how to control the proportion of each node ’s traffic, and I do n’t know how to allocate resources?
3. It is very important to know the availability of the guarantee service, but when encountering a failure, do not know how to quickly direct traffic to other available nodes?
You need global traffic management products to help you solve problems.

What is global traffic management

Global Traffic Manager (GTM) is a product launched by Alibaba Cloud in 19 years. DNS is the most common traffic scheduling method in the Internet field today. With the development of distributed service technologies, multi-node architectures such as active / standby deployment and remote multi-activity have gradually become mainstream. It is particularly important to effectively manage and manage business traffic to achieve the goal of low latency and high availability. Based on years of experience in traffic scheduling and management of Ali economies, as well as practice in many business scenarios (disaster recovery switching, large-scale relocation, economic cloud, etc.), global traffic management came into being and can help users efficiently manage Business traffic.

GTM principle

GTM essentially implements traffic scheduling through DNS. The underlying services use two products: "Cloud Resolution DNS" and "Cloud Monitoring". It integrates the DNS intelligent resolution function and cloud monitoring application service monitoring function to output different networks for customers. Or local user access to achieve the nearest access, application service running status health check, automatic fault switching and other capabilities.

01

 

Figure 1: GTM schematic


GTM will provide users with a CNAME access domain name (similar to CDN access domain name, users need to add their own business domain name CNAME to the access domain name to use GTM functions) and require users to configure an address pool (Pool).

  • An address pool represents a group that provides the same application service, and is generally an IP address or domain name address with the same operator or region (region) attribute.
  • Then, through access policy configuration, the access domain name and address pool are associated. GTM supports resolution to an address pool according to the operator or geographic dimension. The addresses in the address pool support load sharing and weighted polling strategies.
  • Finally, the health check (HealthCheck) is enabled to monitor the availability of the addresses in the address pool. When an address is unavailable, it will be automatically quarantined; when GTM considers that the entire address pool is unavailable, it will automatically switch to the standby address pool.

Conceptually and conceptually, GTM is easily confused with Load Balancing (SLB) and Cloud Resolution Global Load Balancing (GSLB).

The difference between global traffic management (GTM) and load balancing (SLB):

  • GTM resolves the domain name to multiple IP addresses through DNS, and different users access different IP addresses to implement the distribution of application service traffic. At the same time, DNS resolution IP list is dynamically updated through health check to achieve fault isolation and failover. The end user's access to the IP address of the direct connection service does not pass through GTM.
  • SLB distributes user access requests to different servers in real time in the form of proxy user access requests. End user access traffic must pass through SLB. Generally speaking, SLB is used for load balancing in the same region. When multiple SLB addresses in different regions are used, GTM can be used for load balancing.
    The comparison between the two is shown in the following table:
Contrast Network layer Backend address Weighted polling Cross Region Difficulty Time between failures Session hold
GTM 3 floors Domain name, IP stand by simple Minute level not support
SLB 4th floor, 7th floor IP stand by difficult Second level stand by

 

Table 1: Comparison of GTM and SLB

  • The difference between global traffic management (GTM) and global load balancing (GSLB):
    GTM is an upgrade and replacement product of global load balancing (GSLB) in existing cloud resolution DNS. GTM supports more monitoring methods and more advanced IP than GSLB Address management function, more stable and fast monitoring feedback experience.

The comparison between the two is shown in the following table:

Contrast Service access health examination Failover time Multi-line access Link backup
GTM CNAME access ping,tcp,http(s) Minute level, not subject to subdomain TTL Telecom, Unicom, Mobile, Dr. Peng Automatic fault switching, controllable
GSLB Subdomain open not support Limited by subdomain TTL not support Random selection of normal links, uncontrollable

 

Table 2: Comparison of GTM and GSLB

GTM features

Address pool:

Traditional DNS resolves to a single address, while GTM introduces the concept of an address pool. The IP address of the application service can be managed uniformly through the address pool. Realize the resolution of the end user access to the application service address pool, which can not only achieve traffic sharing under high load, but also realize custom traffic distribution. At the same time, when the whole address pool is unavailable, backup switching can be done.

Access strategy:

The access strategy aims to solve the problem of address pool switching based on the source of the request and the health of the address pool. It can not only realize the intelligent analysis of the latitude of the address pool, but also realize the automatic fault switching function.
GTM performs intelligent DNS resolution on 4 major operators in China, 7 major regions, and 6 continents overseas, enabling users from different networks or regions to access nearby and improve access speed.
When the overall address pool fails, GTM will switch the address pool in minutes according to the user-defined strategy, and switch back when the address pool recovers.

health examination:

Relying on the powerful distributed monitoring function of cloud monitoring, GTM has added a HealthCheck module to initiate health detection on multiple application service IP addresses in the address pool from multiple regions. Currently, it supports three methods: http / https, tcp, and ping. When the address in the address pool fails, the HealthCheck module will accurately detect the abnormal situation and interact with the DNS to remove the failed address. And when the faulty address recovers, it automatically recovers to the resolution return list.
After several tests and verifications by the test team on GTM, when the application service fails, GTM can successfully switch about 90% of the application service's traffic within 5 minutes. GTM's failover effective time = failure discovery time + DNS switching synchronization time.
Fault discovery time: the current default health check configuration can accurately find the fault in about 3 minutes of the fault;
DNS switch synchronization time: the current GTM cname access domain name TTL is set to 60 seconds, theoretically it can take effect within 60 seconds after the domain name switch, But the actual situation depends on the cache setting time of operators across the country.

Application scenario

Next, take the example of remote HyperMetro as an example to introduce how to use GTM to achieve rapid disaster recovery switching. As shown in the following figure, users of a service are mainly divided into overseas users and domestic users, and a set of deployment solutions are used for back-end services. Intelligent scheduling of user requests in different regions through GTM, routing user access request traffic to different access service points, that is, overseas users visit Singapore Center (Singapore), domestic users visit Hangzhou Center (CN-Hangzhou). When a site fails and disasters occur, each access site builds its own backup to each other, and finally achieves high availability of services.

02

 

Figure 2: Application of GTM in multiple locations

Five-step quick access to GTM for remote disaster recovery:

(1) Global configuration:
basic configuration, which mainly configures load balancing strategy, global TTL, alarm notification group and other related information.

(2) Address pool configuration:
Newly created address pools Singapore and CN-Hangzhou. Each address pool is configured with multiple service IPs in the area and the minimum number of available addresses. When the number of live addresses in the address pool is less than the number of address pools, the address pool is considered unavailable. In addition, traffic distribution is automatically implemented according to the load balancing strategy in the global configuration.

(3) Enable health check
, that is, configure the health check on the IP addresses in the address pool, and enable the real-time monitoring of the availability status of addresses after enabling. Automatic fault isolation based on the availability of the address, and notify the corresponding alarm group. When the address is restored, it is automatically added to the resolution list. In addition, when there is a problem with the address pool as a whole, automatic switching between the default address pool and the standby address pool is triggered.

(4) The access policy configuration
sets which address pool the end user accesses according to the user's request source. As shown in Figure 2, overseas users want to access the Singapore address pool, they need to set the corresponding access strategy, request the source to set the overseas region, the default address pool is Singapore, and the backup address pool is set to CN-Hangzhou. Under normal request, overseas users visit the Singapore center, and will quickly switch to the CN-Hangzhou center after a failure.

(5) CNAME access configuration
needs to transfer the main domain name CNAME accessed by the user to the instance domain name of global traffic management in order to finally achieve disaster recovery and intelligent access to application services. Take the www.cloud-example.com CNAME in the picture to the access domain name provided by us.

After the configuration is completed, GTM will detect the addresses in the address pool in real time according to the health configuration. When an alarm occurs to the address, it will be judged according to the process of FIG. 3 to implement disaster tolerance switching. Take the IP address A alarm in FIG. 2 as an example. It can be seen that when the default address pool (Singapore) address pool is available, the address A is removed from the resolution list. When the default address pool is not available as a whole, the standby address pool (CN-Hangzhou) is switched. The switching process is automatically completed and the time is reduced to Minute level. Therefore, the switching efficiency of disaster recovery in different places is effectively guaranteed.

03

Guess you like

Origin www.cnblogs.com/yunqishequ/p/12714864.html
Recommended