Learning architecture from scratch - high-performance load balancing

High performance load balancing

No matter how optimized a single server is, no matter how good the hardware is, there will always be a performance ceiling. When the performance of a single server cannot meet business needs, it is necessary to design a high-performance cluster to improve the overall processing performance of the system. The essence of high-performance clusters is very simple - to increase the overall computing power of the system by adding more servers. Due to a characteristic of computing itself: the same input data and logic, no matter which server is executed on, should get the same output. Therefore, the complexity of high-performance cluster design is mainly reflected in the part of task allocation. It is necessary to design a reasonable task allocation strategy to allocate computing tasks to multiple servers for execution.

The complexity of high-performance clusters is mainly reflected in the need to add a task allocator and select an appropriate task allocation algorithm for tasks . For the task distributor, the more popular and general term is "load balancer". But this name is somewhat misleading, and it will make people subconsciously think that the purpose of task allocation is to keep the load of each computing unit in a balanced state. In fact, task allocation does not only consider the load balancing of computing units. Different task allocation algorithms have different goals. Some are based on load considerations, some are based on performance (throughput, response time) considerations, and some are based on business considerations. . Considering that "load balancing" has become a de facto standard term, I also use "load balancing" instead of "task distribution" here, but please keep in mind that load balancing is not just for the load of computing units to achieve a balanced state .

Classification of load balancing

Common load balancing systems include three types: DNS load balancing, hardware load balancing, and software load balancing.

DNS load balancing

DNS is the simplest and most common load balancing method, and is generally used to achieve geographic level balancing. For example, users in the north access the computer room in Beijing, and users in the south access the computer room in Shenzhen. The essence of DNS load balancing is that DNS can resolve the same domain name and return different IP addresses. For example, it is also www.baidu.com, the address obtained by northern users after analysis is 61.135.165.224 (this is the IP of the Beijing computer room), and the address obtained by southern users is 14.215.177.38 (this is the IP of the Shenzhen computer room).

The following is a simple schematic diagram of DNS load balancing:
insert image description here
DNS load balancing is simple to implement and low in cost, but it also has shortcomings such as too coarse granularity and few load balancing algorithms. A careful analysis of the advantages and disadvantages, the advantages are:

  • Simple and low cost: The load balancing work is handed over to the DNS server, and there is no need to develop or maintain the load balancing device by yourself.

  • Nearby access to improve access speed: DNS resolution can be resolved to the server address closest to the user according to the request source IP, which can speed up access speed and improve performance.

The disadvantages are:

  • Update is not timely: DNS cache takes a long time. After modifying DNS configuration, due to caching, many users will continue to access the IP before modification. Such access will fail, failing to achieve the purpose of load balancing, and also affecting The user uses the service normally.

  • Poor scalability: The control of DNS load balancing is with the domain name provider, and it is impossible to do more customized functions and expansion features for it according to business characteristics.

  • The allocation strategy is relatively simple: DNS load balancing supports few algorithms; it cannot distinguish between servers (it cannot judge the load based on the status of the system and services); it cannot perceive the status of the back-end server.

For some shortcomings of DNS load balancing, some companies have implemented the function of HTTP-DNS by themselves for services that are sensitive to delay and failure, that is, to implement a private DNS system using the HTTP protocol. Such a solution is just the opposite of the general DNS advantages and disadvantages.

hardware load balancing

Hardware load balancing implements the load balancing function through a separate hardware device. This type of device is similar to routers and switches, and can be understood as a basic network device for load balancing. Currently, there are two typical hardware load balancing devices in the industry: F5 and A10. This kind of equipment has strong performance and powerful functions, but the price is not cheap. Generally, only "rich" companies will consider using this kind of equipment. Ordinary business-level companies can't afford it, and secondly, the business volume is not so large, so it is a waste to use these devices.

The advantages of hardware load balancing are:

  • Powerful functions: fully support load balancing at all levels, support comprehensive load balancing algorithms, and support global load balancing.

  • Powerful performance: In comparison, software load balancing supports up to 100,000 concurrency levels, and hardware load balancing can support more than 1 million concurrency.

  • High stability: Commercial hardware load balancing has passed good and strict tests, and has been used on a large scale with high stability.

  • Support security protection: In addition to the load balancing function, the hardware balancing device also has security functions such as firewall and anti-DDoS attack.

The disadvantages of hardware load balancing are:

  • Expensive: The most common F5 is a "Ma 6", and the better one is "Q7".

  • Poor scalability: hardware devices can be configured according to the business, but cannot be expanded and customized.

Software load balancing

Software load balancing implements the load balancing function through load balancing software. The common ones are Nginx and LVS, among which Nginx is the 7-layer load balancing of the software, and LVS is the 4-layer load balancing of the Linux kernel. The difference between layer 4 and layer 7 lies in the protocol and flexibility . Nginx supports HTTP and E-mail protocols; while LVS is a layer 4 load balancing, which has nothing to do with the protocol. It can be used for almost all applications, such as chat and database.

The main difference between software and hardware is performance, and hardware load balancing performance is much higher than software load balancing performance . The performance of Ngxin is 10,000-level, and an Nginx installed on a general Linux server can reach 50,000/second; the performance of LVS is 100,000-level, and it is said that it can reach 800,000/second; Second to 8 million per second (data source network, for reference only, if you need to use it, please conduct a performance test according to the actual business scenario). Of course, the biggest advantage of software load balancing is that it is cheap. The wholesale price of an ordinary Linux server is about 10,000 yuan. Compared with the price of F5, that is the difference between a bicycle and a BMW.

In addition to using open source systems for load balancing, if the business is special, it may also be customized based on open source systems (for example, Nginx plug-ins), or even self-developed.

The following is a schematic diagram of Nginx's load balancing architecture:
insert image description here
the advantages of software load balancing:

  • Simple: Both deployment and maintenance are relatively simple.

  • Cheap: Just buy a Linux server and install the software.

  • Flexible: Layer 4 and Layer 7 load balancing can be selected according to the business; it can also be conveniently expanded according to the business, for example, the business customization function can be realized through the Nginx plug-in.

In fact, the following shortcomings are compared with hardware load balancing, not to say that software load balancing cannot be used.

  • General performance: One Nginx can support about 50,000 concurrency.

  • The function is not as powerful as hardware load balancing.

  • Generally, they do not have security functions such as firewalls and anti-DDoS attacks.

Typical architecture of load balancing

Earlier we introduced three common load balancing mechanisms: DNS load balancing, hardware load balancing, and software load balancing. Each method has some advantages and disadvantages, but it does not mean that in actual applications, only based on their advantages and disadvantages It is an either-or choice, but a combination based on their advantages and disadvantages. Specifically, the basic principles of combination are:

  • DNS load balancing is used to achieve geographic level load balancing;

  • Hardware load balancing is used to achieve cluster-level load balancing;

  • Software load balancing is used to achieve machine-level load balancing.

insert image description here

The load balancing of the whole system is divided into three layers.

  • Geographic level load balancing: www.xxx.com is deployed in Beijing, Guangzhou, and Shanghai computer rooms. When a user visits, DNS will determine which computer room’s IP to return based on the user’s geographical location. The figure returns the IP address of the Guangzhou computer room , so that the user can access the computer room in Guangzhou.

  • Cluster-level load balancing: F5 equipment is used for load balancing in the computer room in Guangzhou. After receiving user requests, F5 performs cluster-level load balancing and sends user requests to one of the three local clusters. We assume that F5 sends user requests to Gave "Guangzhou Cluster 2".

  • Machine-level load balancing: Guangzhou Cluster 2 uses Nginx for load balancing. After receiving user requests, Nginx sends the user requests to a server in the cluster. The server processes the user's business requests and returns a business response.

It should be noted that the above figure is just an example, and it is generally used in large-scale business scenarios. If the business volume is not so large, there is no need to strictly copy this architecture. For example, a university forum does not need DNS load balancing, nor does it need F5 equipment. It is enough to use Nginx as a simple load balancing.

load balancing algorithm

There are a large number of load balancing algorithms, and they can be customized and developed according to some business characteristics. Regardless of the differences in details, according to the expected goals of the algorithms, they can be roughly divided into the following categories.

  1. Flat classification of tasks: The load balancing system distributes the received tasks to the servers for processing. The "average" here can be the average of absolute numbers, or the average of proportion or weight.

  2. Load balancing class: The load balancing system distributes according to the load of the server. The load here is not necessarily the "CPU load" in the usual sense, but the current pressure of the system, which can be measured by CPU load or by The number of connections, I/O usage, network card throughput, etc. are used to measure the pressure of the system.

  3. Best performance class: The load balancing system allocates tasks according to the response time of the server, and assigns new tasks to the server with the fastest response first.

  4. Hash class: The load balancing system performs Hash calculations based on certain key information in the task, and assigns requests with the same Hash value to the same server. Common active source address Hash, destination address Hash, session id hash, user ID Hash, etc.

Next, we will introduce load balancing algorithms and their advantages and disadvantages.

polling

After the load balancing system receives the request, it will be assigned to the servers in turn in sequence.

Polling is the simplest strategy, without paying attention to the state of the server itself, for example:

A server is currently in an endless loop because of a program bug that triggers a high CPU load. The load balancing system is not aware of it, and will continue to send requests to it continuously.

There are new machines in the cluster with 32 cores, and old machines with 16 cores. The load balancing system is not concerned either. The number of tasks assigned to the old and new machines is the same.

It should be noted that the load balancing system does not need to pay attention to "the state of the server itself", the key word here is "itself". That is, as long as the server is running, the running status is not of concern. However, if the server directly goes down, or the server is disconnected from the load balancing system, the load balancing system can perceive it at this time and needs to deal with it accordingly. For example, it is obviously unreasonable to delete the server from the list of allocatable servers, otherwise the server will be down and the tasks will continue to be assigned to it.

All in all, "simplicity" is both an advantage and a disadvantage of the polling algorithm.

weighted round robin

The load balancing system assigns tasks based on server weights. The weights here are generally statically configured based on hardware configurations . Dynamic calculations will be more suitable for business, but the complexity will be higher.

Weighted polling is a special form of polling, and its main purpose is to solve the problem that different servers have different processing capabilities. For example, if there are new machines in the cluster with 32 cores and old machines with 16 cores, theoretically we can assume that the processing power of the new machines is twice that of the old machines, and the load balancing system can be based on a ratio of 2:1. Allocate more tasks to the new machine to take full advantage of the performance of the new machine.

Weighted round robin solves the problem that tasks cannot be assigned according to server configuration differences in the round robin algorithm, but there is also the problem that tasks cannot be assigned according to server state differences.

Lowest Load Priority

The load balancing system assigns tasks to the server with the lowest current load . The load here can be measured by different indicators according to different task types and business scenarios. For example:

LVS, a 4-layer network load balancing device, can judge the status of the server by the "number of connections". The greater the number of server connections, the greater the pressure on the server.

Nginx, a 7-layer network load system, can judge the server status by "HTTP request number" (Nginx's built-in load balancing algorithm does not support this method and needs to be expanded).

If we develop our own load balancing system, we can choose indicators to measure system pressure based on business characteristics. If it is CPU-intensive, you can use "CPU load" to measure the system pressure; if it is I/O-intensive, you can use "I/O load" to measure the system pressure.

The algorithm with the lowest load priority solves the problem of being unable to perceive the server status in the polling algorithm, but the resulting cost is a lot of complexity. For example:

The least connection number priority algorithm requires the load balancing system to count the connections currently established by each server. The connection pool method is not suitable for this algorithm. For example, LVS can use this algorithm for load balancing, but a load balancing system that connects to a MySQL cluster through a connection pool is not suitable for using this algorithm for load balancing.

The algorithm with the lowest CPU load priority requires the load balancing system to collect the CPU load of each server in a certain way, and to determine whether it is based on the load of 1 minute or the load of 15 minutes. There is no such thing as 1 minute is definitely better than 15 Minutes are better or worse. The optimal time interval for different businesses is different. If the time interval is too short, it will easily cause frequent fluctuations. If the time interval is too long, it may cause slow response when the peak value comes.

The lowest load priority algorithm can basically solve the shortcomings of the polling algorithm perfectly, because after using this algorithm, the load balancing system needs to perceive the current running status of the server. The price, of course, is a huge increase in complexity. In layman's terms, polling may be an algorithm that can be implemented with 5 lines of code, while the lowest load priority algorithm may take 1,000 lines to be implemented, and even requires the development of codes for both the load balancing system and the server. If the lowest load priority algorithm itself is not designed well, or if it is not suitable for the operating characteristics of the business, the algorithm itself may become a performance bottleneck, or cause many inexplicable problems. Therefore, although the effect of the lowest load priority algorithm looks good, in fact, the actual application scenarios are not as many as polling (including weighted polling).

best performance class

The lowest load priority algorithm is allocated from the perspective of the server, while the optimal performance priority algorithm is allocated from the perspective of the client , and the task is assigned to the server with the fastest processing speed first. way to achieve the fastest response to the client.

Similar to the lowest-load priority algorithm, the performance-optimized priority algorithm essentially perceives the state of the server, and only uses the external standard of response time to measure the state of the server. Therefore, the problems of the optimal performance priority algorithm are similar to those of the lowest load priority algorithm, and the complexity is very high, mainly reflected in:

The load balancing system needs to collect and analyze the response time of each task of each server. In the case of a large number of task processing scenarios, this collection and statistics itself will consume more performance.

In order to reduce this statistical consumption, sampling can be used for statistics, that is, the response time of all tasks is not counted, but the response time of some tasks is sampled to estimate the response time of the overall task. Although sampling statistics can reduce performance consumption, it further increases the complexity, because it is necessary to determine the appropriate sampling rate. If the sampling rate is too low, the results will be inaccurate. If the sampling rate is too high, the performance will be consumed. A complicated matter.

Regardless of whether it is all statistics or sampling statistics, it is necessary to choose an appropriate cycle: the best performance within 10 seconds, the best performance within 1 minute, or the best performance within 5 minutes... There is no one-size-fits-all cycle. It is also a relatively complicated matter to make judgments and choices based on the actual business, and even after the system goes online, it needs to be continuously tuned to achieve the optimal design.

Hash class

The load balancing system performs Hash calculations based on certain key information in the task, and allocates requests with the same Hash value to the same server. The purpose of doing so is mainly to meet specific business needs. For example:

  1. Source Address Hash

Assign tasks from the same source IP address to the same server for processing, which is suitable for businesses with transactions and sessions. For example, when we log in to online banking through a browser, a session information is generated, which is temporary and becomes invalid after closing the browser. There is no need to persist session information in the background of online banking, it only needs to temporarily save the session on a certain server, but it is necessary to ensure that the user can access the same server every time during the existence of the session. This business scenario can be used Source address Hash to achieve.

  1. ID Hash

Assign the business identified by a certain ID to the same server for processing, where the ID is generally the ID of temporary data (such as session id). For example, in the above online banking login example, session id hash can also be used to achieve the purpose that the user accesses the same server every time during the same session.

Guess you like

Origin blog.csdn.net/zkkzpp258/article/details/130169053