The most comprehensive and concise LVS load balancing (Introduction to LVS, three working modes, ten scheduling algorithms)

Copyright statement: This article is the original article of the CSDN blogger "chenhuyang". It follows the CC 4.0 BY-SA copyright agreement. Please attach the original source link and this statement for reprinting.
Original link: https://blog.csdn.net/weixin_40470303/article/details/80541639

 

table of Contents

1. Introduction to LVS

2. Analysis of the three working modes.

1. LVS mode load balancing based on NAT

2. LVS load balancing based on TUN

3. LVS load balancing based on DR

Three, LVS load balancing scheduling algorithm

1. Round-robin scheduling

2. Weighted round-robin scheduling

3. Minimum connection scheduling

4. Weighted least connection scheduling

5. Based on local minimum connection

6. Minimal connections based on locality with replication

7. Target address hash scheduling

8. Source address hash scheduling U

9. The shortest expected delay

10. Least queue scheduling


1. Introduction to LVS

       LVS (Linux Virtual Server) is the Linux virtual server. It is an open source load balancing project led by Dr. Zhang Wensong. Currently LVS has been integrated into the Linux kernel module. This project implements an IP-based data request load balancing scheduling scheme in the Linux kernel. Its architecture is shown in Figure 1. Terminal Internet users access the company’s external load balancing server from the outside, and the end user’s Web request will be sent to LVS scheduling The scheduler decides to send the request to a certain back-end Web server according to its own preset algorithm. For example, the polling algorithm can evenly distribute the external request to all the back-end servers. Although the terminal user accesses the LVS scheduler, it will be Forward to the real back-end server, but if the real server is connected to the same storage, the service provided is the same service. No matter which real server the end user accesses, the service content is the same. The entire cluster is for users All are transparent. Finally, according to the different LVS working modes, the real server will choose different ways to send the data required by the user to the end user. The LVS working mode is divided into NAT mode, TUN mode, and DR mode.

2. Analysis of the three working modes.

1. LVS mode load balancing based on NAT

      NAT (Network Address Translation) means network address translation. Its function is to make the private IP address located in the enterprise can access the external network through the modification of the data header, and external users can access the private IP host located in the company. The VS/NAT working mode topology is shown in Figure 2. The LVS load scheduler can use two network cards to configure different IP addresses, eth0 is set as the private key IP and the internal network is connected to each other through the switching device, and the eth1 device is the external network IP and External network connection.

       In the first step, the user resolves to the external network address on the company's load balancing equipment through the Internet DNS server. Compared with the real server, the LVS external network IP is also called VIP (Virtual IP Address). The user can connect after accessing the VIP. Real Server at the end, and all this is transparent to the user. The user thinks he is accessing the real server, but he does not know that the VIP he is accessing is just a scheduler, and he does not know the true backend. Where are the servers and how many real servers are there.

   In the second step, the user sends the request to 124.126.147.168. At this time, LVS will select a real server (192.168.0.1~192.168.0.3) at the back end according to the preset algorithm, forward the data request packet to the real server, and Before forwarding, LVS will modify the destination address and destination port in the data packet. The destination address and destination port will be modified to the selected real server IP address and corresponding port.

    In the third step, the real server returns the response packet to the LVS scheduler. The scheduler will modify the source address and source port to the corresponding port of the VIP and the scheduler after receiving the response packet. After the modification is completed, the scheduler will Send the response data packet back to the end user. In addition, because the LVS scheduler has a connection Hash table, the table will record the connection request and forwarding information. When the next data packet of the same connection is sent to the scheduler, the Hash You can directly find the previous connection records in the table, and select the same real server and port information based on the record information.

2. LVS load balancing based on TUN

       In the cluster environment of LVS (NAT) mode, since all data requests and response packets need to be forwarded by the LVS scheduler, if the number of back-end servers is greater than 10, the scheduler will become the bottleneck of the entire cluster environment . We know that the data request packet is often much smaller than the size of the response data packet. Because the response data packet contains the specific data that the customer needs, the idea of ​​LVS (TUN) is to separate the request from the response data, let the scheduler only process the data request, and let the real server respond the data packet directly to the client. The topological structure of VS/TUN working mode is shown as in Fig. 3. Among them, IP tunnel (IP tunning) is a data packet encapsulation technology, which can encapsulate the original data packet and add a new header (including the new source address and port, destination address and port), so as to achieve the goal of The data packet encapsulation of the VIP address of the scheduler is forwarded to the back-end real server (Real Server) through the tunnel, and the original data packet sent from the client to the scheduler is encapsulated, and a new data packet header is added on the basis (modified target The address is the IP address and corresponding port of the real server selected by the scheduler. The LVS (TUN) mode requires that the real server can directly connect to the external network, and the real server directly responds to the client host after receiving the request packet.

3. LVS load balancing based on DR

In the LVS (TUN) mode, due to the need to create a tunnel connection between the LVS scheduler and the real server, this will also increase the burden on the server. Similar to LVS (TUN), DR mode is also called direct routing mode. Its architecture is shown in Figure 4. In this mode, LVS still only undertakes the inbound request of data and selects a reasonable real server according to the algorithm. The real server is responsible for sending the response packet back to the client. Different from the tunnel mode, the direct routing mode (DR mode) requires that the dispatcher and the back-end server must be in the same LAN, and the VIP address needs to be shared between the dispatcher and all the back-end servers, because the final real server is given to the client When the client responds to the data packet, it needs to set the source IP to the VIP address and the target IP to the client IP, so that the client accesses the VIP address of the dispatcher, and the source address of the response is still the VIP address (VIP on the real server). The client does not feel the existence of the back-end server. Since multiple computers have set the same VIP address, the VIP address of the dispatcher is required to be visible to the outside in the direct routing mode. The client needs to send the request packet to the dispatcher host, and the VIP addresses of all real servers It must be configured on a non-ARP network device, that is, the network device will not broadcast its MAC and corresponding IP address. The VIP of the real server is invisible to the outside world, but the real server can accept the target address VIP's network request, and set the source address to the VIP address when responding to the data packet. After selecting the real server according to the algorithm, the scheduler modifies the MAC address of the data frame to the MAC address of the selected real server without modifying the data message, and sends the data frame to the real server through the switch. During the whole process, the VIP of the real server does not need to be visible to the outside world.

Three, LVS load balancing scheduling algorithm

      According to the previous introduction, we understand the three working modes of LVS, but no matter which mode is used in the actual environment, the scheduling strategy and algorithm of the scheduling algorithm are the core technologies of LVS. LVS is mainly implemented in the kernel. Ten scheduling algorithms.

1. Round-robin scheduling

The Round Robin (Round Robin abbreviated as'RR') algorithm is to schedule requests to different servers in a round-robin manner. The biggest feature of this algorithm is its simple implementation. The polling algorithm assumes that all servers have the same ability to process requests, and the scheduler will evenly distribute all requests to each real server.

2. Weighted round-robin scheduling

The Weight Round Robin (WRR) algorithm is mainly an optimization and supplement to the round robin algorithm. LVS will consider the performance of each server and add a weight to each server. If the weight of server A If the value is 1, the weight of server B is 2, and the request dispatched by the scheduler to server B will be twice that of server A. The server with the higher the weight, the more requests it processes.

3. Minimum connection scheduling

Least Connections (Least Connections abbreviated as'LC') algorithm is to allocate new connection requests to the server with the smallest number of connections. The least connection scheduling is a dynamic scheduling algorithm, which estimates the server's situation through the number of current active connections of the server. The scheduler needs to record the number of established connections for each server. When a request is dispatched to a server, the number of connections increases by one; when the connection is interrupted or timed out, the number of connections decreases by one.

(The real servers of the cluster system have similar system performance, and the minimum connection scheduling algorithm can better balance the load.)

4. Weighted least connection scheduling

The Weight Least Connections (WLC) algorithm is a superset of the least connection scheduling, and the corresponding weight of each server represents its processing performance. The default weight of the server is 1, and the system administrator can dynamically set the weight of the server. The weighted least connection scheduling makes the server's number of established connections proportional to its weight as much as possible when scheduling new connections. The scheduler can automatically inquire about the load of the real server and adjust its weight dynamically.

5. Based on local minimum connection

Locality-Based Least Connections ('LBLC') algorithm is a load balancing scheduling for the target IP address of the request message. It is currently mainly used in the Cache cluster system because the client request message target in the Cache cluster The IP address changes. It is assumed here that any back-end server can handle any request. The design goal of the algorithm is to schedule requests with the same target IP address to the same server under the condition that the load of the server is basically balanced, so as to improve the locality of access of each server and Cache hit rate, thereby improving the processing capacity of the entire cluster system. The LBLC scheduling algorithm first finds the most recently used server of the target IP address according to the requested target IP address. If the server is available and not overloaded, the request is sent to the server; if the server does not exist, or the server is overloaded and there is a server At half the workload, use the principle of "least connections" to select an available server and send the request to the server.

6. Minimal connections based on locality with replication

The Locality-Based Least Connections with Replication ('LBLCR') algorithm with replication is also aimed at the load balancing of the target IP address. It is currently mainly used in Cache cluster systems. The difference between it and the LBLC algorithm is that it requires Maintain a mapping from a target IP address to a group of servers, while the LBLC algorithm maintains a mapping from a target IP address to a server. Select a server from the server group according to the principle of “least connection”. If the server is not overloaded, send the request to the server; if the server is overloaded, select a server from the entire cluster according to the principle of “least connection” , Add the server to this server group, and send the request to the server. At the same time, when the server group has not been modified for a period of time, the busiest server is deleted from the server group to reduce the degree of replication.

7. Target address hash scheduling

The destination Hashing (Destination Hashing abbreviated as'DH') algorithm first uses the requested destination IP address as a hash key to find the corresponding server from the statically allocated hash table. If the server is available and parallel If it is not overloaded, send the request to the server, otherwise it returns empty.

8. Source address hash scheduling U

The Source Hashing (Source Hashing abbreviated as'SH') algorithm first uses the requested source IP address as a hash key to find the corresponding server from the statically allocated hash table. If the server is available and parallel If it is not overloaded, send the request to the server, otherwise it returns empty. The hash function it uses is the same as the target address hash scheduling algorithm, and its algorithm flow is basically similar to that of the target address hash scheduling algorithm.

9. The shortest expected delay

The Shortest Expected Delay (SED) algorithm is based on the WLC algorithm. For example, the weights of the three ABC servers are 1, 2, and 3 respectively. Then if the WLC algorithm is used, when a new request comes in, it may be assigned to any one of ABC. After using the SED algorithm, an operation will be performed

A: (1+1)/1=2 B: (1+2)/2=3/2 C: (1+3)/3=4/3 Then hand the request to the server with the smallest calculation result.

10. Least queue scheduling

The least queue scheduling (Never Queue abbreviated as'NQ') algorithm, no queue is required. If the number of realserver connections is equal to 0, it will be allocated directly without SED calculation.
 

Guess you like

Origin blog.csdn.net/hugo_lei/article/details/106802339