Two, LVS load balancing

First, a basic introduction to load balancing LVS


LVS (Linux Virtual Server), the Linux Virtual Server, Was developed by Dr. Zhang Wen-song leading open source load balancing project, currently LVS has been integrated into the Linux kernel module. The project implements IP-based data request load balancing scheduling solutions, Internet users from outside the terminal to access the company's external load balancing servers, end-user Web request will be sent to the LVS scheduler in the Linux kernel, scheduler according to their preset the algorithm determines a Web server in the request sent to the backend, for example, the polling algorithm may request the external server to the average of all of the rear end, while the end user access to the scheduler LVS will be forwarded to the real back-end servers, but If the real server connection is the same memory, the same service is also provided services to end users regardless of which real server access, services get are the same, the entire cluster is transparent to the user. Finally, depending on the transmission mode of the LVS, the real servers will choose a different way to the end user data required by the user, the operation mode is divided into LVS NAT mode, TUN mode and DR mode.


Second, the basic working principle of LVS

Here Insert Picture Description

  1. When a user initiates a request to the load balancer scheduler (Director Server), the dispatcher sends the requests to the kernel space to
  2. PREROUTING chain will first user request is received, it determines whether the target IP unit are determined IP, packets destined for the INPUT chain
  3. IPVS workIn the INPUT chainOn, when the user requests arrive INPUT, IPVS request and sends the user that they have defined a cluster service for comparison, if the user requests a cluster service is defined, then the time will be forced to modify the data package IPVS destination IP address and port, and a new data packet destined POSTROUTING chain
  4. POSTROUTING found close link packet destination IP address happens to be their own back-end server, then the time will eventually send packets through routing to the back-end server

Three, LVS composition

LVS program consists of two parts, includingipvs 和 ipvsadm

  1. ipvs (ip virtual server): a piece of code work in kernel space, called ipvs, the code went into effect achieved scheduling.
  2. ipvsadm: another period of work in user space, called ipvsadm, is responsible for the preparation ipvs core framework of rules that define who is a cluster service, and who is the real back-end server (Real Server)

Four, LVS related terms

  1. DS:Director Server. Load balancing refers to the front end node.
  2. RS:Real Server. The back-end server real work.
  3. VIP:Director Virtual IP To the outside directly to the user request, IP address, as the target user request.
  4. DIP:Director Server IP, IP address, and it is mainly used for internal communication with the host.
  5. RIP:Real Server IP, IP address back-end server.
  6. CIP:Client IP, IP address access client.

Five, lvs three ways to achieve load

The server software runs lPVS bear the entire load balancing cluster server a role scheduling software, (ie real server from the client over the allocation request .LVS scheduling method, there are three: NAT (Network Address Translation Network Address Translation) , TUN (tunnel tunnel), DR (direct route direct route)

Parsing of three operating modes

1, based on the pattern NAT LVS load balancing (LVS-NAT Network Address Transform

NAT (Network Address Translation) i.e., network address translation, which role is to modify the data packet header, private IP address is such that within the enterprise can access the Internet, as well as with external users can access the private IP hosts located within the company.

LVS / NAT schematic and scheduling steps in FIG. Based on ip camouflage MASQUERADES, the principle is the multi-target DNAT. Therefore, via the request and response Director scheduler.

Here Insert Picture Description

Advantages and disadvantages of the LVS-NAT

advantage:

  • Support port mapping
  • RS can use any operating system
  • Save public IP address.
  • And DIP RIP should use the same private network address, and RS gateway to point DIP.
  • Use nat Another benefit is that the backend host relatively safe.

Disadvantages:

  • Request and response packets go through the Director forwards; when a high load, Director could become a system bottleneck (that is inefficient means).
2, the LVS TUN based load balancing (LVS-TUN IP Tuneling

In a cluster environment LVS (the NAT) mode, since all data requests and data packets in response to the need to go through forward scheduler LVS, if the number is greater than the back-end server 10, the scheduler will become the bottleneck of the entire cluster environment . We know that the data request packet is always much smaller size of the response packet. Because the response packet contains data specific customer needs, the LVS (TUN) The idea is to separate request and response data, so that only the data request process scheduler, and let the real server response data packet back to the client directly. Wherein, the IP tunnel (IP tunning) is a data packet encapsulation technique, it can be encapsulated original packet and adds a new header (including the new source address and port, destination address and port), so as to achieve a goal packet encapsulation VIP address scheduler, forwarded through the tunnel to the backend of the real server (real server), through the client sent to the scheduler of the original packet encapsulation, and add a new header on its basis (modified target IP address and the corresponding port address is selected from the dispatcher real server), LVS (TUN) mode requires real server can be directly connected to the external network, the real server to the client host in response to receipt of the request data directly to a data packet.

LVS / TUN schematic and scheduling steps in FIG.
Based tunnel encapsulation. IP packets outside Wrap an IP packet.
When the Director receives the request, scheduling elected RealServer
When a request is received from the Director of the VIP will RealServer lo use the interface direct response CIP.
VIP CIP request such resources, as well as VIP response received.

Here Insert Picture Description

Advantages and disadvantages of the LVS-TUN

advantage:

  • RIP, VIP, DIP should use the public network address, the gateway does not point to the DIP and the RS;
  • Only accept incoming requests, to solve the problem of LVS-NAT, reduce the load.
  • Scheduling request message via the Director, but without a response packet via the Director.

Disadvantages:

  • Director does not point so I do not support port mapping.
  • RS's OS must support tunneling.
  • Tunneling extra cost performance, increased overhead.
3, based on the DR LVS load balancing (LVS-DR Direct Routing)

In LVS (TUN) mode, due to the need for a tunnel between the dispatcher and LVS real server, which also will increase the burden on the server. And LVS (TUN) Similarly, the DR mode is also called direct routing mode, which is still only take on the LVS inbound requests and data according to the selected algorithm reasonable real server, and ultimately by the real back-end server is responsible for transmitting the response packet returned to the client. And tunnel mode is different, direct routing mode (DR mode) and back-end server requires the dispatcher must be in the same LAN, VIP address needs to be shared among all the servers and back-end scheduler, because in the end the real server to the client We need to set the source when the responding packet IP for the VIP address, destination IP client IP, so that clients can access the VIP address scheduler, in response to the source address also remains the VIP address (VIP on the real server), the client is imperceptible backend server exists. Since many computers are provided with the same a VIP address, so the requirements of the scheduler in direct routing mode, the VIP address is outside visible, the client needs to request packet to the scheduler host, VIP addresses of all of the real server must be configured on Non-ARP network devices, that is, the network device does not broadcast outside its own MAC and the corresponding IP address, VIP real server is not visible to the outside world, but the real server has accepted the target address VIP network requests, and in response to the data packet the source address as the VIP address. After the selected scheduler real server, without modifying the data packets, data frames will change the MAC address MAC address of the real server according to the selected algorithm, the data frame to the switch by the real server. Throughout the process, VIP real server does not need to be visible to the outside world.

LVS / DR schematic and scheduling steps in FIG. Based tunnel encapsulation. IP packets outside Wrap an IP packet.
Here Insert Picture Description

Advantages and disadvantages of the LVS-DR

advantage:

  • RIP can use private addresses, you can also use public addresses.
  • DIP RIP and requires only the address of the same network segment.
  • Scheduling request message via the Director, but no response packet via the Director.
  • RS can use most OS

Disadvantages:

  • It does not support port mapping.
  • Not across the LAN.

Although the three models have advantages and disadvantages, but due to the pursuit of performance and convenience, DR is the most used LVS model.


Six, LVS load balancing scheduling algorithm

According to previous reports, we learned about the LVS of three operating modes, but regardless of the actual environment in which mode is used, strategy and scheduling algorithm scheduling algorithm is the core technology of LVS, LVS in the kernel implements the following main ten kinds of scheduling algorithms.

1. round robin scheduling

Round robin scheduling (Round Robin referred to as 'RR') algorithm is the manner in cycles of the request to schedule a different server, the algorithm is the biggest feature is simple. Polling algorithm assumes that all the capabilities of the server processes the request are the same, the scheduler will average all requests assigned to each real server.

2. Weighted round robin scheduling

Weighted round robin (Weight Round Robin referred to as 'WRR') is mainly an optimization algorithm and supplement polling algorithm, the LVS will consider the performance of each server, and to add a weight to each server, to the server if the A value 1, value 2 to the server B, then the scheduler schedules the request to the server B, server a would be twice. The higher the weight of the server, the more processing the request.

3. The minimum connection scheduling

The minimum connection scheduling (Least Connections referred to as 'LC') algorithm is to assign a new connection request to the server with the smallest number of currently connected. The minimum connection scheduling is a dynamic scheduling algorithm that estimates the number of the server by connecting to the server is currently active. Scheduler needs to record the number of connections to each server has been established, when a request is dispatched to a server, which is 1 plus the number of connections; when the connection is lost or a timeout, which is 1 minus the number of connections.

(Real server cluster systems have similar performance, with a minimum connection scheduling algorithm can better balance the load.)

4. weighted least connections scheduling

Weighted least connections (Weight Least Connections referred to as 'WLC') connected to the scheduling algorithm is the smallest superset corresponding weight represents the processing performance of each server. The default value of 1 to the server, the system administrator can dynamically set the weight of the server. Weighted least-connection scheduling in scheduling a new connection so that the server has been established as a value proportional to the number of connections and their weights. Queries scheduler may automatically load the real server, and dynamically adjusts the weight.

The local connection based minimum

Based on local minimum connection scheduling (Locality-Based Least Connections referred to as 'LBLC') algorithm for load balancing scheduling request packet destination IP address, the key for Cache cluster system, because in Cache Clusters customers request packet destination IP address is changed. It is assumed that any backend server can handle a request for any, the algorithm is designed in a substantially balanced load the server will request the same destination IP address are dispatched to the same server, each server to improve locality and Cache hit rate, so as to enhance the processing capability of the whole cluster system. LBLC scheduling algorithm first find the IP address of the target server according to the most recently used destination IP address request, if the server is available and is not overloaded, send a request to the server; if the server does not exist, or if the server has a server is overloaded and workload in half, using the principle of "least connections' to select an available server, sends a request to the server.

6. band replication connection based minimum Locality

With a copy of the Locality-Based Least-Connection (Locality-Based Least Connections with Replication referred to as 'LBLCR') algorithm is load balancing target IP address, the key for Cache cluster system, it LBLC algorithm is that it is different to maintaining the mapping from a target IP address to a group of servers, and LBLC algorithm maintains a mapping from a target IP address to a server. Press 'minimum connection' principle selected from eleven servers in the server group, if the server is not overloaded, send a request to the server; if the server is overloaded, press 'minimum connection' a principle selected from the entire server cluster the server is then added to the server group, sends a request to the server. Meanwhile, when the server group for some time not been modified, the busiest server is removed from the server group, in order to reduce the degree of replication.

7. destination address hashing scheduling

Scheduling target address hash (Destination Hashing abbreviation 'DH') algorithm according to the request destination IP address as the hash key (Hash Key) to find the corresponding server from the list of hash static allocation, if the server is available and and not overloaded, the request is sent to the server, otherwise empty.

8. The source address hashing scheduling U

Scheduling the source address hash (Source Hashing abbreviation 'SH') algorithm according to the source IP address of the request, as the hash key (Hash Key) to find the corresponding server from the list of hash static allocation, if the server is available and and not overloaded, the request is sent to the server, otherwise empty. Hash function with the hash of the destination address it uses the same scheduling algorithm, which is substantially similar to the target address hash algorithm process scheduling algorithm.

9. shortest desired delay

The shortest expected delay scheduling (Shortest Expected Delay referred to as 'SED') WLC algorithm Algorithm. To give an example, the right to ABC three servers weight 1,2,3, respectively. So if you use WLC algorithm, then a new request comes in it may give any one of ABC. We will perform a computation algorithm after using SED

A: (1 + 1) / 1 = 2 B: (1 + 2) / 2 = 3/2 C: (1 + 3) / 3 = 4/3 puts the request to the server which yields a minimum calculation result.

10. The queue scheduling Minimum

Minimum queue scheduling (Never Queue referred to as 'NQ') algorithm, without queue. If the number of connections is equal to 0 realserver directly assigned past, no operation is performed SED.


Reference:
https://blog.csdn.net/weixin_40470303/article/details/80541639
https://www.jianshu.com/p/8a61de3f8be9

Published 69 original articles · won praise 123 · views 320 000 +

Guess you like

Origin blog.csdn.net/u011870547/article/details/90293638