One of Linux load balancing software LVS (concept)

1. Introduction to
LVS LVS is the abbreviation of Linux Virtual Server, that is, Linux Virtual Server. It is a free software project initiated by Dr. Zhang Wensong. Its official website is www.linuxvirtualserver.org . LVS is now a part of the Linux standard kernel. Before the Linux 2.4 kernel, the kernel had to be recompiled to support the LVS function modules when using LVS. However, since the Linux 2.4 kernel, the various function modules of LVS have been completely built-in, without the need for Apply any patch to the kernel, you can directly use the various functions provided by LVS.
The goal of using LVS technology is to achieve a high-performance, high-availability server cluster through the load balancing technology and Linux operating system provided by LVS, which has good reliability, scalability and operability. Thereby, optimal service performance can be achieved at low cost.
Since LVS started in 1998, it has developed into a relatively mature technology project. LVS technology can be used to achieve highly scalable and highly available network services, such as WWW services, Cache services, DNS services, FTP services, MAIL services, video/audio on-demand services, etc., which are used by many well-known websites and organizations The cluster system set up by LVS, such as: Linux portal website ( www.linux.com ), Real Company ( www.real.com ), which is famous for providing audio and video services to RealPlayer , the world's largest open source website (sourceforge.net), etc. .
2. LVS Architecture
The server cluster system built with LVS consists of three parts: the front-end load balancing layer, represented by Load Balancer, the middle server group layer, represented by Server Array, and the bottom data shared storage layer, represented by Shared Storage, in From the user's point of view, all internal applications are transparent, and the user is just using a high-performance service provided by a virtual server.
The LVS architecture is shown in Figure 1:

Figure 1 Architecture of LVS
 

The following is a detailed introduction to the various components of LVS:
 Load Balancer layer: located at the forefront of the entire cluster system, composed of one or more load schedulers (Director Server), the LVS module is installed on the Director Server, and the Director Its main function is similar to that of a router. It contains the routing table set to complete the LVS function, and distributes the user's request to the application server (Real Server) of the Server Array layer through these routing tables. At the same time, a monitoring module Ldirectord for the Real Server service should be installed on the Director Server. This module is used to monitor the health status of each Real Server service. Remove it from the LVS routing table when the Real Server is unavailable, and rejoin it when it recovers.
 Server Array layer: It consists of a group of machines that actually run application services. Real Server can be one or more of WEB server, MAIL server, FTP server, DNS server, and video server. LAN or distributed WAN connection. In practical applications, Director Server can also serve as the role of Real Server at the same time.
 Shared Storage layer: It is a storage area that provides shared storage space and content consistency for all Real Servers. Physically, it is generally composed of disk array devices. In order to provide content consistency, data can generally be shared through the NFS network file system. However, in a busy business system, the performance of NFS is not very good. At this time, a cluster file system can be used, such as the GFS file system of Red Hat and the OCFS2 file system provided by oracle.
As can be seen from the entire LVS structure, Director Server is the core of the entire LVS. Currently, the operating systems used for Director Server can only be Linux and FreeBSD. The Linux 2.6 kernel can support LVS functions without any settings, and FreeBSD is used as Director Server. There are not many applications, and the performance is not very good.
For Real Server, it can be almost all system platforms, Linux, windows, Solaris, AIX, BSD series can be well supported.

3. Features of LVS Cluster
3.1 IP Load Balancing and Load Scheduling Algorithms

1. IP load balancing
technology There are many implementation schemes for load balancing technology, including a method based on DNS domain name rotation resolution, a method based on client scheduling access, a scheduling method based on application layer system load, and a scheduling method based on IP addresses. Among these load scheduling algorithms, the most efficient implementation is IP load balancing technology.
The IP load balancing technology of LVS is realized through the IPVS module. IPVS is the core software of the LVS cluster system. Its main function is to install it on the Director Server and create a virtual IP address on the Director Server. Users must pass this Virtual IP address to access the service. This virtual IP is generally called the VIP of LVS, that is, Virtual IP. The access request first reaches the load scheduler through the VIP, and then the load scheduler selects a service node from the Real Server list to respond to the user's request.
When the user's request reaches the load scheduler, how the scheduler sends the request to the Real Server node that provides the service, and how the Real Server node returns data to the user, is the key technology of IPVS implementation. IPVS implements three load balancing mechanisms, respectively. It is NAT, TUN and DR. The details are as follows: 
 VS/NAT: That is (Virtual Server via Network Address Translation)
, that is, the network address translation technology realizes the virtual server. When the user request reaches the scheduler, the scheduler will request the message The destination address (that is, the virtual IP address) is rewritten to the selected Real Server address, and the destination port of the packet is also changed to the corresponding port of the selected Real Server, and finally the packet request is sent to the selected Real Server. After getting the data on the server side, when Real Server returns the data to the user, it needs to go through the load scheduler again to change the source address and source port of the packet to the virtual IP address and corresponding port, and then send the data to the user to complete the entire load scheduling. Process.
It can be seen that in NAT mode, both user request and response packets must be rewritten by the Director Server address. When more and more user requests are made, the processing capability of the scheduler will be called a bottleneck.
 VS/TUN: that is (Virtual Server via IP Tunneling) 
, that is, IP tunneling technology to realize virtual server. Its connection scheduling and management is the same as that of VS/NAT, but its packet forwarding method is different. In VS/TUN, the scheduler uses IP tunnel technology to forward user requests to a Real Server, and this Real Server will directly In response to the user's request, it no longer passes through the front-end scheduler. In addition, there is no requirement for the geographical location of the Real Server. It can be located on the same network segment as the Director Server, or it can be an independent network. Therefore, in the TUN mode, the scheduler will only process the user's message request, and the throughput of the cluster system is greatly improved.
 VS/DR: That is (Virtual Server via Direct Routing) 
that is to use the direct routing technology to realize the virtual server. Its connection scheduling and management are the same as those in VS/NAT and VS/TUN, but its message forwarding method is different. VS/DR sends the request to the Real Server by rewriting the MAC address of the request message, while the Real The server returns the response directly to the client, eliminating the IP tunnel overhead in VS/TUN. This method has the highest performance among the three load scheduling mechanisms, but it must be required that both the Director Server and the Real Server have a network card connected to the same physical network segment.

2. Load Scheduling Algorithm
As we mentioned above, the load scheduler dynamically selects a Real Server to respond to user requests according to the load of each server. So how is the dynamic selection implemented? In fact, it is the load scheduling algorithm we are going to talk about here. According to different network service requirements and server configurations, IPVS implements the following eight load scheduling algorithms. Here we describe the four most commonly used scheduling algorithms in detail. For the remaining four scheduling algorithms, please refer to other materials.
 Round Robin
"Round Robin" scheduling is also called 1:1 scheduling. The scheduler uses the "round robin" scheduling algorithm to allocate external user requests to each Real Server in the cluster in sequence 1:1. This algorithm treats each Real Server equally, regardless of the actual load and connection status on the server. 
 Weighted Round Robin 
The "Weighted Round Robin" scheduling algorithm schedules access requests according to the different processing capabilities of the Real Server. Different scheduling weights can be set for each Real Server. For Real Servers with relatively good performance, a higher weight can be set, and for a Real Server with weak processing capability, a lower weight can be set, which ensures that Servers with more processing power handle more traffic. Make full and reasonable use of server resources. At the same time, the scheduler can also automatically query the load of the Real Server and dynamically adjust its weight. 
 Least Connections Scheduling (Least Connections) 
The "Least Connections" scheduling algorithm dynamically schedules network requests to the server with the fewest established links. If the real servers of the cluster system have similar system performance, the "least connection" scheduling algorithm can better balance the load. 
 Weighted Least Connections 
"Weighted Least Link Scheduling" is a superset of "Least Connection Scheduling". Each service node can express its processing capability with a corresponding weight, and the system administrator can dynamically set the corresponding weight. The default weight is 1. , Weighted Least Connection Scheduling tries to make the number of established connections of service nodes proportional to their weights when allocating new connection requests.
The other four scheduling algorithms are: Locality-Based Least Connections, Locality-Based Least Connections with Replication, Destination Hashing, and Source Address hashing (Source Hashing), the meaning of these four scheduling algorithms will not be described in this article. If you want to learn more about the other four scheduling strategies, you can log in to the LVS Chinese site zh.linuxvirtualserver.org for more detailed information .

3.2 High Availability
LVS is an application software based on the kernel level, so it has high processing performance. The load balancing cluster system constructed with LVS has excellent processing ability. The failure of each service node will not affect the normal use of the entire system. At the same time, the load is reasonably balanced, so that the application has an ultra-high-load service capability and can support millions of concurrent connection requests. If a 100M NIC is configured, and VS/TUN or VS/DR scheduling technology is used, the throughput of the entire cluster system can be as high as 1Gbits/s; if a Gigabit NIC is configured, the maximum throughput of the system can be close to 10Gbits/s.

3.3 High-reliability
LVS load balancing cluster software has been widely used in enterprises, schools and other industries. Many large and critical web sites at home and abroad also use LVS cluster software, so its reliability is in practice has been well confirmed. There are many load balancing systems with LVS that run for a long time and never restart. These all illustrate the high stability and reliability of LVS.

3.4 Applicable environment
LVS currently only supports Linux and FreeBSD systems for the front-end Director Server, but supports most TCP and UDP protocols. Applications that support TCP protocols include: HTTP, HTTPS, FTP, SMTP, POP3, IMAP4, PROXY, LDAP, SSMTP and so on. Applications that support the UDP protocol include: DNS, NTP, ICP, video and audio streaming protocols, etc.
LVS has no restrictions on the operating system of Real Server. Real Server can run on any operating system that supports TCP/IP, including Linux, various Unixes (such as FreeBSD, Sun Solaris, HP Unix, etc.), Mac/OS and Windows, etc. .

3.5 Open source software 
LVS cluster software is free software released under the GPL (GNU Public License) license. Therefore, users can obtain the source code of the software and make various modifications according to their own needs, but the modification must be in GPL mode issued.

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325824703&siteId=291194637