The road to learning cloud computing - LVS load balancing

LVS

1. Introduction to load balancing cluster

1. What is a cluster?

Cluster technology is a relatively new technology that can achieve relatively high gains in performance, reliability, flexibility, etc. at a lower cost. Its task scheduling is the core technology in cluster systems.

After the cluster is formed, multiple computers and combinations can be used to process massive requests (load balancing) to achieve high processing efficiency. Multiple computers can also be used for backup (high availability), so that any one machine breaks down the entire system. Still able to function normally.

2. Load balancing cluster technology

Load Balance: Load balancing clusters provide an effective solution to capacity problems for enterprise needs. Load-balancing clusters allow the load to be distributed as evenly as possible across a cluster of computers.

Load usually includes application processing load and network traffic load. Each node can bear a certain processing load, and different load balancing algorithms can be used to dynamically distribute the processing load among nodes to achieve load balancing.

3. Implementation methods and products of load balancing cluster technology

Type of load balancing technology: 4-layer load balancing technology and 7-layer load balancing technology
Load balancing implementation method: hardware load balancing device or software load balancing
Hardware load balancing products: F5, Sangfor, Redware
Software load balancing product: LVS (Linux Virtual Server) , Haproxy , Nginx , Ats (Apache traffic server)

4. Load balancing implementation renderings

Insert image description here

5. Load balancing classification

1. Layer 2 load balancing (mac):
Generally, a virtual mac address is used. When an external request is made for the virtual mac address, the load balancer allocates the actual mac address to the backend after receiving it.
2. Three-layer load balancing (IP):
Half of them use virtual IP addresses. The virtual IP address is requested from the outside, and the load balancer allocates the actual IP address in the back segment after receiving it.
3. Four-layer load balancing (tcp):
Based on three-layer load balancing, it uses ip+port to accept requests and then forwards them to the corresponding machine. It is load balancing of local IP and port.
Products that implement four-layer load balancing include:
F5: Hardware load balancer, which has good functions but high cost.
LVS: heavyweight private load balancing software
nginx: lightweight four-layer load balancing software with caching function and flexible regular expressions.
haproxy: simulates layer 4 forwarding and is more flexible.
4. Seven-layer load balancing (http)
accepts requests based on virtual URL, IP or host name, and then forwards them to the corresponding processing server. Seven-layer load balancing is based on virtual URL or host IP load balancing . It is based on four-layer load balancing (it is absolutely impossible to have seven layers without four layers), and then considers the characteristics of the application layer, such as the same web server. For load balancing, in addition to identifying whether the traffic needs to be processed based on the VIP+80 port, it can also determine whether to perform load balancing based on the seven-layer URL and browser category.
The software that implements seven-layer load balancing includes:
haproxy: Innate load balancing skills, fully supports seven-layer proxy, session retention, marking, path transfer;
nginx: only has better functions on http protocol and mail protocol, and its performance is similar to haproxy .

6. The difference between four-layer load balancing and seven-layer load balancing

the difference Layer 4 load balancing Layer 7 load balancing
based on Based on IP+Port Based on virtual URL or host IP
Similar to router proxy server
the complexity low complexity High complexity
performance High performance, no need to parse content Moderate performance, requiring algorithms to identify information such as URLs, cookies, and http headers
safety Low high
Extra features none Session persistence, image compression, etc.

Summary: From the above comparison, it seems that the biggest difference between four-layer load and seven-layer load is the difference in efficiency and function. The four-layer load architecture design is relatively simple, without the need to parse specific message content, and will have relatively high network throughput and processing capabilities. The advantages of seven-layer load balancing are reflected in its multiple functions and flexible and powerful control. When designing specific business architecture, whether to use seven-layer load or four-layer load must be comprehensively considered based on the specific situation.

2. Introduction to LVS

1. Introduction to LVS
LVS is the abbreviation of Linux Virtual Server, which is also the Linux virtual server. It is a free software project initiated by Dr. Zhang Wensong. Its official website is www.linuxvirtualserver.org. Now LVS is part of the Linux standard kernel, so its performance is high, and the function of load balancing is completed by the Linux operating system kernel.

Function of LVS software: Realize a high-performance, high-availability server cluster through the load balancing technology and Linux operating system provided by LVS, which has good reliability, scalability and operability. This enables optimal service performance at low cost.

2. Advantages and disadvantages of LVS Advantages
:
High concurrent connections: LVS works based on the kernel network level and has super carrying capacity and concurrent processing capabilities. A single LVS load balancer supports tens of thousands of concurrent connections.
Strong stability: LVS works on the fourth layer of the network for distribution. This feature also determines that it has the strongest performance and stability among load balancing software, and has extremely low consumption of memory and CPU resources.
Low cost: The price of a hardware load balancer ranges from hundreds of thousands to hundreds of thousands or millions. LVS can be deployed and used for free with just one server, making it extremely cost-effective.
Simple configuration: LVS configuration is very simple. It only requires a few lines of commands to complete the configuration, and it can also be written as a script for management.
Supports multiple algorithms: supports multiple argument algorithms, which can be flexibly deployed and used according to business scenarios.
Supports multiple working models: different working modes can be used to solve production environment request processing problems based on business scenarios.
Wide range of applications: Because LVS works on layer 4, it can load balance almost all applications, including http, database, DNS, ftp services, etc. However, it works
on
layer 4 and does not support layer 7 rule modification. The mechanism is too large. Not suitable for small scale applications.

3. Composition of LVS

LVS consists of 2 programs, including ipvs and ipvsadm.

1) ipvs (ip virtual server): A code program that works in the kernel space. It is the code that truly implements scheduling and can forward requests based on user-defined clusters.

2) ipvsadm: A command line tool in user space, used to manage cluster services and Rs (Real Server) on cluster services.

4. LVS professional terminology
VS: Virtual Server #Virtual service
RS: Real Server #Back-end real request processing server
CIP: Client IP #Client IP
VIP: Virtual IP #Virtual IP address, the IP address published to users for access
DIP: Director Server IP #The IP address mainly used for communication with internal hosts.
RIP: Real Server IP # The IP address of the back-end real request processing server

3. LVS working mode

1. Four working modes of LVS load balancing

LVS/NAT: Network address translation mode, inbound/outbound data traffic passes through the distributor (IP load balancing, it modifies the IP address) - using the three-layer function
LVS/DR: direct routing mode, only inbound traffic Data traffic passes through the distributor (data link layer load balancing, because it modifies the destination mac address) – using the layer 2 function mac address
LVS/TUN: tunnel mode, only inbound data traffic passes through the distributor
LVS/full-nat : Bidirectional conversion: Forwarding is achieved by changing the source address of the request message to DIP and the destination to RIP: For the response message, modify the source address to VIP and the destination address to CIP to achieve forwarding.

2. Principles, advantages and disadvantages of the four working modes

1) NAT mode:
Principle : It is to change the destination address of the IP header of the data packet sent by the client to the IP address of one of the RSs on the load balancer, and send it to this RS for processing. After the RS processing is completed, the data The load balancer then changes the original IP address of the data packet to its own IP, and changes the destination address to the client IP address. During this period, whether it is incoming traffic or outgoing traffic, Must go through the load balancer.
Advantages : The physical servers in the cluster can use any operating system that supports TCP/IP, only the load balancer needs a legal IP address.
Disadvantages : Limited scalability. When server nodes (ordinary PC servers) grow too much, the load balancer will become the bottleneck of the entire system, because all request packets and response packets flow through the load balancer. When there are too many server nodes, a large number of data packets will converge on the load balancer, and the speed will slow down!
Insert image description here
2) DR (direct routing) mode :
Principle : Both the load balancer and RS use the same IP for external services. But only the DR responds to ARP requests, and all RSs remain silent for ARP requests for their own IP. In other words, the gateway All requests for this service IP will be directed to the DR. After receiving the data packet, the DR will find the corresponding RS according to the scheduling algorithm, change the destination MAC address to the MAC of the RS (because the IP is consistent) and distribute the request to this Station RS. At this time, RS receives this data packet. After the processing is completed, since the IP is consistent, the data can be directly returned to the client. This is equivalent to receiving the data packet directly from the client. After processing, it is directly returned to the client. DR mode is the best performing mode. 。In
DR mode, LVS and RS need to be bound to the same VIP (RS is implemented by binding VIP to loopback)
Advantages: Like TUN (tunnel mode), the load balancer only distributes requests, and the response packets are returned to the client through a separate routing method. Compared with VS-TUN, VS-DR does not require a tunnel structure, so most operating systems can be used as physical servers.
Disadvantages : (I can’t say shortcomings, only shortcomings) The network card of the load balancer must be on the same physical segment as the physical network card.
Insert image description here
3) IP Tunnel mode
Principle: The request packets of most Internet services on the Internet are very short, while the response packets are usually large. Then the tunnel mode is to encapsulate the data packet sent by the client with a new IP header tag (only the destination IP) and send it to RS. After RS ​​receives it, it first unpacks the header of the data packet, restores the data packet, and processes it. , directly returned to the client without going through the load balancer. Note that because RS needs to restore the data packets sent by the load balancer, it must support the IPTUNNEL protocol. Therefore, in the kernel of RS, it must be compiled to support Advantages of IPTUNNEL option
: The load balancer is only responsible for distributing request packets to back-end node servers, while RS sends response packets directly to users. Therefore, the large amount of data flow in the load balancer is reduced, and the load balancer is no longer the bottleneck of the system and can handle a huge amount of requests. In this way, one load balancer can distribute to many RSs. And it can be distributed in different regions by running on the public Internet.
Disadvantages: Tunnel mode RS nodes require legal IPs. This method requires all servers to support the "IP Tunneling" (IP Encapsulation) protocol, and the servers may be limited to some Linux systems.
Insert image description here
4)FULL-NAT mode (understand)
Principle: The client initiates a request to the VIP, and the Director receives the request and discovers that it is requesting a back-end service. Direcrot does full-nat on the request message, changes the source IP to Dip, converts the target IP to the rip of any backend RS, and then sends it to the backend. After receiving the request, rs responds, and the corresponding source IP is the Rip target. IP or DIP, it is routed to the Director by internal routing. The Director receives the response message, performs full-nat, changes the source address to VIP, the destination address to CIP, uses
DNAT for the request, and uses SNAT for the response.

3. The difference between the four working modes

lvs-nat and lvs-fullnat: Both request and response messages pass through Director
  lvs-nat: RIP's gateway must point to DIP
   lvs-fullnat: RIP and DIP may not be in the same IP network, but they must be able to communicate
with lvs-dr and lvs-tun : The request message must go through the Director, but the response message is sent directly to the Client from the RS.
   lvs-dr: implemented by encapsulating a new MAC header and forwarded through the MAC network.
   lvs-tun: forwarded by encapsulating a new IP header outside the original IP message. , support long-distance communication

4. LVS management tool—ipvsadm

Starting from version 2.4, the Linux kernel supports LVS by default. To use the capabilities of LVS, you only need to install an LVS management tool: ipvsadm.
1. LVS-server management tool ipvsadm
installation ipvsadm: yum -y install ipvsadm
package: ipvsadm
main program: /usr/sbin/ipvsadm
rule saving tool: /usr/sbin/ipvsadm --save > /etc/sysconfig/ipvsadm
configuration File:/etc/sysconfig/ipvsadm-config

2. Command options

-A --add-service #Add a new virtual server record in the server list
-t #Represent as tcp service
-u #Represent as udp service
-s --scheduler #Scheduling algorithm used, rr | wrr | lc | wlc The default scheduling algorithm is wlc.
Example: ipvsadm -A -t 192.168.1.2:80 -s wrr

-a --add-server #Add a new real host record in the server table
-t --tcp-service #Indicate that the virtual server provides tcp service
-u --udp-service #Indicate that the virtual server provides udp service
-r - -real-server #Real server address
-m --masquerading #Specify LVS working mode as NAT mode
-w --weight #Real server weight
-g --gatewaying #Specify LVS working mode as direct router mode (also LVS default mode)
-i --ip #Specify the working mode of LVS as tunnel mode
-p #Session retention time, define the session retention time for traffic to be transferred to the same realserver.

Example: ipvsadm -a -t 192.168.1.2:80 -r 192.168.2.10:80 -m -w 1

-E -edit-service #Edit a virtual server record in the kernel virtual server table.
-D -delete-service #Delete a virtual server record in the kernel virtual server table.
-C -clear #Clear all records in the kernel virtual server table.
-R -restore #Restore virtual server rules
-S -save #Save virtual server rules to standard output in a format readable by the -R option
-d -delete-server #Delete a real server record in a virtual server record
-L|-l –list #Display kernel virtual server table
–numeric, -n: #Output address and port number in numeric form
–stats: #Statistical information
–rate: #Output rate information

-Z –zero #Virtual service table counter Clear (clear the current number of connections, etc.)

5. Practical application of LVS load balancing cluster

1. Environment:

Four clean virtual machines, two as web servers and one as LVS load balancing server.

virtual machine IP address play a role VIP
Virtual machine 1 DIP:192.168.58.143 lvs-dr, load balancing server in DR mode 192.168.58.140
virtual machine 2 SIP1:192.168.58.162 rs-1, server 1 that actually provides web services 192.168.58.140
virtual machine 3 SIP2:192.168.58.155 rs-2, server 2 that actually provides web services 192.168.58.140
virtual machine 4 192.168.58.164 Test use without any configuration none

Note:
① All four virtual machines must turn off firewalld and selinux;
② The network adopts NAT mode
③ DR mode requires Director DIP and all RealServer RIPs to be in the same network segment and broadcast domain
④ All node gateways must specify the real gateway

2. Build a web server

Virtual machine 2:
Install and start nginx, and write test content in the nginx default release directory

[root@rs-1 ~]# yum -y install nginx && systemctl start nginx
[root@rs-1 ~]# echo lvs-rs1 > /usr/share/nginx/html/index.html
[root@rs-1 ~]# nginx -s reload

Virtual machine 3:
Install and start nginx, and write test content in the nginx default release directory

[root@rs-2 ~]# yum -y install nginx && systemctl start nginx
[root@rs-2 ~]# echo lvs-rs2 > /usr/share/nginx/html/index.html
[root@rs-2 ~]# nginx -s reload

Virtual machine 4:
Test two web servers.
Insert image description here
The two web services are successfully established and can be accessed normally.

3. LVS load balancing configuration

Virtual machine 1
1) Install the LVS management tool ipvsadm

[root@lvs-dr ~]# yum -y install ipvsadm

2) Configure VIP

[root@lvs-dr ~]# ip addr add dev ens33 192.168.58.140/32 #Set VIP. This command is a temporary setting. If the temporary IP disappears when the server is restarted, the subnet mask must be set to 32.
Insert image description here
Insert image description here

3) Start ipvsadm

[root@lvs-dr ~]# systemctl start ipvsadm
Note: If an error is reported during startup: /bin/bash: /etc/sysconfig/ipvsadm: There is no such file or directory. You
need to manually generate the /etc/sysconfig/ipvsadm file and then start ipvsadm.
[root@lvs-dr ~]# ipvsadm --save > /etc/sysconfig/ipvsadm

Define LVS distribution policy
4) Create a virtual server and specify the IP address: port, protocol and algorithm policy

[root@lvs-dr ~]# ipvsadm -A -t 192.168.58.140:80 -s rr#rr round robin scheduling algorithm

5) Add real records to the virtual server and specify the working mode

[root@lvs-dr ~]# ipvsadm -a -t 192.168.58.140:80 -r 192.168.58.162 -g
[root@lvs-dr ~]# ipvsadm -a -t 192.168.58.140:80 -r 192.168.58.155 -g
Insert image description here
-a: Add a new real host record in the virtual server table
-t: Specify the use of TCP protocol
-r: Specify the real server IP address
-g: Specify the working mode of LVS as direct routing mode, which is the default working mode of LVS

6) Save server rules

[root@lvs-dr ~]# ipvsadm -S > /etc/sysconfig/ipvsadm# Replace -S with - -save.

The DR deployment of the above virtual machine 1 is completed

View the contents of the load balancing configuration

[root@lvs-dr ~]# ipvsadm -ln
Insert image description here

There are also the following viewing methods, and the parameter interpretation is:

[root@lvs-dr ~]# ipvsadm -L -n --stats #Display statistical information
1. Conns (connections scheduled) Number of forwarded connections
2. InPkts (incoming packets) Number of incoming packets
3. OutPkts (outgoing packets) Number of outgoing packets
4. InBytes (incoming bytes) Incoming traffic (bytes)
5. OutBytes (outgoing bytes) Outgoing traffic (bytes)

[root@lvs-dr ~]# ipvsadm -L -n --rate #Look at the rate
1. CPS (current connection rate) The number of connections per second
2. InPPS (current in packet rate) The number of packets per second
3. OutPPS (current out packet rate) The number of outgoing packets per second
4. InBPS (current in byte rate) The incoming traffic (bytes) per second
5. OutBPS (current out byte rate) The outgoing traffic (bytes) per second

Virtual machine 2
1) Bind VIP on the lo interface

[root@rs-1 ~]# ip addr add dev lo 192.168.58.140/32
It should be noted that the LVS load balancing server needs to bind the VIP to the ens33 network card, and the two real web servers need to bind the VIP to the lo network card.

2) Ignore arp broadcast and silence ARP to ensure that users can access dr when accessing VIP

[root@rs-1 ~]# echo 1 > /proc/sys/net/ipv4/conf/all/arp_ignore #Temporary modification will disappear after restarting. The function of this command is to ignore arp broadcast and silence ARP to ensure that users can access dr when accessing VIP. It is not ignored by default.

3) Enable routing and forwarding function

[root@rs-1 ~]# echo 1 > /proc/sys/net/ipv4/ip_forward#Temporary modification will disappear after restarting.

Permanent modification:

[root@rs-1 ~]# vim /etc/sysctl.conf
Insert image description here
use sysctl -p to make the configuration effective

4) Match the exact IP address and return the packet

[root@rs-1 ~]# echo 2 > /proc/sys/net/ipv4/conf/all/arp_announce
This step is to enable RS to directly return data to the Client.

5) Start nginx

[root@rs-1 ~]# systemctl start nginx

virtual machine 3

Same operation as virtual machine 2

4. Verification

virtual machine 4

Test the LVS load balancing function
Insert image description here
and access VIP to achieve access to two web servers in turn, indicating that the LVS-DR load balancing service cluster using the rr scheduling algorithm is successfully established.

6. Scheduling algorithm of LVS

The scheduling algorithm of LVS is divided into two categories: static algorithm and dynamic algorithm. It includes a total of 10 algorithms. Here are the 5 more commonly used algorithms.

1. Static algorithm

Scheduling is only performed based on the algorithm without considering the actual connection status and load status of the back-end server.
1) RR: Round Robin
scheduler uses the "round robin" scheduling algorithm to allocate external requests to the real servers in the cluster in sequence. server, it treats each server equally, regardless of the actual number of connections and system load on the server.
Example:
ipvsadm -A -t 192.168.58.140:80 -s rr
ipvsadm -a -t 192.168.58.140 -r 192.168 .58.155-g

2) WRR: The weighted round robin (Weight RR)
  scheduler schedules access requests according to the different processing capabilities of the real server through the "weighted round robin" scheduling algorithm. This ensures that servers with strong processing capabilities can handle more access traffic. The scheduler can automatically query the load status of the real server and dynamically adjust its weights.
Example:
ipvsadm -A -t 192.168.58.140:80 -s wrr
ipvsadm -a -t 192.168.58.140 -r 192.168.58.155 -g -w 2

2. Dynamic algorithm

The front-end scheduler will allocate requests based on the actual connection status of the back-end real server.

1) LC: The Least Connections
scheduler dynamically schedules network requests to the server with the least number of established links through the "least connections" scheduling algorithm. If the real servers of the cluster system have similar system performance, the "minimum connection" scheduling algorithm can be used to better balance the load.
Example:
ipvsadm -A -t 192.168.58.140:80 -s lc
ipvsadm -a -t 192.168.58.140 -r 192.168.58.155 -g

2) WLC: Weighted Least Connections (This is used by default) (Weighted Least Connections)
When the server performance in the cluster system varies greatly, the scheduler uses the "Weighted Least Connections" scheduling algorithm to optimize the load balancing performance, with Servers with higher weights will bear a larger proportion of the active connection load. The scheduler can automatically query the load of the real server and dynamically adjust its weights.
Example:
ipvsadm -A -t 192.168.58.140:80 -s wlc
ipvsadm -a -t 192.168.58.140 -r 192.168.58.155 -g

3) NQ: Never Queue Scheduling (Never Queue Scheduling NQ)
does not require a queue. If there is a realserver with a connection number = 0, it will be allocated directly to it without the need for sed operation, ensuring that no host will be idle.
Example:
ipvsadm -A -t 192.168.58.140:80 -s nq
ipvsadm -a -t 192.168.58.140 -r 192.168.58.155 -g

7. LVS health monitoring script

LVS does not have health detection by default. When the real-server service hangs up, LVS cannot judge in time, which may cause user access failure. So how to perform health detection through scripts?

[root@lvs-dr ~]# vim lvs.sh

The script content is as follows:

#!/bin/bash
rs1=192.168.58.162
ping -c 1 $rs1 &> /dev/null
if [ $? -eq 0 ];then
        curl $rs1
        if [ $? -eq 0 ];then
                /usr/sbin/ipvsadm -Ln | grep $rs1 &> /dev/null
                if [ $? -eq 0 ];then
                        break
                else
                        /usr/sbin/ipvsadm -a -t 192.168.58.140:80 -r $rs1 -g
                fi

        else
                 /usr/sbin/ipvsadm -d -t 192.168.58.140:80 -r $rs1
        fi
else
        /usr/sbin/ipvsadm -d -t 192.168.58.140:80 -r $rs1
fi

rs2=192.168.58.155
ping -c 1 $rs2 &> /dev/null
if [ $? -eq 0 ];then
        curl $rs2
        if [ $? -eq 0 ];then
                /usr/sbin/ipvsadm -Ln | grep $rs2 &> /dev/null
                if [ $? -eq 0 ];then
                        exit
                else
                        /usr/sbin/ipvsadm -a -t 192.168.58.140:80 -r $rs2 -g
                fi

        else
                 /usr/sbin/ipvsadm -d -t 192.168.58.140:80 -r $rs2
        fi
else
        /usr/sbin/ipvsadm -d -t 192.168.58.140:80 -r $rs2
fi

Add a scheduled task to detect script changes every minute

[root@lvs-dr ~]# crontab -e
Insert image description here

Guess you like

Origin blog.csdn.net/weixin_44178770/article/details/124502534