Introduction to LVS/Nginx/HAProxy Principles and Application Scenarios

Load balancing has developed into a basic core component in the network architecture, which eliminates the single point of failure of the server, can divert request traffic, improve redundancy, and ensure server stability.

In open source software load balancing, the most widely used are LVS, Nginx, HAProxy, and even Alibaba Cloud's SLB is also based on LVS and Nginx. This article will explain the working principles and application scenarios of LVS, Nginx, and HAProxy.

Introduction to LVS

LVS is the abbreviation of Linux Virtual Server, which means Linux virtual server, which is a virtual server cluster system. LVS works on the bottom layer of the second layer/third layer/fourth layer and is only used for distribution, with extremely low CPU and memory consumption and strong load resistance . So it can do load balancing for almost all applications, including HTTP, database, online chat room, etc., and LVS has 3 working modes and 10 scheduling algorithms to make it more flexible strategy selection on the load balancing side.

LVS is mainly implemented by IPVS and Ipvsadm:

IPVS: It is the core part of the LVS cluster system. It is a kernel module implemented based on the Linux Netfilter framework and mainly works on the INPUT chain of the kernel space. Its hook function is HOOK at the two HOOK points of LOCAL_IN and FORWARD respectively.

 

IPVS directly acts on the kernel space to modify and forward data packets. However, Nginx/HAProxy works in user space, which makes LVS more powerful.

Ipvsadm: Works in user space and is mainly used for users to define and manage cluster services. Therefore, when installing and configuring LVS, it is mainly to install and configure Ipvsadm.

LVS Principle Architecture

 

①The access request first reaches the kernel space of the load scheduler through the VIP.

② After the PREROUTING chain receives the user request, it will judge the target IP, determine that it is the local IP, and send the data packet to the INPUT chain.

③ When the user request reaches the INPUT, IPVS will compare the user request with the rules defined by Ipvsadm. If the user requests the defined cluster service, then IPVS will forcibly modify the data packet and send the new data packet to the POSTROUTING chain.

④ After the POSTROUTING link receives the data packet, it finds that the target IP address is just its own back-end server, and finally sends the data packet to the back-end server.

Three working modes of LVS

 

The abbreviations used below indicate: CIP - client IP address; VIP - load balancer publishes IP for user requests; DIP - load balancer IP mainly communicates with backend server; RIP - backend server IP address.

Principle of LVS Layer 2 DR Mode


LVS DR mode packet flow:

 

① The source address of the client request data packet message is CIP, and the destination address is VIP.

②Load balancing will change the source MAC address of the client request data packet to the MAC address of its own DIP, change the target MAC to the MAC address of RIP, and send the packet to the back-end server. Here it is required that all back-end servers and servers where load balancing is located can only be in one VLAN (local area network), that is, they cannot cross VLANs.

③The backend server finds that the destination MAC in the request packet is itself, and will receive the packet. Since the MAC address of the data packet is modified, the backend server needs to bind the VIP to the lo network port. After processing the request message, send the response message to the eth0 network card through the lo interface and send it directly to the client.

Principle of LVS three-layer IPTunnel mode

 

LVS IPTunnel mode packet flow:

① The source address of the client request data packet message is CIP, and the destination address is VIP.

② Load balancing encapsulates the header of the client request data packet into a layer of IP packets, changes the source address to DIP, and the destination address to RIP, and sends the data packet to the back-end server. Different from Layer 2 load balancing, it can cross VLANs. However, the principle of three-layer load balancing makes it impossible to directly obtain the source IP address of the client in the back-end server.

③After the backend server receives the request message, it will first unpack the first layer of encapsulation, and then find that there is still a layer of IP header in which the target address is the VIP on its own lo interface, so it will process the request message and send the response The message is sent to the eth0 network card through the lo interface and directly sent to the client.

Principle of LVS four-layer NAT mode


LVS NAT mode packet flow:

 

① The source address of the client request data packet message is CIP, and the destination address is VIP.

② Load balancing changes the destination address of the client request data packet to the RIP address, and sends the data packet to the back-end server. It is also required that all back-end servers and load balancing servers can only be in one VLAN (local area network), that is, they cannot cross VLANs.

③ After the message is sent to the backend server, the target server will respond to the request and return the response packet message to the load balancer. The gateway address of each internal back-end server must be the intranet address of the server where the load balancer is located, that is, SNAT must be configured so that the data packets can be returned to the client through LVS.

④ Then the load balancer modifies the source address of the data packet to the local machine and sends it to the client.

The DR mode and NAT mode of LVS only do "one connection" for the processing of data packets, that is, the load balancing only forwards data packets.

 

The essential reason why LVS can achieve "one connection" is that LVS works in the kernel space. The three modes of LVS all work in the kernel space, and the processing of data packets is only in the kernel space, which is the most essential reason for LVS' light weight, high efficiency and high performance.

LVS application scenarios


LVS is suitable for medium and large-scale applications, but not suitable for small and medium-sized applications, especially for small and medium-sized websites. This is because the websites we deploy generally have requirements for virtual hosts, dynamic and static separation, and regular distribution. Generally, it can be directly realized by using Nginx.


Cloud ECS does not support LVS deployment, so for Layer 2/Layer 3/Layer 4 load balancing requirements, you can only use the Layer 4 load balancing function of the cloud product SLB instead, or deploy Nginx/HAProxy yourself.


LVS does not support seven-layer virtual hosts, Rewrite regular expression processing, dynamic and static separation and other functions. And now many Web sites have a strong demand in this regard, which is the advantage of Nginx/HAPrxoy.


Introduction to Nginx/HA Proxy

Nginx is a lightweight web server/reverse proxy server and email (IMAP/POP3) proxy server. It is characterized by less memory usage, strong concurrency, and rich plug-in function modules. It is currently the preferred software among cloud web applications.

HAProxy is a free and open source software written in C language. It is a load balancing software that mainly works on seven-layer HTTP and four-layer TCP. Like LVS, it is primarily a professional-grade load balancer.

Nginx/HAProxy four-layer mode principle

 

Four-layer load balancing Nginx/HAProxy packet direction principle

①The source address of the client request data packet message is CIP, and the access destination address is DIP+IP port.

②The load balancer and the backend server initiate a new TCP three-way handshake to establish a new TCP connection. The source address of the message is DIP, and the destination address is RIP.

③ After the message is sent to the back-end server, the server responds to the request and returns the response packet message to the load balancer.

④ Then load balancing repackages the response content of this data packet and returns it to the client.

 

Nginx/HAProxy four-layer three-way handshake packet direction

After the client performs the TCP three-way handshake to the load balancer, the load balancer will immediately initiate a new TCP connection, which is the " secondary connection ".

Nginx/HAProxy seven-layer mode principle


 

Seven-layer load balancing Nginx/HAProxy packet direction principle:

 ①The source address of the client request data packet message is CIP, and the access destination address is DIP+IP port+URL.

②The load balancer and the backend server initiate a new TCP three-way handshake to establish a new TCP connection. The source address of the message is DIP, the destination address is RIP, and there is also the destination URL requested by the client.

③ After the message is sent to the back-end server, the server responds to the request and returns the response packet message to the load balancer.

④ Then load balancing repackages the response content of this data packet and returns it to the client.

 

For the secondary connection of Nginx/HAProxy seven layers, after the client and the load balancer perform the TCP three-way handshake, they still need to wait for the client Pushdata to transmit data, and then the load balancer and the back-end server will establish a new TCP three-way handshake .

Nginx application scenarios

It can do seven-layer HTTP load balancing, and can do some distribution strategies for HTTP applications, such as domain names, requested URLs, directory structures, and requested browser types. Its regular rules are more powerful and flexible than HAProxy.


Compared with HAProxy, Nginx has a weaker function of maintaining sessions at the seventh layer. By default, Nginx only supports configuring ip_hash. Can be enhanced with the nginx-sticky-module module. The health check of the backend server only supports monitoring through the port, not through the URL.


Starting from version 1.9.0 of Nginx, Nginx supports load balancing for Layer 4 TCP.


Nginx is also a powerful web application server. LNMP is also a very popular web architecture in recent years, and it is also very stable in high-traffic environments.


Nginx is now more and more mature as a Web reverse acceleration static cache, and Nginx can also be used as a static web page and image server. Its speed is faster than the traditional Squid server, basically the CDN chooses the underlying static cache server .


HAProxy application scenarios

HAProxy is a soft load balancing software that focuses on layer 7/layer 4, but compared with Nginx, it lacks the corresponding web server, static cache, rich third-party plug-ins and other functions .


HAProxy supports source address HASH, cookie identification, and session persistence based on different strategies of Session. At the same time, in the health check, it is supported to detect the status of the backend server by obtaining the specified URL .


In cloud practice, it is rarely used. Layer-7/Layer-4 load balancing can be done with cloud products SLB and Nginx.


Cloud load balancing options


Alibaba Cloud SLB (Server Load Balancer) currently provides four-layer (TCP and UDP protocols) and seven-layer (HTTP and HTTPS protocols) load balancing services.

The fourth layer uses open source software LVS (Linux Virtual Server) + Keepalived to achieve load balancing, and it is customized according to cloud computing requirements.

Layer 7 uses Tengine to achieve load balancing. Tengine is a web server project initiated by Taobao. Based on Nginx, it adds many advanced functions and features to meet the needs of websites with a large number of visits.

In cloud practice, the choice of load balancing must give priority to SLB provided by cloud vendors or soft load balancing products provided by mainstream hardware load balancing vendors in the cloud (the performance will be better and the functions will be richer).

In specific practical applications, if there is a requirement for rewrite on the seventh layer, or more scheduling algorithms on the fourth layer, the SLB function cannot meet the requirements for the time being, and it may be necessary to build a fourth layer of Nginx/HAProxy on ECS / Layer seven load balancing.

But in cloud practice, the four-layer/seven-layer of SLB basically meets 80% of the daily needs. Therefore, the selection of cloud load balancing is basically the selection of cloud load balancing SLB:

In TCP services, for load balancing, we can only choose four-layer load balancing.


In HTTP-like Web services, either four layers or seven layers can be selected. In fact, in 80% of enterprise-level HTTP load balancing applications, there is only a simple forwarding function, and there is no forwarding requirement for virtual hosts. The four-tier is preferred here because the four-tier is more powerful in terms of performance, and it is a standard and mature enterprise-level architecture to use a four-tier load balancer at the application entrance for distribution.


If the front-end SLB needs to be linked with certificate SSL, that is, HTTPS, then only seven-layer load balancing can be selected. If there are high-traffic and high-concurrency scenarios, it is recommended to use four-layer load balancing on the front end, and configure the certificate on Nginx in the back-end ECS to ensure performance.

Guess you like

Origin blog.csdn.net/chuixue24/article/details/130754554