Article directory

Linux operation and maintenance engineer interview questions (3)

Linux operation and maintenance engineer interview questions (3)

I wish you all the best of luck in finding the job you want.
Continuous learning will not be eliminated.
The earth doesn't explode, we don't take holidays.
Opportunities are always reserved for those who are prepared.
Come on, hit the workers!

1 There are several working modes of LVS, what are they?

Three kinds:

NAT mode: modify the target IP of the request message, DNAT of multi-target IP
DR mode (default mode): Manipulate new MAC addresses
TUN mode: add a new IP header in addition to the original request IP message

2 What are the parts of LVS

LVS consists of 2-part programs, including ipvs and ipvsadm.

ipvs (ip virtual server): A piece of code works in the kernel space, called ipvs, which is the code that actually takes effect to realize scheduling;
ipvsadm: The other section is working in user space, called ipvsadm, responsible for writing rules for the ipvs kernel framework, defining who is the cluster service, and who is the real server (Real Server) at the back end.

3 What are the terms related to LVS

DS: Director Server, refers to the front-end load balancer node.
RS: Real Server, the real working server at the back end.
VIP: Virtual IP, directly facing the user's request to the outside, as the IP address of the target of the user's request.
DIP: Director Server IP, the IP address mainly used to communicate with internal hosts.
RIP: Real Server IP, the IP address of the backend server.
CIP: Client IP, the IP address of the access client.

4 What are the load scheduling algorithms of the LVS cluster?

Round-robin (polling, round-robin) scheduling (Round-Robin Scheduling) rr
Weighted Round-Robin Scheduling (Weighted Round-Robin Scheduling) wrr
Least-Connection Scheduling lc
Weighted Least-Connection Scheduling (Weighted Least-Connection Scheduling) wlc (default scheduling algorithm)
Locality-Based Least Connections Scheduling (Locality-Based Least Connections Scheduling) lblc
Locality-Based Least Connections with Replication Scheduling lblcr
Destination Hashing Scheduling dh
Source Hashing Scheduling (Source Hashing Scheduling) sh

5 Can I disable and delete iptables when using LVS

Yes, disabling iptables will not affect the use of LVS. LVS is a load balancing technology implemented at the Linux kernel level, and its underlying layer does not rely on iptables for traffic forwarding. LVS uses technologies such as IP tunneling or network address translation (NAT) to forward traffic from clients to backend servers without relying on iptables rules.

6 What are the haproxy scheduling algorithms?

tcp stands for four-layer load, and http stands for seven-layer load.

Static algorithm:

static-rr-------->tcp/http: Weight-based round-robin scheduling, does not support dynamic adjustment of weights using socat at runtime (only 0 and 1 are supported, other values are not supported and the back-end server is slow There is no limit to the number of backend hosts, which is equivalent to wrr in LVS.
first------------->tcp/http: According to the position of the server in the list, it will be scheduled from top to bottom, but only when the number of connections of the first server reaches the upper limit, the new The request will be assigned to the next service, so the weight setting of the server will be ignored, and this method is less used. Dynamic weight modification with socat is not supported, 0 and 1 can be set, other values can be set but invalid.

Dynamic algorithm:

roundrobin------->tcp/http: Weight-based round-robin dynamic scheduling algorithm, supports runtime adjustment of weights, different from rr round-robin training mode in lvs, roundrobin in haproxy supports slow start (newly added server will gradually increase the number of reposts), each backend supports up to 4095 real servers, supports dynamic adjustment of real server weights, and roundrobin is the default scheduling algorithm, which is widely used.
leastconn ---------> tcp/http: weighted least connection dynamics, supports runtime adjustment of weights and slow start, that is: priority scheduling based on the back-end server with the least current connections instead of weights (new Client connection), which is more suitable for long-term connection scenarios, such as MySQL and other scenarios.
random------------>tcp/http: The random load balancing algorithm was added in version 1.9, which is based on random numbers as the key of consistent hash. Random load balancing is suitable for large server farms or frequently added Or deleting the server is very useful. It supports dynamic adjustment of weight. Hosts with larger weights have a greater probability of obtaining new requests.

** Other algorithms: ** The following static and dynamic depends on whether the hash_type is consistent

source---------->tcp/http: source address hash, forward the request to the backend server based on the user source address hash, subsequent requests with the same source address will be forwarded to the same backend web server . In this method, when the data volume of the backend server changes, many user requests will be forwarded to the new backend server. The default is static, but it can be changed through the options supported by hash-type.
This algorithm is generally used in TCP mode that does not insert cookies, and it can also provide the best session stickiness for customers who refuse session cookies. It is suitable for scenarios where session sessions are kept but cookies and caches are not supported.
There are two server selection calculation methods for forwarding the client request to the backend server for the source address, namely the modulo method and the consistent hash.
uri--------------->http: Based on the hash of the left half of the URI requested by the user or the entire uri, and then modulo the hash result to the total weight, according to the final result Forward the request to the specified backend server. It is suitable for the scenario where the backend is a cache server. The default is a static algorithm. You can also specify map-based and consistent through the hash-type to define whether to use the modulo method or consistent hash.
url_param---->http: url_param performs hash calculation on the value corresponding to a parameter key in the params part of the url requested by the user, and distributes it to a selected server after dividing by the total weight of the server; usually used for Track users to ensure that requests from the same user are always sent to the same real server. If there is no key, the roundrobin algorithm will be used.
hdr-------------->http: hash the specified information in each http header (header) request of the user, here the http header specified by name will be taken out and hashed Hash calculation, and then dispatched to a selected server after taking the modulus of the total weight of the server. If there is no valid value, the default round-robin scheduling will be used.
rdp-cookie---->tcp: rdp-cookie loads the windows remote desktop, using cookies to keep the session, the default is static, you can also specify map-based and consistent through hash-type to define whether to use the modulo method or— Consistent hash.

Algorithm usage scenarios

first		# 使用较少

static-rr	# 做了session共享的web集群
roundrobin
random

leastconn	# 数据库
source		# 基于客户端公网IP的会话保持

uri--------->http	# 缓存服务器，CDN服务商，蓝汛、百度、阿里云、腾讯
url_param--->http	# 可以实现session保持

hdr			# 基于客户端请求报文头部做下一步处理
rdp-cookie	# 基于windows主机,很少使用

7 What are the distribution strategies for nginx to achieve load balancing

Polling (default): Each request is allocated to different back-end servers one by one in chronological order. If a back-end server goes down, the faulty system can be automatically eliminated.
Weight weight: The larger the value of weight, the higher the probability of being accessed. It is mainly used when the performance of each backend server is unbalanced. The second is to set different weights in the case of master and slave, so as to make reasonable and effective use of host resources.
ip_hash (IP binding): Each request is allocated according to the hash result of the access IP, so that visitors from the same IP can access a backend server, and can effectively solve the session sharing problem existing in dynamic web pages
url_hash (third-party plug-in): The hash package of Nginx must be installed, and requests are allocated according to the hash result of the accessed URL, so that each URL is directed to the same backend server, which can further improve the efficiency of the backend cache server.
fair (third-party plugin): The upstream_fair module must be installed. Compared with weight and ip_hash, which are more intelligent load balancing algorithms, the fair algorithm can intelligently perform load balancing according to the page size and loading time, and give priority to those with short response times.

8 The difference between four-layer load and seven-layer load

Layer 4: IP+PORT forwarding
Seven layers: protocol + content exchange

Four layers of load:

In the four-layer load device, the target address of the message sent by the client (originally the IP address of the load balancing device), selects the corresponding web server IP address according to the rules for selecting web servers set by the balancing device, so that the client can directly communicate with This server establishes a TCP connection and sends data, and the layer-4 load itself does not participate in the establishment of the connection. Unlike LVS, haproxy is a pseudo-layer-4 load balancing, because haproxy needs to establish connections with the front-end client and the back-end server respectively.

Seven layers of load:

The seven-layer load balancing server acts as a reverse proxy server. The server needs three handshakes to establish a TCP connection, and the client needs to perform three handshakes with the seven-layer load device to establish a TCP connection before accessing the Web Server. The text information is sent to the seven-layer load balancing; then the seven-layer load balancing selects a specific Web Server according to the set balancing rules, and then establishes a TCP connection with this Web Server through a three-way handshake, and then the Web Server sends the required data to the seven-layer The load balancing device, the load balancing device then sends the data to the client; therefore, the seven-layer load balancing device acts as a proxy server, and the seven-layer proxy needs to establish connections with the client and the back-end server respectively.

To put it simply: layer 4 is to modify the target route of the user request and forward it directly to the server; layer 7 is to split the user's message and send it to the server by load balancing instead of the user. When returning the same message, it is first sent to the load balancer, and then the load balancer modifies the message before sending it to the user. So the user ip in the logs we see is the ip address of the load balancer, so x-forward needs to be done.

9 What are the functions of load balancing

Forwarding function: According to a certain algorithm [weight, round-robin], forward client requests to different application servers, reduce the pressure on a single server, and increase system concurrency.
Fault removal: Through heartbeat detection, it is judged whether the application server is currently working normally. If the server is down for a period of time, the request is automatically sent to other application servers.
Restoration and addition: If it is detected that the failed application server is back to work, it will be automatically added to the queue for processing user requests.

10 Advantages, disadvantages and differences of LVS, HAProxy, and Nginx load balancing

Advantages of LVS:

Strong anti-load ability, working on the 4th layer only for distribution, no traffic generation, this feature also determines that it has the strongest performance in load balancing software; no traffic, and at the same time ensures that the performance of the balancer IO is not Will be affected by high traffic;
The work is stable, and it has a complete dual-system hot backup solution, such as LVS+Keepalived and LVS+Heartbeat;
The application range is relatively wide, and it can do load balancing for all applications;
The configurability is relatively low, which is a disadvantage and an advantage, because there are not many things that can be configured, so it does not require too much contact, which greatly reduces the chance of human error;

Disadvantages of LVS:

The software itself does not support regular processing, and cannot separate dynamic and static, which highlights the advantages of Nginx/HAProxy+Keepalived.
If the website application is relatively large, LVS/DR+Keepalived will be more complicated, especially for machines with Windows Server applications behind it, the implementation, configuration and maintenance process will be more troublesome. Relatively speaking, Nginx/HAProxy+Keepalived is much simpler .

Advantages of Nginx:

Working on OSI layer 7, it can do some shunting strategies for http applications. For example, for domain names and directory structures. Its regularization is more powerful and flexible than HAProxy;
Nginx has very little dependence on the network. In theory, it can perform the load function if it can be pinged. This is also its advantage;
Nginx is relatively simple to install and configure, and it is convenient to test;
It can bear high load pressure and is stable, and can generally support more than tens of thousands of concurrency;
Nginx can detect the internal failure of the server through the port, such as the status code returned by the server processing the web page, timeout, etc., and will resubmit the wrong request to another node;
Nginx is not only an excellent load balancer/reverse proxy software, but also a powerful web application server. LNMP is also a very popular web environment now, and it has the potential to compete with the LAMP environment. Nginx has advantages over Apache in handling static pages, especially in anti-high concurrency;
Nginx is now more and more mature as a web reverse acceleration cache, and its speed is faster than the traditional Squid server. Friends in need can consider using it as a reverse proxy accelerator;

Disadvantages of Nginx:

Nginx does not support url to detect.
Nginx can only support http and Email, which is its weakness.
Nginx's Session retention and Cookie's ability to guide are relatively lacking.

Advantages of HAProxy:

HAProxy supports virtual hosts and can work on layers 4 and 7 (supports multiple network segments);
It can supplement some shortcomings of Nginx, such as session maintenance, cookie guidance, etc.;
Support url detection backend server;
Like LVS, it is only a load balancing software; purely in terms of efficiency, HAProxy will have better load balancing speed than Nginx, and it is also better than Nginx in concurrent processing;
HAProxy can load balance Mysql reads, detect and load balance the backend MySQL nodes, but the performance is not as good as LVS when the number of backend MySQL slaves exceeds 10;
HAProxy has many algorithms, reaching 8 kinds;

LVS: It is forwarding based on four layers
HAproxy: It is forwarding based on four layers and seven layers, it is a professional proxy server
Nginx: it is a WEB server, a cache server, and a reverse proxy server, it can do seven layers of forwarding

Difference: LVS can only do port forwarding because it is based on four-layer forwarding, but URL-based and directory-based forwarding LVS cannot do it.

Work choice: HAproxy and Nginx can do seven-layer forwarding, so both URL and directory forwarding can be done. When there is a large amount of concurrency, we have to choose LVS. For small and medium-sized companies, if the concurrency is not so large, choose HAproxy Or Nginx is enough, because HAproxy is a professional proxy server with simple configuration, so small and medium-sized enterprises recommend using HAproxy

The above interview questions are just a personal summary. Write whatever you think of, without any order. If there is anything wrong with the writing, please comment and leave a message, and I will correct it in time.

Original link: Linux operation and maintenance engineer interview questions (3)