The latest version of Linux operation and maintenance interview questions in 2023 (3)

  • About the author: A cloud computing network operation and maintenance personnel, sharing network and operation and maintenance technology and useful information every day. 

  • Public account: Netdou Cloud Computing School

  •  Motto: Keep your head down and be respectful

  • Personal homepage:  Internet Bean’s homepage

Table of contents

write in front

 16. What is keepalived?

 17. How do you understand the VRRP protocol?

18. How does keepalived work?

19. Causes of split brain

 20. How to solve the keepalived split-brain problem?


write in front

    Hello everyone, I am Wangdou, a blogger focusing on the field of operation and maintenance. Today, I bring you a special topic: operation and maintenance interview questions. Today, as the IT industry continues to develop, interviews for operations and maintenance positions are no longer limited to the examination of basic knowledge, but pay more attention to candidates' practical experience, problem-solving abilities, and attitude toward continuous learning. Therefore, this article will share with you some common operation and maintenance interview questions to help you better prepare for interviews and improve your competitiveness.

With the popularization of cloud computing, big data and other technologies, operation and maintenance positions are becoming more and more important in the IT field. An excellent operation and maintenance engineer must not only have a solid technical foundation, but also need to have good problem-solving skills, teamwork spirit and learning ability. Therefore, the interview is a key step in selecting excellent operation and maintenance engineers.

During the interview process, the interviewer usually examines aspects such as basic knowledge, practical experience, teamwork, and learning ability. Below, I will introduce the interview questions in these aspects one by one, and give corresponding answer ideas and techniques. I hope that this article can help you better prepare for the operation and maintenance interview and get your favorite position.

Please note that these questions are just one of the common interview questions, and other aspects may be covered in the actual interview. Therefore, it is recommended that when preparing for interviews, in addition to mastering these questions, you should also focus on comprehensively improving your technical capabilities and overall quality.
 


 16. What is keepalived?

 Broadly speaking, it is high availability, and narrowly speaking, it is host redundancy and management.

 Keepalived was originally designed for LVS, specifically used to monitor the status of each service node in the cluster system. It detects the status of each service node based on the third, fourth, and fifth layer switching mechanisms of the TCP/IP reference model. If If a certain server node is abnormal or the work fails, Keepalived will detect it and remove the failed server node from the cluster system. All these tasks are completed automatically and do not require manual intervention. All that needs to be done manually is repair. A failed service node.

Later, Keepalived added the VRRP function. The purpose of VRRP (VritrualRouterRedundancyProtocol, virtual routing redundancy protocol) is to solve the single point of failure problem in static routing. Through VRRP, uninterrupted and stable operation of the network can be achieved, so Keepalived has server status on the one hand. Detection and fault isolation functions, on the other hand, there are also HAcluster functions.

Therefore, the core function of keepalived is health check and failure replacement.
The so-called health check uses TCP three-way handshake, ICMP request, HTTP request, UDP echo request, etc. to keep alive the actual server behind the load balancer (usually the server that carries the real business);

 Failover is mainly applied to load balancers configured with active and standby modes. VRRP is used to maintain the heartbeat of the active and standby load balancers. When there is a problem with the active load balancer, the standby load balancer carries the corresponding services, thus maximizing the efficiency of the load balancer. Reduce traffic loss and provide service stability


 17. How do you understand the VRRP protocol?

Why use VRRP?

Communication between hosts is completed by configuring static routing or (default gateway). Once the router between hosts fails, communication will fail. Therefore, in this communication mode, the router becomes a single point bottleneck. ,In order to solve this problem, the VRRP protocol was ,introduced.


The VRRP protocol is a fault-tolerant master-backup mode protocol that ensures that when the next-hop route of a host fails, another router will take over the work of the failed router. VRRP can be used transparently when a network failure occurs. Device switching without affecting data communication between hosts.

 Three states of VRRP:
VRRP router has three states during operation:
1. Initialize state: After the system starts, it enters Initialize. In this state, the router does not do any processing of VRRP messages;
2. Master state;
3. Backup state;
Generally, the main router is in the Master state and the backup router is in the Backup state.


18. How does keepalived work?

keepalived adopts a modular design, and different modules implement different functions.
keepalived mainly has three modules, namely core, check and vrrp.
core: is the core of keepalived, responsible for the startup and maintenance of the main process, the loading and analysis of global configuration files, etc.
check: responsible for healthchecker (health check), including various health check methods, and the analysis of corresponding configurations, including LVS configuration analysis ;Can check the health status of IPVS backend server based on script check

vrrp: VRRPD sub-process. The VRRPD sub-process is used to implement the
Keepalived high-availability pair of the VRRP protocol. The communication between the Keepalived high-availability pairs is through VRRP. VRRP determines the master and backup through the election mechanism. The master has a higher priority than the backup. Therefore, the work The primary node will obtain all resources first, and the standby node is in a waiting state. When the primary node goes down, the standby node will take over the resources of the primary node, and then replace the primary node to provide external services.

Between Keepalived service pairs, only the master server will always send VRRP broadcast packets to tell the backup server that it is alive. At this time, the backup server will not preempt the master. When the master is unavailable, that is, when the backup server cannot listen to the broadcast packets sent by the master, it will Relevant services will be started to take over resources to ensure business continuity. The takeover speed is the fastest.


19. Causes of split brain

What is split-brain?

In a high availability (HA) system, when the "heartbeat line" connecting two nodes is disconnected, the HA system, which was originally a whole and coordinated actions,
split into two independent entities.
Since they lost contact with each other, they both thought that the other party had malfunctioned. The HA software on the two nodes is like a "split-brain man", competing for "shared
resources" and "application services", which will lead to serious consequences. The shared resources are divided and the "services" on both sides cannot be started; or
the "services" on both sides are up, but the "shared storage" is read and written at the same time, resulting in data damage.

What are the causes of split brain?

The heartbeat link between the high-availability server pair failed, resulting in the failure of normal communication.
Because the heartbeat cable is broken (including broken or aging).
Because the network card and related drivers are broken, IP configuration and conflict issues (network card direct connection),
due to device failure (network card and switch) connected between the heartbeat lines,
due to problems with the arbitrated machine (arbitration solution is adopted),
iptables is enabled on the high-availability server The firewall blocks heartbeat message transmission.
Information such as the heartbeat network card address on the high-availability server is incorrectly configured, resulting in failure to send heartbeats.
Other reasons include improper configuration of other services, such as different heartbeat modes, heartbeat wide insertion conflicts, software bugs, etc.


 20. How to solve the keepalived split-brain problem?

In actual production environments, we prevent split-brain from the following aspects:

Use a serial cable and an Ethernet cable to connect at the same time, and use two heartbeat lines at the same time. In this way, if one line is broken, the other one is still
good and the heartbeat message can still be transmitted.

When checking for split-brain, forcibly shutting down a heartbeat node (this function requires special equipment support, such as stonith and fence) is equivalent to the backup node
not receiving the heartbeat message and sending a shutdown command through a separate line to turn off the power of the master node.

Do a good job of monitoring and alarming for split-brain
common solutions:

If the firewall is turned on, heartbeat messages must be allowed to pass. This is usually solved by allowing IP segments.

You can pull an Ethernet cable or serial port cable as a redundant heartbeat line for the primary and secondary nodes.

Develop detection program to detect split-brain through monitoring software


 

Guess you like

Origin blog.csdn.net/yj11290301/article/details/135213608