Comparison of High Availability Keepalived and Heartbeat

1. Difference

Two high-availability open source solutions: Keepalived and Heartbeat. Both are very popular, and resources (such as ip and program services) are quickly transferred from a failed computer to another operating machine to continue providing services, generally referred to as high-availability services.

Heartbeat and keepalived have a lot in common, but there are also differences:

(1) Keepalived is easier to use : From the perspective of installation, configuration, use, maintenance, etc., Keepalived is simpler than Heartbeat

(2) The function of Heartbeat is more powerful : Although Heartbeat is complicated, it has more powerful functions and more complete supporting tools, and is suitable for large-scale cluster management , while Keepalived is mainly used for cluster switching and basically has no management function;

(3) The protocols are different : Keepalived uses the VRRP protocol , that is, the Virtual Router Redundancy Protocol ( VRRP for short, Cisco switching uses this protocol as a dual-machine ) for communication and election. Heartbeat uses heartbeats (IBM POWER minicomputers use heartbeat lines as dual computers) for communication and election; Heartbeat communicates through network or serial port


2. Working principle:


(Active-standby mode) One heartbeat server acts as the primary server, and the other automatically becomes the hot-standby server. Configure the heartbeat daemon on the hot standby server to listen for heartbeat information from the primary server. If the heartbeat information cannot be monitored within the specified time, failover will be initiated, the ownership of the relevant resources on the primary server will be obtained, and the primary server will continue to provide services uninterrupted, thereby achieving high availability of resources and services.

(main-main mode) heartbeat also supports main-main mode, and two servers are active and standby. Generally, the failover time is between 5 and 20s.

(1) Server downtime
        1. Heartbeat software failure
        2. Heartbeat connection line failure
        3. Service failure will not cause switching, and the heartbeat service can be stopped through service downtime.
(2) Communication between two heartbeat services:
        1. Serial cable, a special serial port card is installed on the server (the distance should not be too far, usually the upper and lower racks)
        2. The crossover network cable is directly connected to the two network cards of the server
        3. Through The switch is connected with a network cable. (affected by switch failure)


(3) Heartbeat split brain (splitbrain)

       Within a certain period of time, the two servers cannot detect the heartbeat of each other and start the failover function respectively to obtain the ownership of resources and services, which will cause the same IP to start the service at both ends at the same time, and there are two identical VIPs, causing conflicts. Serious Problem.

(4) Reasons for split-brain
       1. The heartbeat link is faulty, resulting in the inability to communicate normally
       2. The firewall is turned on to block the transmission of the heartbeat information
       3. The heartbeat network card address is not configured correctly
       4. The heartbeat mode, the heartbeat broadcast conflict, and the software bug
(5) ) Split-brain prevention scheme:
      1. Use serial cable and Ethernet cable to connect at the same time, and use two heartbeat lines at the same time
      2. When a split-brain is detected, a node is forcibly shut down.
      3. Do a good job of monitoring and early warning

      4. Arbitration mechanism (determine which node to take over the service)

(6) Message type:
        1. Heartbeat message (unicast, broadcast or multicast): 150-byte data packet
        2. Cluster conversion message: ip-request, ip-request-rsp
        3. Retransmission message: rexmit-request

3. IP address takeover and failover:

Heartbeat fails over via IP address takeover and arp broadcast.

ARP broadcast: When the primary server fails, after the standby node takes over the resources, it will immediately forcibly update the local arp tables of all clients (that is, clear the resolution records of the vip and mac addresses of the faulty server cached locally by the client) to ensure that the client and the New master server conversation.
 
Real IP, also known as management IP, generally refers to the IP configured on the physical network card. In a load balancing high availability environment, the management IP does not provide external access services. It is only used as a management server, such as SSH, which can be used for service connection management.

VIP is a virtual ip, which is actually eth0:X, and x is any number from 0 to 255. You can bind multiple aliases to a network card. VIP can automatically drift to the standby server when the main server fails.

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325447479&siteId=291194637