Enterprise High Availability project combat keepalived

keepalived what
keepalived is the cluster management software and services to ensure a highly available cluster, to prevent a single point of failure.

keepalived works
keepalived VRRP protocol is based on the underlying implementation, VRRP stands for Virtual Router Redundancy Protocol, or virtual routing redundancy protocol.

Virtual Router Redundancy Protocol can be considered to achieve high availability of the agreement, Taiwan will soon N routers provide the same functionality of a router group, the group which has a master and multiple backup, there is a master above the external service provider vip (the default route where other machines within the LAN router for vip), master will send multicast, when the backup does not receive packets vrrp considers that the master dawdle out, then you need a backup when the master VRRP according to priority of election . So we can ensure high availability of the router.

keepalived there are three main modules, namely core, check and vrrp. keepalived core module as the core, the main process responsible for initiating, maintaining, and loads the global configuration file and parsing. check responsible for health checks, including a variety of common inspection method. vrrp VRRP module is to achieve agreement.

Brain split (split-brain):
Keepalived after receiving the host without BACKUP MASTER host packets will switch to become master, if the problem is a communication line therebetween, can not receive multicast notification to each other, but the two nodes actual are in normal working condition, then the binding force two nodes are master virtual IP, leading to unpredictable consequences, this is the split brain.
Solution:
1, add more detection methods, such as redundant heartbeat (two network cards do health monitoring), ping each other and so on. To minimize the "split-brain" opportunity. (Indicators is not a cure, but to improve the probability of detection);
2, do a good job monitoring alarm on split-brain (such as e-mail and mobile phone text messages or watch) the first time human intervention arbitration when problems occur, reduce losses. For example, the alarm monitoring Baidu shorter times have distinction of uplink and downlink. Send an alarm message to the phone on the administrator, the administrator can simply reply the corresponding figure or character string by operating the phone to return to the server so that the server automatically processes a respective fault in accordance with instructions, such shorter time troubleshooting.
3, HS, and master stopped. Then check the firewall between the machine. Communications between the network

2, Nginx + keepalived seven of load balancing
. 3, LVS_Director + keepalived
. 4, the MySQL + Keepalived
. 5, Haproxy + keepalived

Published 48 original articles · won praise 18 · views 3642

Guess you like

Origin blog.csdn.net/wx912820/article/details/104976215