[Linux35] Pacemaker high availability cluster+HAProxy+fence-virtd+nginx

1. Introduction to Pacemaker


1.1 Introduction to Pacemaker


Pacemaker is the most widely used open source cluster resource manager in the Linux environment. Pacemaker uses the message and cluster member management functions provided by the cluster infrastructure (Corosync or Heartbeat) to achieve node and resource level failure detection and resource recovery, thereby ensuring maximum assurance High availability of cluster services. In terms of logical function, pacemaker is responsible for the full life cycle management of software services in the cluster, driven by the resource rules defined by the cluster administrator. This management even includes the entire software system and the interaction between the software systems. Pacemaker can manage clusters of any size in practical applications. Because of its powerful resource dependency model, this enables cluster administrators to accurately describe and express the relationship between cluster resources (including the order and location of resources). At the same time, for any form of software resources, almost all of them can be managed by Pacemaker as resource objects by customizing resource startup and management scripts (resource agents). In addition, it should be pointed out that Pacemaker is only a resource manager and does not provide cluster heartbeat information. Since any highly available cluster must have a heartbeat monitoring mechanism, many beginners will always mistakenly believe that Pacemaker itself has a heartbeat detection function. The heartbeat mechanism of Pacemaker is mainly based on Corosync or Heartbeat.

  • pcsd.service:service name

1.2 Introduction to pcs


pcs: Pacemaker cluster management tool

  • The Pacemaker community has introduced two commonly used cluster management command line tools, namely the pcs and crmsh commands most commonly used by cluster administrators;
  • In addition, it should be noted that the use of the pcs command line has certain requirements for the pacemaker and corosync software versions installed in the system, that is, Pacemaker 1.1.8 and above, Corosync 2.0 and above can use the pcs command line tool for cluster management

1.3 pcs commonly used management commands


cluster:配置集群选项和节点
status:查看当前集群资源和节点以及进程状态
resource:创建和管理集群资源
constraint:管理集群资源约束和限制
property:管理集群节点和资源属性
config:以用户可读格式显示完整集群配置信息

2. Installation and configuration


Experimental environment: 1. The firewall is turned off 2. selinux is Disabled


Two servers (to avoid password between them):
server1: 192.168.17.1
server4: 192.168.17.4


  • Software warehouse configuration

vim /etc/yum.repos.d/westos.repo

[dvd]
name=rhel7.6 BaseOS
baseurl=http://192.168.17.1/rhel7.6/
gpgcheck=0
[HighAvailability]
name=rhel7.6
baseurl=http://192.168.17.1/rhel7.6/addons/HighAvailability
gpgcheck=0

  • Install pacemaker and pcs and psmisc and policycoreutils-python
yum install -y pacemaker pcs psmisc policycoreutils-python 
ssh server4 yum install -y pacemaker pcs psmisc policycoreutils-python

  • Permanently open the service
systemctl enable --now pcsd.service
ssh server4 systemctl enable --now pcsd.service

  • set password
echo westos | passwd --stdin hacluster 
ssh server4 'echo westos | passwd --stdin hacluster'

  • Configure cluster node authentication

pcs cluster auth server1 server4

Insert picture description here

  • Create a two-node cluster

pcs cluster setup --name mycluster server1 server4

  • Start the cluster

pcs cluster start --all

pcs cluster enable --all


  • Disable STONITH component function

pcs property set stonith-enabled=false

若是不禁用
pcs status 查看状态时会有警告
crm_verify -LV 验证群集配置信息会有错误

Insert picture description here

  • Create VIP
pcs resource create vip ocf:heartbeat:IPaddr2 ip=192.168.17.100 op monitor interval=30s
#参数都可以通过pcs resource create --help查看

3. Testing


  • Check on the node server1, it shows that the node server1 is turned on, and the VIP is automatically added

Insert picture description here

  • When the service of node server1 is stopped, another node server4 automatically takes over and automatically adds VIP

pcs cluster stop server1

Insert picture description here
Insert picture description here

  • Restart node server1, resources will not be switched back, server4 remains unchanged and continues to take over

pcs cluster start server1

Insert picture description here

4. Pacemaker and HAProxy


4.1 Configuration


  • Both server nodes have haproxy installed, and haproxy service is closed, the configuration is also the same
yum install -y haproxy
systemctl disable --now haproxy.service


  • Create resources

pcs resource create haproxy systemd:haproxy op monitor interval=30s

Insert picture description here

  • Create a group (make two resources managed by the same node)

pcs resource group add hagroup vip haproxy:According to the order of commands, first manage vip, then haproxy


  • Visit : http://172.25.17.100/status

Insert picture description here

4.2 Test 1: Turn off the pcs service of the node


  • The node server4 is shut down, and it is automatically taken over by the idle node server1. After server4 restarts the service, it will not switch back.

pcs node standby

pcs node unstandby

Insert picture description here
Insert picture description here
Insert picture description here

4.3 Test 2: Close the haproxy service


  • After turning off the haproxy service, the system will automatically turn it on again when it is detected. When the status is displayed, a log will show that haproxy has been closed.

Insert picture description here

4.4 Test 3: Delete VIP


  • After deleting, it is detected that VIP is deleted, it will be added automatically

Insert picture description here
Insert picture description here

4.5 Test 4: Disable hardware devices


ip link set down 网卡接口

  • It is found that the resource is automatically taken over to another idle node server4, but server1 still thinks it is taking over by itself, and it will be displayed after server1 restarts.

Insert picture description here
Insert picture description here

5. Fence (prevent cluster server from suspended animation)


Fence function : In the HA cluster environment, the backup server B sends data packets through the heartbeat line to see if the main server A is still alive, the main server A receives a large number of client access requests, and the CPU load of server A reaches 100% response However, when the resource has been exhausted and there is no way to reply to the data packet of server B, (the reply data packet will be delayed), server B thinks that server A is down, so the backup server B takes the resources and becomes the main server by itself. Server A responded for a while. Server A felt that he was the boss, and Server B felt that he was also the boss. The two of them were struggling to grab resources. The cluster resources were occupied by multiple nodes. The two servers simultaneously wrote data to the resources, which destroyed it. The security and consistency of resources, the occurrence of this situation is called "split brain". Server A is overloaded and cannot respond. With the Fence mechanism, Fence will automatically kill Server A to prevent the occurrence of "split brain".

Principle : When the host is abnormal or down due to unexpected reasons, the backup machine first calls the FENCE device, and then restarts the abnormal host or isolates it from the network through the FENCE device. When the FENCE operation is successfully executed, the information is returned to the backup machine, and the backup machine is connecting After FENCE's successful message, it began to take over the services and resources of the host. In this way, the FENCE device releases the resources occupied by the abnormal node, ensuring that resources and services are always running on one node.

Type : Hardware Fence: Power Fence, kick off the broken server software by turning off the power. Fence: Fence card (smart card), kick off the broken server by cable and software


5.1 Test host installation and configuration


fence host: 192.168.17.250


  • Install fence-virtd, fence-virtd-libvirt, fence-virtd-multicast, network monitor, power manager, kernel level control virtual machine

yum install -y fence-virtd

yum install -y fence-virtd-libvirt

yum install -y fence-virtd-multicast

  • Write fence information

fence_virtd -c

Insert picture description here
Insert picture description here
Insert picture description here

  • Create a directory /etc/cluster, cut to its directory, and generate a key file

mkdir /etc/cluster

cd /etc/cluster

dd if=/dev/urandom of=fence_xvm.key bs=128 count=1: Generate key file

  • Restart the fence_virtd service, you can view port 1229

systemctl restart fence_virtd

  • Transfer the key file to the monitored node (create the /etc/cluster directory in advance on all nodes)

scp fence_xvm.key [email protected]:/etc/cluster/

scp fence_xvm.key [email protected]:/etc/cluster/

Insert picture description here

5.2 Detected node configuration (cluster server)


  • Install fence-virt on all cluster servers

yum install fence-virt.x86_64 -y

ssh server4 yum install fence-virt.x86_64 -y

  • View fence agent

stonith_admin -I

Insert picture description here

  • Add fence

pcs stonith create vmfence fence_xvm pcmk_host_map="server1:vm1;server4:vm4" op monitor interval=60s
选项在pcs stonith describe fence_xvm都可以查看到
#pcmk_host_map的值以键值对书写,如下:
#hostname:虚拟机名称

Insert picture description here
Insert picture description here

  • Enable STONITH component function

pcs property set stonith-enabled=true

  • Verify cluster configuration information

crm_verify -LV

Insert picture description here

5.3 Test 1: Damage the kernel


  • Test on node server4: damage the kernel and find that server4 is automatically shut down and turned on again to realize the monitoring function!

echo c > /proc/sysrq-trigger

5.4 Test 2: Disable hardware devices


  • Disable the server4 hardware device, and find that server4 is automatically shut down and turned on again to realize the monitoring function!

ip link set down eth0

6. lvs and nginx


  • Close pcs

pcs cluster stop --all

pcs cluster disable --all

  • Install source nginx

tar zxf nginx-1.18.0.tar.gz

cd nginx-1.18.0

yum install -y gcc pcre-devel openssl-devel:安装gcc、pcre-devel、openssl-devel

vim auto/cc/gcc`
#CFLAGS="$CFLAGS -g"
#注释此行(127行)可以使安装后的二进制文件更小

./configure --prefix=/usr/local/nginx --with-http_ssl_module: Configure script, specify the installation path and other parameters

make && make install

  • Configure environment variables and start the service

cd /usr/local/nginx/sbin/: Variable configuration of this directory

vim .bash_profile: Writing variables

PATH=$PATH:$HOME/bin:/usr/local/nginx/sbin

source .bash_profile: Re-read the file to make the variable effective

nginx: Start service

Insert picture description here
Insert picture description here

  • Configuration file

vim /usr/local/nginx/conf/nginx.conf

117     upstream westos {
    
    
118     server 172.25.17.2:80;
119     server 172.25.17.3:80;
120     }
121 
122 server {
    
    
123     listen 80;
124     server_name demo.westos.org;
125     location / {
    
    
126         proxy_pass http://westos;
127     }
128 }

nginx -t: Check for syntax errors

nginx -s reload: Reread the configuration file

Insert picture description here
Insert picture description here

  • Test : Visit after the test host configuration is resolved

vim /etc/hosts

192.168.17.1     demo.westos.org

Insert picture description here

Guess you like

Origin blog.csdn.net/weixin_46069582/article/details/112645366