corosync+pacemaker+postgresql流复制问题处理小计

corosync+pacemaker+postgresql流复制问题处理小计

情况说明

在搭建corosync+pacemaker集群环境时,每次开启pacemaker,原主库就自动关闭了,而原从库成为了主。并且slave虚拟ip也未生效,整个集群环境崩溃。
状态如下:

[root@plat-ecloud01-andfleethe-prd-postgres03 ~]# crm status
Stack: corosync
Current DC: plat-ecloud01-andfleethe-prd-postgres03 (version 1.1.15-11.el7_3.4-e174ec8) - partition with quorum
Last updated: Wed Jan  2 11:22:13 2019		Last change: Wed Jan  2 11:18:30 2019 by root via crm_attribute on plat-ecloud01-andfleethe-prd-postgres04

2 nodes and 8 resources configured

Online: [ plat-ecloud01-andfleethe-prd-postgres03 plat-ecloud01-andfleethe-prd-postgres04 ]

Full list of resources:

 fence-cps01	(ocf::heartbeat:fence_check):	Started plat-ecloud01-andfleethe-prd-postgres03
 fence-cps02	(ocf::heartbeat:fence_check):	Started plat-ecloud01-andfleethe-prd-postgres04
 Master/Slave Set: msPostgresql [pgsql]
     Masters: [ plat-ecloud01-andfleethe-prd-postgres04 ]
     Stopped: [ plat-ecloud01-andfleethe-prd-postgres03 ]
 Resource Group: master-group
     vip-master	(ocf::heartbeat:IPaddr2):	Started plat-ecloud01-andfleethe-prd-postgres04
 Clone Set: clnPingCheck [pingCheck]
     Started: [ plat-ecloud01-andfleethe-prd-postgres03 plat-ecloud01-andfleethe-prd-postgres04 ]
 Resource Group: slave-group
     vip-slave	(ocf::heartbeat:IPaddr2):	Stopped

问题解决

检查日志发现报错:

an 02 14:59:12 [17628] plat-ecloud01-andfleethe-prd-postgres03       lrmd:   notice: operation_finished:       pingCheck_monitor_10000:10617:stderr [ /usr/lib/ocf/resource.d/heartbeat/.ocf-shellfuncs: line 24: $'\r': command not found ]
Jan 02 14:59:12 [17628] plat-ecloud01-andfleethe-prd-postgres03       lrmd:   notice: operation_finished:       pingCheck_monitor_10000:10617:stderr [ /usr/lib/ocf/resource.d/heartbeat/.ocf-shellfuncs: line 26: $'\r': command not found ]
Jan 02 14:59:12 [17628] plat-ecloud01-andfleethe-prd-postgres03       lrmd:   notice: operation_finished:       pingCheck_monitor_10000:10617:stderr [ /usr/lib/ocf/resource.d/heartbeat/.ocf-shellfuncs: line 31: $'\r': command not found ]
Jan 02 14:59:12 [17628] plat-ecloud01-andfleethe-prd-postgres03       lrmd:   notice: operation_finished:       pingCheck_monitor_10000:10617:stderr [ /usr/lib/ocf/resource.d/heartbeat/.ocf-shellfuncs: line 39: $'\r': command not found ]
Jan 02 14:59:12 [17628] plat-ecloud01-andfleethe-prd-postgres03       lrmd:   notice: operation_finished:       pingCheck_monitor_10000:10617:stderr [ /usr/lib/ocf/resource.d/heartbeat/.ocf-shellfuncs: line 41: expot: command not found ]
Jan 02 14:59:12 [17628] plat-ecloud01-andfleethe-prd-postgres03       lrmd:   notice: operation_finished:       pingCheck_monitor_10000:10617:stderr [ /usr/lib/ocf/resource.d/heartbeat/.ocf-shellfuncs: line 42: expot: command not found ]
Jan 02 14:59:12 [17628] plat-ecloud01-andfleethe-prd-postgres03       lrmd:   notice: operation_finished:       pingCheck_monitor_10000:10617:stderr [ /usr/lib/ocf/resource.d/heartbeat/.ocf-shellfuncs: line 43: $'\r': command not found ]
Jan 02 14:59:12 [17628] plat-ecloud01-andfleethe-prd-postgres03       lrmd:   notice: operation_finished:       pingCheck_monitor_10000:10617:stderr [ /usr/lib/ocf/resource.d/heartbeat/.ocf-shellfuncs: line 45: $'\r': command not found ]
Jan 02 14:59:12 [17628] plat-ecloud01-andfleethe-prd-postgres03       lrmd:   notice: operation_finished:       pingCheck_monitor_10000:10617:stderr [ /usr/lib/ocf/resource.d/heartbeat/.ocf-shellfuncs: line 66: syntax error near unexpected token `$'{\r'' ]
Jan 02 14:59:12 [17628] plat-ecloud01-andfleethe-prd-postgres03       lrmd:   notice: operation_finished:       pingCheck_monitor_10000:10617:stderr [ /usr/lib/ocf/resource.d/heartbeat/.ocf-shellfuncs: line 66: `ocf_is_oot() {^M' ]

原来是link文件出了问题,我之前已经排除过ocf-shellfuncs的问题,没想到主库的link文件还有错误。

[root@plat-ecloud01-andfleethe-prd-postgres03 ~]# ls -a /usr/lib/ocf/resource.d/heartbeat/
.           CTDB    Dummy        galera      IPaddr2.bak       MailTo     nfsserver         .ocf-shellfuncs      pgsql.bak  rabbitmq-cluster  slapd          Xinetd
..          db2     ethmonitor   garbd       IPsrcaddr         mysql      nginx             .ocf-shellfuncs.bak  pgsql.bb   redis             Squid
apache      Delay   exportfs     iface-vlan  iSCSILogicalUnit  nagios     .ocf-binaries     oracle               ping       Route             symlink
clvm        dhcpd   fence_check  IPaddr      iSCSITarget       named      .ocf-directories  oralsnr              portblock  rsyncd            tomcat
conntrackd  docker  Filesystem   IPaddr2     LVM               nfsnotify  .ocf-returncodes  pgsql                postfix    SendArp           VirtualDomain

修改后重启整套系统,完成修复:

[root@plat-ecloud01-andfleethe-prd-postgres03 ~]# crm status
Stack: corosync
Current DC: plat-ecloud01-andfleethe-prd-postgres03 (version 1.1.15-11.el7_3.4-e174ec8) - partition with quorum
Last updated: Wed Jan  2 15:46:46 2019		Last change: Wed Jan  2 15:22:53 2019 by root via crm_attribute on plat-ecloud01-andfleethe-prd-postgres03

2 nodes and 8 resources configured

Online: [ plat-ecloud01-andfleethe-prd-postgres03 plat-ecloud01-andfleethe-prd-postgres04 ]

Full list of resources:

 fence-cps01	(ocf::heartbeat:fence_check):	Started plat-ecloud01-andfleethe-prd-postgres03
 fence-cps02	(ocf::heartbeat:fence_check):	Started plat-ecloud01-andfleethe-prd-postgres04
 Master/Slave Set: msPostgresql [pgsql]
     Masters: [ plat-ecloud01-andfleethe-prd-postgres03 ]
     Slaves: [ plat-ecloud01-andfleethe-prd-postgres04 ]
 Resource Group: master-group
     vip-master	(ocf::heartbeat:IPaddr2):	Started plat-ecloud01-andfleethe-prd-postgres03
 Clone Set: clnPingCheck [pingCheck]
     Started: [ plat-ecloud01-andfleethe-prd-postgres03 plat-ecloud01-andfleethe-prd-postgres04 ]
 Resource Group: slave-group
     vip-slave	(ocf::heartbeat:IPaddr2):	Started plat-ecloud01-andfleethe-prd-postgres04

猜你喜欢

转载自blog.csdn.net/sunbocong/article/details/85620034