PostgreSQL高可用中间件—Pgpool-Ⅱ

一、Pgpool-Ⅱ简介

  Pgpool-II是位于PostgreSQL服务器和 PostgreSQL数据库客户端之间的代理软件。它提供以下功能:

  • 连接池(Connection Pooling)
      Pgpool II维护到PostgreSQL服务器的已建立的连接,并在出现具有相同属性(即用户名、数据库、协议版本和其他连接参数)的新连接时重用这些连接。它减少了连接开销,提高了系统的整体吞吐量。

  • 负载均衡(Load Balancing)
      如果复制了数据库(因为在复制模式或本机复制模式下运行),则在任何服务器上执行SELECT查询都将返回相同的结果。Pgpool-II利用复制特性来减少每个PostgreSQL服务器上的负载。它通过在可用服务器之间分发SELECT查询来实现这一点,从而提高系统的整体吞吐量。在理想情况下,读取性能可以与PostgreSQL服务器的数量成比例地提高。负载平衡在许多用户同时执行许多只读查询的情况下效果最好。

  • 自动故障转移(Automatic failover)
      如果其中一台数据库服务器出现故障或无法访问,则 Pgpool-II会将其分离,并将继续使用其余的数据库服务器进行操作。有一些复杂的功能可以帮助自动故障转移,包括超时和重试。

  • 在线恢复(Online recovery)
      Pgpool-II可以通过执行一个命令来执行数据库节点的联机恢复。当联机恢复与自动故障转移一起使用时,可以通过故障转移将分离的节点自动附加为备用节点。也可以同步并附加新的 PostgreSQL服务器。

  • 复制(Replication)
      Pgpool-II可以管理多个PostgreSQL 服务器。激活复制功能可以在两个或多个PostgreSQL群集上创建实时备份,因此,如果其中一个群集发生故障,该服务可以继续运行而不会中断。Pgpool-II具有内置复制(本机复制)。但是,用户可以使用外部复制功能,包括PostgreSQL的流复制。

  • 超出限制连接(Limiting Exceeding Connections)
      PostgreSQL的最大并发连接数是有限制的,达到此数量时,新连接将被拒绝。但是,增加此最大连接数会增加资源消耗,并对整体系统性能产生负面影响。Pgpool-II也对最大连接数有限制,但是额外的连接将排队,而不是立即返回错误。但是,可以配置为在超过连接限制时返回错误。

  • 看门狗(Watchdog)
      Watchdog可以协调多个Pgpool-II,创建强大的群集系统,并避免单点故障或大脑裂开。为了避免大脑分裂,您至少需要3个Pgpool-II节点。看门狗可以对其他执行lifecheck pgpool-II的节点,以检测故障Pgpool-II。如果活动Pgpool-II发生故障,则备用 Pgpool-II可以提升为活动状态,并接管虚拟IP。

  • 查询缓存(In Memory Query Cache)
      Watchdog在内存中查询缓存允许保存一对SELECT语句及其结果。如果出现相同的SELECT,则Pgpool-II从缓存中返回该值。由于不 涉及SQL解析和对PostgreSQL的访问,因此在内存缓存中使用非常快。另一方面,在某些情况下,它可能比正常路径慢,因为它增加了存储缓存数据的开销。
    在这里插入图片描述

二、PostgreSQL主备部署

2.1 关闭防火墙(主备均操作)

systemctl stop firewalld.service #停止防火墙
systemctl disable firewalld.service #禁止开机启动

2.2 关闭selinux(主备均操作)

setenforce 0
sed -i 's/SELINUX=enforcing/SELINUX=disabled/g' /etc/selinux/config
cat /etc/sysconfig/selinux
/usr/sbin/sestatus -v

2.3 操作系统限制(主备均操作)

设置swap分区

echo "vm.swappiness = 1" >>/etc/sysctl.conf
sysctl -p
#通过sysctl -a 查看
PS :swappiness值在0-100之间,0尽力使用物理内存,100尽力使用swap分区。

资源配置

cat /etc/security/limits.conf
...
* soft nproc 65536
* hard nproc 65536
* soft nofile 65536
* hard nofile 65536
...

关闭透明大页面

echo never >>  /sys/kernel/mm/transparent_hugepage/enabled
echo never >>  /sys/kernel/mm/transparent_hugepage/defrag

打开noatime

每个文件上都有以下上个时间:
ctime: 改变时间
mtime: 修改时间
atime: 访问时间
通常Postgresql 并不使用这三个时间;
首先禁止的是:atime
mtime 和 ctime 有时还有些作用
设置 noatime 如下:vim /etc/fstab
/dev/vda1 / xfs noatime,errors=remount-ro 0 1

调整预读

Linux 下块设备通常都默认打开了预读,可以使用下面的命令查看预读的大小:
blockdev --getra /dev/sdf
注意,上面的命令中值的单位为扇区,即 512bytes. 在下面的示例中:
sudo blockdev --getra /dev/sda
返回值为256,表示是256个扇区,即为128KB
设置预读的命令如下:
blockdev --setra 4096 /dev/sdf
上面的设置并不会永久生效,机器重启后就会失效,如果想要永久生效,应该
把命令放到自动脚本中
如果想让全表扫描更快一些,可以把预读调整大一些,如像上例那样把预读设置为2MB

调整IO调度器

linux 下通常有一下三种I/O调度器:
1.cfq: completely fair queuing,完全公平队列,尝试为所有的请求分配公平的I/O带宽,注意时带宽,而不是响应时间
2.deadline: 平衡所有的请求,避免某个请求被饿死,让响应时间最优化
3.noop: 除了基本的块合并及排序工作,其他基本里上什么都不做。

echo deadline > /sys/block/vda/queue/scheduler

2.4 关闭numa(主备均操作)

sed -i 's/GRUB_CMDLINE_LINUX.*/GRUB_CMDLINE_LINUX="crashkernel=auto rd.lvm.lv=centos\/root rd.lvm.lv=centos\/swap rhgb quiet numa=off"/g' /etc/default/grub
grub2-mkconfig -o /etc/grub2.cfg
cat /etc/grub2.cfg
reboot
cat /proc/cmdline
dmesg | grep -i numa

2.5 编译安装(主备均操作)

[root@db01 ~]# tar xf postgresql-12.4.tar.gz -C /var/soft/
[root@db01 ~]# ln -s /var/soft/postgresql-12.4/ /var/soft/postgresql
[root@db01 ~]# cd /var/soft/postgresql
[root@db01 ~]# ./configure --prefix=/usr/local/pgsql --with-perl --with-tcl --with-python --with-openssl --with-pam --without-ldap --with-libxml --with-libxslt --enable-thread-safety --with-wal-blocksize=16 --with-blocksize=16 --enable-dtrace --enable-debug  
[root@db01 ~]# make && make install

报错处理

问题1: 
checking for dtrace... no 
configure: error: dtrace not found 
解决方法: yum install -y systemtap-sdt-devel.x86_64 
 
问题2: 
checking for flags to link embedded Perl... Can't locate ExtUtils/Embed.pm in @INC (@INC contains: /usr/local/lib64/perl5 /usr/local/share/perl5 /usr/lib64/perl5/vendor_perl /usr/share/perl5/vendor_perl /usr/lib64/perl5 /usr/share/perl5 .). 
BEGIN failed--compilation aborted. 
no 
configure: error: could not determine flags for linking embedded Perl. 
This probably means that ExtUtils::Embed or ExtUtils::MakeMaker is not 
installed. 
解决方法: 
yum install perl-ExtUtils-Embed -y 
 
问题3: 
checking build system type... x86_64-pc-linux-gnu
checking host system type... x86_64-pc-linux-gnu
checking which template to use... linux
checking whether NLS is wanted... no
checking for default port number... 5432
checking for dtrace... /usr/bin/dtrace
checking for block size... 16kB
checking for segment size... 1GB
checking for WAL block size... 16kB
checking for gcc... no
checking for cc... no
configure: error: in `/var/soft/postgresql':
configure: error: no acceptable C compiler found in $PATH
解决方法:
yum install gcc
   
问题4: 
configure: error: readline library not found 
If you have readline already installed, see config.log for details on the 
failure. It is possible the compiler isn't looking in the proper directory. 
Use --without-readline to disable readline support. 
   
解决方法: 
yum install readline readline-devel 
   
问题5: 
checking for inflate in -lz... no 
configure: error: zlib library not found 
If you have zlib already installed, see config.log for details on the 
failure. It is possible the compiler isn't looking in the proper directory. 
Use --without-zlib to disable zlib support. 
解决方法: 
yum install zlib zlib-devel 
   
   
问题6: 
checking for CRYPTO_new_ex_data in -lcrypto... no 
configure: error: library 'crypto' is required for OpenSSL 
解决方法: 
yum install openssl openssl-devel 
   
问题7: 
checking for pam_start in -lpam... no 
configure: error: library 'pam' is required for PAM 
解决方法: 
yum install pam pam-devel 
   
问题8: 
checking for xmlSaveToBuffer in -lxml2... no 
configure: error: library 'xml2' (version >= 2.6.23) is required for XML support 
解决方法: 
yum install libxml2 libxml2-devel 
   
问题9: 
checking for xsltCleanupGlobals in -lxslt... no 
configure: error: library 'xslt' is required for XSLT support 
解决方法: 
yum install libxslt libxslt-devel 
   
   
问题10: 
configure: error: Tcl shell not found 
解决方法: 
yum install tcl tcl-devel 
   
   
问题11: 
checking for ldap.h... no 
configure: error: header file is required for LDAP 
解决方法: 
yum install openldap openldap-devel 
   
问题12: 
checking for Python.h... no 
configure: error: header file <Python.h> is required for Python 
解决方法: 
yum install python python-devel 

2.6 配置用户、目录及权限(主备均操作)

[root@db01 postgresql]# groupadd postgres
[root@db01 postgresql]# useradd postgres -g postgres
[root@db01 postgresql]# passwd postgres
[root@db01 postgresql]# mkdir -p /data/pgsql/data
[root@db01 postgresql]# chown -R postgres:postgres /data/pgsql/
[root@db01 postgresql]# chown -R postgres:postgres /usr/local/pgsql/

2.7 配置环境变量(主备均操作)

#修改环境变量
export LD_LIBRARY_PATH=/usr/local/pgsql/lib:$LD_LIBRARY_PATH
export PGDATA=/data/pgsql/data
export PGHOST=/tmp
export PGHOME=/usr/local/pgsql
export PATH=$PATH:$PGHOME/bin
#使其生效
source /etc/profile

2.8 初始化数据库(主备均操作)

[root@db01 postgresql]# su - postgres
Last login: Sun Aug 30 22:11:36 CST 2020 on pts/2
[postgres@db01 ~]$ initdb
The files belonging to this database system will be owned by user "postgres".
This user must also own the server process.

The database cluster will be initialized with locale "en_US.UTF-8".
The default database encoding has accordingly been set to "UTF8".
The default text search configuration will be set to "english".

Data page checksums are disabled.

fixing permissions on existing directory /data/pgsql/data ... ok
creating subdirectories ... ok
selecting dynamic shared memory implementation ... posix
selecting default max_connections ... 100
selecting default shared_buffers ... 128MB
selecting default time zone ... Asia/Shanghai
creating configuration files ... ok
running bootstrap script ... ok
performing post-bootstrap initialization ... ok
syncing data to disk ... ok

initdb: warning: enabling "trust" authentication for local connections
You can change this by editing pg_hba.conf or using the option -A, or
--auth-local and --auth-host, the next time you run initdb.

Success. You can now start the database server using:

    pg_ctl -D /data/pgsql/data -l logfile start

2.9 修改pg_hba.conf文件(主库操作)

local   all             all                                     trust
# IPv4 local connections:
host    all             all             0.0.0.0/0               md5
# IPv6 local connections:
# Allow replication connections from localhost, by a user with the
# replication privilege.
local   replication     all                                     trust
host    replication     all             192.168.137.0/24     trust

2.10 修改修改postgresql.conf配置文件(主库操作)

archive_mode = on
archive_command = 'cp "%p" "/data/pgsql/data/arch/%f"'
max_wal_senders = 10
max_replication_slots = 10
wal_level = logical
hot_standby = on
wal_log_hints = on
synchronous_commit = on
synchronous_standby_names = 'walreceiver'
logging_collector = on
log_filename = 'postgresql-%a.log'
log_truncate_on_rotation = on
log_rotation_age = 1d
log_rotation_size = 0

2.11 启动数据库(主库操作)

[postgres@db02 ~]$ pg_ctl -D /data/pgsql/data start
[postgres@db02 ~]$ psql
postgres=# CREATE ROLE pgpool WITH LOGIN;
postgres=# CREATE ROLE repl WITH REPLICATION LOGIN;
postgres=# \password pgpool
postgres=# \password repl
postgres=# \password postgres
# pgpool_status中显示
postgres=# GRANT pg_monitor TO pgpool;
# 查看结果 replication_state
postgres=# \du
postgres=# exit

2.12 备库删除data(备库操作)

rm -rf /data/pgsql/data

2.13 备端基础备份(备库操作)

[postgres@db02 ~]$ cd /data/pgsql/
[postgres@db02 ~]$ pg_basebackup -h 192.168.137.129 -U repl -Xs -Fp -R -Pv -D data

2.14 备库启动数据库服务(备库操作)

[postgres@db02 ~]$ pg_ctl -D /data/pgsql/data start

2.15 验证复制

#主库
[postgres@db01 data]$ ps -ef | grep postgres
postgres  84008      1  0 16:54 ?        00:00:00 /usr/local/pgsql/bin/postgres -D /data/pgsql/data
postgres  84009  84008  0 16:54 ?        00:00:00 postgres: logger   
postgres  84011  84008  0 16:54 ?        00:00:00 postgres: checkpointer   
postgres  84012  84008  0 16:54 ?        00:00:00 postgres: background writer   
postgres  84013  84008  0 16:54 ?        00:00:00 postgres: walwriter   
postgres  84014  84008  0 16:54 ?        00:00:00 postgres: autovacuum launcher   
postgres  84015  84008  0 16:54 ?        00:00:00 postgres: archiver   last was 000000010000000000000019.00000028.backup
postgres  84016  84008  0 16:54 ?        00:00:00 postgres: stats collector   
postgres  84017  84008  0 16:54 ?        00:00:00 postgres: logical replication launcher   
postgres  85904  84008  0 17:09 ?        00:00:00 postgres: walsender repl 192.168.137.130(33968) streaming 0/1A000148


postgres=# select client_addr,sync_state from pg_stat_replication;
   client_addr   | sync_state 
-----------------+------------
 192.168.137.130 | sync
(1 row)

postgres=# 
postgres=# select pg_is_in_recovery();
 pg_is_in_recovery 
-------------------
 f
(1 row)

#备库
[postgres@db02 pgsql]$ ps -ef | grep postgres
postgres  91113      1  0 17:09 ?        00:00:00 /usr/local/pgsql/bin/postgres
postgres  91114  91113  0 17:09 ?        00:00:00 postgres: logger   
postgres  91115  91113  0 17:09 ?        00:00:00 postgres: startup   recovering 00000001000000000000001A
postgres  91116  91113  0 17:09 ?        00:00:00 postgres: checkpointer   
postgres  91117  91113  0 17:09 ?        00:00:00 postgres: background writer  
postgres  91118  91113  0 17:09 ?        00:00:00 postgres: stats collector   
postgres  91119  91113  0 17:09 ?        00:00:01 postgres: walreceiver   streaming 0/1A000148

postgres=# select pg_is_in_recovery();
 pg_is_in_recovery 
-------------------
 t
(1 row)

三、Pgpool-Ⅱ部署

3.1 软件安装(主备均操作)

yum localinstall pgpool-II-pg12*
chown postgres.postgres /etc/pgpool-II/*

3.2 互信配置

# 每个节点都执行
ssh-keygen -t rsa # 一路回车
# 将公钥添加到认证文件中
cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
# 并设置authorized_keys的访问权限
chmod 600 ~/.ssh/authorized_keys

# 只要在一个节点执行即可
ssh db02 cat ~/.ssh/id_rsa.pub >>~/.ssh/authorized_keys
# 分发整合后的文件到其它节点
scp ~/.ssh/authorized_keys db02:~/.ssh/

#各节点执行
ssh db01 date
ssh db02 date

3.3 配置Pgpool-Ⅱ(主备均执行)

配置密码

[postgres@db01 pgpool-II]$ pg_md5 -p -m -u postgres postgres
password: 
[postgres@db01 pgpool-II]$ cat pool_passwd 
postgres:md53175bce1d3201d16594cebf9d7eb3f9d
[postgres@db01 pgpool-II]$ pg_md5 -p -m -u pgpool pgpool
password: 
[postgres@db01 pgpool-II]$ cat /etc/pgpool-II/pool_passwd 
postgres:md53175bce1d3201d16594cebf9d7eb3f9d
pgpool:md5f24aeb1c3b7d05d7eaf2cd648c307092

pgpool.conf

# [CONNECTIONS]
listen_addresses = '*'
port = 9999
socket_dir = '/data/pgpool'
## - Backend Connection Settings -
backend_hostname0 = 'db01'
backend_data_directory0 = '/data/pgsql/data'
backend_application_name0 = 'server1'
backend_flag0 = 'ALLOW_TO_FAILOVER'
backend_port0 = 5432
backend_weight0 = 1

backend_hostname1 = 'db02'
backend_data_directory1 = '/data/pgsql/data'
backend_application_name1 = 'server2'
backend_flag1 = 'ALLOW_TO_FAILOVER'
backend_port1 = 5432
backend_weight1 = 0

## - Authentication -
enable_pool_hba = on
pool_passwd = 'pool_passwd'

# [LOGS]
logging_collector = on
log_directory = '/data/pgpool'
log_filename = 'pgpool-%Y-%m-%d_%H%M%S.log'
log_file_mode = 0600
log_truncate_on_rotation = on
log_rotation_age = 1d
log_rotation_size = 10MB

# [FILE LOCATIONS]
pid_file_name = '/data/pgpool/pgpool.pid'
## 此目录用来存放 pgpool_status 文件,此文件保存集群状态(刷新有问题时会造成show pool_status不正确)
logdir = '/data/pgpool'
# [NATIVE REPLICATION MODE]
sr_check_user = 'pgpool'
## 为''时查找 pool_passwd
sr_check_password = 'pgpool'
follow_primary_command = '/etc/pgpool-II/follow_primary.sh %d %h %p %D %m %H %M %P %r %R'

# [HEALTH CHECK GLOBAL PARAMETERS]
health_check_period = 5
health_check_timeout = 30
health_check_user = 'pgpool'
## 为''时查找 pool_passwd
health_check_password = 'pgpool'
health_check_max_retries = 3

# [FAILOVER AND FAILBACK]
failover_command = '/etc/pgpool-II/failover.sh %d %h %p %D %m %H %M %P %r %R %N %S'

# [ONLINE RECOVERY]
recovery_user = 'postgres'
recovery_password = 'postgres'
recovery_1st_stage_command = 'recovery_1st_stage'

# [WATCHDOG]
use_watchdog = on
hostname0 = 'db01'
wd_port0 = 9000
pgpool_port0 = 9999

hostname1 = 'db02'
wd_port1 = 9000
pgpool_port1 = 9999
wd_ipc_socket_dir = '/data/pgpool'

## - Virtual IP control Setting -
delegate_IP = '192.168.137.128'
if_up_cmd = '/sbin/ip addr add 192.168.137.128/24 dev ens32 label ens32:0'
if_down_cmd = '/sbin/ip addr del 192.168.137.128/24 dev ens32'

## - Behaivor on escalation Setting -
wd_escalation_command = '/etc/pgpool-II/escalation.sh'

## - Lifecheck Setting -
wd_lifecheck_method = 'heartbeat'
### -- heartbeat mode --
heartbeat_hostname0 = 'db01'
heartbeat_port0 = 9694
heartbeat_device0 = 'ens32'

heartbeat_hostname1 = 'db02'
heartbeat_port1 = 9694
heartbeat_device1 = 'ens32'

3.4 创建pgpool_node_id(主备分别执行)

echo 0 > /etc/pgpool-II/pgpool_node_id
echo 1 > /etc/pgpool-II/pgpool_node_id

3.5 follow_primary_command(主备均执行)

[postgres@db01 pgpool-II]$ echo 'pgpool:'`pg_md5 pgpool` >> /etc/pgpool-II/pcp.conf
[postgres@db01 pgpool-II]$ echo 'localhost:9898:pgpool:pgpool' > ~/.pcppass
[postgres@db01 pgpool-II]$ chmod 600 ~/.pcppass

3.6 enable_pool_hba(主备均操作)

[postgres@db01 pgpool-II]$ vi /etc/pgpool-II/pool_hba.conf
# 官方文档为 scram-sha-256,改为md5
host    all         pgpool           0.0.0.0/0          md5
host    all         postgres         0.0.0.0/0          md5

3.7 Failover configuration(主备均执行)

[postgres@db01 pgpool-II]$ cd /etc/pgpool-II
#以下脚本需进行相应修改
[postgres@db01 pgpool-II]$ mv escalation.sh.sample escalation.sh
[postgres@db01 pgpool-II]$ mv failover.sh.sample failover.sh
[postgres@db01 pgpool-II]$ mv follow_primary.sh.sample follow_primary.sh
[postgres@db01 pgpool-II]$ mv pgpool_remote_start.sample /data/pgsql/data/pgpool_remote_start
[postgres@db01 pgpool-II]$ mv recovery_1st_stage.sample /data/pgsql/data/recovery_1st_stage
[postgres@db01 pgpool-II]$ mv recovery_2nd_stage.sample /data/pgsql/data/recovery_2nd_stage

3.8 pgpool_recovery安装(主执行)

#获取libdir
pg_config --pkglibdir
cp pgpool_recovery* /usr/local/pgsql/share/extension/
cp pgpool-recovery.sql /usr/local/pgsql/share/extension/
cp pgpool-recovery.so /usr/local/pgsql/lib/
psql -f pgpool-recovery.sql template1

3.9 创建pgpool目录(主备均执行)

mkdir /data/pgpool
chown  -R /data/pgpool

3.10 启动pgpool(主备均执行)

system start pgpool

四、管理命令

4.1 查看集群配置信息

pcp_pool_status -h 192.168.137.128 -p 9898 -U pgpool -v

4.2 查看集群节点信息

pcp_watchdog_info -h 192.168.137.128 -p 9898 -U pgpool -v

4.3 查看节点数量

pcp_node_count -h 192.168.137.128 -p 9898 -U pgpool

4.4 查看指定节点信息

pcp_node_info -h 192.168.137.128 -p 9898 -U pgpool -n 0 -v

4.5 增加一个集群节点

pcp_attach_node -h 192.168.137.128 -p 9898 -U pgpool -n 0 -v

4.6 脱离一个集群节点

pcp_detach_node -h 192.168.137.128 -p 9898 -U pgpool -n 0 -v

4.7 提升一个备用节点为活动节点

pcp_promote_node -h 192.168.137.128 -p 9898 -U pgpool -n 0 -v

4.8 恢复一个离线节点为集群节点

pcp_recovery_node -h 192.168.137.128 -p 9898 -U pgpool -n 0 -v

五、功能验证及问题汇总

待更新

猜你喜欢

转载自blog.csdn.net/qq_42979842/article/details/113820842