postgresql内存配置问题

业务环境

操作系统：CentOS Linux release 7.3.1611 (Core)
数据库版本：postgresql 10.6

本环境为一主二从的流复制集群，使用corosync+pacemaker进行高可用管控。一台从库使用同步复制，分担读压力；另一台从库使用异步复制，作为一个实时备份。

场景还原

早上收到监控报警：

Stack: corosync
Current DC: sh01-oscar-cmp-pp-pg03 (version 1.1.19-8.el7_6.4-c3c624ea3d) - partition with quorum
Last updated: Wed Oct 16 10:17:02 2019
Last change: Wed Oct 16 10:15:17 2019 by root via crm_attribute on sh01-oscar-cmp-pp-pg03

3 nodes configured
11 resources configured

Online: [ sh01-oscar-cmp-pp-pg01 sh01-oscar-cmp-pp-pg02 sh01-oscar-cmp-pp-pg03 ]

Full list of resources:

fence-sh01-oscar-cmp-pp-pg01	(ocf::heartbeat:fence_check):	Started sh01-oscar-cmp-pp-pg01
fence-sh01-oscar-cmp-pp-pg02	(ocf::heartbeat:fence_check):	Started sh01-oscar-cmp-pp-pg02
fence-sh01-oscar-cmp-pp-pg03	(ocf::heartbeat:fence_check):	Started sh01-oscar-cmp-pp-pg03
Resource Group: master-group
vip-master	(ocf::heartbeat:IPaddr2):	Started sh01-oscar-cmp-pp-pg03
Resource Group: slave-group
vip-slave	(ocf::heartbeat:IPaddr2):	Started sh01-oscar-cmp-pp-pg02
Master/Slave Set: msPostgresql [pgsql]
Masters: [ sh01-oscar-cmp-pp-pg03 ]
Slaves: [ sh01-oscar-cmp-pp-pg02 ]
Stopped: [ sh01-oscar-cmp-pp-pg01 ]
Clone Set: clnPingCheck [pingCheck]
Started: [ sh01-oscar-cmp-pp-pg01 sh01-oscar-cmp-pp-pg02 sh01-oscar-cmp-pp-pg03 ]

Failed Actions:
* pgsql_start_0 on sh01-oscar-cmp-pp-pg01 'unknown error' (1): call=84, status=complete, exitreason='My data may be inconsistent. You have to remove /var/lib/pgsql/tmp/PGSQL.lock file to force start.',
last-rc-change='Wed Oct 16 10:15:06 2019', queued=0ms, exec=167ms

显示01机器出现异常，同事Prometheus收到告警，01节点的服务端口已无法访问。

问题分析

连入操作系统，发现01节点的postgresql已关闭，检查数据库日志发现问题：

2019-10-16 10:15:02.651 CST [55400] LOG:  server process (PID 16342) was terminated by signal 9: Killed

2019-10-16 10:15:02.651 CST [55400] LOG:  terminating any other active server processes
2019-10-16 10:15:02.651 CST [20414] WARNING:  terminating connection because of crash of another server process
2019-10-16 10:15:02.651 CST [20414] DETAIL:  The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.

2019-10-16 10:15:02.681 CST [20523] HINT:  In a moment you should be able to reconnect to the database and repeat your command.
2019-10-16 10:15:02.694 CST [55400] LOG:  all server processes terminated; reinitializing
2019-10-16 10:15:02.785 CST [20551] LOG:  database system was interrupted; last known up at 2019-10-16 10:12:42 CST
2019-10-16 10:15:02.787 CST [20552] FATAL:  the database system is in recovery mode
2019-10-16 10:15:02.971 CST [20572] FATAL:  the database system is in recovery mode
2019-10-16 10:15:03.050 CST [20551] LOG:  database system was not properly shut down; automatic recovery in progress
2019-10-16 10:15:03.058 CST [20551] LOG:  redo starts at 0/65CBB4A0
2019-10-16 10:15:03.164 CST [20686] FATAL:  the database system is in recovery mode
2019-10-16 10:15:03.190 CST [20688] FATAL:  the database system is in recovery mode
2019-10-16 10:15:03.195 CST [20689] FATAL:  the database system is in recovery mode
2019-10-16 10:15:03.214 CST [20690] FATAL:  the database system is in recovery mode
2019-10-16 10:15:03.260 CST [20551] LOG:  invalid record length at 0/67D224D8: wanted 24, got 0
2019-10-16 10:15:03.260 CST [20551] LOG:  redo done at 0/67D224B0
2019-10-16 10:15:03.260 CST [20551] LOG:  last completed transaction was at log time 2019-10-16 10:15:02.434297+08
2019-10-16 10:15:03.391 CST [55400] LOG:  database system is ready to accept connections
2019-10-16 10:15:03.419 CST [55400] LOG:  received fast shutdown request
2019-10-16 10:15:03.421 CST [55400] LOG:  aborting any active transactions
2019-10-16 10:15:03.423 CST [55400] LOG:  worker process: logical replication launcher (PID 20811) exited with exit code 1
2019-10-16 10:15:03.425 CST [20804] LOG:  shutting down
2019-10-16 10:15:03.462 CST [20859] FATAL:  the database system is shutting down
2019-10-16 10:15:03.465 CST [20860] FATAL:  the database system is shutting down
2019-10-16 10:15:03.491 CST [55400] LOG:  database system is shut down

可观察到，在10:15:02的时候，postgresql的进程被直接kill -9杀死了，，，

查看系统日志（/var/log/messages）：

Oct 16 10:15:02 sh01-oscar-cmp-pp-pg01 kernel: Out of memory: Kill process 16342 (postgres) score 843 or sacrifice child
Oct 16 10:15:02 sh01-oscar-cmp-pp-pg01 kernel: Killed process 16342 (postgres) total-vm:8494044kB, anon-rss:3399704kB, file-rss:400kB, shmem-rss:21080kB
Oct 16 10:15:02 sh01-oscar-cmp-pp-pg01 kernel: postgres: page allocation failure: order:0, mode:0x2015a

原来是内存耗尽，触发了oom-kill，操作系统杀掉了耗费内存最高的postgresql进程。。。

问题解决

检查机器总内存：

[root@sh01-oscar-cmp-pp-pg01 ~]# free -h
              total        used        free      shared  buff/cache   available
Mem:           3.7G        194M        2.6G        149M        971M        3.1G
Swap:          2.0G        290M        1.7G

机器内存只有不到4G

而postgresql.conf文件中配置的share_buffer为3G，再加上各个session占用的内存，导致系统内存不足，后修改share_buffer为1G，问题解决。

具体内存计算方法可参考这个博客：
PostgreSQL消耗的内存计算方法

以下为转载：

wal_buffers默认值为-1,此时wal_buffers使用的是shared_buffers,wal_buffers大小为shared_buffers的1/32
autovacuum_work_mem默认值为-1,此时使用maintenance_work_mem的值

1 不使用wal_buffers、autovacuum_work_mem

计算公式为:

max_connections*work_mem 
+ max_connections*temp_buffers 
+ shared_buffers
+ (autovacuum_max_workers * maintenance_work_mem）

假设PostgreSQL的配置如下:

max_connections = 100
temp_buffers=32MB
work_mem=32MB
shared_buffers=19GB
autovacuum_max_workers = 3
maintenance_work_mem=1GB #默认值64MB

则计算出内存为：

select(
	(100*(32*1024*1024)::bigint)
	+ (100*(32*1024*1024)::bigint)
	+ (19*(1024*1024*1024)::bigint)
	+ (3 * (1024*1024*1024)::bigint )
)::float8 / 1024 / 1024 / 1024
--output
28.25

此时pg满载峰值时最多使用28.25GB内存,物理内容为32GB时,还有3.75GB内存给操作系统使用.

2 使用wal_buffers,不使用autovacuum_work_mem

计算公式为:

max_connections*work_mem 
+ max_connections*temp_buffers 
+ shared_buffers+wal_buffers
+ (autovacuum_max_workers * autovacuum_work_mem）

假设PostgreSQL的配置如下:

max_connections = 100
temp_buffers=32MB
work_mem=32MB
shared_buffers=19GB	
wal_buffers=16MB #--with-wal-segsize的默认值
autovacuum_max_workers = 3	
maintenance_work_mem=1GB

则计算出内存为：

select(
	(100*(32*1024*1024)::bigint)
	+ (100*(32*1024*1024)::bigint)
	+ (19*(1024*1024*1024)::bigint)
	+ (16*1024*1024)::bigint
	+ (3 * (1024*1024*1024)::bigint )
)::float8  / 1024 / 1024 / 1024
--output
28.26

此时pg满载峰值时最多使用28.5GB内存,物理内容为32GB,还有3.5GB内存给操作系统使用.

3 同时使用wal_buffers、autovacuum_work_mem[建议使用]

计算公式为:

max_connections*work_mem 
+ max_connections*temp_buffers 
+ shared_buffers+wal_buffers
+ (autovacuum_max_workers * autovacuum_work_mem）
+  maintenance_work_mem

假设PostgreSQL的配置如下:

max_connections = 100
temp_buffers=32MB
work_mem=32MB
shared_buffers=19GB	
wal_buffers=262143kb
autovacuum_max_workers = 3
autovacuum_work_mem=256MB
maintenance_work_mem=2GB

则计算出内存为：

select(
    (100*(32*1024*1024)::bigint)
    + (100*(32*1024*1024)::bigint)
    + (19*(1024*1024*1024)::bigint)
    + (262143*1024)::bigint
    + (3 * (256*1024*1024)::bigint )
    + ( 2 * (1024*1024*1024)::bigint )
)::float8  / 1024 / 1024 / 1024
--output
28.01

此时pg载峰值时最多使用28.25GB内存,物理内容为32GB时,还有3.75GB内存给操作系统使用.建议所有内存消耗根据硬件配置,也就是使用这个配置.

aladdin_sun 博客专家

发布了136 篇原创文章 · 获赞 58 · 访问量 36万+

私信关注