PostgreSQL:“ FATAL: requested WAL segment00800002A0 has already been removed”


When using a hot standby PostgreSQL database, when performing a large number of transactions, especially when an insert transaction needs to insert tens of millions of data (the typical approach is to continue insert into t select * from t;), the background log The error is reported as follows:

csv format log:

2013-07-01 13:25:29.430 CST,,,27738,,51d112c8.6c5a,1,,2013-07-01 13:25:28 CST,,0,LOG,00000,"streaming replication successfully connected to primary",,,,,,,,"libpqrcv_connect, libpqwalreceiver.c:171",""
2013-07-01 13:25:29.430 CST,,,27738,,51d112c8.6c5a,2,,2013-07-01 13:25:28 CST,,0,FATAL,XX000,"could not receive data from WAL stream:FATAL:  requested WAL segment 0000000800002A0000000000 has already been removed
",,,,,,,,"libpqrcv_receive, libpqwalreceiver.c:389",""

 Remarks: According to the error information, it is easy to know that a large number of xlogs are generated in the main library, and because postgreSQL is executing transactions, it is sent to the standby database when it is submitted. Because the transaction takes too long to execute and exceeds the default interval of checkpoint, some xlogs have not been sent to the standby database but have been removed. To solve this problem, generally available solutions are:

First, adjust the value of
wal_keep_segments Set the GUC parameter wal_keep_segments to a larger value , such as 2000, and the default value of each segment is 16MB, which is equivalent to 32000MB, that is, about 30 GB of space as cache space.

However, this method cannot fundamentally solve the problem. After all, in a production environment or TPCC test, if a transaction needs to insert billions of records, the problem may still occur.

2. Enabling archiving
Archiving means backing up xlogs that have not been sent to the standby database to a certain directory and restoring them to the standby database when the database is restarted.

Examples of GUC parameter settings are as follows:

主库的postgresql.conf文件中:
wal_level = hot_standby
archive_mode = on
archive_command = 'rsync -zaq %p postgres@pg-slave:/var/lib/pgsql/wal_restore/%f && test ! -f /var/lib/pgsql/backup/wal_archive/%f && cp %p /var/lib/pgsql/backup/wal_archive/'
archive_timeout = 300
max_wal_senders = 5
wal_keep_segments = 0 # not sure why I've set it to this?

备库的postgresql.conf文件中:
wal_level = hot_standby
archive_mode = on
archive_command = 'test ! -f /var/lib/pgsql/backup/wal_archive/%f && cp -i %p /var/lib/pgsql/backup/wal_archive/%f < /dev/null'
hot_standby = on
wal_keep_segments = 1

备库的recovery.conf文件中:
standby_mode = 'on'
primary_conninfo = 'host=pg-master port=5432 user=replicator'
restore_command = 'cp /var/lib/psql/wal_restore/%f %p'
archive_cleanup_command = 'pg_archivecleanup /var/lib/pgsql/wal_restore/ %r'

3. Enable replication slot (available only after pg9.4)
This method is a fundamental solution and will not cause the loss of xlog. In other words, before xlog is copied, it will not be deleted.
Activation method:

(1) Add in postgresql.conf:

max_replication_slots = 2000

(2) Before copying to the standby database, the main library must create a slot:


postgres=# SELECT * FROM pg_create_physical_replication_slot('node_a_slot');
  slot_name  | xlog_position
-------------+---------------
 node_a_slot |

postgres = # SELECT * FROM pg_replication_slots;
  slot_name | slot_type | datoid | database | active | xmin | restart_lsn
------------- + ----------- + --- ----- + ---------- + -------- + ------ + -------------
 node_a_slot | physical | | | f | |
(1 row)
(3) Add a line in the recovery.conf file of the standby database:

standby_mode = 'on'
primary_conninfo = 'host=192.168.4.225 port=19000 user=wslu password=xxxx'
primary_slot_name = 'node_a_slot'


How to configure the standby database, refer to:

http://blog.csdn.net/prettyshuang/article/details/50898363#t10

reference:

https://www.postgresql.org/docs/9.4/static/runtime-config-replication.html

https://www.postgresql.org/docs/9.4/static/warm-standby.html#CASCADING-REPLICATION

http://blog.2ndquadrant.com/postgresql-9-4-slots/

http://grokbase.com/t/postgresql/pgsql-general/13654jchy3/trouble-with-replication

http://stackoverflow.com/questions/28201475/how-do-i-fix-a-postgresql-9-3-slave-that-cannot-keep-up-with-the-master
 

Published 19 original articles · praised 4 · 170,000 views +

Guess you like

Origin blog.csdn.net/u011250186/article/details/105518258