Article directory
Preface
pg_probackup is a powerful open source PostgreSQL backup tool, similar to the MySQL community XtraBackup. This article will explore and learn with you.
Open source address:
https://github.com/postgrespro/pg_probackup
Document address:
https://postgrespro.github.io/pg_probackup
1. Install backup tools
1.1 Environment introduction
Environment introduction: CentOS Linux release 7.8.2003 (Core)
Database version: PostgreSQL - 12.2
1.2 RPM installation
# RPM Centos Packages
rpm -ivh https://repo.postgrespro.ru/pg_probackup/keys/pg_probackup-repo-centos.noarch.rpm
# pg_probackup-这里写 PG 对应的大版本
yum install pg_probackup-{
15,14,13,12,11}
yum install pg_probackup-{
15,14,13,12,11}-debuginfo
The official Github has installation instructions, which can be installed according to the environment installation guidelines. The above is the CentOS environment installation method.
1.3 Verification
After the RPM installation is completed, the environment variables will be automatically configured. Here I installed version 12, so I used the pg_probackup-12 command to operate the tool:
pg_probackup-12 --help
2. Configure backup tools
2.1 Initial settings
Initialize the backup directory:
pg_probackup-12 init -B ${backup_dir}
pg_probackup-12 init -B /data/pgsql12/backup
INFO: Backup catalog ‘/data/pgsql12/backup’ successfully initialized
Add a new backup instance:
# 本地实例
pg_probackup-12 add-instance -B ${backup_dir} -D ${PGDATA} --instance ${instance_name}
# 添加远程实例
pg_probackup-12 add-instance -B ${backup_dir} -D ${PGDATA} --instance ${instance_name} --remote-prot=ssh --remote-host=${remote_ip} --remote-port=${remote_ssh_port} --remote-user=${remote_ssh_user} --remote-path=${pg_probackup_dir}
pg_probackup-12 add-instance -B /data/pgsql12/backup/ -D /data/pgsql12/data/ --instance test01
INFO: Instance ‘test01’ successfully initialized
2.2 Create backup user
PostgreSQL versions 10 - 14 backup user creation statement:
BEGIN;
CREATE ROLE backup WITH LOGIN;
GRANT USAGE ON SCHEMA pg_catalog TO backup;
GRANT EXECUTE ON FUNCTION pg_catalog.current_setting(text) TO backup;
GRANT EXECUTE ON FUNCTION pg_catalog.set_config(text, text, boolean) TO backup;
GRANT EXECUTE ON FUNCTION pg_catalog.pg_is_in_recovery() TO backup;
GRANT EXECUTE ON FUNCTION pg_catalog.pg_start_backup(text, boolean, boolean) TO backup;
GRANT EXECUTE ON FUNCTION pg_catalog.pg_stop_backup(boolean, boolean) TO backup;
GRANT EXECUTE ON FUNCTION pg_catalog.pg_create_restore_point(text) TO backup;
GRANT EXECUTE ON FUNCTION pg_catalog.pg_switch_wal() TO backup;
GRANT EXECUTE ON FUNCTION pg_catalog.pg_last_wal_replay_lsn() TO backup;
GRANT EXECUTE ON FUNCTION pg_catalog.txid_current() TO backup;
GRANT EXECUTE ON FUNCTION pg_catalog.txid_current_snapshot() TO backup;
GRANT EXECUTE ON FUNCTION pg_catalog.txid_snapshot_xmax(txid_snapshot) TO backup;
GRANT EXECUTE ON FUNCTION pg_catalog.pg_control_checkpoint() TO backup;
COMMIT;
PostgreSQL versions 15 backup user creation statement:
BEGIN;
CREATE ROLE backup WITH LOGIN;
GRANT USAGE ON SCHEMA pg_catalog TO backup;
GRANT EXECUTE ON FUNCTION pg_catalog.current_setting(text) TO backup;
GRANT EXECUTE ON FUNCTION pg_catalog.set_config(text, text, boolean) TO backup;
GRANT EXECUTE ON FUNCTION pg_catalog.pg_is_in_recovery() TO backup;
GRANT EXECUTE ON FUNCTION pg_catalog.pg_backup_start(text, boolean) TO backup;
GRANT EXECUTE ON FUNCTION pg_catalog.pg_backup_stop(boolean) TO backup;
GRANT EXECUTE ON FUNCTION pg_catalog.pg_create_restore_point(text) TO backup;
GRANT EXECUTE ON FUNCTION pg_catalog.pg_switch_wal() TO backup;
GRANT EXECUTE ON FUNCTION pg_catalog.pg_last_wal_replay_lsn() TO backup;
GRANT EXECUTE ON FUNCTION pg_catalog.txid_current() TO backup;
GRANT EXECUTE ON FUNCTION pg_catalog.txid_current_snapshot() TO backup;
GRANT EXECUTE ON FUNCTION pg_catalog.txid_snapshot_xmax(txid_snapshot) TO backup;
GRANT EXECUTE ON FUNCTION pg_catalog.pg_control_checkpoint() TO backup;
COMMIT;
After the user is created, pay attention to pg_hba.conf to release the backup user.
2.3 Configure automatic archiving
Configuring WAL automatic archiving requires adjusting the following parameters:
# 自动归档的时间,单位为秒,可酌情设置,建议 1 分钟
max_wal_senders = 60
# 开启归档
archive_mode = 'on'
# WAL 格式,归档必须是 replica 及更高级别
wal_level = 'replica'
# 配置归档命令
archive_command = 'pg_probackup-12 archive-push -B /data/pgsql12/backup --instance test01 --wal-file-path=%p --wal-file-name=%f'
After the configuration is completed, the database needs to be restarted. You can use the following command to view the archive information:
pg_probackup-12 show -B /data/pgsql12/backup --instance test01 --archive
3. Introduction to using tools
Using pg_probackup-12 --help
You can see that this tool is mainly divided into several functions, which will be introduced in detail in this section.
pg_probackup-12 - utility to manage backup/recovery of PostgreSQL database.
pg_probackup-12 help [COMMAND]
pg_probackup-12 version
pg_probackup-12 init -B backup-path
pg_probackup-12 set-backup -B backup-path --instance=instance_name
-i backup-id [--ttl=interval] [--expire-time=timestamp]
[--note=text]
[--help]
pg_probackup-12 show-config -B backup-path --instance=instance_name
[--format=format]
[--help]
....................
3.1 init
pg_probackup-12 init -B backup-path
Initialize the backup directory and what you need to do after installing pg_probackup. This is equivalent to creating a working directory for pg_probackup-12 and managing backup files and archive files. pg_probackup can be used to manage the backup of multiple instances.
For example:
pg_probackup-12 init -B /pg_data/backup
Set /pg_data/backup as the backup tool's home directory.
3.2 add-instance
pg_probackup-12 add-instance -B backup-path -D pgdata-path
--instance=instance_name
[--external-dirs=external-directories-paths]
[--remote-proto] [--remote-host]
[--remote-port] [--remote-path] [--remote-user]
[--ssh-options]
Add instance information that needs to be backed up. pg_probackup can manage the backup of multiple instances and can be used as a backup management center.
Add a local backup instance, for example:
pg_probackup-12 add-instance -B /pg_data/backup -D /data/pgsql12/data --instance node01
Add PostgreSQL with the local data directory /pg_data/backup to the backup management center. If you want to add a remote backup instance, you need to configure mutual trust first.
## 在远程备份实例主机上
# su - postgres
$ ssh-keygen
$ ssh-copy-id postgres@${备份机_ip}
## 在备份机上
# su - postgres
$ ssh-keygen
$ ssh-copy-id postgres@${备份实例主机_ip}
## 测试互信
$ ssh postgres@${对方IP}
Add a remote instance:
pg_probackup-12 add-instance -B /data/pgsql12/backup -D /data/pgsql/data --instance test02 --remote-prot=ssh --remote-host=172.16.104.55 --remote-port=22 --remote-user=postgres
INFO: Instance ‘test02’ successfully initialized
3.3 del-instance
pg_probackup-12 del-instance -B backup-path
--instance=instance_name
[--help]
Delete instance information from the backup metadata center. For example:
pg_probackup-12 del-instance -B /data/pgsql12/backup/ --instance test02
INFO: Delete: RZXNFS 2023-08-25 14:02:16+08
INFO: Delete: RZXNEX 2023-08-25 14:01:45+08
INFO: Delete: RZXND9 2023-08-25 14:00:45+08
INFO: Delete: RZXNCL 2023-08-25 14:00:21+08
INFO: Delete: RZXNA5 2023-08-25 13:58:53+08
INFO: Delete: RZXN9V 2023-08-25 13:58:43+08
INFO: Instance ‘test02’ successfully deleted
3.4 set-config
pg_probackup-12 set-config -B backup-path --instance=instance_name
[-D pgdata-path]
[--external-dirs=external-directories-paths]
[--log-level-console=log-level-console]
[--log-level-file=log-level-file]
[--log-format-file=log-format-file]
[--log-filename=log-filename]
[--error-log-filename=error-log-filename]
[--log-directory=log-directory]
[--log-rotation-size=log-rotation-size]
[--log-rotation-age=log-rotation-age]
[--retention-redundancy=retention-redundancy]
[--retention-window=retention-window]
[--wal-depth=wal-depth]
[--compress-algorithm=compress-algorithm]
[--compress-level=compress-level]
[--archive-timeout=timeout]
[-d dbname] [-h host] [-p port] [-U username]
[--remote-proto] [--remote-host]
[--remote-port] [--remote-path] [--remote-user]
[--ssh-options]
[--restore-command=cmdline] [--archive-host=destination]
[--archive-port=port] [--archive-user=username]
[--help]
You can configure the backup retention policy and some configuration parameters, such as configuring the backup retention policy:
pg_probackup-12 set-config -B /pg_data/backup --instance node01 --retention-redundancy 7 --retention-window 7
3.5 show-config
pg_probackup-12 show-config -B backup-path --instance=instance_name
[--format=format]
[--help]
View the parameter configuration information of the backup service.
3.6 set-backup
pg_probackup-12 set-backup -B backup-path --instance=instance_name
-i backup-id [--ttl=interval] [--expire-time=timestamp]
[--note=text]
[--help]
Modify the metadata information of the backup file.
3.7 backup
pg_probackup-12 backup -B backup-path -b backup-mode --instance=instance_name
[-D pgdata-path] [-C]
[--stream [-S slot-name] [--temp-slot]]
[--backup-pg-log] [-j num-threads] [--progress]
[--no-validate] [--skip-block-validation]
[--external-dirs=external-directories-paths]
[--no-sync]
[--log-level-console=log-level-console]
[--log-level-file=log-level-file]
[--log-format-console=log-format-console]
[--log-format-file=log-format-file]
[--log-filename=log-filename]
[--error-log-filename=error-log-filename]
[--log-directory=log-directory]
[--log-rotation-size=log-rotation-size]
[--log-rotation-age=log-rotation-age] [--no-color]
[--delete-expired] [--delete-wal] [--merge-expired]
[--retention-redundancy=retention-redundancy]
[--retention-window=retention-window]
[--wal-depth=wal-depth]
[--compress]
[--compress-algorithm=compress-algorithm]
[--compress-level=compress-level]
[--archive-timeout=archive-timeout]
[-d dbname] [-h host] [-p port] [-U username]
[-w --no-password] [-W --password]
[--remote-proto] [--remote-host]
[--remote-port] [--remote-path] [--remote-user]
[--ssh-options]
[--ttl=interval] [--expire-time=timestamp] [--note=text]
[--help]
How to start the backup. The parameters of this function are introduced in detail below:
- -B, --backup-path=backup-path : directory of the backup tool init.
- -b, --backup-mode=backup-mode : Backup mode, there are four modes: FULL, PAGE, DELTA, and PTRACK.
- FULL: Creates a full backup that contains all data files for the cluster to be restored.
- PAGE: Creates an incremental backup based on the WAL files generated since the last full or incremental backup. Only changed blocks are read from the data file.
- DELTA: Reads all data files in the data directory and creates an incremental backup of the pages that have changed since the last backup.
- PTRACK: Dynamically creates incremental backups to track page changes.
- -C, --smooth-checkpoint : Spread checkpoints over a period of time. By default, pg_probackup attempts to complete checkpoints as quickly as possible.
- –instance=instance_name : Instance name. By specifying the instance name, the backup tool will find the information about the instance based on the metadata.
- –stream : If this parameter is added to the backup command, it means to use stream mode to back up WAL logs.
- -S, --slot=SLOTNAME : Specify the replication slot for the WAL stream. This option can only be used with the --stream flag.
- -j, --threads=NUM : Concurrently use several threads to perform backup.
Use case, initiate backup locally:
pg_probackup-12 backup -B /data/pgsql12/backup/ --instance test01 -b full
INFO: Database backup start
INFO: wait for pg_start_backup()
INFO: Wait for WAL segment /data/pgsql12/backup/wal/test01/00000002000000020000007E to be archived
INFO: PGDATA size: 2397MB
INFO: Current Start LSN: 2/7E000028, TLI: 2
INFO: Start transferring data files
INFO: Data files are transferred, time elapsed: 39s
INFO: wait for pg_stop_backup()
INFO: pg_stop backup() successfully executed
INFO: stop_lsn: 2/7F0000F0
INFO: Getting the Recovery Time from WAL
INFO: Syncing backup files to disk
INFO: Backup files are synced, time elapsed: 1s
INFO: Validating backup RZXNYO
INFO: Backup RZXNYO data files are valid
INFO: Backup RZXNYO resident size: 2400MB
INFO: Backup RZXNYO completed
Use case, initiate remote backup:
pg_probackup-12 backup -B /data/pg_backup --instance test02 --remote-user='postgres' --remote-host='172.16.104.7' --remote-proto=ssh --stream --remote-port=22 -b full
INFO: Database backup start
INFO: wait for pg_start_backup()
INFO: Wait for WAL segment /data/pg_backup/backups/test02/S04Q23/database/pg_wal/000000020000000200000092 to be streamed
INFO: PGDATA size: 2405MB
INFO: Current Start LSN: 2/92000028, TLI: 2
INFO: Start transferring data files
INFO: Data files are transferred, time elapsed: 40s
INFO: wait for pg_stop_backup()
INFO: pg_stop backup() successfully executed
INFO: stop_lsn: 2/920001A0
INFO: Getting the Recovery Time from WAL
INFO: Syncing backup files to disk
INFO: Backup files are synced, time elapsed: 1s
INFO: Validating backup S04Q23
INFO: Backup S04Q23 data files are valid
INFO: Backup S04Q23 resident size: 2439MB
INFO: Backup S04Q23 completed
Next test the incremental backup:
# 先发起一个物理全量备份
pg_probackup-12 backup -B /pg_data/backup --instance node01 -b full
View backup information:
======================================================================================================================================
Instance Version ID Recovery Time Mode WAL Mode TLI Time Data WAL Zratio Start LSN Stop LSN Status
======================================================================================================================================
node01 12 S0GD3F 2023-09-04 16:33:57+08 FULL ARCHIVE 3/0 44s 2446MB 16MB 1.00 4/E9000028 4/EA000128 OK
Test manufacturing data changes:
update pgbench_accounts set bid = 6;
Based on the last full backup, perform incremental backup:
pg_probackup-12 backup -B /pg_data/backup --instance node01 -b page
======================================================================================================================================
Instance Version ID Recovery Time Mode WAL Mode TLI Time Data WAL Zratio Start LSN Stop LSN Status
======================================================================================================================================
node01 12 S0GDJI 2023-09-04 16:43:24+08 PAGE ARCHIVE 3/3 31s 1090MB 16MB 1.00 5/6D000110 5/6E0000F0 OK
node01 12 S0GD3F 2023-09-04 16:33:57+08 FULL ARCHIVE 3/0 44s 2446MB 16MB 1.00 4/E9000028 4/EA000128 OK
3.8 show
pg_probackup-12 show -B backup-path
[--instance=instance_name [-i backup-id]]
[--format=format] [--archive]
[--no-color] [--help]
This method is used to view backup list information and archive information. Use case to view backup information:
pg_probackup-12 show -B /data/pg_backup/
BACKUP INSTANCE 'test02'
======================================================================================================================================
Instance Version ID Recovery Time Mode WAL Mode TLI Time Data WAL Zratio Start LSN Stop LSN Status
======================================================================================================================================
test02 12 S04Q23 2023-08-29 09:42:50+08 FULL STREAM 2/0 54s 2407MB 32MB 1.00 2/92000028 2/920001A0 OK
Use case to view archived information:
pg_probackup-12 show -B /data/pgsql12/backup/ --archive
ARCHIVE INSTANCE 'test01'
==================================================================================================================================
TLI Parent TLI Switchpoint Min Segno Max Segno N segments Size Zratio N backups Status
==================================================================================================================================
2 0 0/0 000000020000000200000072 000000020000000200000092 32 512MB 1.00 0 DEGRADED
3.9 delete
pg_probackup-12 delete -B backup-path --instance=instance_name
[-j num-threads] [--progress]
[--retention-redundancy=retention-redundancy]
[--retention-window=retention-window]
[--wal-depth=wal-depth]
[-i backup-id | --delete-expired | --merge-expired | --status=backup_status]
[--delete-wal]
[--dry-run] [--no-validate] [--no-sync]
[--help]
This method is used to delete backups or delete expired archive logs. For example:
pg_probackup-12 delete -B /pg_data/backup/ --instance node01 -i S0G7IN
# INFO: Delete: S0G7IN 2023-09-04 14:32:47+08
Delete expired backups and WAL logs:
pg_probackup-12 delete -B /pg_data/backup --instance node01 --delete-expired --delete-wal
3.10 restore
pg_probackup-12 restore -B backup-path --instance=instance_name
[-D pgdata-path] [-i backup-id] [-j num-threads]
[--recovery-target-time=time|--recovery-target-xid=xid
|--recovery-target-lsn=lsn [--recovery-target-inclusive=boolean]]
[--recovery-target-timeline=timeline]
[--recovery-target=immediate|latest]
[--recovery-target-name=target-name]
[--recovery-target-action=pause|promote|shutdown]
[--restore-command=cmdline]
[-R | --restore-as-replica] [--force]
[--primary-conninfo=primary_conninfo]
[-S | --primary-slot-name=slotname]
[--no-validate] [--skip-block-validation]
[-T OLDDIR=NEWDIR] [--progress]
[--external-mapping=OLDDIR=NEWDIR]
[--skip-external-dirs] [--no-sync]
[-X WALDIR | --waldir=WALDIR]
[-I | --incremental-mode=none|checksum|lsn]
[--db-include | --db-exclude]
[--remote-proto] [--remote-host]
[--remote-port] [--remote-path] [--remote-user]
[--ssh-options]
[--archive-host=hostname]
[--archive-port=port] [--archive-user=username]
[--help]
This method is used to restore the backup to a new PostgreSQL instance. If the recovery target parameter is specified, pg_probackup will find the most recent backup and restore it to the specified recovery target. If no backup ID is provided and no recovery target option is provided, pg_probackup uses the latest backup to perform restore.
Full backup and recovery case:
# 停掉 PostgreSQL
pg_ctl -D /data/pgsql12/data/ -l /data/pgsql12/logs/start.log stop
# 删掉数据目录
rm -rf /data/pgsql12/data
# 使用备份恢复,这里用的是远程备份恢复
pg_probackup-12 restore -B /data/pg_backup --instance test02 --remote-user='postgres' --remote-host='172.16.104.7' --remote-proto=ssh --stream --remote-port=22
# 使用备份恢复,这里是本地备份
pg_probackup-12 restore -B /data/pgsql12/backup/ --instance test01 -i S08V98
# 恢复后启动 PostgreSQL
pg_ctl -D /data/pgsql12/data/ -l /data/pgsql12/logs/start.log start
INFO: Validating backup S04Q23
INFO: Backup S04Q23 data files are valid
INFO: Backup S04Q23 WAL segments are valid
INFO: Backup S04Q23 is valid.
INFO: Restoring the database from backup at 2023-08-29 09:42:03+08
INFO: Start restoring backup files. PGDATA size: 2437MB
INFO: Backup files are restored. Transfered bytes: 2437MB, time elapsed: 52s
INFO: Restore incremental ratio (less is better): 100% (2437MB/2437MB)
INFO: Syncing restored files to disk
INFO: Restored backup files are synced, time elapsed: 3s
INFO: Restore of backup S04Q23 completed.
3.11 ketchup
pg_probackup-12 catchup -b catchup-mode
--source-pgdata=path_to_pgdata_on_remote_server
--destination-pgdata=path_to_local_dir
[--stream [-S slot-name] [--temp-slot | --perm-slot]]
[-j num-threads]
[-T OLDDIR=NEWDIR]
[--exclude-path=path_prefix]
[-d dbname] [-h host] [-p port] [-U username]
[-w --no-password] [-W --password]
[--remote-proto] [--remote-host]
[--remote-port] [--remote-path] [--remote-user]
[--ssh-options]
[--dry-run]
[--help]
3.12 archive-push
pg_probackup-12 archive-push -B backup-path --instance=instance_name
--wal-file-name=wal-file-name
[--wal-file-path=wal-file-path]
[-j num-threads] [--batch-size=batch_size]
[--archive-timeout=timeout]
[--no-ready-rename] [--no-sync]
[--overwrite] [--compress]
[--compress-algorithm=compress-algorithm]
[--compress-level=compress-level]
[--remote-proto] [--remote-host]
[--remote-port] [--remote-path] [--remote-user]
[--ssh-options]
[--help]
This method is used to back up archive files. This command can be added to the PostgreSQL parameter file.
# 自动归档的时间,单位为秒,可酌情设置,建议 1 分钟
max_wal_senders = 60
# 开启归档
archive_mode = 'on'
# WAL 格式,归档必须是 replica 及更高级别
wal_level = 'replica'
# 配置归档命令
archive_command = 'pg_probackup-12 archive-push -B /data/pgsql12/backup --instance test01 --wal-file-path=%p --wal-file-name=%f'
3.13 archive-get
pg_probackup-12 archive-get -B backup-path --instance=instance_name
--wal-file-path=wal-file-path
--wal-file-name=wal-file-name
[-j num-threads] [--batch-size=batch_size]
[--no-validate-wal]
[--remote-proto] [--remote-host]
[--remote-port] [--remote-path] [--remote-user]
[--ssh-options]
[--help]
This method is used to copy WAL files from the backup directory to the PostgreSQL WAL log directory for PITR. This command is automatically set by pg_probackup. Restore_command users do not need to worry about how to set it.
3.14 checkdb
pg_probackup-12 checkdb [-B backup-path] [--instance=instance_name]
[-D pgdata-path] [--progress] [-j num-threads]
[--amcheck] [--skip-block-validation]
[--heapallindexed] [--checkunique]
[--help]
Verify the correctness of your PostgreSQL database cluster by detecting physical and logical corruption.
3.15 validate
pg_probackup-12 validate -B backup-path [--instance=instance_name]
[-i backup-id] [--progress] [-j num-threads]
[--recovery-target-time=time|--recovery-target-xid=xid
|--recovery-target-lsn=lsn [--recovery-target-inclusive=boolean]]
[--recovery-target-timeline=timeline]
[--recovery-target-name=target-name]
[--skip-block-validation]
[--help]
This method is used to verify the correctness of the backup. Use the case to verify the backup of test02 instance number S04Q23.
pg_probackup-12 validate -B /data/pg_backup -i S04Q23 --instance test02
INFO: Validating backup S04Q23
INFO: Backup S04Q23 data files are valid
INFO: Backup S04Q23 WAL segments are valid
INFO: Backup S04Q23 is valid.
INFO: Validate of backup S04Q23 completed.
3.16 merge
pg_probackup-12 merge -B backup-path --instance=instance_name
-i backup-id [--progress] [-j num-threads]
[--no-validate] [--no-sync]
[--help]
This method is used to merge incremental backups and merge backup files.