PG备份之pg_basebackup工具


假设有凌晨1点做了全备,当天下午4点误删了数据库,需要恢复到删数据库之前

思路

  1. 恢复全备数据
  2. 日志应用(备份归档+凌晨一点到下午2点的归档+在线redo)
  3. 主要参数
recovery_target_xid = ''	# 恢复到事务号 
recovery_target_lsn = ''    # 恢复到日志序列号

 恢复使用这两个参数会更精确一些

一、备份环境

###备份脚本
[postgres@postgresql ~]$ pg_basebackup -D /data/backupsets/ -R -Ft -Pv -Upostgres
pg_basebackup: initiating base backup, waiting for checkpoint to complete
pg_basebackup: checkpoint completed
pg_basebackup: write-ahead log start point: 0/2000028 on timeline 1
pg_basebackup: starting background WAL receiver
pg_basebackup: created temporary replication slot "pg_basebackup_8831"
32243/32243 kB (100%), 1/1 tablespace                                         
pg_basebackup: write-ahead log end point: 0/2000100
pg_basebackup: waiting for background process to finish streaming ...
pg_basebackup: syncing data to disk ...
pg_basebackup: renaming backup_manifest.tmp to backup_manifest
pg_basebackup: base backup completed
# 备份集
[postgres@postgresql backupsets]$ ll
total 48872
-rw-------. 1 postgres postgres   178976 Jul 14 17:41 backup_manifest
-rw-------. 1 postgres postgres 33079808 Jul 14 17:41 base.tar
-rw-------. 1 postgres postgres 16778752 Jul 14 17:41 pg_wal.tar

二、模拟drop库

testdb=# \c postgres
You are now connected to database "postgres" as user "postgres".
postgres=# drop database testdb;
DROP DATABASE
postgres=# select pg_walfile_name(pg_current_wal_lsn());
     pg_walfile_name      
--------------------------
 00000001000000000000001E

可以看到当前使用的wal日志文件是00000001000000000000001E,那么drop操作一定是在这个日志中,需要使用pg_waldump工具查看日志文件,找出drop之前的事务号或者日志序列号

三、恢复数据库

[postgres@postgresql ~]$ psql
psql (13.6)
Type "help" for help.

postgres=# select pg_switch_wal();
 pg_switch_wal 
---------------
 0/1E0008D8
(1 row)
--在关闭数据库前手工切换下日志,确保归档目录下有误操作的日志文件
[postgres@postgresql ~]$ pg_ctl stop 
waiting for server to shut down.... done
server stopped
[postgres@postgresql ~]$ cd $PGDATA
[postgres@postgresql data]$ rm -rf *
---这里为了方便,删除原库操作

1.恢复全备数据

[postgres@postgresql backupsets]$ tar xf base.tar -C /data/pg13.6/data/

2.日志应用前配置

归档文件

[postgres@postgresql pgarchive]$ ll
total 131080
-rw-------. 1 postgres postgres 16777216 Jul 14 17:35 000000010000000000000018
-rw-------. 1 postgres postgres 16777216 Jul 14 17:35 000000010000000000000019
-rw-------. 1 postgres postgres 16777216 Jul 14 17:37 00000001000000000000001A
-rw-------. 1 postgres postgres 16777216 Jul 14 17:37 00000001000000000000001B
-rw-------. 1 postgres postgres      340 Jul 14 17:37 00000001000000000000001B.00000028.backup
-rw-------. 1 postgres postgres 16777216 Jul 14 17:41 00000001000000000000001C
-rw-------. 1 postgres postgres 16777216 Jul 14 17:41 00000001000000000000001D
-rw-------. 1 postgres postgres      340 Jul 14 17:41 00000001000000000000001D.00000028.backup
-rw-------. 1 postgres postgres 16777216 Jul 14 18:32 00000001000000000000001E
-rw-------. 1 postgres postgres 16777216 Jul 14 18:38 00000001000000000000001F

找出恢复点

[postgres@postgresql pg_wal]$ pg_waldump 00000001000000000000001E
rmgr: Standby     len (rec/tot):     50/    50, tx:          0, lsn: 0/1E000028, prev 0/1D000100, desc: RUNNING_XACTS nextXid 511 latestCompletedXid 510 oldestRunningXid 511
rmgr: Heap        len (rec/tot):     59/  1511, tx:        511, lsn: 0/1E000060, prev 0/1E000028, desc: DELETE off 7 flags 0x00 KEYS_UPDATED , blkref #0: rel 1664/0/1262 blk 0 FPW
rmgr: Standby     len (rec/tot):     54/    54, tx:          0, lsn: 0/1E000648, prev 0/1E000060, desc: RUNNING_XACTS nextXid 512 latestCompletedXid 510 oldestRunningXid 511; 1 xacts: 511
rmgr: Standby     len (rec/tot):     54/    54, tx:          0, lsn: 0/1E000680, prev 0/1E000648, desc: RUNNING_XACTS nextXid 512 latestCompletedXid 510 oldestRunningXid 511; 1 xacts: 511
rmgr: XLOG        len (rec/tot):    114/   114, tx:          0, lsn: 0/1E0006B8, prev 0/1E000680, desc: CHECKPOINT_ONLINE redo 0/1E000680; tli 1; prev tli 1; fpw true; xid 0:512; oid 24590; multi 1; offset 0; oldest xid 478 in DB 1; oldest multi 1 in DB 1; oldest/newest commit timestamp xid: 0/0; oldest running xid 511; online
rmgr: Database    len (rec/tot):     38/    38, tx:        511, lsn: 0/1E000730, prev 0/1E0006B8, desc: DROP dir 1663/16398
rmgr: Transaction len (rec/tot):     66/    66, tx:        511, lsn: 0/1E000758, prev 0/1E000730, desc: COMMIT 2022-07-14 17:42:26.981178 CST; inval msgs: catcache 21; sync
rmgr: Standby     len (rec/tot):     50/    50, tx:          0, lsn: 0/1E0007A0, prev 0/1E000758, desc: RUNNING_XACTS nextXid 512 latestCompletedXid 511 oldestRunningXid 512
rmgr: Standby     len (rec/tot):     50/    50, tx:          0, lsn: 0/1E0007D8, prev 0/1E0007A0, desc: RUNNING_XACTS nextXid 512 latestCompletedXid 511 oldestRunningXid 512
rmgr: XLOG        len (rec/tot):    114/   114, tx:          0, lsn: 0/1E000810, prev 0/1E0007D8, desc: CHECKPOINT_ONLINE redo 0/1E0007D8; tli 1; prev tli 1; fpw true; xid 0:512; oid 24590; multi 1; offset 0; oldest xid 478 in DB 1; oldest multi 1 in DB 1; oldest/newest commit timestamp xid: 0/0; oldest running xid 512; online
rmgr: Standby     len (rec/tot):     50/    50, tx:          0, lsn: 0/1E000888, prev 0/1E000810, desc: RUNNING_XACTS nextXid 512 latestCompletedXid 511 oldestRunningXid 512
rmgr: XLOG        len (rec/tot):     24/    24, tx:          0, lsn: 0/1E0008C0, prev 0/1E000888, desc: SWITCH 

从挖掘的日志来看,只要恢复到事务号511之前即可(drop database 操作在日志中记录为drop dir字眼)如果日志内容过多,可以根据时间点仔细分析,而lsn: 0/1E000028为510的最后序列号

所以postgresql.auto.conf 添加如下内容
[postgres@postgresql data]$ vim postgresql.auto.conf 
restore_command='cp /pgarchive/%f %p'
recovery_target_lsn = '0/1E000028'

注:因在原先的机器上,且归档目录不在$PGDATA目录下,归档文件完整,所以没必要解压一遍/data/backupsets/pg_wal.tar到归档目录下

四、启动数据库

[postgres@postgresql data]$ pg_ctl start
waiting for server to start....2022-07-14 18:58:23.398 CST [15123] LOG:  redirecting log output to logging collector process
2022-07-14 18:58:23.398 CST [15123] HINT:  Future log output will appear in directory "log".
 done
server started
[postgres@postgresql data]$ psql
psql (13.6)
Type "help" for help.

postgres=# \l
                                  List of databases
   Name    |  Owner   | Encoding |   Collate   |    Ctype    |   Access privileges   
-----------+----------+----------+-------------+-------------+-----------------------
 postgres  | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 | 
 template0 | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 | =c/postgres          +
           |          |          |             |             | postgres=CTc/postgres
 template1 | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 | =c/postgres          +
           |          |          |             |             | postgres=CTc/postgres
 testdb    | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 | 
(4 rows)

postgres=# \c testdb
You are now connected to database "testdb" as user "postgres".
testdb=# select * from test;
 id |      name      |     phone      | country  | numberrange 
----+----------------+----------------+----------+-------------
  1 | Wade Sykes     | 1-917-342-3132 | Turkey   |           3
  2 | Barrett Boyer  | 1-264-304-0665 | Germany  |           9
  3 | Alana Kaufman  | (213) 254-4997 | India    |           0
  4 | Emmanuel Lopez | (543) 493-0137 | Germany  |           9
  5 | Timon Bauer    | 1-269-448-2772 | Pakistan |           6
(5 rows)

这时候数据库还是只读模式,需要执行select pg_wal_replay_resume();来结束日志应用

[postgres@postgresql data]$ psql
psql (13.6)
Type "help" for help.

postgres=# create database xx;
ERROR:  cannot execute CREATE DATABASE in a read-only transaction
postgres=# select pg_wal_replay_resume();
 pg_wal_replay_resume 
----------------------
 
(1 row)

postgres=# create database xx;
CREATE DATABASE
postgres=# 

五、部署备份任务计划(仅供参考)

创建backup.sh备份脚本

[postgres@postgresql backupsets]$ cd /data/scripts/
[postgres@postgresql scripts]$ cat backup.sh
#!/bin/bash
DATE=$(date +%Y%m%d)
BACKDIR=/data/backupsets/ 
REV_DATE=1

function green_echo(){
	echo -e "\e[40;32;1m$1\e[0m"
}
function red_echo(){
	echo -e "\e[40;31;1m$1\e[0m"
}


#全量备份所有数据到 以备份日期命名的文件夹
function backup(){
 pg_basebackup -D /data/backupsets/bkdata_$DATE/ -R -Ft -Pv -Upostgres
}
#开始备份
green_echo "Begin backup data At Date:`date`"
backup
green_echo "End backup data At Date:`date`"


#删除7天前的备份文件
cd $BACKDIR

if [ $? -eq 0 ];then
  find  ${BACKDIR:=/tmp}  -type d -mtime  +$REV_DATE |xargs rm -rvf
  green_echo "delete Success"
else
  red_echo "delete fail please check log!!!"
  exit 1
fi

配置crontab任务计划

[postgres@postgresql scripts]$ crontab -l
30 03 * * * /bin/bash /data/scripts/backup.sh >/data/scripts/backup.log 2>&1

basebackup dump 的两个脚本

-bash-4.2$ cat backupdump.sh
#!/bin/bash
 
#操作类型,backup、restore
type=$1
#造作数据库schema名
dbname=$2
#备份文件名,格式为:注册名_yyyyMMddHHmmss.sql
backupFileName=$3
#数据库所在服务器ip
dbhost=$4
 
#固定存储目录/home/backup/
if [ ! -d "/home/backup/" ];then
    mkdir "/home/backup/"
fi
 
backupFile="/home/backup/"${backupFileName}
echo ${backupFile}
 
cd /usr/pgsql-14/bin
 
if [ $type == "backup" ];then
    PGPASSWORD="postgres" ./pg_dump -h ${dbhost} -U postgres ${dbname}  > ${backupFile}
elif [ $type == "restore" ];then
    #先清理掉schema,不然相对于恢复点有新增数据,恢复时不会清理
    PGPASSWORD="postgres" ./psql -h ${dbhost} -U postgres -d ${dbname} -c "drop schema public cascade;create schema public;"
    PGPASSWORD='postgres' ./psql -h ${dbhost} -U postgres -d ${dbname}  -f ${backupFile}
else
    echo "没有合适的操作类型"
    exit 1
fi
 
exit 0
-bash-4.2$ ls
14  backupdump.sh






-bash-4.2$ cat basebackup.sh
#!/bin/bash
DATE=$(date +%Y%m%d)
BACKDIR=/data/backupsets/ 
REV_DATE=1

function green_echo(){
        echo -e "\e[40;32;1m$1\e[0m"
}
function red_echo(){
        echo -e "\e[40;31;1m$1\e[0m"
}


#全量备份所有数据到 以备份日期命名的文件夹
function backup(){
 pg_basebackup -D /data/backupsets/bkdata_$DATE/ -R -Ft -Pv -Upostgres
}
#开始备份
green_echo "Begin backup data At Date:`date`"
backup
green_echo "End backup data At Date:`date`"


#删除7天前的备份文件
cd $BACKDIR

if [ $? -eq 0 ];then
  find  ${BACKDIR:=/tmp}  -type d -mtime  +$REV_DATE |xargs rm -rvf
  green_echo "delete Success"
else
  red_echo "delete fail please check log!!!"
  exit 1
fi
-bash-4.2$

pg_basebackup实现热备

[root@mysql1 data]# vim pg_hba.conf
local all all trust

IPv4 local connections:

host all all 127.0.0.1/32 trust

IPv6 local connections:

host all all ::1/128 trust
host all all 0.0.0.0/0 md5

Allow replication connections from localhost, by a user with the

replication privilege.

local replication all trust
host replication all 0.0.0.0/0 md5
host replication all 127.0.0.1/32 trust
host replication all ::1/128 trust

之前的远程备份关了
[root@mysql1 data]# vim /pgsql/data/postgresql.conf

archive_command = 'scp %p [email protected]:/archive/%f'

[root@mysql1 data]# systemctl restart postgresql.service

在备份服务器执行下面操作
备份目录/pgsql/backup/必须为空
mkdir -p /pgsql/backup/

备份
pg_basebackup -D /pgsql/backup/ -Ft -Upostgres -h192.168.57.110 -R

root@zhaohuakang:/pgsql/backup# ls /pgsql/backup/
backup_manifest base.tar pg_wal.tar

利用完全备份恢复
在备份服务器上执行下面操作还原
创建存放归档日志的目录
chown -R postgres. /pgsql/backup/

su - postgres
pg_ctl stop
mkdir /archive/
chown postgres.postgres /archive/
rm -rf /archive/*
rm -rf /pgsql/data/*
解压缩备份文件到数据目录下,进行还原
tar xf /pgsql/backup/base.tar -C /pgsql/data/
tar xf /pgsql/backup/pg_wal.tar -C /archive/
修改配置文件
postgres@zhaohuakang:~$ vim /pgsql/data/postgresql.conf
restore_command = 'cp /archive/%f %p'
recovery_target = 'immediate'
启动
pg_ctl start

利用pitr实现误删除的实战案例
每天2:00备份,第二天10:00误删除数据,如何恢复
恢复过程
备份数据和归档
还原流程
还原完全备份
归档日志恢复:备份中的归档,恢复2点到10点直接的归档,恢复在线redo

在主服务器开启归档
[root@mysql1 data]# vim postgresql.conf
archive_mode = on
archive_command = '[ ! -f /archive/%f ] && cp %p /archive/%f'
[root@mysql1 data]# systemctl restart postgresql.service

创建测试数据
postgres=# create database testdb;
postgres=# \c testdb
testdb=# create table t1(id int);
testdb=# insert into t1 values(1);

在备份服务器上对数据库进行远程备份
rm -rf /pgsql/backup/*
pg_basebackup -D /pgsql/backup/ -Ft -Upostgres -h192.168.57.110 -R
chown -R postgres. /pgsql/backup/

在数据库上继续生成测试数据
testdb=# insert into t1 values(2);
testdb=# insert into t1 values(3);

模拟数据库删除
testdb=# \c db1;
db1=# drop database testdb;

发现故障,停止用户访问
查看当前日志文件
db1=# select pg_walfile_name(pg_current_wal_lsn());
pg_walfile_name


000000010000000000000029
查看当前事务id
db1=# select txid_current();
txid_current


      898

故障还原
在服务器上切换归档日志
db1=# select pg_switch_wal();

在要还原的服务器停止服务,准备还原
su - postgres
pg_ctl stop
rm -rf /archive/*
rm -rf /pgsql/data/*
解压缩备份文件到数据目录下,进行还原
tar xf /pgsql/backup/base.tar -C /pgsql/data/
tar xf /pgsql/backup/pg_wal.tar -C /archive/
复制服务器的归档日志到还原的测试服务器,还在还原的服务器操作,但是要root账户
root@zhaohuakang:/backup# rsync -a 192.168.57.110:/archive/ /archive/

查看故障点事务id
root@zhaohuakang:/backup# pg_waldump /archive/000000010000000000000029 |grep DROP
rmgr: Database len (rec/tot): 38/ 38, tx: 897, lsn: 0/29001128, prev 0/290010B0, desc: DROP dir 1663/16543
查看此指令的事务id为897,前一个事务是896

修改配置文件
postgres@zhaohuakang:~$ vim /pgsql/data/postgresql.conf
restore_command = 'cp /archive/%f %p'
recovery_target_xid = '896'
启动
su - postgres
pg_ctl start

验证数据
postgres@zhaohuakang:~$ psql
postgres=# \c testdb
testdb=# select * from t1;
id


1
2
3
当前无法写入
testdb=# insert into t1 values(3);
ERROR: cannot execute INSERT in a read-only transaction

恢复正常模式
testdb=# select pg_wal_replay_resume();

备份单个数据库带创建库的命令,并发送给原数据库
pg_dump -U postgres -C -f /backup/testdb testdb
scp /backup/testdb 192.168.57.110:/backup/testdb

数据库录入
cd /backup/
psql <testdb

这个时候数据库是归档状态,可以通过pg_controldata命令查看数据库簇状态:

    8、切换数据库状态

    执行pg_ctl promote命令即可。

猜你喜欢

转载自blog.csdn.net/jnrjian/article/details/129945206