Table of contents
Three methods and comparison of Postgres data backup ... 1
1. Modify the main database configuration file postgresql.conf 3
2. Create a replication account for the master database... 3
3. Modify the main library configuration file pg_hba.conf 3
4. Restart the main library service... 4
5. The slave computer backs up the data of the master library online... 4
6. Modify the slave library configuration file recovery.conf 5
8. Check the master-slave database service, WAL log and master-slave identification... 5
10. Synchronous stream replication... 7
③ Cancel master-slave synchronous stream replication... 8
④ Reset master-slave synchronous stream replication... 8
Three methods and comparison of Postgres data backup
Backups are made according to the archive command, usually one WAL log file behind the backup.
② stream replication
Stream replication is also called physical replication, which can replicate an instance-level slave library that is exactly the same as the master library from the instance level. There are two types of streaming replication synchronization modes: synchronous and asynchronous.
> Asynchronous replication can achieve better performance, but its disadvantage is: if the master library is down, or the slave library is activated as the master library, some WAL is not sent to the slave library, which may cause data loss.
> Synchronous replication, which can ensure that all transaction modifications on the master database can be transmitted to the slave database, which improves the security of data replication and reduces performance.
③ Logical replication
The difference from physical replication is that physical replication is based on instance-level replication, which can only replicate the entire PostgreSQL instance, not some databases and tables. Starting from PostgreSQL10, there has been table-level replication, namely logical replication.
By default, streaming replication works in asynchronous mode. The main library writes WAL logs, and sends the WAL logs to the wal receiver process of the slave library through the wal sender process. The wal receiver receives the WAL logs and persists them to storage.
The startup process of the slave library restores the WAL log written to the disk, and applies the data to the data page to achieve master-slave data synchronization. On the basis of the asynchronous mode of streaming replication, the synchronous mode also specifies the synchronization level of transaction submission: remote_write ensures that all data of the transaction is received from the library, receives data from the library and calls write to write to disk, but does not persist to Disk; remote_apply guarantees that all data of the transaction is restored to the data page from the library.
Note: The following three pictures are from the Internet
Overall frame diagram of PG active and standby:
PG stream copy process:
PG standby mode and apply log process:
Both the main library and the slave library are psql10.11
1. Modify the main library configuration file postgresql.conf
Note: In addition to the basic parameters, at least the following parameters are required
listen_addresses = '*'
wal_level = replica
max_connections = 100
archive_mode = on
archive_command = 'test ! -f /mnt/server/archive/%f && cp %p /mnt/server/archive/%f'
max_wal_senders = 10
wal_keep_segments = 60
hot_standby = on
The above parameters include paths involving archived logs, which need to be created manually
mkdir -p /mnt/server/archive/
Parameter Description:
listen_address: set as needed, this test is configured to be accessible to all hosts, the production environment can configure the network segment as needed wal_level: set the stream replication mode to at least replica
archive_mode: enable archiving this time
archive_command: WAL log archiving command, the production environment can copy the archive to the corresponding directory or other machines, this test is configured to archive to another directory of the machine
max_wal_senders: The maximum number of WAL sending processes, which must be greater than or equal to the number of slave libraries and smaller than max_connections.
wal_keep_segments: The number of WAL logs kept in the pg_wal directory. Each WAL file defaults to 16M. In order to ensure that the slave library can still catch up with the main library when the application archive lags behind, it is recommended to set this value to be larger.
hot_standby: This parameter controls whether read-only operations are supported during archive recovery. When set to ON, the slave library is in read-only mode.
2. Create a copy account for the main library
Postgres=# CREATE ROLE replicauser login replication encrypted password 'replicauser';
3. Modify the main library configuration file pg_hba.conf
Master library server 192.168.20.7/24 Slave library server 192.168.20.5/24
Allow user replicauser to connect to replication
4. Restart the main library service
#su postgres
/usr/pgsql-10/bin/pg_ctl reload -D /var/lib/pgsql/10/data/
/usr/pgsql-10/bin/pg_ctl restart -D /var/lib/pgsql/10/data/
5. The slave machine backs up the master database data online
Put the data in the specified path. This path is recommended to be consistent with the path of the main library. Before executing the following command, remove the directory /var/lib/pgsql/10/data from the library and clear it. After executing the command, go to the path of the library What is saved is the data of the main library, among which pg_wal is saved to the most recent WAL log
[root@centos7min2 bin]# su postgres
bash-4.2$ ./pg_basebackup -h 192.168.20.7 -U replicauser -p 5432 -F p -X s -v -P -R -D /var/lib/pgsql/10/data/ -l postgres32
pg_basebackup: initiating base backup, waiting for checkpoint to complete
pg_basebackup: checkpoint completed
pg_basebackup: write-ahead log start point: 0/13000028 on timeline 2
pg_basebackup: starting background WAL receiver
32368/32368 kB (100%), 1/1 tablespace
pg_basebackup: write-ahead log end point: 0/130000F8
pg_basebackup: waiting for background process to finish streaming ...
pg_basebackup: base backup completed
Parameter description in the pg_basebackup command:
-h specifies the host name or IP address of the connected database, here is the ip of the main library
-U specifies the user name of the connection, here is the repl user we just created to be responsible for streaming replication
-F specifies the data format for generating backups, supports p (plain output as it is) or t (tar format output)
-X indicates that after the backup starts, start another streaming replication connection to receive WAL logs from the main library. There are two methods: f (fetch) and s (stream), and it is recommended to use the s method
-P means to display the approximate percentage of data files and table space transfers, allowing real-time printing of the backup progress during the backup process
-v means to enable the verbose mode, the logs of each stage will be printed during the command execution, it is recommended to enable
-R means that the recovery.conf file will be automatically generated after the backup , thus avoiding manual creation
-D Specifies which directory to write the backup to. One thing to note here is that the data directory ( /var/lib/pgsql/10/data ) directory of the slave library needs to be manually cleared before doing the basic backup
-l means to specify a backup ID, and you can see the progress prompt after running the command
6. Modify the slave library configuration file recovery.conf
The configuration file is automatically generated by the pg_basebackup command, and can also be downloaded from /usr/pgsql-10/share/
Copy recovery.conf.sample and make the following adjustments
[root@centos7min2 data]# cat recovery.conf
standby_mode = 'on'
primary_conninfo = 'user=replicauser host=192.168.20.7 port=5432 password=replicauser'
recovery_target_timeline = 'latest'
trigger_file = '/tmp/failover'
Parameter Description:
standby_mode: Set whether to enable the database as a standby database. If it is set to on, the standby database will continuously obtain the WAL log stream from the main database until the latest WAL log stream on the main database is obtained.
primary_conninfo: Set the connection information of the main library. Here, the main library IP, port, user name information, etc. are set. Here is the plaintext password. It is recommended to configure a non-plaintext password in the production environment, but configure the password in another hidden file
recovery_target_timeline: Set the recovery timeline (timeline). By default, it is restored to the timeline when the baseline backup was generated. Setting it to latest means to restore from the backup to the latest timeline. Usually, the stream replication environment sets this parameter to latest, which is complex For recovery scenarios, this parameter can be set to other values
trigger_file: If the file specified by trigger_file exists, recovery.conf is switched to recovery.done, and the master-slave switch
7. Restart the slave library service
pg_ctl restart failed, started successfully via postmaster
bash-4.2$ ./pg_ctl restart -D /var/lib/pgsql/10/data/
pg_ctl: PID file "/var/lib/pgsql/10/data/postmaster.pid" does not exist
Is server running?
starting server anyway
pg_ctl: could not read file "/var/lib/pgsql/10/data/postmaster.opts"
/usr/pgsql-10/bin/postmaster -D /var/lib/pgsql/10/data/
8. Check the master-slave database service, WAL log and master-slave identification
main library
From library
9. Synchronization test
The master database creates the database test, and the database information is added synchronously from the database
10. Synchronous stream replication
The above configuration can realize asynchronous stream replication, and the pg_stat_replication table field sync_state = async.
1) Set application_name
Update the primary_conninfo parameter in the standby database configuration file recovery.conf, the default value is walreceiver, and the application_name of the instance needs to be specified
[root@centos7min2 data]# cat recovery.conf
standby_mode = 'on'
primary_conninfo = 'user=replicauser host=192.168.20.7 port=5432 password=replicauser application_name=standby01'
recovery_target_timeline = 'latest'
trigger_file = '/tmp/failover'
2) After restarting the slave library service, the main library query status
3) Modify the postgres.conf of the main library
synchronous_standby_names = 'standby01'
synchronous_commit = remote_apply # synchronization level;
# off, local, remote_write, remote_apply, or on
4) Query the synchronization status after restarting the main library service
sync_state = sync
③Cancel master-slave synchronous stream replication
If you need to cancel the master-slave synchronous stream replication, you only need to remove or delete the recovery.conf under the database data directory of the slave library, and then restart the postgres service of the slave library.
View the postgres process and synchronization ID from the library
View the main library postgres process and synchronization identification
④ Reset the master-slave synchronous stream replication
If you need to reset the master-slave synchronous replication, you need to reset it step by step according to ② configuration. If only recovery.conf is removed, you can restore it and restart the master-slave database service.