Postgres master-slave data synchronization

Table of contents

Three methods and comparison of Postgres data backup ... 1

① Archive. 1

② Stream replication... 1

③ Logical replication... 1

stream replication ... 1

① Principle... 1

② Configure... 3

1. Modify the main database configuration file postgresql.conf 3

2. Create a replication account for the master database... 3

3. Modify the main library configuration file pg_hba.conf 3

4. Restart the main library service... 4

5. The slave computer backs up the data of the master library online... 4

6. Modify the slave library configuration file recovery.conf 5

7. Restart slave service... 5

8. Check the master-slave database service, WAL log and master-slave identification... 5

9. Synchronization test... 6

10. Synchronous stream replication... 7

③ Cancel master-slave synchronous stream replication... 8

④ Reset master-slave synchronous stream replication... 8

Three methods and comparison of Postgres data backup

① Archive

   Backups are made according to the archive command, usually one WAL log file behind the backup.

stream replication

   Stream replication is also called physical replication, which can replicate an instance-level slave library that is exactly the same as the master library from the instance level. There are two types of streaming replication synchronization modes: synchronous and asynchronous.

> Asynchronous replication can achieve better performance, but its disadvantage is: if the master library is down, or the slave library is activated as the master library, some WAL is not sent to the slave library, which may cause data loss.

> Synchronous replication, which can ensure that all transaction modifications on the master database can be transmitted to the slave database, which improves the security of data replication and reduces performance.

Logical replication

   The difference from physical replication is that physical replication is based on instance-level replication, which can only replicate the entire PostgreSQL instance, not some databases and tables. Starting from PostgreSQL10, there has been table-level replication, namely logical replication.

stream replication

①Principle _

By default, streaming replication works in asynchronous mode. The main library writes WAL logs, and sends the WAL logs to the wal receiver process of the slave library through the wal sender process. The wal receiver receives the WAL logs and persists them to storage.

The startup process of the slave library restores the WAL log written to the disk, and applies the data to the data page to achieve master-slave data synchronization. On the basis of the asynchronous mode of streaming replication, the synchronous mode also specifies the synchronization level of transaction submission: remote_write ensures that all data of the transaction is received from the library, receives data from the library and calls write to write to disk, but does not persist to Disk; remote_apply guarantees that all data of the transaction is restored to the data page from the library.

Note: The following three pictures are from the Internet   

Overall frame diagram of PG active and standby:

PG stream copy process:

PG standby mode and apply log process:

②Configuration _

Both the main library and the slave library are psql10.11

1. Modify the main library configuration file postgresql.conf

  Note: In addition to the basic parameters, at least the following parameters are required

  listen_addresses = '*'

  wal_level = replica

  max_connections = 100

  archive_mode = on

  archive_command = 'test ! -f /mnt/server/archive/%f && cp %p /mnt/server/archive/%f'

  max_wal_senders = 10

  wal_keep_segments = 60

  hot_standby = on

The above parameters include paths involving archived logs, which need to be created manually

mkdir -p /mnt/server/archive/

Parameter Description:

listen_address: set as needed, this test is configured to be accessible to all hosts, the production environment can configure the network segment as needed wal_level: set the stream replication mode to at least replica

archive_mode: enable archiving this time

archive_command: WAL log archiving command, the production environment can copy the archive to the corresponding directory or other machines, this test is configured to archive to another directory of the machine

max_wal_senders: The maximum number of WAL sending processes, which must be greater than or equal to the number of slave libraries and smaller than max_connections.

wal_keep_segments: The number of WAL logs kept in the pg_wal directory. Each WAL file defaults to 16M. In order to ensure that the slave library can still catch up with the main library when the application archive lags behind, it is recommended to set this value to be larger.

hot_standby: This parameter controls whether read-only operations are supported during archive recovery. When set to ON, the slave library is in read-only mode.

2. Create a copy account for the main library

  Postgres=# CREATE ROLE replicauser login replication encrypted password 'replicauser';

3. Modify the main library configuration file pg_hba.conf

  Master library server 192.168.20.7/24 Slave library server 192.168.20.5/24

  Allow user replicauser to connect to replication

4. Restart the main library service

  #su postgres

   /usr/pgsql-10/bin/pg_ctl reload -D /var/lib/pgsql/10/data/

   /usr/pgsql-10/bin/pg_ctl restart -D /var/lib/pgsql/10/data/

  

5. The slave machine backs up the master database data online

Put the data in the specified path. This path is recommended to be consistent with the path of the main library. Before executing the following command, remove the directory /var/lib/pgsql/10/data from the library and clear it. After executing the command, go to the path of the library What is saved is the data of the main library, among which pg_wal is saved to the most recent WAL log

[root@centos7min2 bin]# su postgres

bash-4.2$ ./pg_basebackup  -h 192.168.20.7 -U replicauser -p 5432 -F p   -X s  -v -P -R -D /var/lib/pgsql/10/data/ -l postgres32

pg_basebackup: initiating base backup, waiting for checkpoint to complete

pg_basebackup: checkpoint completed

pg_basebackup: write-ahead log start point: 0/13000028 on timeline 2

pg_basebackup: starting background WAL receiver

32368/32368 kB (100%), 1/1 tablespace                                        

pg_basebackup: write-ahead log end point: 0/130000F8

pg_basebackup: waiting for background process to finish streaming ...

pg_basebackup: base backup completed

Parameter description in the pg_basebackup command:

-h specifies the host name or IP address of the connected database, here is the ip of the main library

-U specifies the user name of the connection, here is the repl user we just created to be responsible for streaming replication

-F specifies the data format for generating backups, supports p (plain output as it is) or t (tar format output)

-X indicates that after the backup starts, start another streaming replication connection to receive WAL logs from the main library. There are two methods: f (fetch) and s (stream), and it is recommended to use the s method

-P means to display the approximate percentage of data files and table space transfers, allowing real-time printing of the backup progress during the backup process

-v means to enable the verbose mode, the logs of each stage will be printed during the command execution, it is recommended to enable

-R means that the recovery.conf file will be automatically generated after the backup , thus avoiding manual creation

-D Specifies which directory to write the backup to. One thing to note here is that the data directory ( /var/lib/pgsql/10/data ) directory of the slave library needs to be manually cleared before doing the basic backup

-l means to specify a backup ID, and you can see the progress prompt after running the command

6. Modify the slave library configuration file recovery.conf

   The configuration file is automatically generated by the pg_basebackup command, and can also be downloaded from /usr/pgsql-10/share/

Copy recovery.conf.sample and make the following adjustments

[root@centos7min2 data]# cat recovery.conf

standby_mode = 'on'

primary_conninfo = 'user=replicauser host=192.168.20.7 port=5432 password=replicauser'

recovery_target_timeline = 'latest'

trigger_file = '/tmp/failover'

Parameter Description:

standby_mode: Set whether to enable the database as a standby database. If it is set to on, the standby database will continuously obtain the WAL log stream from the main database until the latest WAL log stream on the main database is obtained.

primary_conninfo: Set the connection information of the main library. Here, the main library IP, port, user name information, etc. are set. Here is the plaintext password. It is recommended to configure a non-plaintext password in the production environment, but configure the password in another hidden file

recovery_target_timeline: Set the recovery timeline (timeline). By default, it is restored to the timeline when the baseline backup was generated. Setting it to latest means to restore from the backup to the latest timeline. Usually, the stream replication environment sets this parameter to latest, which is complex For recovery scenarios, this parameter can be set to other values

trigger_file: If the file specified by trigger_file exists, recovery.conf is switched to recovery.done, and the master-slave switch

7. Restart the slave library service

pg_ctl restart failed, started successfully via postmaster

bash-4.2$ ./pg_ctl restart -D /var/lib/pgsql/10/data/

pg_ctl: PID file "/var/lib/pgsql/10/data/postmaster.pid" does not exist

Is server running?

starting server anyway

pg_ctl: could not read file "/var/lib/pgsql/10/data/postmaster.opts"

/usr/pgsql-10/bin/postmaster -D /var/lib/pgsql/10/data/

8. Check the master-slave database service, WAL log and master-slave identification

main library

From library

9. Synchronization test

The master database creates the database test, and the database information is added synchronously from the database

10. Synchronous stream replication

The above configuration can realize asynchronous stream replication, and the pg_stat_replication table field sync_state = async.

1) Set application_name

Update the primary_conninfo parameter in the standby database configuration file recovery.conf, the default value is walreceiver, and the application_name of the instance needs to be specified

[root@centos7min2 data]# cat recovery.conf

standby_mode = 'on'

primary_conninfo = 'user=replicauser host=192.168.20.7 port=5432 password=replicauser application_name=standby01'

recovery_target_timeline = 'latest'

trigger_file = '/tmp/failover'

2) After restarting the slave library service, the main library query status

3) Modify the postgres.conf of the main library

synchronous_standby_names = 'standby01'

synchronous_commit = remote_apply       # synchronization level;

                                      # off, local, remote_write, remote_apply, or on

4) Query the synchronization status after restarting the main library service

sync_state = sync

③Cancel master-slave synchronous stream replication

   If you need to cancel the master-slave synchronous stream replication, you only need to remove or delete the recovery.conf under the database data directory of the slave library, and then restart the postgres service of the slave library.

   View the postgres process and synchronization ID from the library

   View the main library postgres process and synchronization identification

Reset the master-slave synchronous stream replication

   If you need to reset the master-slave synchronous replication, you need to reset it step by step according to ② configuration. If only recovery.conf is removed, you can restore it and restart the master-slave database service.

Guess you like

Origin blog.csdn.net/Wemesun/article/details/126213248