Postgresql summary of several HA deployment

1 deployment architecture

Here Insert Picture Description

2 Host Configuration

(Host ID20)


sed -ir "s/#*max_replication_slots.*/max_replication_slots= 10/" $PGDATA/postgresql.conf

sed -ir "s/#*max_wal_senders.*/max_wal_senders = 10/" $PGDATA/postgresql.conf
sed -ir "s/#*wal_level.*/wal_level = replica/" $PGDATA/postgresql.conf
sed -ir "s/#*archive_mode.*/archive_mode = on/" $PGDATA/postgresql.conf
sed -ir "s/#*archive_command.*/archive_command = 'test ! -f \${PGHOME}\/archive\/%f \&\& cp %p \${PGHOME}\/archive\/%f'/" $PGDATA/postgresql.conf

3 Archive recovery

(ID21)

3.1 Fundamentals backup (master operation)

Note 1: If the database using initdb do archive, will get an error
LOG: WAL file is from different database system
Note 2: Why archive?
If the stream is not continuous replication you use file-based archiving, the server may recover these old WAL segments before the backup computer receives WAL segment. If this happens, the backup machine will need to from a new base backup initialization. By setting wal_keep_segments to a value high enough to ensure that old WAL segment will not be reused or disposed too early as a back-up copy slot machine, this situation can be avoided. If you set a back-up machine can access the WAL archive, you do not need these solutions, because the archive can retain sufficient period, back-up machine for the backup machine can always use the archive to catch up with the host computer.

Step 1: Configure pg_hba.conf channel

Step two:pg_basebackup -Fp -P -x -D ~/app/data/pg_root21 -l basebackup21

3.2 Configuring Archive recovery

cp $PGHOME/share/recovery.conf.sample ./recovery.conf
sed -ir "s/#*standby_mode.*/standby_mode= on/" $PGDATA/recovery.conf
sed -ir "s/#*restore_command.*/restore_command = 'cp \/home\/gaomingjie\/app\/pgsql20\/archive\/%f %p'/" $PGDATA/recovery.conf

Preparation machine log (non-stop for the latest log):

cp: cannot stat `/home/gaomingjie/app/pgsql20/archive/000000010000000000000004': No such file or directory
cp: cannot stat `/home/gaomingjie/app/pgsql20/archive/000000010000000000000004': No such file or directory
cp: cannot stat `/home/gaomingjie/app/pgsql20/archive/000000010000000000000004': No such file or directory
cp: cannot stat `/home/gaomingjie/app/pgsql20/archive/000000010000000000000004': No such file or directory
cp: cannot stat `/home/gaomingjie/app/pgsql20/archive/000000010000000000000004': No such file or directory
LOG:  restored log file "000000010000000000000004" from archive
cp: cannot stat `/home/gaomingjie/app/pgsql20/archive/000000010000000000000005': No such file or directory
cp: cannot stat `/home/gaomingjie/app/pgsql20/archive/000000010000000000000005': No such file or directory
cp: cannot stat `/home/gaomingjie/app/pgsql20/archive/000000010000000000000005': No such file or directory

Process Status:

/home/gaomingjie/app/pgsql21/bin/postgres
\_ postgres: startup process   recovering 000000010000000000000005
\_ postgres: checkpointer process           
\_ postgres: writer process

4 asynchronous replication flow

(ID22)

By default downstream replication is asynchronous, in this case commit a transaction on the primary server becomes There is a slight delay between the visible and the change in the backup server. However, this delay than file-based log shipping in a much smaller way, the premise backup server capacity is sufficient to keep up with the load delay is usually less than one second. In the copy flow, no archive_timeout to reduce the data loss window. On systems that support keepalive socket option, setting tcp_keepalives_idle, tcp_keepalives_interval and tcp_keepalives_count contribute to the primary server noticed a broken connection quickly.

4.1 Fundamentals backup (master operation)

Step 1: Configure pg_hba.conf channel

Set access permissions for a good copy is very important, so that only trusted users can read the WAL stream, because it is easy to extract the information required privileges to access from the WAL stream. Account as a backup server must have a super user or REPLICATION privilege authentication to the main server. We recommend copying to create a dedicated user account has REPLICATION and LOGIN privileges. Although the REPLICATION privilege gives a very high privilege, but it does not allow users to modify any data on the primary system, and you can SUPERUSER privilege.

Created with Raphaël 2.2.0 pg_hba.conf primary_conninfo= 'host=127.0.0.1 port=9420'
配置方法1(本例中不使用这种配置方法):

pg_hba.conf:

host    replication     gaomingjie        127.0.0.1/32            trust

配置方法2(创建用户后使用密码校验):

create role foo login replication password 'server@123';

pg_hba.conf:

host    replication     foo               127.0.0.1/32            md5

Step two:pg_basebackup -Fp -P -x -D ~/app/data/pg_root22 -l basebackup22

4.2 Flow duplication parameter configuration

cp $PGHOME/share/recovery.conf.sample ./recovery.conf
sed -ir "s/#*standby_mode.*/standby_mode= on/" $PGDATA/recovery.conf
sed -ir "s/#*primary_conninfo.*/primary_conninfo= 'host=127.0.0.1 port=9420 user=foo password=server@123'/" $PGDATA/recovery.conf

Log information

LOG:  database system was shut down in recovery at 2017-04-27 11:46:42 CST
LOG:  entering standby mode
LOG:  redo starts at 0/6000028
LOG:  consistent recovery state reached at 0/7000000
LOG:  started streaming WAL from primary at 0/7000000 on timeline 1

Process State

/home/gaomingjie/app/pgsql22/bin/postgres
\_ postgres: startup process   recovering 000000010000000000000007
\_ postgres: checkpointer process           
\_ postgres: writer process                 
\_ postgres: wal receiver process   streaming 0/7000140

4.3 Flow monitoring replication status

A copy of important health indicators flow is generated WAL number of records but has not been applied on a backup server on the primary server. You can write the last WAL WAL positions and backup server received through the comparison of the current primary server to calculate the hysteresis.
  They can be used separately on pg_current_xlog_location pg_last_xlog_receive_location on the master server and the standby server to retrieve. Last WAL receive location backup server is also displayed in the process WAL receiver process in the state that the use of state of the ps command displays.
  You can pg_stat_replication list view to retrieve WAL sender process. pg_current_xlog_location great differences between the master server and the domain sent_location under great load, and the difference between the upper sent_location pg_last_xlog_receive_location could indicate a network server and a backup or standby server delay is under great load .

postgres=# select * from pg_stat_replication;
-[ RECORD 1 ]----+-----------------------------
pid              | 11715
usesysid         | 16393
usename          | foo
application_name | walreceiver
client_addr      | 127.0.0.1
client_hostname  | 
client_port      | 51930
backend_start    | 2017-04-27 14:12:57.43909+08
backend_xmin     | 
state            | streaming
sent_location    | 0/8000610
write_location   | 0/8000610
flush_location   | 0/8000610
replay_location  | 0/8000610
sync_priority    | 0
sync_state       | async

5 Hot Standby asynchronous transfer bath flow (primary cascade)

(ID23)

5.1 Fundamentals backup (master operation)

Step One: Configure Permissions

create role foo login replication password 'server@123';

pg_hba.conf:

host    replication     foo               127.0.0.1/32            md5

Step two:pg_basebackup -Fp -P -x -D ~/app/data/pg_root23 -l basebackup23

The third step: the master node creates a transfer bath flow

SELECT * FROM pg_create_physical_replication_slot('node_slot_23');

SELECT * FROM pg_replication_slots;
-[ RECORD 1 ]-------+-------------
slot_name           | node_slot_23
plugin              | 
slot_type           | physical
datoid              | 
database            | 
active              | f
active_pid          | 
xmin                | 
catalog_xmin        | 
restart_lsn         | 
confirmed_flush_lsn |

5.2 Configuring stream copying parameters

sed -ir "s/#*hot_standby.*/hot_standby= on/" $PGDATA/postgresql.conf

cp $PGHOME/share/recovery.conf.sample ./recovery.conf
sed -ir "s/#*standby_mode.*/standby_mode= on/" $PGDATA/recovery.conf
sed -ir "s/#*primary_conninfo.*/primary_conninfo= 'host=127.0.0.1 port=9420 user=foo password=server@123'/" $PGDATA/recovery.conf
sed -ir "s/#*primary_slot_name.*/primary_slot_name= 'node_slot_23'/" $PGDATA/recovery.conf

Log information

LOG:  entering standby mode
LOG:  redo starts at 0/9000028
LOG:  consistent recovery state reached at 0/A000060
LOG:  invalid record length at 0/A000060: wanted 24, got 0
LOG:  database system is ready to accept read only connections
LOG:  started streaming WAL from primary at 0/A000000 on timeline 1

Process State

/home/gaomingjie/app/pgsql23/bin/postgres
\_ postgres: startup process   recovering 00000001000000000000000A
\_ postgres: checkpointer process           
\_ postgres: writer process                 
\_ postgres: stats collector process        
\_ postgres: wal receiver process

The master node status query stream copying groove

psql
psql (9.6.0)
Type "help" for help.

postgres=# \x
Expanded display is on.
postgres=# SELECT * FROM pg_replication_slots;
-[ RECORD 1 ]-------+-------------
slot_name           | node_slot_23
plugin              | 
slot_type           | physical
datoid              | 
database            | 
active              | t
active_pid          | 14799
xmin                | 
catalog_xmin        | 
restart_lsn         | 0/B001148
confirmed_flush_lsn |

psql -U foo -W "dbname=postgres replication=database" 
Password for user foo: 
psql (9.6.0)
Type "help" for help.

postgres=> IDENTIFY_SYSTEM;
      systemid       | timeline |  xlogpos  |  dbname  
---------------------+----------+-----------+----------
 6413518490021561706 |        1 | 0/B001148 | postgres
(1 row)

5.3 groove stream replication concepts

Copy slots provide an automated method to ensure that the master is not removed before they receive all the back-up machine WAL segment, and the master does not remove the conflict could lead to the recovery of the line, even if the back-up machine off open as well.
  As an alternative to copying the slot can also be used to prevent wal_keep_segments remove the old WAL segments, or use archive_command the segment stored in an archive in. However, these methods often result in more than the required WAL segment reserved, and copy only the slot reservation period known need. One advantage of these methods is that they provide a boundary for the space requirements pg_xlog, but do not currently use the copy slot.
  Similarly, hot_standby vacuum_defer_cleanup_age protection and the relevant line is not removed vacuum, but the former can not provide protection during the backup machine is turned off, while the latter need to be set to a high value already provide adequate protection. Copy the tank to overcome these shortcomings.

5.4 to query and manipulate transfer bath

Each copy slot has a name, the name can contain lowercase letters, numbers, and the underscore character. Copy existing grooves and their status can be seen in the view pg_replication_slots.

Copy flow channel correlation function

pg_create_physical_replication_slot
pg_drop_replication_slot
pg_create_logical_replication_slot
pg_logical_slot_get_changes
pg_logical_slot_peek_changes
pg_logical_slot_get_binary_changes
pg_logical_slot_peek_binary_changes
pg_replication_origin_create
pg_replication_origin_drop
pg_replication_origin_oid
pg_replication_origin_session_setup
pg_replication_origin_session_reset
pg_replication_origin_session_is_setup
pg_replication_origin_session_progress
pg_replication_origin_xact_setup
pg_replication_origin_xact_reset
pg_replication_origin_advance
pg_replication_origin_progress
pg_logical_emit_message

Preparation 6 asynchronous cascading

(ID24)

6.1 Fundamentals backup (operation master node 23)

pg_basebackup -U foo -W -Fp -P -x -D ~/app/data/pg_root24 -l basebackup24

Note: This connection do 23 base backup.

6.2 Flow duplication parameter configuration

sed -ir "s/#*standby_mode.*/standby_mode= on/" $PGDATA/recovery.conf
sed -ir "s/#*primary_conninfo.*/primary_conninfo= 'host=127.0.0.1 port=9423 user=foo password=server@123'/" $PGDATA/recovery.conf

Log information

LOG:  database system was interrupted while in recovery at log time 2017-04-28 11:13:33 CST
HINT:  If this has occurred more than once some data might be corrupted and you might need to choose an earlier recovery target.
LOG:  entering standby mode
LOG:  redo starts at 0/B00D958
LOG:  consistent recovery state reached at 0/B00DA38
LOG:  invalid record length at 0/B00DA38: wanted 24, got 0
LOG:  database system is ready to accept read only connections
LOG:  started streaming WAL from primary at 0/B000000 on timeline 1

Process State

/home/gaomingjie/app/pgsql23/bin/postgres
\_ postgres: startup process   recovering 00000001000000000000000B
\_ postgres: checkpointer process           
\_ postgres: writer process                 
\_ postgres: stats collector process        
\_ postgres: wal receiver process   streaming 0/B00DBF8
\_ postgres: wal sender process foo 127.0.0.1(58019) streaming 0/B00DBF8

/home/gaomingjie/app/pgsql24/bin/postgres
\_ postgres: startup process   recovering 00000001000000000000000B
\_ postgres: checkpointer process           
\_ postgres: writer process                 
\_ postgres: stats collector process        
\_ postgres: wal receiver process   streaming 0/B00DBF8

7 Hot Standby synchronization stream replication (open archives)

(ID25)

When requesting synchronous replication, a write transaction of each submission will wait until it receives an acknowledgment indicating that the submission on the primary server and backup server have been written to the transaction log on disk. The only possibility that data will be lost is the primary server and a backup server at the same time have collapsed. This can provide a higher level of persistence, although only the system administrator to place the relationship between the two servers and management. Request for modification improves the user will not lose confidence, but it also unnecessarily increase the response time for a request transaction. The minimum wait time is the time back and forth between the main server and backup server. Read-only transactions and transaction rollback does not need to wait for a reply back-up server. Transaction commit child does not need to wait for a response back-up server, and only the top layer submitted only need to wait. Long-running operation (e.g., loading data or index building) without waiting for the final submission message. All two-phase commit actions requested to wait, including preparation and submission.

7.1 Fundamentals backup (operation master node 20)

Step One: Configure Permissions

create role foo login replication password 'server@123';

pg_hba.conf:

host    replication     foo               127.0.0.1/32            md5

Step two:pg_basebackup -U foo -W -Fp -P -x -D ~/app/data/pg_root25 -l basebackup25

Note: This connection do 20 base backup.

The third step: Modify synchronous_standby_names parameters.

sed -ir "s/#*synchronous_standby_names.*/synchronous_standby_names= '1 (s1)'/" $PGDATA/postgresql.conf

7.2 Flow duplication parameter configuration

sed -ir "s/#*hot_standby.*/hot_standby= on/" $PGDATA/postgresql.conf

cp $PGHOME/share/recovery.conf.sample ./recovery.conf
sed -ir "s/#*standby_mode.*/standby_mode= on/" $PGDATA/recovery.conf
sed -ir "s/#*primary_conninfo.*/primary_conninfo= 'application_name=s1 host=127.0.0.1 port=9420 user=foo password=server@123'/" $PGDATA/recovery.conf

sed -ir "s/#*archive_mode.*/archive_mode = always/" $PGDATA/postgresql.conf
sed -ir "s/#*archive_command.*/archive_command = 'test ! -f \${PGHOME}\/archive\/%f \&\& cp %p \${PGHOME}\/archive\/%f'/" $PGDATA/postgresql.conf

The master node log information

LOG:  standby "s1" is now a synchronous standby with priority 1

Dual master node status query

postgres=# select * from pg_stat_replication where application_name='s1';
-[ RECORD 1 ]----+------------------------------
pid              | 23543
usesysid         | 16393
usename          | foo
application_name | s1
client_addr      | 127.0.0.1
client_hostname  | 
client_port      | 48481
backend_start    | 2017-04-28 14:45:03.051153+08
backend_xmin     | 
state            | streaming
sent_location    | 0/E0000D0
write_location   | 0/E0000D0
flush_location   | 0/E0000D0
replay_location  | 0/E0000D0
sync_priority    | 1
sync_state       | sync

7.3 related parameters

synchronous_commit

values means
remote_apply When records are submitted to replay backup server will send a response message, it makes the transaction become visible. If you select the backup server from the priority list synchronous_standby_names master server as a back-up synchronization, will decide when to release awaiting confirmation commit transaction records are received in accordance with the backup server and the other from the response message synchronization backup. These parameters allow the administrator to specify which servers should be synchronized back-up reserve. Note synchronous replication configuration mainly on the host computer. Naming the backup server must be connected directly to the host computer, the host computer using a cascade of downstream backup server replication nothing.
remote_apply cause each commit will wait until the current sync backup server reported that they had replayed the transaction, which would make the transaction visible to user queries. In the simplest case, this is consistent with a causal load balancing leaving a room. If the request is a quick shut down, users will stop waiting. However, before the use of asynchronous replication, is transmitted to the backup server currently connected WAL records in all unresolved, the server will not shut down completely.
remote_write Resulting in each submission are waiting for a backup server has received the written record and submit it to confirm its own operating system is located, but not to wait for data to be brushed out to disk on the backup server. This arrangement provides persistent protection than on weaker point: backup server may lose data in the event of a crash the operating system, although it is not a PostgreSQL crash. However, in practice it is a useful setting because it can reduce the response time of the transaction. In the case only when the primary server and a backup server crashes and the main server database while damaged, data loss will occur.
Published 27 original articles · won praise 2 · views 50000 +

Guess you like

Origin blog.csdn.net/jackgo73/article/details/89684258