PostgreSQL Database Management - Chapter II architecture

PostgreSQL Database Management

Chapter II architecture

Outline

PostgreSQL is a very powerful, open source code for client / server relational database management system (RDBMS). Data types (such as JSON and JSONB type, array type) supports rich and custom type. The default page size PostgreSQL memory is 8kB.

 

PostgreSQL has the following key features:

1 Good SQL language support, support the ACID , associate integrity, database transactions, Unicode multilingual.

2 designing high concurrent read and write and do not clog

3 supports a number of types of database models: relational, document type (such as JSON and JSONB type, array type), Key / value types.

 

 

2.1 PostgreSQL process structure

 

 

 

PostgreSQL is a client a user process / application server. Several process starts when the database starts, including postmaster (daemon), postgres (service process), syslogger, checkpointer, bgwriter, walwriter and other auxiliary processes.

2.1.1 postmaster (Daemon)

postmaster (Daemon) main responsibilities are:
start-stop database 1.
2. listen for client connections
3. For each client connection fork separate server process postgres
4. When postgres repair service process error
5. Data File Management
6. management and operation of the database related to worker process

2.1.2 Postgres service process

Postgres server process accepts and executes the client (such as psql, or user applications via JDBC interfaces) commands (Interactive SQL query) sent. It calls on an underlying module (e.g., storage, management transaction, index, etc.) of each main function modules (e.g., compiler, optimizer, actuators, etc.), accomplish the various database operations the client, and returns the result.

 

2.1.3 Syslogger (system log process)

There are many associated with the log parameters in the configuration file Postgresql.conf of which only is on, the main process will start syslogger worker process parameters logging_collect settings.

- Acquisition of PostgreSQL running state and running log written to the log file

- logging_collector on startup parameter is not recommended to close

- log_directory set the log directory

- log_destination set the log output, even the format

- log_filename set the log file name

- log_truncate_on_rotation set whether to repeat the cycle use and delete the log

- log_rotation_age set cycle time

- log_rotation_size log cycle set dimension line

2.1.6 Auxiliary Process Checkpoint:

- to ensure the consistency of the database

- it triggers action bgwriter and wal writer

- has more parameters to control its start interval

 

2.1.7 Auxiliary process Backgroup writer (background writing process):

In PostgreSQL, Bgwriter worker process is the process on the disk shared memory writes dirty pages. When inserted into the database or update data, and the data will not immediately persisted to the data file. This is mainly to improve the insert, update, delete performance two Bgwriter worker process data can periodically refresh the dirty data in memory to disk, data dirty brush neither too fast nor too slow if a data block is He changed several times, but this time to refresh too fast, then these changes every time it is saved to disk, which can lead to an increase in 1/0 times. In the case of slow refresh, if a new query or update need to use memory to store data blocks read from disk, because there is no free space to store these data blocks, it is necessary to vacate the memory, that a number of first in memory of dirty pages written to disk, which would lead to longer need to wait for queries or updates, naturally reduces the performance of these mechanisms by L-mentioned order " 'bgwriter _" at the beginning of configuration parameters to control

- tasks dirty data pages in the shared buffer is written to disk file

- Use the LRU algorithm to clean up dirty pages

- spend more time in sleep, when activated work

 

 

2.1.8 Auxiliary process WAL writer (write-ahead log):

WAL is WriteAheadLog acronym, Chinese called write-ahead log. WALlog also referred to as xlog.

WalWriter process is the process of writing WAL log. The concept is that write-ahead log before modifying the data, you must make these modifications recorded to disk, so the following updates to the actual data, you do not need real-time data persisted to the file. Even if the machine suddenly database downtime or abnormal exit, resulting in dirty part of the data in memory is not refreshed in a timely manner to a file, the database is restarted by reading the WAL log, and the WAL log the last part of the re-run again, it It can be restored to the state of downtime.

WAL logs are kept in pg_ xlog. Each xlog default file is 16MB, in order to meet recovery requirements, generated at a plurality of directory xlog WAL logs this ensures downtime after, the non-persistent data can be recovered by WAL log, that do not require the WAL logs will be overwritten automatically.

 

- The write-ahead log to disk file

- Trigger timing:

• WAL BUFFER is full

• transaction commit;

• WAL writer process reaches the interval time;

When • checkpoint occurs;

 

 

2.1.9 Auxiliary process Archiver (archive):

- will be filled for the WAL log files transferred to the archive directory, the process is enabled only in archive mode

WAL logs are recycled, that is to say, WAL log earlier time would be covered. PgArch before the archiving process will be covered by the WAL log backup out. WAL logs after the start PostgreSQL from 8.X version provides PITR (Point-In-Time-Recoery) technology, popular to say, is in the database had a full backup, the backup point in time the technology will be backed up by archiving WAL generated logs, full backup database plus behind, pushed forward to the database to any of the full backup - a time point.

2.1.10 worker process Statistics Collector (statistics collection process):

- collection process statistics. Tuple space information and information collection tables and indexes, and even access information table. In addition to the information collected can be optimized using outside, there autovaccum can use, even as reference information to the database administrator database management

2.1.11 worker process Autovacuum launcher / workers (automatic system cleaning process):

- automatically clean up garbage collection process

- enable automatic cleanup function when the parameter is set on when autovacuum

- Launcher is a daemon to clean up every time you start time will call one or more worker

- Worker is responsible for the real clean-up process, which is set by the number of parameters autovacuum_max_workers

In the PostgreSQL database, DELETE operations after the table, the old data will not be deleted immediately. And, when updating the data, it does not do updates on old data, but a new generation - rows of data. This has been introduced in front of the "lock" in the relevant section, called multi-version. In this case, the old data is identified only as a deleted state, only if no other transactions concurrent read these old data, they will be cleared. The cleanup process is to be completed by the AutoVacuum.

 

 

2.2 PostgreSQL memory structure

After the start PostgreSQL, generated - a block of shared memory, the shared buffer memory is mainly used as a data block, in order to increase read and write performance. WAL log buffer and CLOG (Commit log) is also present in the shared buffer memory. In addition, some global information is also stored in shared memory, as process information, lock information, global statistics, and so on.

2.2.1 Shared Memory

Equivalent to the oracle SGA is a group of shared memory structures, is shared by all the services and background processes. When the database instance is started, the system memory is allocated automatically global area. It closed when the database instance, SGA memory is recovered. SGA is one of the largest area occupied by memory, but also an important factor affecting database performance.

Shared Buffer:

- a cache table and index | data block

A read data are directly BUFFER operation, if the cache blocks are no longer needed, it is necessary to read from disk

- is modified in the buffer, but not written to disk file blocks are called dirty blocks

- controlling the size of the shared buffers parameter

WAL(Write Ahead Log) Buffer:

- write-ahead log buffer for the transaction log writes the generated cache additions and deletions, etc.

- controlled by the size parameter wal buffers

Clog Buffer:

-Commit Log Buffer is the log cache recording the state of affairs

2.2.2 Local Memory

Oracle equivalent of PGA

A PGA is an exclusive area of ​​memory, Oracle processes in a proprietary way to use it to store data and control information. When Oracle process starts, PGA also created by the Oracle database. When a user process to connect to the database and create a corresponding session, Oracle server process will set up a special area for the PGA user to store the content of this user sessions. When the user session is terminated, the system will automatically release the PGA area occupied by the memory.

Local memory is the exclusive server process memory structure, each sub-process is assigned a postgre - a respective small memory, as the connection session is increased, it is not part of an example of

work_ mem: memory for sorting

maintenance work mem: memory is used for internal operation and maintenance work, such as VACUUM garbage collection, create and rebuild indexes, etc.

temp_ buffers: a temporary table for storing data

2.3 PostgreSQL memory directory structure

PostgreSQL hierarchy
  1. The logical hierarchy
    Database the Cluster (instance) - "Database -" the Schema - "Objects (the Table) -> Tuples are
  2. physical hierarchy
    Database Cluster -" Tablespaces - "Files - "Blocks

 

 

2.3.1 installation directory structure

PGHOME = / opt / pgsql11.4 PostgreSQL is the directory to install the software.

 

 

2.3.1 Data Directory Structure

PGDATA general use environment variables point to the root of the data directory. This directory is specified during installation, you need to specify an appropriate directory as the root of the data directory during installation, and each database instance needs to have such a directory. Directory is initialized using initdb to complete. After completion, the root of this data will be generated

Three configuration files.

postgresql conf:. main configuration file database instance, substantially all of the configuration parameters in the file.

pg_ hba.conf: authentication configuration files, configuration which allows IP hosts to access the database, authentication method is what other information.

pg_ ident.conf: User mapping file "ident" authentication method.

1 overall directory structure

base,global,logfile,pg_clog,pg_multixact,pg_notify

,pg_serial,pg_snapshots,pg_stat_tmp,pg_subtrans

,pg_tblspc,pg_twophase,PG_VERSION,pg_xlog

 ,postgresql.conf,pg_hba.conf,pg_ident.conf

2 base directory entity file directory

 

  1. base directory used to store files in the database of all entities.
  2. Subdirectories are named under the OID database.
  3. Under the database subdirectory is the object OID named file.
  4. PG VERSION database corresponding to the current data format of the version number.

Fsm file name to the end of the data file corresponding FSM (free

space map) file, a bitmap identifying which block is empty

Leisure.

  1. Vm ending file is a data file corresponding VM (visibility

map), in the multi-version concurrency control is done by the head of the tuple identifier

"No longer valid" to achieve deleted or updated, and finally by VACUUM

Function to clean up the invalid data reclaim free space.

3 global shared global directory

■pg_ control

Used for global control information

■pg_ filenode.map

OID for the current directory system tables with specific file names

Hard-coded maps (database directory for each user-created

There are also under the same file name).

■pg_ internal.init

A system table cache, read speed up system tables (each

User-created database directory files have the same name).

 

■ global file system table

Digital file named for content storage system tables. it;

They relfilenode in pg_ class in are zero, depending on

pg_ filenode.map file with the OID hard-coded map.

4 Other common directory

 

 

  1. pg_ wal, WAL log directory is very important
  2. pg_ xact, Commit log directory is very important. V9 version of the directory formerly pg_ clog
  3. pg_ hba.conf client authentication configuration file, you can configure the client connection protocols, encryption, ACL, etc.
  4. The database cluster postgresql.conf configuration file in text format
  5. postgresql.auto.conf, also the parameter configuration file, all commands modified by the alter command will be saved in this file, that file will override parameter, the value of the same parameter postgresql.conf file to binary
  6. postmaster.pid, the main operating system Postmaster process PID, the file will be generated after the database instance starts normally.

2.3.4 Table space directory

 

  After creating a table space will be generated with the subheading "Catalog version" in the root directory table space

Record, such as:

   CREATE TABLESPACE tbs01 LOCATION ' /home/osdba/tbs01";

   It will also generate - a subdirectory name "PG, 9.3_ 201306121":

   osdbatosdba-laptop:~s 1s -1 /home/osdba/tbs01

   total 4

   drw ------ 3 osdba osdba 4096 10 May 19 14:29 PG_ 9.3_ 201306121

 

   Subdirectory name "PG 93 201306121" in the "* 201 306 121" is the "Catalog version," Catalog version "

You can check out the pg_ controldata command:

   osdbadosdba-laptop: -/pgsql/bin5 Pg_ controldata

   P9_ control version number:      937

  Catalog version number:    201306121

  Database system identifier:  5925271200006401779

  Database cluster state:  in production

   In the "PG 9.3 _201306121" subdirectory, and there will be a - some subdirectories which is the database name

The oid, as follows:

  osdba@osdba-laptop:-$ 1s -1 /home/osdba/tbs01/PG_ 9.3_ 201306121/

   total 4

   drw ------ 2 osdba osdba 4096 10 May 19 14:29 16384

   For example, the above "16384" subdirectory is "osdba" database oid, as follows:

   osdbaf select oid, datname from P9_ database;

  oid datname

     1       template1

 12065    template0

12070     postgres

16384 oadba

Published 37 original articles · won praise 0 · Views 2401

Guess you like

Origin blog.csdn.net/syjhct/article/details/100813693