PostgreSQL LIST, RANGE table partitioning scheme

Introduction

PG分区:就是把逻辑上的一个大表分割成物理上的几块。

Advantages of Partitioning

    1. 某些类型的查询性能得到提升
    2. 更新的性能也可以得到提升,因为某块的索引要比在整个数据集上的索引要小。
    3. 批量删除可以通过简单的删除某个分区来实现。
    4. 可以将很少用的数据移动到便宜的、转速慢的存储介质上。

Partition realization principle

The implementation principle of PG table partitioning before the 10.x version: PG is implemented through table inheritance . A main table is created, which is empty, and then each partition inherits it . The master table must be empty at all times

Official website suggestion: Only when the size of the table itself exceeds the actual size of the physical memory of the machine, consider partitioning.

original partition usage

Implemented in the way of inheritance table:

create table tbl( a int, b varchar(10) ); 
create table tbl_1 ( check ( a <= 1000 ) ) INHERITS (tbl); 
create table tbl_2 ( check ( a <= 10000 and a >1000 ) ) INHERITS (tbl);
create table tbl_3 ( check ( a <= 100000 and a >10000 ) ) INHERITS (tbl);

Then, by creating triggers or rules, data distribution is realized, and data is automatically allocated to the sub-table just by inserting data into the sub-table.

CREATE OR REPLACE FUNCTION tbl_part_tg() 
RETURNS TRIGGER AS $$ 
BEGIN 
	IF ( NEW. a <= 1000 ) THEN 
		INSERT INTO tbl_1 VALUES (NEW.*); 
	ELSIF ( NEW. a > 1000 and NEW.a <= 10000 ) THEN 
		INSERT INTO tbl_2 VALUES (NEW.*); 
	ELSIF ( NEW. a > 10000 and NEW.a <= 100000 ) THEN 
		INSERT INTO tbl_3 VALUES (NEW.*); 
	ELSIF ( NEW. a > 100000 and NEW.a <= 1000000 ) THEN 
		INSERT INTO tbl_4 VALUES (NEW.*); 
	ELSE RAISE EXCEPTION 'data out of range!'; 
	END IF;
	 RETURN NULL; 
END;
 $$ 
LANGUAGE plpgsql; 

CREATE TRIGGER insert_tbl_part_tg
     BEFORE INSERT ON tbl 
FOR EACH ROW EXECUTE PROCEDURE tbl_part_tg();

Partition created successfully

How to implement partition filtering?

For the partition table, if there are 50 partition tables, if the value of a certain condition can be determined, then it is possible to directly filter out 49 partitions, which greatly improves the scanning speed. Of course, the partition table can also be placed on different physical disks. , to improve IO speed.

How to implement partition table filtering for query?
Constraint exclusion Whether to use constraint exclusion is controlled by the parameter constraint_exclusion in postgresql.conf , there are only three values

 constraint_exclusion = on
 on:所有情况都会进行约束排除检查
 off:关闭,所有约束都不生效
 partition:对分区表或者继承表进行约束排查,默认为partition

Such as:

select *from tbl where a = 12345;

First find the main table tbl, then find its sub-tables through tbl, and then check the constraints of the sub-tables with the predicate condition a = 12345. If the table does not meet the conditions, it will be removed and not scanned, and the partition table will be filtered. The following Briefly introduce the logic of constraint exclusion source code.

How to achieve data distribution?

Based on the rules, the rules will be replaced on time in the query rewriting phase to generate a new insert statement. Based on the trigger, another insert operation will be triggered before the insert main table. These two logics are relatively simple, and the relevant code will not be introduced.

Error description: The following error message is prompted when creating a new partitioned primary table
write picture description here

Reason for the error: search_path = '$user' is configured in the local postgresql.conf , so you need to create the schema corresponding to the current user before using it. If it does not exist, an error will be prompted

Solution: Specify the created schema when creating the table, and it will be successful.
write picture description here

PostgreSQL 10.x LIST partitioning scheme

postgres=# CREATE TABLE list_parted (
postgres(# a int
postgres(# ) PARTITION BY LIST (a);
CREATE TABLE
postgres=# CREATE TABLE part_1 PARTITION OF list_parted FOR VALUES IN (1);
CREATE TABLE
postgres=# CREATE TABLE part_2 PARTITION OF list_parted FOR VALUES IN (2);
CREATE TABLE
postgres=# CREATE TABLE part_3 PARTITION OF list_parted FOR VALUES IN (3);
CREATE TABLE
postgres=# CREATE TABLE part_4 PARTITION OF list_parted FOR VALUES IN (4);
CREATE TABLE
postgres=# CREATE TABLE part_5 PARTITION OF list_parted FOR VALUES IN (5);
CREATE TABLE
postgres=#
postgres=# insert into list_parted values(32); --faled
ERROR:  no partition of relation "list_parted" found for row
DETAIL:  Failing row contains (32).
postgres=# insert into part_1 values(1);
INSERT 0 1
postgres=# insert into part_1 values(2);--faled
ERROR:  new row for relation "part_1" violates partition constraint
DETAIL:  Failing row contains (2).
postgres=# explain select *from list_parted where a =1;
                           QUERY PLAN
-----------------------------------------------------------------
 Append  (cost=0.00..41.88 rows=14 width=4)
   ->  Seq Scan on list_parted  (cost=0.00..0.00 rows=1 width=4)
         Filter: (a = 1)
   ->  Seq Scan on part_1  (cost=0.00..41.88 rows=13 width=4)
         Filter: (a = 1)
(5 rows)

The above is the LIST partition table. To build a table, first build the main table, and then build the sub-table. The sub-table is explained in the PARTITION OF way with the main table, and the constraints should be in the following in.

Explain executes the sql parsing plan

cost: The consumption unit customized by the database, and the SQL consumption is estimated through statistical information. (The query analysis is generated based on the stubbornness of analyze. After generation, the query plan is executed according to the query plan, and the analysis will not change during the execution process. Therefore, if the estimated value is quite different from the real situation, it will affect the generation of the query plan.)
rows : Estimate the number of rows in the result set returned by SQL based on statistics.
width: Returns the length of each row in the result set, which is calculated based on the statistics in the pg_statistic table.

pgAdminList partition directory

PostgreSQL 10.x RANGE partitions

Create RANGE partition

postgres=# CREATE TABLE range_parted (
postgres(# a int
postgres(# ) PARTITION BY RANGE (a);
CREATE TABLE
postgres=# CREATE TABLE range_parted1 PARTITION OF range_parted FOR VALUES from (1) TO (1000);
CREATE TABLE
postgres=# CREATE TABLE range_parted2 PARTITION OF range_parted FOR VALUES FROM (1000) TO (10000);
CREATE TABLE
postgres=# CREATE TABLE range_parted3 PARTITION OF range_parted FOR VALUES FROM (10000) TO (100000);
CREATE TABLE
postgres=#
postgres=# insert into range_parted1 values(343);
INSERT 0 1
postgres=#
postgres=# explain select *from range_parted where a=32425;
                             QUERY PLAN
---------------------------------------------------------------------
 Append  (cost=0.00..41.88 rows=14 width=4)
   ->  Seq Scan on range_parted  (cost=0.00..0.00 rows=1 width=4)
         Filter: (a = 32425)
   ->  Seq Scan on range_parted3  (cost=0.00..41.88 rows=13 width=4)
         Filter: (a = 32425)
(5 rows)
postgres=# set constraint_exclusion = off;
SET
postgres=# explain select *from range_parted where a=32425;
                             QUERY PLAN
---------------------------------------------------------------------
 Append  (cost=0.00..125.63 rows=40 width=4)
   ->  Seq Scan on range_parted  (cost=0.00..0.00 rows=1 width=4)
         Filter: (a = 32425)
   ->  Seq Scan on range_parted1  (cost=0.00..41.88 rows=13 width=4)
         Filter: (a = 32425)
   ->  Seq Scan on range_parted2  (cost=0.00..41.88 rows=13 width=4)
         Filter: (a = 32425)
   ->  Seq Scan on range_parted3  (cost=0.00..41.88 rows=13 width=4)
         Filter: (a = 32425)
(9 rows)

The value range of a in the above operation is [0, 1000), that is, if the inserted value is a boundary value of 1000, it will be stored in the second
partition table, which is similar to LIST, but the syntax is slightly different. The range table value is a A continuous range, the LIST table is a collection of single or multiple
points . As can be seen from the above example, it is obvious that the constraints are excluded from filtering the sub-tables.

constraint_exclusion = “on ,off,partition ”; 该参数为postgresql.conf中的参数
    on 表示所有的查询都会执行约束排除
    off 关闭,所有的查询都不会执行约束排除
    partition :表示只对分区的表进行约束排除

The type of the partition column must support the btree index interface (almost all types are covered, and the check method will be mentioned later).
If the updated data exceeds the range of the partition, an error will be reported

RANGE partition modification data error

PostgreSQL Partitioning Considerations

grammar

1. Create the main table

[ PARTITION BY { RANGE | LIST } ( { column_name | ( expression ) } [ COLLATE collation ] [ opclass ] [, ... ] ) ] 

2. Create a partition

PARTITION OF parent_table [ (  
  { column_name [ column_constraint [ ... ] ]  
    | table_constraint }  
    [, ... ]  
) ] FOR VALUES partition_bound_spec  
and partition_bound_spec is:  
{ IN ( expression [, ...] )   -- list分区  
  |  
  FROM ( { expression | UNBOUNDED } [, ...] ) TO ( { expression | UNBOUNDED } [, ...] ) }  -- range分区, unbounded表示无限小或无限大

Grammar Explanation

partition by 指定分区表的类型range或list
指定分区列,或表达式作为分区键。
range分区表键:支持指定多列、或多表达式,支持混合(键,非表达式中的列,会自动
添加not null的约束)
list分区表键:支持单个列、或单个表达式

The partition key must have the corresponding ops of the btree index method (you can get it from the system table)

select typname from pg_type where oid in (select opcintype from pg_opclass);    
  • The main table will not have any data , the data will enter the corresponding partition table according to the partition rules

  • If the value of the partition key does not have a matching partition when inserting data, an error will be reported

  • Does not support global unique, primary key, exclude, foreign key constraints, these constraints can only be established in the corresponding partition

  • The definition of the number of columns in the partition table and the main table must be exactly the same , (including the OID must also be the same, either all or none)

  • You can add Default values, or constraints, to the columns of the partitioned table.

  • Users can also add table-level constraints to partitioned tables

  • If the name of the newly added partition table check constraint is the same as the constraint name of the main table, the content of the constraint must be consistent with the main table

  • When the user inserts the database into the main table, the record is automatically routed to the corresponding partition. If there is no suitable partition, an error will be reported.

  • If the data is updated, and the updated KEY causes the data to be moved to another partition, an error will be reported, (meaning that the partition key can be updated, but the updated data cannot be moved to another partition table)

  • When modifying the field name and field type of the main table, all partitions will be automatically modified at the same time

  • When TRUNCATE the main table, it will clear the records of all inherited table partitions (if there are multiple levels of partitions, it will clear all the directly and indirectly inherited partitions)

  • If you want to clear a single partition, do it on the partition

  • If you want to delete a partitioned table, you can use the DDL statement of DROP TABLE. Note that this operation will also add an access exclusive lock to the main table.

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325626883&siteId=291194637