postgresql基础学习(二)——TOAST,分区表

目录

TOAST简介

变长类型

TRUNK

TOAST策略

heap-only tuple技术简介

表继承

分区表

分区表操作

建立父表

建立子表

建立子表索引

建立触发器:

小结:


TOAST简介

全称The Oversized-Attributes Storage Technique,超大字段存储技术。

PG页面(block)通常为8kb,不允许跨越多个页面。大字段(超过1/4 blocksize)会被压缩 或者 分片成多行存到另一张系统表——TOAST表。

只有变长的数据类型才能支持TOAST。( 定长的大字段如何处理? 暂不知道,日后再说)

变长类型

bit0 bit1 bits 2~31 N bytes
压缩标志 行外标志 length value or pointer

前2 bits为标志位,30bits长度,value实际长度,非压缩后长度。

压缩标志,如果置1,那么value是压缩的。

行外标志,如果置1,那么长度后面的value将是一个指针,指针指向TOAST表中的位置。

TRUNK

行外存储切成多个chunk块,chunk为四分之一block大小。

TOAST表里包含N个chunk块,每一个chunk块当做一个独立的行。

trunk_id trunk_seq trunk_data
OID 序列号 实际数据

TOAST指针数据:TOAST‘s oid, chunk_id(oid), 未压缩数据长度,压缩后数据长度。共20 bytes

TOAST策略


PLAIN:避免压缩和行外存储
EXTENDED:允许压缩和行外存储,先压缩,犹大则行外存
EXERNA:允许行外存,不虚压缩
MAIN: 允许压缩,禁止行外存

指定策略

testdb=# ALTER TABLE t2 ALTER title SET STORAGE MAIN;
ALTER TABLE
testdb=# \d+ t2 
                          Table "public.t2"
 Column |  Type   | Modifiers | Storage | Stats target | Description 
--------+---------+-----------+---------+--------------+-------------
 id     | integer |           | plain   |              | 
 title  | text    |           | main    |              | 

testdb=# 

heap-only tuple技术简介

表上设定存储参数fillfactor,这个叫填充因子,设定一个数据块中只写入百分之多少,剩下的部分不再填充

那剩下的空间干嘛用呢 ? 数据更新。

当更新一条数据,如果有空闲空间,直接再后面插入一条新的数据,再把旧行和新行形成链表。

这样不需要更新索引,也可以快速找到更新后的数据。

如图所示更新data1, 将新数据填入该block空闲部分,再将data1->next 指向新数据。

表继承

类似于C++等面向对象的语言中类的继承,PostgreSql支持表继承。

Select父表,会显示其下所有子表的数据。

举例,父表computer, 子表有dell和hasee。

testdb=# select * from only computer;
 name  |   cpu    | memory 
-------+----------+--------
 oldPC | i5-4570k |     12
(1 row)

testdb=# select * from computer;
     name      |    cpu    | memory 
---------------+-----------+--------
 oldPC         | i5-4570k  |     12
 laptop_ck     | i5-8250u  |     12
 andoverlaptop | i7-7700hq |     16
 gamebook      | i7-8579H  |      8
(4 rows)

testdb=# 
testdb=# 
testdb=# select * from dell;
     name      |    cpu    | memory |    model    
---------------+-----------+--------+-------------
 laptop_ck     | i5-8250u  |     12 | 15E8525
 andoverlaptop | i7-7700hq |     16 | inspire5000
(2 rows)

testdb=# select * from hasee;
   name   |   cpu    | memory |  model   | usedfor | graphisc 
----------+----------+--------+----------+---------+----------
 gamebook | i7-8579H |      8 | z7m-kp7s | Game    | 1050Ti
(1 row)

testdb=# \d+ computer
                       Table "public.computer"
 Column |  Type   | Modifiers | Storage  | Stats target | Description 
--------+---------+-----------+----------+--------------+-------------
 name   | text    |           | extended |              | 
 cpu    | text    |           | extended |              | 
 memory | integer |           | plain    |              | 
Child tables: dell,
              hasee

testdb=# 

分区表

分区表是通过继承来实现。 父表不定义任何约束条件,N个子表与父表一样结构,子表加约束条件。

按照一定规则,把数据分别存储到不同的子表里面去。

分区表的好处是啥?

1. 按时间分区,删除历史数据很快,旧数据也可以迁移到便宜慢速的存储介质
2. 按时间分区,近时间高频使用的分区表索引可以完全缓存到内存
3. 一个分区是集中的,针对一个分区的操作,不会像大表一样离散

创建步骤:
1. 创建父表,不定义任何检查约束
2. 创建几个字表,一般不增加字段
3. 分区表增加约束
4. 分区表字段增加索引
5. 定义规则or触发器,对主表的操作重定向到分区表。这一点很像面向对象的多态。

分区表操作

建立父表


testdb=# create table mileage_detail(
testdb(# plate_number varchar(10) not null,
testdb(# mdate         date not null,
testdb(# mileage     int not null,
testdb(# brand   varchar(10),
testdb(# model      varchar (10),
testdb(# PRIMARY KEY(plate_number, mdate)
testdb(# );
CREATE TABLE
testdb=# 
testdb=# \d
             List of relations
 Schema |      Name      | Type  |  Owner   
--------+----------------+-------+----------
 public | computer       | table | postgres
 public | dell           | table | postgres
 public | hasee          | table | postgres
 public | mileage_detail | table | postgres
 public | t1             | table | postgres
 public | t2             | table | postgres
(6 rows)

testdb=# 
testdb=# \d+ mileage_detail
                              Table "public.mileage_detail"
    Column    |         Type          | Modifiers | Storage  | Stats target | Description 
--------------+-----------------------+-----------+----------+--------------+-------------
 plate_number | character varying(10) | not null  | extended |              | 
 mdate        | date                  | not null  | plain    |              | 
 mileage      | integer               | not null  | plain    |              | 
 brand        | character varying(10) |           | extended |              | 
 model        | character varying(10) |           | extended |              | 
Indexes:
    "mileage_detail_pkey" PRIMARY KEY, btree (plate_number, mdate)

testdb=# 
testdb=# 

建立子表

create table mileage_detail_y2018m09(CHECK (mdate >= DATE '2018-09-01' 
AND mdate < DATE '2018-10-01')) INHERITS(mileage_detail);

create table mileage_detail_y2018m10(CHECK (mdate >= DATE '2018-10-01' 
AND mdate < DATE '2018-11-01')) INHERITS(mileage_detail);

create table mileage_detail_y2018m11(CHECK (mdate >= DATE '2018-11-01' 
AND mdate < DATE '2018-12-01')) INHERITS(mileage_detail);

testdb=# \d+ mileage_detail
                              Table "public.mileage_detail"
    Column    |         Type          | Modifiers | Storage  | Stats target | Description 
--------------+-----------------------+-----------+----------+--------------+-------------
 plate_number | character varying(10) | not null  | extended |              | 
 mdate        | date                  | not null  | plain    |              | 
 mileage      | integer               | not null  | plain    |              | 
 brand        | character varying(10) |           | extended |              | 
 model        | character varying(10) |           | extended |              | 
Indexes:
    "mileage_detail_pkey" PRIMARY KEY, btree (plate_number, mdate)
PROCEDURE mileage_detail_insert_trigger()
Child tables: mileage_detail_y2018m09,
              mileage_detail_y2018m10,
              mileage_detail_y2018m11

建立子表索引

CREATE INDEX mileage_detail_y2018m09_mdate ON mileage_detail_y2018m09 (mdate);
CREATE INDEX mileage_detail_y2018m10_mdate ON mileage_detail_y2018m10 (mdate);
CREATE INDEX mileage_detail_y2018m11_mdate ON mileage_detail_y2018m11 (mdate);

建立触发器:

触发函数

CREATE OR REPLACE FUNCTION mileage_detail_insert_trigger()
RETURNS TRIGGER AS $$
BEGIN
IF (NEW.mdate >= DATE '2018-09-01' AND NEW.mdate < DATE '2018-10-01') THEN
INSERT INTO mileage_detail_y2018m09 VALUES (NEW.*);
ELSIF (NEW.mdate >= DATE '2018-10-01' AND NEW.mdate < DATE '2018-11-01') THEN
INSERT INTO mileage_detail_y2018m10 VALUES (NEW.*);
ELSIF (NEW.mdate >= DATE '2018-11-01' AND NEW.mdate < DATE '2018-12-01') THEN
INSERT INTO mileage_detail_y2018m11 VALUES (NEW.*);
ELSE
RAISE EXCEPTION 'Date out of range. Fix the mileage_detail_insert_trigger function';
END IF;
RETURN NULL;
END;
$$
LANGUAGE plpgsql;

触发器

触发器
CREATE TRIGGER insert_mileage_detail_trigger
	BEFORE INSERT ON mileage_detail
	FOR EACH ROW EXECUTE PROCEDURE mileage_detail_insert_trigger ();
	

testdb=# \d+ mileage_detail
                              Table "public.mileage_detail"
    Column    |         Type          | Modifiers | Storage  | Stats target | Description 
--------------+-----------------------+-----------+----------+--------------+-------------
 plate_number | character varying(10) | not null  | extended |              | 
 mdate        | date                  | not null  | plain    |              | 
 mileage      | integer               | not null  | plain    |              | 
 brand        | character varying(10) |           | extended |              | 
 model        | character varying(10) |           | extended |              | 
Indexes:
    "mileage_detail_pkey" PRIMARY KEY, btree (plate_number, mdate)
Triggers:
    insert_mileage_detail_trigger BEFORE INSERT ON mileage_detail FOR EACH ROW EXECUTE PROCEDURE mileage_detail_insert_trigger()
Child tables: mileage_detail_y2018m09,
              mileage_detail_y2018m10,
              mileage_detail_y2018m11

testdb=# 
testdb=# 

验证触发器分表

testdb=# 
testdb=# INSERT INTO mileage_detail VALUES('AJ1E08', DATE '2018-09-05', 10000, 'toyota', 'corolla');
INSERT 0 0
testdb=# INSERT INTO mileage_detail VALUES('AJ1E08', DATE '2018-09-07', 10200, 'toyota', 'corolla');
INSERT 0 0
testdb=# 
testdb=# 
testdb=# INSERT INTO mileage_detail VALUES('AJ1E08', DATE '2018-10-22', 11200, 'toyota', 'corolla');
INSERT 0 0
testdb=# 
testdb=# INSERT INTO mileage_detail VALUES('AJ1E08', DATE '2018-11-21', 12200, 'toyota', 'corolla');
INSERT 0 0
testdb=# 
testdb=# select * from mileage_detail
testdb-# ;
 plate_number |   mdate    | mileage | brand  |  model  
--------------+------------+---------+--------+---------
 AJ1E08       | 2018-09-05 |   10000 | toyota | corolla
 AJ1E08       | 2018-09-07 |   10200 | toyota | corolla
 AJ1E08       | 2018-10-22 |   11200 | toyota | corolla
 AJ1E08       | 2018-11-21 |   12200 | toyota | corolla
(4 rows)

testdb=# select * from mileage_detail
mileage_detail           mileage_detail_y2018m09  mileage_detail_y2018m10  mileage_detail_y2018m11
testdb=# select * from mileage_detail_y2018m09 
testdb-# ;
 plate_number |   mdate    | mileage | brand  |  model  
--------------+------------+---------+--------+---------
 AJ1E08       | 2018-09-05 |   10000 | toyota | corolla
 AJ1E08       | 2018-09-07 |   10200 | toyota | corolla
(2 rows)

testdb=# select * from mileage_detail_y2018m10;
 plate_number |   mdate    | mileage | brand  |  model  
--------------+------------+---------+--------+---------
 AJ1E08       | 2018-10-22 |   11200 | toyota | corolla
(1 row)

testdb=# select * from mileage_detail_y2018m11;
 plate_number |   mdate    | mileage | brand  |  model  
--------------+------------+---------+--------+---------
 AJ1E08       | 2018-11-21 |   12200 | toyota | corolla
(1 row)

testdb=# 

分区表貌似实现了多态,按照触发规则自动分类数据,便于日后数据的操作。

小结:

这一篇介绍了对于行外存储COAST表,分区表等内容。后面将学习事件触发器、表空间、视图等。

猜你喜欢

转载自blog.csdn.net/jacicson1987/article/details/82705283