Postgresql database TimescaleDB regular compression super table delete super table (block)

Postgresql database TimescaleDB regular compression super table delete super table (block)

I use the postgresql database TimescaleDB timing library to store real-time data in my work. The amount of collected data is too large and the memory provided is insufficient, so consider how to save the data

The super table in
TimescaleDB database TimescaleDB database itself can be realized with functions

Functions carried by TimescaleDB database

Insert picture description here
Insert picture description here

1. Compression SELECT compress_chunk()

In order not to lose data as much as possible, to avoid deleting data, I first consider compressing the data, using its own function SELECT compress_chunk()

1 query time show_chunks()

CREATE OR REPLACE FUNCTION "hrmw"."show_chunks"("hypertable" regclass=NULL::regclass, "older_than" any=NULL::unknown, "newer_than" any=NULL::unknown)
  RETURNS SETOF "pg_catalog"."regclass" AS '$libdir/timescaledb-1.7.1', 'ts_chunk_show_chunks'
  LANGUAGE c STABLE
  COST 1
  ROWS 1000

show_shunks() usage

select show_shunks(); --查看所有块
select  show_shunks(超表名); --查看某个超表底下的所有块
SELECT show_chunks(older_than => INTERVAL '10 days', newer_than => INTERVAL '20 days');
-- 查询10天到20天的的块

To query 180 days of data

SELECT show_chunks('超表名',older_than => INTERVAL '180 days', newer_than => INTERVAL '182 days');

2.compress_chunk() compression function

CREATE OR REPLACE FUNCTION "hrmw"."compress_chunk"("uncompressed_chunk" regclass, "if_not_compressed" bool=false)
  RETURNS "pg_catalog"."regclass" AS '$libdir/timescaledb-1.7.1', 'ts_compress_chunk'
  LANGUAGE c VOLATILE STRICT
  COST 1

2.1 First make the super table compressible

ALTER TABLE '超表名' SET (
timescaledb.compress,
timescaledb.compress_segmentby = '主键(字段名)',
timescaledb.compress_orderby = '时间字段 DESC');

2.2 Compressed partition

-Compress SELECT compress_chunk();
compress 180 days of data

SELECT compress_chunk( '_timescaledb_internal._hyper_4_238_chunk');
-- SELECT compress_chunk( '_timescaledb_internal.分区名(块)');

-Query space status after compression

SELECT * FROM timescaledb_information.compressed_chunk_stats;

-unzip

SELECT decompress_chunk('_timescaledb_internal._hyper_4_26_chunk');
-- SELECT decompress_chunk('_timescaledb_internal.分区名(块)');

3. Use function to automatically compress 180 days

CREATE 
	OR REPLACE FUNCTION "hrmw"."target_compress_chunk" ( ) RETURNS "pg_catalog"."void" AS $BODY$ DECLARE--定义变量
	t_accid VARCHAR;--变量
strSQL VARCHAR ( 1000 );
BEGIN--函数开始
	t_accid := ( SELECT show_chunks ( '超表名', older_than => INTERVAL '180 days', newer_than => INTERVAL '182 days' ) );
	strSQL := 'select compress_chunk(''' || t_accid || ''' ,true);';
	EXECUTE strSQL;
	
END;--结束
$BODY$ LANGUAGE plpgsql VOLATILE COST 100

4. Add timed tasks

(Automatically compress 180-day partitions (blocks) every day at 2:30)
Use the pgadmin tool that comes with the postgresql database to create timed tasks
. Code to be added:

SET search_path TO hrmw;
select hrmw.target_compress_chunk();--执行函数target_compress_chunk()

Step 1 Step
Add a scheduled task
2
Insert picture description here
Insert picture description here
Step 3
Determine the time of the scheduled task

Two delete partition

Because there is still a lot of data after compression, the data can only be deleted half a year ago.
Batch deletion can use the drop_chunks() function.
I am lazy. I used the automation strategy
add_drop_chunks_policy()

 #创建策略 只保留保留最近半年的数据(直接删除块)
SELECT add_drop_chunks_policy('conditions', INTERVAL '6 months');

Query strategy

select * from timescaledb_information.drop_chunks_policies;

There can only be one strategy per super table

Field name description
hypertable (REGCLASS) The name of the super table to which the strategy is applied
older_than (Interval) When running this strategy, blocks much longer than this time will be discarded
cascade (Boolean value) Whether to run the strategy with the cascade option turned on, which will cause dependent objects and blocks to be discarded.
job_id (INTEGER) I of the background job set to implement the drop_chunks strategy
schedule_interval (Interval) The interval at which the job runs
max_runtime (Interval) The maximum time that the background job scheduler will allow the job to run before stopping it
max_retries (Integer) If the job fails, the number of times the job will be retried
retry_period (Interval) The time the scheduler waits between failed retries

Reference materials
https://blog.csdn.net/woai243779594/article/details/107544310?utm_medium=distribute.pc_relevant_t0.none-task-blog-BlogCommendFromBaidu-1.control&depth_1-utm_source=distribute.pc_relevant_t0.none-task-blog- BlogCommendFromBaidu-1.control

It's basically over here. If you have any questions, please leave a message or chat with me privately.
By the way, TimescaleDB has too little information to complete it, thank you

Guess you like

Origin blog.csdn.net/yang_z_1/article/details/111560747