PostgreSQL batch import performance (using dblink asynchronous calls)

Label

PostgreSQL , unlogged table , 批量 , dblink

background

Importing data in batches, how to tap the potential of the system's comparative limit?

Where are the bottlenecks usually?

1、WAL lock

2、INDEX lock

3、EXTEND LOCK

4. Autovacuum interference

So the best way is to rule out the above problems, for example

1. Use multiple tables to solve the problem of single table EXTEND LOCK

2. Use unlogged table (data will be lost in the event of an exception, remember to use only in scenarios) multiple tables to solve the WAL LOCK problem

3. Do not use indexes to solve the INDEX LOCK problem

4. Do not use autovacuum when importing to solve the problem of autovacuum interference

Basically, the maximum potential of the machine can be tapped.

"HTAP database PostgreSQL scenario and performance test 43 - (OLTP+OLAP) unlogged table with index multi-table batch write"

"HTAP database PostgreSQL scenario and performance test 42 - (OLTP+OLAP) unlogged table without index multi-table batch write"

"HTAP database PostgreSQL scenario and performance test 41 - (OLTP+OLAP) multi-table batch write with index"

"HTAP database PostgreSQL scenario and performance test 40 - (OLTP+OLAP) multi-table batch write without index"

"HTAP database PostgreSQL scenario and performance test 39 - (OLTP+OLAP) with index multi-point write"

"HTAP database PostgreSQL scenario and performance test 38 - (OLTP+OLAP) without index multi-point write"

"HTAP database PostgreSQL scenario and performance test 37 - (OLTP+OLAP) batch write with index single table"

"HTAP database PostgreSQL scenario and performance test 36 - (OLTP+OLAP) batch write of single table without index"

"HTAP Database PostgreSQL Scenario and Performance Test 35 - (OLTP+OLAP) Single Point Write with Index"

"HTAP database PostgreSQL scenario and performance test 34 - (OLTP+OLAP) without index single point write"

single table test

1. Create a test table

postgres=# create unlogged table ut(c1 int8) with (autovacuum_enabled=off, toast.autovacuum_enabled=off);  
CREATE TABLE  
Time: 12.723 ms  

2. Generate 100 million data

postgres=# insert into ut select generate_series(1,100000000);  
INSERT 0 100000000  
Time: 43378.465 ms (00:43.378)  
  
postgres=# copy ut to '/data01/pg/ut.csv';  
COPY 100000000  
Time: 20292.684 ms (00:20.293)  
# ll -ht /data01/pg/ut.csv   
-rw-r--r-- 1 digoal digoal 848M Apr 27 22:02 /data01/pg/ut.csv  

3. Create a plugin

create extension dblink;  

4. Create a function that repeatedly establishes a connection without reporting an error

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325939349&siteId=291194637