Reasons for slow update of postgresql database update (resolved)

Reasons for slow update of postgresql database update (resolved)

In the past few days, I found an update statement (about 140,000 pieces of data). It has been running for an hour and has not been completed.
Here are some of my solutions.
My update statement updates values ​​from a temporary table to another official table
because of the specific data. It needs to be kept secret, I won’t take screenshots, just talk about the general idea and method

1. Check if there is a problem with the statement

Copy two identical tables and data. Manually execute the statement and find that it runs successfully in less than a minute, so that you can confirm that the statement is correct.

2. Find the factors that affect updata

My first reaction is whether there are locks and locks that will cause waiting or deadlock

Query lock

select w1.pid as 等待进程,
w1.mode as 等待锁模式,
w2.usename as 等待用户,
w2.query as 等待会话,
b1.pid as 锁的进程,
b1.mode 锁的锁模式,
b2.usename as 锁的用户,
b2.query as 锁的会话,
b2.application_name 锁的应用,
b2.client_addr 锁的IP地址,
b2.query_start 锁的语句执行时间
from pg_locks w1
join pg_stat_activity w2 on w1.pid=w2.pid
join pg_locks b1 on w1.transactionid=b1.transactionid and w1.pid!=b1.pid
join pg_stat_activity b2 on b1.pid=b2.pid
where not w1.granted;
SELECT pg_terminate_backend(pid) FROM pg_stat_activity WHERE pid='62560'

Check that there is a lock, kill the lock process and restart the service. Continue tracking and find that the lock appears again after 5 minutes. I tried several times and found that it has nothing to do with the lock.

3. Query parameters

The first thing I looked at was the shared_buffers parameter, and found no problem
Insert picture description here

4. Shrink table VACUUM

When querying the data process, it was found that the automatic shrinkage was also executed for 10 minutes before the query table was shrinked

Used for server monitoring, process can be queried, time consumption is related to lock

SELECT 

C.relname 对象名称,
l.locktype 可锁对象的类型,
l.pid 进程id,
l.MODE 持有的锁模式,
l.GRANTED 是否已经对锁进行授权,
l.fastpath,
psa.datname 数据库名称,
psa.usesysid 用户id,
psa.usename 用户名称,
psa.application_name 应用程序名称,
psa.client_addr 连接的IP地址,
psa.client_port 连接使用的TCP端口号,
psa.backend_start 进程开始时间,
psa.xact_start 事务开始时间,
psa.query_start 事务执行此语句时间,
psa.state_change 事务状态改变时间,
psa.wait_event_type 等待事件类型,
psa.wait_event 等待事件,
psa.STATE 查询状态,

backend_xid 事务是否有写入操作,
backend_xmin 是否执事务快照,

psa.query 执行语句,
now( ) - query_start 持续时间

FROM

pg_locks l
INNER JOIN pg_stat_activity psa ON ( psa.pid = l.pid )
LEFT OUTER JOIN pg_class C ON ( l.relation = C.oid )
-- where l.relation = 'tb_base_apparatus'::regclass

where relkind ='r'
ORDER BY query_start asc

Whether the query reaches the automatically cleaned table

SELECT
    c.relname 表名,
    (current_setting('autovacuum_analyze_threshold')::NUMERIC(12,4))+(current_setting('autovacuum_analyze_scale_factor')::NUMERIC(12,4))*reltuples AS 自动分析阈值,
    (current_setting('autovacuum_vacuum_threshold')::NUMERIC(12,4))+(current_setting('autovacuum_vacuum_scale_factor')::NUMERIC(12,4))*reltuples AS 自动清理阈值,
    reltuples::DECIMAL(19,0) 活元组数,
    n_dead_tup::DECIMAL(19,0) 死元组数
FROM
    pg_class c 

LEFT JOIN pg_stat_all_tables d

    ON C.relname = d.relname
WHERE
    c.relname LIKE'tb%'  AND reltuples > 0
    AND n_dead_tup > (current_setting('autovacuum_analyze_threshold')::NUMERIC(12,4))+(current_setting('autovacuum_analyze_scale_factor')::NUMERIC(12,4))*reltuples;

Then I found that there were too many dead ancestors.
Then I manually shrank the table and updated it soon.

VACUUM FULL VERBOSE 表名;
VACUUM FULL VERBOSE ANALYZE 表名;

5. Summary

In this case, you need to make sure that your sql statement has no problems, and then check whether there is a lock to EXPLAIN, and check the database parameters, whether it is the performance reason of the database, and finally see if you need to shrink the table

Guess you like

Origin blog.csdn.net/yang_z_1/article/details/113237696