Postgres slow SQL troubleshooting methods

One, slow SQL logging

1.1 Related parameter settings

-- 基本日志采集开启参数
log_destination = 'stderr'              # Valid values are combinations of
logging_collector = on                  # Enable capturing of stderr and csvlog
log_directory = '/data/pgsql12/logs'    # directory where log files are written,
log_filename = 'postgresql-%w.log'      # 最多保存一周的日志,每天一个文件
log_file_mode = 0600                    # creation mode for log files,
log_truncate_on_rotation = on           # If on, an existing log file with the same name as the new log file will be truncated rather than appended to.
log_rotation_age = 1d                   # Automatic rotation of logfiles will
log_rotation_size = 10MB                # Automatic rotation of logfiles will

-- 慢SQL记录相关参数
log_min_duration_statement = 1000       # -1 is disabled, 0 logs all statements,> 0 logs only statements running at least this number of milliseconds
log_statement = 'mod'                   # none, ddl, mod, all 。记录慢SQL可使用all或者mod

-- 对某个数据库进行慢SQL记录设置
alter database db1 set log_min_duration_statement=5000;

1.2 View related slow SQL days

-- 当log_statement=all,会记录所有语句
# tail -100f postgresql-6.log
2020-11-14 15:47:54.246 CST [9669] STATEMENT:  select info from t1 group by info order by crt_time limit 10;
2020-11-14 15:48:11.145 CST [9669] LOG:  statement: select info from t1 group by info  limit 10;
2020-11-14 15:48:11.148 CST [9669] LOG:  duration: 2.865 ms
2020-11-14 15:48:31.490 CST [9669] ERROR:  syntax error at or near "having" at character 30
2020-11-14 15:48:31.490 CST [9669] STATEMENT:  select info from t1 group by having count(*) > 30 limit 10;
2020-11-14 15:48:37.600 CST [9669] LOG:  statement: select info from t1 group by info having count(*) > 30 limit 10;        //记录慢SQL语句
2020-11-14 15:48:37.604 CST [9669] LOG:  duration: 4.161 ms                                 //记录慢SQL执行时间

-- 当 log_statement=mod,仅仅会记录执行时间大于 log_min_duration_statement 的语句
2020-11-14 16:07:31.600 CST [10874] LOG:  duration: 3.817 ms  statement: select info from t1 group by info having count(*) > 30 limit 10;
2020-11-14 16:09:36.680 CST [10874] LOG:  duration: 3.491 ms  statement: select info from t1 group by info having count(*) > 30 limit 10;

Two, pg_stat_statements plugin

2.1 Plug-in installation

1. Compile and install

-- 进入postgresql源码目录的contrib目录下,查看是否有pg_stat_statements子目录
# pwd
/usr/local/postgresql-12.2/contrib/pg_stat_statements
# make && make install

-- 进入postgresql的安装目录,若在share/extension目录下以下pg_stat_statements相关文件,说明该工具安装成功
# pwd
/usr/local/pgsql/share/extension
[postgres@sansi_test extension]$ ll | grep pg_stat_statements
-rw-r--r-- 1 root root  1246 Oct 25 15:46 pg_stat_statements--1.0--1.1.sql
-rw-r--r-- 1 root root  1336 Oct 25 15:46 pg_stat_statements--1.1--1.2.sql
-rw-r--r-- 1 root root  1454 Oct 25 15:46 pg_stat_statements--1.2--1.3.sql
-rw-r--r-- 1 root root   345 Oct 25 15:46 pg_stat_statements--1.3--1.4.sql
-rw-r--r-- 1 root root   305 Oct 25 15:46 pg_stat_statements--1.4--1.5.sql
-rw-r--r-- 1 root root  1427 Oct 25 15:46 pg_stat_statements--1.4.sql
-rw-r--r-- 1 root root   376 Oct 25 15:46 pg_stat_statements--1.5--1.6.sql
-rw-r--r-- 1 root root   806 Oct 25 15:46 pg_stat_statements--1.6--1.7.sql
-rw-r--r-- 1 root root   191 Oct 25 15:46 pg_stat_statements.control
-rw-r--r-- 1 root root   449 Oct 25 15:46 pg_stat_statements--unpackaged--1.0.sql

2. Modify the configuration file

-- 预加载插件
shared_preload_libraries='pg_stat_statements,pg_pathman'        //若数据库同时使用pg_pathman,该顺序不可改变
-- pg_stat_statements记录最大行数,默认为1000
pg_stat_statements.max = 10000
pg_stat_statements.track = all

-- 重启数据库
# su - postgres
$ pg_ctl -D /data/pgsql12/data restart

3. Load pg_stat_statement

-- 登录数据库,载入pg_stat_statement插件
postgres=# SELECT * FROM pg_available_extensions WHERE name = 'pg_stat_statements';               //查看可用模块
-[ RECORD 1 ]-----+----------------------------------------------------------
name              | pg_stat_statements
default_version   | 1.7
installed_version |
comment           | track execution statistics of all SQL statements executed

postgres=# create extension pg_stat_statements;         //载入模块,载入后pg_stat_statements表可正常使用
CREATE EXTENSION

4. Indicator meaning

postgres=# select * from  pg_stat_statements limit 1;
-[ RECORD 1 ]-------+------------------------------------------------------------------------
userid              | 10                        //用户id
dbid                | 13547                     //数据库oid
queryid             | 1194713979                //查询id
query               | SELECT * FROM pg_available_extensions WHERE name = 'pg_stat_statements'   //查询SQL
calls               | 1                         //调用次数
total_time          | 53.363875                 //SQL总共执行时间
min_time            | 53.363875                 //SQL最小执行时间
max_time            | 53.363875                 //SQL最大执行时间
mean_time           | 53.363875                 //SQL平均执行时间
stddev_time         | 0                         //SQL花费时间的表中偏差
rows                | 1                         //SQL返回或者影响的行数
shared_blks_hit     | 1                         //SQL在在shared_buffer中命中的块数
shared_blks_read    | 0                         //SQL从page cache或者磁盘中读取的块数
shared_blks_dirtied | 0                         //SQL语句弄脏的shared_buffer的块数
shared_blks_written | 0                         //SQL语句写入的块数
local_blks_hit      | 0                         //临时表中命中的块数
local_blks_read     | 0                         //临时表需要读的块数
local_blks_dirtied  | 0                         //临时表弄脏的块数
local_blks_written  | 0                         //临时表写入的块数
temp_blks_read      | 0                         //从临时文件读取的块数
temp_blks_written   | 0                         //从临时文件写入的数据块数
blk_read_time       | 0                         //从磁盘或者读取花费的时间
blk_write_time      | 0                         //从磁盘写入花费的时间

Three, investigation methods

3.1 View the execution of the current session

Check whether there is a slow SQL with high concurrency and inefficient execution from the current session

SELECT pgsa.datname AS database_name
    , pgsa.usename AS user_name
    , pgsa.client_addr AS client_addr
    , pgsa.application_name AS application_name
    , pgsa.state AS state
	, pgsa.backend_start AS backend_start
	, pgsa.xact_start AS xact_start
	, extract(epoch FROM now() - pgsa.xact_start) AS xact_time, pgsa.query_start AS query_start
	, extract(epoch FROM now() - pgsa.query_start) AS query_time
	, pgsa.query AS query_sql
FROM pg_stat_activity pgsa
WHERE pgsa.state != 'idle'
	AND pgsa.state != 'idle in transaction'
	AND pgsa.state != 'idle in transaction (aborted)'
ORDER BY query_time DESC
LIMIT 5;

3.2 View TOP SQL through pg_stat_statements

1) Enable and remake the pg_stat_statements plugin

create extension pg_stat_statements;
select pg_stat_reset();
select pg_stat_statements_reset();

2) Query resource consumption TOP SQL

-- 总查询时间TOP
select * from pg_stat_statements order by total_time desc limit 5;
-- 总IO消耗TOP
select * from pg_stat_statements order by blk_read_time+blk_write_time desc limit 5;
-- 总调用次数TOP
select * from pg_stat_statements order by calls desc limit 5;

3.3 Accumulated sessions with inefficient kill execution

1) At the database level

-- 取消进程当前正在执行的查询
select pg_cancel_backend(pid) 

-- 强制终止整个连接进程 
select pg_terminate_backend(pid)

-- SQL层面直接kill
select pg_cancel_backend(pid) from pg_stat_activity where pid != pg_backend_pid() [ and client_addr='xxx' and query like 'xxx%'] ;
select pg_terminate_backend(pid) from pg_stat_activity where  pid != pg_backend_pid() [ and client_addr='xxx' and query like 'xxx%'] ;

2) Operating system level

-- 相当于pg_cancel_backend
kill -2 pid

-- 相当于pg_terminate_backend
kill -9 pid


Document reference: https://blog.csdn.net/horses/article/details/107209023

Guess you like

Origin blog.csdn.net/weixin_37692493/article/details/109706700