PG execution plan analysis

1. Basic use of Explain

1.1 Command explanation

explain [ ( option [,...] ) ] statement
explain [ analyze ] [ verbose ] statement

option选项有:
analyze [ boolean ]                     //会实际执行SQL,并返回SQL实际执行的相关统计信息
verbose [ boolean ]                     //显示执行计划的附加信息
costs [ boolean ]                       //默认开启,显示每个计划节点的启动成本、总成本,预计返回行数,预估返回结果集每行平均宽度
buffers [ boolean ]                     //显示缓冲区使用信息
format [ text | xml | json | yaml ]     //执行计划执行输出格式

1.2 explain result output interpretation

Example of execution plan:

db1=# explain (analyze 1,verbose 1,costs 1,buffers 1) select * from t3 where name='aa';
                                             QUERY PLAN
----------------------------------------------------------------------------------------------------
 Seq Scan on public.t3  (cost=0.00..1.10 rows=1 width=70) (actual time=0.007..0.008 rows=2 loops=1)
   Output: id, name, gmt_create
   Filter: ((t3.name)::text = 'aa'::text)
   Rows Removed by Filter: 6
   Buffers: shared hit=1
 Planning Time: 0.139 ms
 Execution Time: 0.020 ms
(7 rows)

Result output explanation:

 Seq Scan on public.t3                      //seq scan表示全表扫描
 (cost=0.00..1.10 rows=1 width=70)          //以".."为分隔符,前面表示启动的成本,后面表示返回结果集第一行记录的成功,rows表示预估结果集行数,width表示预估结果集每行的宽度
 (actual time=0.007..0.008 rows=2 loops=1)  //当analyze设置为true时,会输出SQL实际执行后的相关资源消耗信息,分别时启动时间,返回结果集花费时间,返回结果集行数
 Output: id, name, gmt_create               //当verbose为true时,输出查询字段
 Filter: ((t3.name)::text = 'aa'::text)     //过滤条件
 Buffers: shared hit=1                      //当buffers为true时,输出缓冲区命中代价,共读取共享内存中的一个数据块,
 Planning Time: 0.139 ms
 Execution Time: 0.020 ms

2. Scanning data method

2.1 Full table scan

The full table scan also becomes a sequential scan, which is identified by "Seq Scan" in the execution plan. A full table scan is to read all the data blocks of the table sequentially from beginning to end.

db1=# explain (analyze 1,verbose 1,costs 1,buffers 1) select * from t3 where name='ff';
                                               QUERY PLAN
---------------------------------------------------------------------------------------------------------
 Seq Scan on public.t3  (cost=0.00..18.30 rows=225 width=15) (actual time=0.011..0.117 rows=225 loops=1)
   Output: id, name, gmt_create
   Filter: ((t3.name)::text = 'ff'::text)
   Rows Removed by Filter: 759
   Buffers: shared hit=6
 Planning Time: 0.057 ms
 Execution Time: 0.137 ms
(7 rows)

2.2 Index scan

Index scan is designed to speed up the efficiency of querying data, and is marked with "Index Scan using ${index_name}" in the execution plan. The index first finds out the physical location of the record by scanning the index, and then queries the required fields in the back table.

db1=# explain (analyze 1,verbose 1,costs 1,buffers 1) select * from t3 where name='ww';
                                                       QUERY PLAN
------------------------------------------------------------------------------------------------------------------------
 Index Scan using idx_t3_name on public.t3  (cost=0.28..8.29 rows=1 width=15) (actual time=0.043..0.043 rows=0 loops=1)
   Output: id, name, gmt_create
   Index Cond: ((t3.name)::text = 'ww'::text)
   Buffers: shared hit=1 read=1
 Planning Time: 0.065 ms
 Execution Time: 0.056 ms
(6 rows)

2.3 Bitmap scanning

Bitmap index is also a way of indexing, which is identified by "Bitmap Heap Scan" in the execution plan. Scan the index and create a bitmap in the memory for rows or blocks that meet the conditions, and then read the corresponding data from the data file from the bitmap to the table after the index is scanned. If you have left two indexes, you can form the index into a bitmap according to the actual situation, do "or" and "and" calculations, merge them into a bitmap, and then read the data in the data file of the table.

db1=# explain (analyze 1,verbose 1,costs 1,buffers 1) select * from t3 where name='gg';
                                                       QUERY PLAN
------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on public.t3  (cost=5.27..12.89 rows=129 width=15) (actual time=0.048..0.064 rows=129 loops=1)
   Output: id, name, gmt_create
   Recheck Cond: ((t3.name)::text = 'gg'::text)
   Heap Blocks: exact=6
   Buffers: shared hit=7 read=1
   ->  Bitmap Index Scan on idx_t3_name  (cost=0.00..5.24 rows=129 width=0) (actual time=0.042..0.042 rows=129 loops=1)
         Index Cond: ((t3.name)::text = 'gg'::text)
         Buffers: shared hit=1 read=1
 Planning Time: 0.059 ms
 Execution Time: 0.085 ms
(10 rows)

2.4 Condition filtering

When the scanned result set needs to be filtered by conditions, it is marked with "Filter: (conditions)" in the execution plan. Whether to filter the index or not depends on the specific situation.

db1=# explain (analyze 1,verbose 1,costs 1,buffers 1) select * from t3 where name='ww' and gmt_create<now();
                                                       QUERY PLAN
------------------------------------------------------------------------------------------------------------------------
 Index Scan using idx_t3_name on public.t3  (cost=0.28..8.30 rows=1 width=15) (actual time=0.013..0.013 rows=0 loops=1)
   Output: id, name, gmt_create
   Index Cond: ((t3.name)::text = 'ww'::text)
   Filter: (t3.gmt_create < now())
   Buffers: shared hit=2
 Planning Time: 0.080 ms
 Execution Time: 0.024 ms
(7 rows)

Three, table association method

3.1 NestLoop join

In the nested loop, the outer table drives the inner table. For the outer table records that meet the conditions, the inner table is associated and matched one by one. For nested inspections, the outer table should be as small as possible, and the table-related fields of the inner table should be guaranteed to have a valid index.

db1=# explain select * from t3 join t4 on t3.name=t4.name where t4.name='ww';
                                 QUERY PLAN
-----------------------------------------------------------------------------
 Nested Loop  (cost=0.28..17.40 rows=1 width=30)
   ->  Index Scan using idx_t3_name on t3  (cost=0.28..8.29 rows=1 width=15)
         Index Cond: ((name)::text = 'ww'::text)
   ->  Seq Scan on t4  (cost=0.00..9.10 rows=1 width=15)
         Filter: ((name)::text = 'ww'::text)
(5 rows)

3.2 Hash join

Select the small table in the table association, create a hash table for the table association fields in memory, and then scan the larger table to do a hash detection hash table to find out the records that match the hash table. When the table is relatively small, you can directly put all the small tables into the memory. If the memory cannot be put down, the optimizer will cut it into several different partitions, and write the part that cannot be put into the memory into the temporary segment of the disk. Need to have a larger temporary segment in order to maximize IO performance.

db1=# explain select * from t3 join t4 on t3.name=t4.name where t4.id>300;
                            QUERY PLAN
------------------------------------------------------------------
 Hash Join  (cost=28.14..360.84 rows=27026 width=30)
   Hash Cond: ((t4.name)::text = (t3.name)::text)
   ->  Seq Scan on t4  (cost=0.00..9.10 rows=188 width=15)
         Filter: (id > 300)
   ->  Hash  (cost=15.84..15.84 rows=984 width=15)
         ->  Seq Scan on t3  (cost=0.00..15.84 rows=984 width=15)
(6 rows)

3.3 Merge join

Under normal circumstances, the effect of hash join is better than merge join. When there is an index on the source data, or the results have been sorted, there is no need to sort when performing sort merge join. In this case, the effect of merge join is better. For hash join.

db2=# explain select * from t1 join t2 on t1.id=t2.id;
                                QUERY PLAN
---------------------------------------------------------------------------
 Merge Join  (cost=4.60..9.07 rows=80 width=14)
   Merge Cond: (t1.id = t2.id)
   ->  Index Scan using t1_pkey on t1  (cost=0.28..34.74 rows=898 width=7)
   ->  Sort  (cost=4.33..4.53 rows=80 width=7)
         Sort Key: t2.id
         ->  Seq Scan on t2  (cost=0.00..1.80 rows=80 width=7)
(6 rows)

Guess you like

Origin blog.csdn.net/weixin_37692493/article/details/109232501