FQS: A magical data warehouse query optimization technology

This article is shared from the Huawei Cloud Community " Optimizing SQL Based on Execution Plans [Bloom!" GaussDB (DWS) Cloud Native Data Warehouse】》, author: Xiling Snow Mountain.

introduction

If you are new to DWS, you must be curious to know what " REMOTE_FQS_QUERY " means? We see that the description on the official website means that the execution plan has been directly sent by the CN to the DN, and each DN is executed separately, and the execution results are summarized on the CN. And there is no need to make too many adjustments, is that really the case?

FQS plan, fully pushed down

Two tables are JOIN, and the connection condition is the distribution column of each table. When the stream operator is turned off, CN will directly send the statement to each DN for execution, and the final results will be summarized in CN.

SET enable_stream_operator=off;

SET explain_perf_mode=normal;

EXPLAIN (VERBOSE on,COSTS off) SELECT * FROM tt01,tt02 WHERE tt01.c1=tt02.c2;

QUERY PLAN

-------------------------------------------------------------------------------------------------------------------

Data Node Scan on "__REMOTE_FQS_QUERY__"

Output: tt01.c1, tt01.c2, tt02.c1, tt02.c2

Node/s: All datanodes

Remote query: SELECT tt01.c1, tt01.c2, tt02.c1, tt02.c2 FROM dbadmin.tt01, dbadmin.tt02 WHERE tt01.c1 = tt02.c2

(4 rows)

The execution plan above only shows Data Node Scan on "__REMOTE_FQS_QUERY__". This execution plan is too rough. I don't know how it is executed internally, whether the index is removed, and other more detailed information.

Next we create a table to verify

create table t5 (bh varchar(300),bh2 varchar(300),c_name varchar(300),c_info varchar(300))distribute by hash(bh);

insert into t4 select uuid_generate_v1(), uuid_generate_v1(),'测试','sdfffffffffffffffsdf' from generate_series(1,50000);

insert into t4 select * from t4;

--1. Without index:

postgres=# explain analyze select * from t4 where bh2 = '652e4e0e-ba60-0400-25b5-4ee5e490fffe';

QUERY PLAN

-----------------------------------------------------------------------------------------------------------------------------

id | operation | A-time | A-rows | E-rows | Peak Memory | A-width | E-width | E-costs

----+----------------------------------------------+---------+--------+--------+-------------+---------+---------+---------

1 | -> Data Node Scan on "__REMOTE_FQS_QUERY__" | 256.364 | 32 | 0 | 56KB | | 0 | 0.00



====== Query Summary =====

-----------------------------------------

Coordinator executor start time: 0.055 ms

Coordinator executor run time: 256.410 ms

Coordinator executor end time: 0.010 ms

Planner runtime: 0.145 ms

Query Id: 73746443917091633

Total runtime: 256.557 ms

(12 rows)

Time: 259.051 ms

--2. Add index and hint indexscan

postgres=# create index i_t4 on t4(bh2);

CREATE INDEX

Time: 3328.258 ms

postgres=# explain analyze select /*+ indexscan(t4 i_t4) */ * from t4 where bh2 = '652e4e0e-ba60-0400-25b5-4ee5e490fffe';

QUERY PLAN

----------------------------------------------------------------------------------------------------------------------------

id | operation | A-time | A-rows | E-rows | Peak Memory | A-width | E-width | E-costs

----+----------------------------------------------+--------+--------+--------+-------------+---------+---------+---------

1 | -> Data Node Scan on "__REMOTE_FQS_QUERY__" | 2.269 | 32 | 0 | 56KB | | 0 | 0.00



====== Query Summary =====

-----------------------------------------

Coordinator executor start time: 0.027 ms

Coordinator executor run time: 2.298 ms

Coordinator executor end time: 0.009 ms

Planner runtime: 0.074 ms

Query Id: 73746443917091930

Total runtime: 2.401 ms

(12 rows)

It can be seen that the execution plan when the index is not created is exactly the same as the execution plan when the index is created, but the execution time is 259.051ms and 2.401ms. The difference is very obvious. It is very likely that the second execution plan has already used the index, but the execution plan is the same. , which is not intuitive enough for optimizers.

Even if /*+ indexscan(t4 i_t4) */ is added to the execution plan, it does not print out whether the index is removed. The execution plan is too concise, and all statistical information of the business table in pg_stat_all_indexes is 0, and no judgment is issued. .

CPUTime

For the above time difference, you can also compare the CPU time consumption and add the CPU time consumption to the execution plan:

--Execution plan without index

postgres=# explain (analyze,buffers,verbose,cpu,nodes )select * from t4 where bh2 = '652e4e0e-ba60-0400-25b5-4ee5e490fffe';

QUERY PLAN

---------------------------------------------------------------------------------------------------------------------------

Data Node Scan on "__REMOTE_FQS_QUERY__" (cost=0.00..0.00 rows=0 width=0) (actual time=244.096..244.108 rows=32 loops=1)

Output: t4.bh, t4.bh2, t4.c_name, t4.c_info

Node/s: All datanodes

Remote query: SELECT bh, bh2, c_name, c_info FROM sa.t4 WHERE bh2::text = '652e4e0e-ba60-0400-25b5-4ee5e490fffe'::text

(CPU: ex c/r=762829, ex row=32, ex cyc=24410534, inc cyc=24410534)

Total runtime: 244.306 ms

(6 rows)

--Execution plan after index creation

postgres=# explain (analyze,buffers,verbose,cpu,nodes )select * from t4 where bh2 = '652e4e0e-ba60-0400-25b5-4ee5e490fffe';

QUERY PLAN

--------------------------------------------------------------------------------------------------------------------------

Data Node Scan on "__REMOTE_FQS_QUERY__" (cost=0.00..0.00 rows=0 width=0) (actual time=1.035..2.148 rows=32 loops=1)

Output: t4.bh, t4.bh2, t4.c_name, t4.c_info

Node/s: All datanodes

Remote query: SELECT bh, bh2, c_name, c_info FROM sa.t4 WHERE bh2::text = '652e4e0e-ba60-0400-25b5-4ee5e490fffe'::text

(CPU: ex c/r=6698, ex row=32, ex cyc=214354, inc cyc=214354)

Total runtime: 2.242 ms

(6 rows)

Comparing the execution plan, we can see that they are the same.

Among them, cyc represents the number of cycles of the CPU, ex cyc represents the number of cycles of the current operator, excluding its child nodes; inc cyc represents the number of cycles including child nodes; ex row is the number of data rows output by the current operator; ex c/r is the average number of cycles used for each piece of data obtained by ex cyc/ex row.

Comparison of average cpu cycles: without index: 762829, after creating index: 6698, which is about more than a hundred times.

View detailed plan

__REMOTE_FQS_QUERY__ directly sends the statement to nodedata, so the cn node does not generate an execution plan, so we cannot see whether the index is used. If we turn off enable_fast_query_shipping, we can generate an execution plan on cn and we can see whether the index is used. .

--Close fast_query

postgres=# set enable_fast_query_shipping to off;

postgres=# set explain_perf_mode=normal;

--Execution plan for indexing

postgres=# explain analyze select * from t4 where bh2 = '652e4e0e-ba60-0400-25b5-4ee5e490fffe';

QUERY PLAN

------------------------------------------------------------------------------------------------------------------------------

Streaming (type: GATHER) (cost=4.95..51.75 rows=31 width=102) (actual time=1.695..2.263 rows=32 loops=1)

Node/s: All datanodes

-> Bitmap Heap Scan on t4 (cost=4.33..43.75 rows=31 width=102) (actual time=[0.040,0.040]..[0.057,0.153], rows=32)

Recheck Cond: ((bh2)::text = '652e4e0e-ba60-0400-25b5-4ee5e490fffe'::text)

-> Bitmap Index Scan on i_t4 (cost=0.00..4.33 rows=31 width=0) (actual time=[0.035,0.035]..[0.042,0.042], rows=32)

Index Cond: ((bh2)::text = '652e4e0e-ba60-0400-25b5-4ee5e490fffe'::text)

Total runtime: 2.569 ms

(7 rows)

Time: 5.226 ms

--Full table scan after deleting index

postgres=# explain analyze select * from t4 where bh2 = '652e4e0e-ba60-0400-25b5-4ee5e490fffe';

QUERY PLAN

-------------------------------------------------------------------------------------------------------------------------

Streaming (type: GATHER) (cost=0.62..31755.34 rows=31 width=102) (actual time=294.661..294.814 rows=32 loops=1)

Node/s: All datanodes

-> Seq Scan on t4 (cost=0.00..31747.34 rows=31 width=102) (actual time=[0.084,258.294]..[280.141,293.190], rows=32)

Filter: ((bh2)::text = '652e4e0e-ba60-0400-25b5-4ee5e490fffe'::text)

Rows Removed by Filter: 3199968

Total runtime: 295.154 ms

(6 rows)

Time: 297.348 ms

Use enable_fast_query_shipping to control whether to use the distributed framework to view the specific execution plan, which is helpful for optimizing SQL.

It is impossible to judge whether there is an index based on " REMOTE_FQS_QUERY " alone, and further verification is required.

Small flaw: Even if SQL removes the index, the number of index_scan index scans in the statistics tables pg_stat_all_indexes and pg_stat_all_table are both 0.

Distribution key type impact

Common fqs generally include simple queries on a single table, as well as multi-table connections where the associated keys are distributed keys of the same type.

When there are functions in the query, different key field types in multi-table associations, different distribution key types, and non-equivalent situations may cause pushdown failure.

The following examples show that the distribution key types are different:

--The table structures of t1 and t2 are exactly the same, and the distribution keys are hash(id)

postgres=# \d+ t1

Table "sa.t1"

Column | Type | Modifiers | Storage | Stats target | Description

--------+------------------------+-----------+----------+--------------+-------------

id | character varying(300) | | extended | |

c_name | character varying(300) | | extended | |

c_info | character varying(300) | | extended | |

Indexes:

"i_t1" btree (id) TABLESPACE pg_default

"i_t1_id" btree (id) TABLESPACE pg_default

Has OIDs: no

Distribute By: HASH(id)

Location Nodes: ALL DATANODES

Options: orientation=row, compression=no

--Can be pushed down, the execution plan displays FQS

postgres=# explain select * from t1,t2 where t1.id=t2.id;

QUERY PLAN

----------------------------------------------------------------------------------

id | operation | E-rows | E-width | E-costs

----+----------------------------------------------+--------+---------+---------

1 | -> Data Node Scan on "__REMOTE_FQS_QUERY__" | 0 | 0 | 0.00

(3 rows)

--Modify the distribution key of one of the tables to a random distribution roundrobin

postgres=# alter table t1 distribute by roundrobin;

ALTER TABLE

postgres=# explain select * from t1,t2 where t1.id=t2.id;

QUERY PLAN

------------------------------------------------------------------------------------------------

id | operation | E-rows | E-memory | E-width | E-costs

----+-----------------------------------------+----------+--------------+---------+-----------

1 | -> Streaming (type: GATHER) | 13021186 | | 60 | 159866.51

2 | -> Hash Join (3,5) | 13021186 | 1MB | 60 | 159449.88

3 | -> Streaming(type: REDISTRIBUTE) | 1600000 | 2MB | 30 | 53357.30

4 | -> Seq Scan on t1 | 1600000 | 1MB | 30 | 9357.33

5 | -> Hash | 1599999 | 48MB(4435MB) | 30 | 9355.33

6 | -> Seq Scan on t2 | 1600000 | 1MB | 30 | 9355.33



RunTime Analyze Information

----------------------------------

"sa.t1" runtime: 219.368ms

"sa.t2" runtime: 184.141ms



Predicate Information (identified by plan id)

--------------------------------------------------

2 --Hash Join (3,5)

Hash Cond: ((t1.id)::text = (t2.id)::text)



====== Query Summary =====

-------------------------------

System available mem: 4546560KB

Query Max mem: 4546560KB

Query estimated mem: 131072KB

(24 rows)

--Modify the t2 table to random distribution. The result is that both tables need to be redistributed during query

postgres=# alter table t2 distribute by roundrobin;

ALTER TABLE

postgres=# explain select * from t1,t2 where t1.id=t2.id;

QUERY PLAN

---------------------------------------------------------------------------------------------------

id | operation | E-rows | E-memory | E-width | E-costs

----+--------------------------------------------+----------+--------------+---------+-----------

1 | -> Streaming (type: GATHER) | 12804286 | | 60 | 203041.85

2 | -> Hash Join (3,5) | 12804286 | 1MB | 60 | 202625.22

3 | -> Streaming(type: REDISTRIBUTE) | 1600000 | 2MB | 30 | 53357.30

4 | -> Seq Scan on t2 | 1600000 | 1MB | 30 | 9357.33

5 | -> Hash | 1599999 | 68MB(4433MB) | 30 | 53357.30

6 | -> Streaming(type: REDISTRIBUTE) | 1600000 | 2MB | 30 | 53357.30

7 | -> Seq Scan on t1 | 1600000 | 1MB | 30 | 9357.33



RunTime Analyze Information

----------------------------------

"sa.t2" runtime: 203.933ms



Predicate Information (identified by plan id)

--------------------------------------------------

2 --Hash Join (3,5)

Hash Cond: ((t2.id)::text = (t1.id)::text)



====== Query Summary =====

-------------------------------

System available mem: 4546560KB

Query Max mem: 4546560KB

Query estimated mem: 131072KB

(24 rows)

When the t1 table is randomly distributed, the join table query will need to be redistributed. When t2 is also randomly distributed, the join table query will also need to be redistributed. In the case of random distribution, it is impossible to completely push down.

The replication mode will not be demonstrated because replication means that all DNs have a copy of data, so the data volume is the number of DNs * the amount of table data. Each node has a complete copy of the data, which can definitely be pushed down.

Change both t1 and t2 to hash distribution, and then select a non-distribution column for the association. Obviously, it cannot be completely pushed down directly:

postgres=# alter table t1 distribute by hash(id);

ALTER TABLE

postgres=# alter table t2 distribute by hash(id);

ALTER TABLE

--Add c_name into the association

postgres=# explain select * from t1,t2 where t1.id=t2.c_name;

QUERY PLAN

---------------------------------------------------------------------------------------------------------------------

id | operation | E-rows | E-memory | E-width | E-costs

----+--------------------------------------------------------------+----------+--------------+---------+-----------

1 | -> Streaming (type: GATHER) | 12621020 | | 61 | 182863.95

2 | -> Hash Join (3,5) | 12621020 | 1MB | 61 | 182447.32

3 | -> Streaming(type: PART REDISTRIBUTE PART ROUNDROBIN) | 1600000 | 2MB | 30 | 54688.64

4 | -> Seq Scan on t2 | 1600000 | 1MB | 30 | 9355.33

5 | -> Hash | 1599999 | 48MB(4433MB) | 31 | 32355.32

6 | -> Streaming(type: PART LOCAL PART BROADCAST) | 1600000 | 2MB | 31 | 32355.32

7 | -> Seq Scan on t1 | 1600000 | 1MB | 31 | 9355.33

-- If you change t1 to replication

postgres=# alter table t1 distribute by replication ;

ALTER TABLE

postgres=# explain select * from t1,t2 where t1.id=t2.id;

QUERY PLAN

----------------------------------------------------------------------------------

id | operation | E-rows | E-width | E-costs

----+----------------------------------------------+--------+---------+---------

1 | -> Data Node Scan on "__REMOTE_FQS_QUERY__" | 0 | 0 | 0.00

(3 rows)

--You can see that t1 is a replication table and t2 is a hash table, which can also be completely pushed down.

--If t2 is changed to random distribution, what will happen to the related query?

postgres=# alter table t2 distribute by replication;

ALTER TABLE

postgres=# explain select * from t1,t2 where t1.id=t2.id;

QUERY PLAN

----------------------------------------------------------------------------------

id | operation | E-rows | E-width | E-costs

----+----------------------------------------------+--------+---------+---------

1 | -> Data Node Scan on "__REMOTE_FQS_QUERY__" | 0 | 0 | 0.00

(3 rows)

When there are non-distributed keys in the association, it cannot be completely pushed down. If one of the tables is changed to a replicated table (each dn has data), it can be completely pushed down no matter how the other table is distributed. . But copying tables is only suitable for small tables

Common non-FQS

  1. Aggregation and sorting operations : When a query requires complex aggregation operations or sorting, it usually needs to be performed on the coordinating node. FQS is not suitable for these situations because performing these operations on the data nodes may cause performance degradation.
  2. Joins across multiple distribution keys : If a query requires joining multiple tables, and the join conditions for those tables involve different distribution keys, FQS may not be the best choice. Such queries may need to be executed on the coordinating node in order to properly handle connections across multiple data nodes.
  3. Subqueries and complex logic : Queries that contain complex subqueries or logic often need to be run on a coordinating node because these queries require the coordination of multiple steps to produce results.
  4. Involving external data sources or functions : If the query involves communication with external data sources or requires the use of functions outside the database, FQS may not be applicable because these operations usually need to be performed on the coordination node. There are three forms of functions, depending on the specific situation.

Overall, FQS is a performance optimization tool that is suitable for many queries, but not all queries. Database query optimization often involves trade-offs and choosing the right execution method based on your specific query and performance needs. You can determine whether FQS should be used by observing execution plans and performance testing.

Summarize

1. In DWS, FQS (Fast Query Shipping) is a query optimization technology that allows queries to be forwarded to data nodes for execution on the data nodes, thereby reducing data transmission and improving query performance.

2. There are currently three main types of plans in DWS:

  • FQS: cn directly sends the original statement to dn, each dn executes it separately, and the execution results are summarized on cn
  • Stream: The plan is that CN generates a plan based on the original statement and sends the plan to DN for execution. The Stream operator is used for data interaction during the execution of each DN.
  • Remote-Query: After CN generates the plan, it sends some original statements to DN, and each DN executes them separately. After execution, the results are sent to CN, and CN executes the remaining plan.

3. " REMOTE_FQS_QUERY " alone cannot determine whether there is an index. Further verification is required. Use enable_fast_query_shipping to control whether to use the distributed framework to view the specific execution plan, which is helpful for optimizing SQL.

4. When using random distribution, because the data is randomly distributed, the table basically needs to be redistributed when performing related queries, which is costly.

5. Replication mode: Since each node has a copy of the data, it can be completely pushed down. The replication mode is suitable for small tables with frequent queries.

6. The association between distributed key and non-distributed key cannot be completely pushed down. This is a relatively common situation. Therefore, when designing the table, the distributed key field type is consistent, and the join column is the best.

7. Small flaw: Even if SQL removes the index, the number of index_scan index scans in the statistical information tables pg_stat_all_indexes and pg_stat_all_table are both 0.

8. You should try to ensure that the execution plan is fqs. If you can continue to optimize based on fqs, you can use enable_fast_query_shipping to turn off complete pushdown and check the targeted optimization of the execution plan.

 

Click to follow and learn about Huawei Cloud’s new technologies as soon as possible~

IntelliJ IDEA 2023.3 & JetBrains Family Bucket annual major version update new concept "defensive programming": make yourself a stable job GitHub.com runs more than 1,200 MySQL hosts, how to seamlessly upgrade to 8.0? Stephen Chow's Web3 team will launch an independent App next month. Will Firefox be eliminated? Visual Studio Code 1.85 released, floating window US CISA recommends abandoning C/C++ to eliminate memory security vulnerabilities Yu Chengdong: Huawei will launch disruptive products next year and rewrite industry history TIOBE December: C# is expected to become the programming language of the year A paper written by Lei Jun 30 years ago : "Principle and Design of Computer Virus Determination Expert System"
{{o.name}}
{{m.name}}

Guess you like

Origin my.oschina.net/u/4526289/blog/10321105