· MySQL · optimize the performance of sub-query materialized pushed down to the table

Creative Commons License Copyright: Attribution, allow others to create paper-based, and must distribute paper (based on the original license agreement with the same license Creative Commons )

background

MySQL introduced Materialization (materialized) the key characteristic for sub-queries (such as sub-queries IN / NOT IN and FROM subquery) optimization. 

Specific implementation is: in the process of SQL execution, it first needs to perform sub-sub-query results and sub-queries when the query results are saved to a temporary table, visit the follow-up sub-query result set will be available directly through the temporary table. 

At the same time, the optimizer also has the ability to delay materialized subquery, first check whether you really need to perform by other conditions to determine the child. Materialized sub-query optimization SQL execution key point is that the subquery only needs to be performed once. As opposed to implementation is the appearance of each row subqueries called, its execution plan for the query type "DEPENDENT SUBQUERY". 

While using Materialization (materialized) can improve the performance of SQL, it is necessary to pay attention to whether the relevant SQL possibility of further optimization of space exists. For example scenarios are described below: 

mysql> explain extended Select * from (select * from score where score >= 60 ) derived1 where class_id = 10 ; 

+----+-------------+------------+-------+---------------+-------------+---------+-------+------+----------+--------------------------+ 

| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra | 

+----+-------------+------------+-------+---------------+-------------+---------+-------+------+----------+--------------------------+ 

| 1 | PRIMARY | <derived2> | ref | <auto_key0> | <auto_key0> | 4 | const | 0 | 100 | | 

| 2 | DERIVED | score | index | idx_score | idx_score | 4 | | 1 | 100 | Using where; Using index | 

+----+-------------+------------+-------+---------------+-------------+---------+-------+------+----------+--------------------------+ 

As can be seen from the execution plan, MySQL first materialized subquery (select_type = DERIVED, or format = json format to view the execution plan), and then filter the result set by class_id field. The SQL semantically, can also be written in the form, if the index is reasonable efficiency will be higher. 

select * from score where score >= 60 and class_id= 10 

As can be seen from this example that a subquery potential problems materialized: when the sub resource consuming query itself or the result set is large, there is often a high optimization of space, especially in a case where the outer layer may be applied to conditions subqueries . By condition pushdown, in the implementation process as early as possible to reduce the amount of data access, can significantly improve performance. This article focuses on describing the scene to push down the conditions materialized subqueries. 

analysis

In fact the aforementioned query can be automatically rewritten in version 5.7. After opening the optimizer option derived_merge = on, the statement after viewing rewritten as follows: 

select `remall` . `score` . `class_id` AS `class_id` , `remall` . `score` . `student_id` AS `student_id` , `remall` . `score` . `score` AS `score` 

from `remall` . `score` 

where (( `remall` . `score` . `class_id` = 10 ) and ( `remall` . `score` . `score` >= 60 )) 

On the other hand, not all sub-queries can be done automatically conditions pushed down. For example, the following statement: 

select * from (select class_id, avg (score) from score group by class_id) derived1 where class_id = 10 ; 

The reason for this phenomenon is currently only MySQL optimizer to Mergable view or sub-query rewrite. Understand this concept we can start with two algorithms view the start: merge and temptable. 

Generally more complex view or sub-query using temptable algorithm types, including: 

1. subqueries polymerization; 

2. LIMIT contain sub-queries; 

3. UNION or UNION ALL subqueries; 

4. Output field subqueries; 

We can also be displayed by creating a sub-query the view to determine whether to use the merge algorithm. such as: 

mysql >create algorithm=merge view v as select class_id, avg(score) from score group by class_id; 

Successful execution, cost  2.46  MS. 

mysql >show warnings; 

+ ---------+------+-------------------------------------------------------------------------------+ 

| Level | Code | Message | 

+ ---------+------+-------------------------------------------------------------------------------+ 

| Warning | 1354 | View merge algorithm can't be used here for now (assumed undefined algorithm) | 

+ ---------+------+-------------------------------------------------------------------------------+ 

We use merge specified when creating the view, but the database is not suitable therefore determine the algorithm (used during actual execution temptable algorithm) using the default undefined. 

/** 

Strategy for how to process a view or derived table (merge or materialization) 

*/ 

enum enum_view_algorithm { 

VIEW_ALGORITHM_UNDEFINED = 0 , 

VIEW_ALGORITHM_TEMPTABLE = 1 , 

VIEW_ALGORITHM_MERGE = 2 

}; 

Using the merge algorithm view or subqueries can be pushed down to the query or subquery internal view; temptable algorithms and subqueries or view can not be pushed down conditions, it can be further filtered on a result set. Optimizer for this criterion is: 

bool merge_derived(THD *thd, TABLE_LIST *derived_table) 

{ 

... 

// Check whether derived table is mergeable, and directives allow merging 

if (!derived_unit->is_mergeable() || 

derived_table->algorithm == VIEW_ALGORITHM_TEMPTABLE || 

(!thd->optimizer_switch_flag(OPTIMIZER_SWITCH_DERIVED_MERGE) && 

derived_table->algorithm != VIEW_ALGORITHM_MERGE)) 

DBUG_RETURN( false ); 

... 

} 

Conditions push down principle

Not all database engines are the perfect realization of functional sub-query conditions pushed down to push down. The conditions for the use of MySQL in view of aggregate queries or from sub-queries, suggestions pushdown principles are: 

 Only depends on the view of the query or query from the output of the sub-fields where security conditions can be pushed down. 

Also note that the proper conditions are pushed down to the position or view of a subquery derived table stored: 

  1. Semantically, down to the sub-query polymerization conditions can be placed in the HAVING clause. HAVING clause after the pushdown can be: HAVING xxx and NEW_CONDITION operation VALUE;
  2. If the condition that a subquery field group, and the index on the condition, the condition where the words in the sub-queries, the performance will be better (the HAVING condition contains no aggregation function, to push down the condition where words filtered entire group).

For other types of views or from subqueries, artificial conditions can be performed by pushdown semantic inspection. 

to sum up

Any database optimizer is not a panacea. After understand the characteristics of the optimizer and avoid their weaknesses, in order to write optimal SQL statements. 

Guess you like

Origin blog.csdn.net/qq_28254699/article/details/93486183
Recommended