MySQL8.0 optimizer_switch变化

Optimizer_switch变量是支持对优化器行为的控制。是一组值标志,每个标志都有一个on或off的值,以指示是否启用或禁用相应的行为。

MySQL8.0里除了熟悉的hash join重大变化之外,其他方面也有优化。

mysql> SHOW VARIABLES LIKE 'OPTIMIZER_SWITCH'\G;
*************************** 1. row ***************************
Variable_name: optimizer_switch
        Value: index_merge=on,index_merge_union=on,index_merge_sort_union=on,
index_merge_intersection=on,engine_condition_pushdown=on,index_condition_pushdown=on,
mrr=on,mrr_cost_based=on,block_nested_loop=on,batched_key_access=off,
materialization=on,semijoin=on,loosescan=on,firstmatch=on,
duplicateweedout=on,subquery_materialization_cost_based=on,
use_index_extensions=on,condition_fanout_filter=on,derived_merge=on,

##下面是目前5.7 Vs 8.0差异内容##
use_invisible_indexes=off,skip_scan=on,hash_join=on,
subquery_to_derived=off,prefer_ordering_index=on,
hypergraph_optimizer=off,derived_condition_pushdown=on

use_invisible_indexes

MySQL8.0 开始支持隐藏索引,有时往往因为索引不合理,执行计划太差,需要大动干戈,现在不需要进行破坏性的更改,优化器方面配合是否使用不可见索引来构建查询执行计划.默认是关闭。当然这里主键以外。

#索引age隐藏
mysql> ALTER TABLE members  ALTER INDEX idx_age INVISIBLE;
Query OK, 0 rows affected (0.01 sec)
Records: 0  Duplicates: 0  Warnings: 0

#执行计划全表扫描
mysql> EXPLAIN select *from members where age > 10;
+----+-------------+---------+------------+------+---------------+------+---------+------+------+----------+-------------+
| id | select_type | table   | partitions | type | possible_keys | key  | key_len | ref  | rows | filtered | Extra       |
+----+-------------+---------+------------+------+---------------+------+---------+------+------+----------+-------------+
|  1 | SIMPLE      | members | NULL       | ALL  | NULL          | NULL | NULL    | NULL |    3 |    33.33 | Using where |
+----+-------------+---------+------------+------+---------------+------+---------+------+------+----------+-------------+
1 row in set, 1 warning (0.00 sec)

#优化器可用隐藏索引打开。
mysql> SET SESSION optimizer_switch='use_invisible_indexes=on';
Query OK, 0 rows affected (0.00 sec)

#执行计划使用age 隐藏索引
mysql> EXPLAIN select *from members where age > 10;
+----+-------------+---------+------------+-------+---------------+---------+---------+------+------+----------+-----------------------+
| id | select_type | table   | partitions | type  | possible_keys | key     | key_len | ref  | rows | filtered | Extra                 |
+----+-------------+---------+------------+-------+---------------+---------+---------+------+------+----------+-----------------------+
|  1 | SIMPLE      | members | NULL       | range | idx_age       | idx_age | 5       | NULL |    3 |   100.00 | Using index condition |
+----+-------------+---------+------------+-------+---------------+---------+---------+------+------+----------+-----------------------+
1 row in set, 1 warning (0.00 sec)

备注: 在一些业务无法及时改变SQL语句的时,可以通过更改索引隐藏属性,有效控制手段。
特别是生产不确定索引性能影响的时候,可以在线操作,算是应急对策。

prefer_ordering_index

MySQL里对任何有LIMIT子句的ORDER BY或GROUP BY查询使用有序索引,覆盖优化器所做的任何其他选择,只要它确定这会导致更快的执行。因此做出这种判断的算法对数据分布和其他条件做出了某些假设,所以它可能并不总是完全正确,而且在某些情况下,此类查询选择不同的优化可能会提供更好的性能。
8.0.21版本中optimizer_switch变量的prefer_ordering_index设置来禁用这种优化算法。

optimizer_switch参数中的prefer_ordering_index进行控制对于order by limit的优化。prefer_ordering_index默认打开,表示MySQL会优先考虑通过order by列索引进行排序优化。

目前验证发现只在范围查询中,有算法干扰。所以使用范围查询,可要小心了。

#模拟表结构和数据
mysql>CREATE table `members`  
( 
id int unsigned NOT NULL AUTO_INCREMENT , 
first_name varchar(100)  DEFAULT  NULL  , 
last_name  varchar(100) DEFAULT NULL ,  
age  INT  DEFAULT '0' ,  
create_time   timestamp  DEFAULT CURRENT_TIMESTAMP,  
update_time timestamp   DEFAULT current_timestamp() ON UPDATE current_timestamp(),
primary KEY (id) ,
KEY idx_first_last(first_name(3),last_name(3)),
KEY idx_age(age)
);
mysql>INSERT INTO `members`(id,first_name,last_name,age)  
      VALUES(1,'AAAAA','BBBBB',12),(2,'AAAAA','CCCCC',20),(3,'DDDDD','EEEEE',30);

#排序优先开启
mysql>SET  optimizer_switch = "prefer_ordering_index=on";

#优先ORDER BY 和 LIMIT 选择主键
mysql> EXPLAIN  SELECT  * FROM  members   WHERE   age> 15  order by `id` DESC LIMIT 1;
+----+-------------+---------+------------+-------+---------------+---------+---------+------+------+----------+----------------------------------+
| id | select_type | table   | partitions | type  | possible_keys | key     | key_len | ref  | rows | filtered | Extra                            |
+----+-------------+---------+------------+-------+---------------+---------+---------+------+------+----------+----------------------------------+
|  1 | SIMPLE      | members | NULL       | index | idx_age       | PRIMARY | 4       | NULL |    1 |    66.67 | Using where; Backward index scan |
+----+-------------+---------+------------+-------+---------------+---------+---------+------+------+----------+----------------------------------+
1 row in set, 1 warning (0.00 sec)

#排序优先关闭
mysql> SET  optimizer_switch = "prefer_ordering_index=off";
Query OK, 0 rows affected (0.00 sec)

#选择合理执行计划
mysql> EXPLAIN  SELECT  * FROM  members   WHERE   age> 15  order by `id` DESC LIMIT 1;
+----+-------------+---------+------------+-------+---------------+---------+---------+------+------+----------+---------------------------------------+
| id | select_type | table   | partitions | type  | possible_keys | key     | key_len | ref  | rows | filtered | Extra                                 |
+----+-------------+---------+------------+-------+---------------+---------+---------+------+------+----------+---------------------------------------+
|  1 | SIMPLE      | members | NULL       | range | idx_age       | idx_age | 5       | NULL |    2 |   100.00 | Using index condition; Using filesort |
+----+-------------+---------+------------+-------+---------------+---------+---------+------+------+----------+---------------------------------------+
1 row in set, 1 warning (0.00 sec)

subquery_to_derived

MySQL 8.0.21开始,优化器增加了subquery_to_derived(默认关闭)选项。优化器在许多情况下能够将SELECT、WHERE、JOIN或HAVING子句中的标量子查询转换为派生表上的左&外部连接(在某些情况下是内连接)。
所为的:

  • subquery: 包含在 SELECT 中的子查询(不在 FROM子句中)。
  • derived : 包含在 from 子句中的子查询。MySQL会将结果存放在一个临时表中,
    也称为派生表(derived).

在大多数情况下,启用此优化不会产生任何明显的性能改进(在许多情况下甚至会使查询运行得更慢),视情况而定,所以选择默认值即可。

SHOW WARNINGS信息下,可以看到语句改写情况:

mysql> SET SESSION optimizer_switch='subquery_to_derived=OFF';
Query OK, 0 rows affected (0.00 sec)

mysql> EXPLAIN SELECT * FROM t1   WHERE t1.a > (SELECT COUNT(a) FROM t2);
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key  | key_len | ref  | rows | filtered | Extra       |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+
|  1 | PRIMARY     | t1    | NULL       | ALL  | NULL          | NULL | NULL    | NULL |    4 |    33.33 | Using where |
|  2 | SUBQUERY    | t2    | NULL       | ALL  | NULL          | NULL | NULL    | NULL |    2 |   100.00 | NULL        |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+
2 rows in set, 1 warning (0.00 sec)

mysql> SHOW WARNINGS;
+-------+------+---------------------------------------------------------------------------------------------------------------------------------------------------------+
| Level | Code | Message                                                                                                                                                 |
+-------+------+---------------------------------------------------------------------------------------------------------------------------------------------------------+
| Note  | 1003 | /* select#1 */ select `test`.`t1`.`a` AS `a` 
 from `test`.`t1` 
where (`test`.`t1`.`a` > (/* select#2 */ select count(`test`.`t2`.`a`) from `test`.`t2`)) |
+-------+------+---------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)
mysql> SET SESSION optimizer_switch='subquery_to_derived=ON';
Query OK, 0 rows affected (0.00 sec)

mysql> EXPLAIN SELECT * FROM t1   WHERE t1.a > (SELECT COUNT(a) FROM t2);
+----+-------------+------------+------------+------+---------------+------+---------+------+------+----------+--------------------------------------------+
| id | select_type | table      | partitions | type | possible_keys | key  | key_len | ref  | rows | filtered | Extra                                      |
+----+-------------+------------+------------+------+---------------+------+---------+------+------+----------+--------------------------------------------+
|  1 | PRIMARY     | <derived2> | NULL       | ALL  | NULL          | NULL | NULL    | NULL |    1 |   100.00 | NULL                                       |
|  1 | PRIMARY     | t1         | NULL       | ALL  | NULL          | NULL | NULL    | NULL |    4 |    33.33 | Using where; Using join buffer (hash join) |
|  2 | DERIVED     | t2         | NULL       | ALL  | NULL          | NULL | NULL    | NULL |    2 |   100.00 | NULL                                       |
+----+-------------+------------+------------+------+---------------+------+---------+------+------+----------+--------------------------------------------+
3 rows in set, 1 warning (0.00 sec)

mysql>  SHOW WARNINGS;;
+-------+------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Level | Code | Message                                                                                                                                                                                                           |
+-------+------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Note  | 1003 | /* select#1 */ select `test`.`t1`.`a` AS `a` 
from `test`.`t1` join (
/* select#2 */ select count(`test`.`t2`.`a`) AS `COUNT(a)` 
from `test`.`t2`) `derived_1_2` 
where (`test`.`t1`.`a` > `derived_1_2`.`COUNT(a)`) |
+-------+------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

derived_condition_pushdown

Mysql 8.0.22之后的版本支持派生条件回移(Derived Condition Pushdown)优化。该优化可以减少derived派生表处理的行数从而提高查询执行的效率。

比方如下查询:

SELECT * FROM 
(SELECT i, j FROM t1) AS dt 
WHERE i > constant

# ↓ ↓ ↓ 通过派生条件回移优化后类似如下形式(WHERE条件拿到派生表的里面)

SELECT * FROM (SELECT i, j FROM t1 WHERE i > constant)

就是说 将外部WHERE条件下推到派生表中应该会减少需要处理的行数,从而加快查询的执行速度。

MySQL 8.0.29及以后版本中,派生表条件下推优化可以用于UNION查询。
不能下推情况:内部表,包含子查询的条件,公共表表达式,LIMIT子句,变量的赋值 等场景。

skip_scan

MySQL从8.0.13版本开始支持一种新的range scan方式。就是说 组合索引(f1,f2)的时候,查询条件里 只有f2的时候 也可以实现索引扫描的功能.范围扫描比全索引扫描更有效,优化器可以执行多个范围扫描,每个值对应一个f1,使用一种称为Skip Scan的方法,类似于松散索引扫描。

从上述描述可以看到使用skip-scan的方式避免了全索引扫描,从而提升了性能,尤其是在索引前缀列区分度比较低的时候。可以说 避开了索引前缀原则。

mysql> ALTER TABLE t1 ADD PRIMARY KEY(f1, f2);
Query OK, 0 rows affected (0.11 sec)
Records: 0  Duplicates: 0  Warnings: 0

mysql> EXPLAIN SELECT f1, f2 FROM t1 WHERE f2 > 40;;
+----+-------------+-------+------------+-------+---------------+---------+---------+------+------+----------+----------------------------------------+
| id | select_type | table | partitions | type  | possible_keys | key     | key_len | ref  | rows | filtered | Extra                                  |
+----+-------------+-------+------------+-------+---------------+---------+---------+------+------+----------+----------------------------------------+
|  1 | SIMPLE      | t1    | NULL       | range | PRIMARY       | PRIMARY | 8       | NULL |   53 |   100.00 
| Using where; Using index for skip scan |
+----+-------------+-------+------------+-------+---------------+---------+---------+------+------+----------+----------------------------------------+
1 row in set, 1 warning (0.00 sec)

hypergraph_optimizer

MySQL 8.0.23提供了优化器支持hypergraph(超图)的模型。目前实际还不支持,需要在debug下进行验证测试,是不成熟的优化方案。

hypergraph是数学中用的概念。在数据算法中,描述对象之间建立联系而用的。个人理解,就是通过边界算法,把对应的数据关联点,范围等圈出来。

由此多表关联下,从5.7的nest loop 进化到8.0的hash join,在hash join进一步提升效率的的算法。这个算法非常切合B+Tree配合使用。

mysql> SET SESSION optimizer_switch='hypergraph_optimizer=ON';
ERROR 3999 (42000): The hypergraph optimizer does not yet support 'use in non-debug builds'

总结

优化器行为变更,可能会导致不同版本SQL语句性能体现不一样。当然变化是往好的方面延伸的,但对于复杂的情况,可能也会出现退化效果。所以版本升级的时候,需要特别关注下。

在这里特别感兴趣hypergraph(超图)优化方案。因为MySQL对于多表关联是一个致命的问题,关联表数量越多性能越差。对于hypergraph的算法本质上的理解,可以说结合B+Tree性能提升空间很大。

猜你喜欢

转载自blog.csdn.net/dreamyuzhou/article/details/128725844