SQL优化案例：相关子查询优化 - 代码天地

SQL优化案例：相关子查询优化

其他 2018-08-12 07:49:11 阅读次数: 0

原始语句：

SELECT
t1.*
FROM
t_payment_bank_account_info t1
WHERE
EXISTS (
SELECT
1
FROM
t_payment_account_dtl t2
WHERE
t1.account_no = t2.account_no
AND t2.parent_account_no = '7311810182600115231'
AND t2.txn_Date >= '2015-12-23'
AND t2.account_no != t2.opp_acc_no
);

执行计划

+----+--------------------+-------+------+---------------+------+---------+------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------------+-------+------+---------------+------+---------+------+------+-------------+
| 1 | PRIMARY | t1 | ALL | NULL | NULL | NULL | NULL | 4552 | Using where |
| 2 | DEPENDENT SUBQUERY | t2 | ALL | NULL | NULL | NULL | NULL | 7924 | Using where |
+----+--------------------+-------+------+---------------+------+---------+------+------+-------------+

语句在hotfix环境运行时间：14 rows in set (27.98 sec)

第一个问题：select * 语句在生产环境严格禁止，需明确指明查询字段。

第二个问题：相关子查询，尤其是使用不到索引时效率或非常低，可改写成join方式。

select t1.*
from t_payment_bank_account_info t1
join t_payment_account_dtl t2
using(account_no)
where t2.parent_account_no = '7311810182600115231'
AND t2.txn_Date >= '2015-12-23'
AND t2.account_no != t2.opp_acc_no
group by 需要查询的字段;

（因为join时内表中的一条记录可能跟外表中的多条记录匹配，所以最终会比使用相关子查询的方式多出一些重复的记录结果，故使用group by去重复，当然也可以使用distinct关键字，两者原理相同。如果重复值对于最终需求并没有什么影响则可以移除该从句以避免分组、排序造成的临时表和文件排序等额外开销，提高查询效率）

执行计划：

+----+-------------+-------+------+---------------+------+---------+------+------+---------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+---------------+------+---------+------+------+---------------------------------+
| 1 | SIMPLE | t1 | ALL | NULL | NULL | NULL | NULL | 4552 | Using temporary; Using filesort |
| 1 | SIMPLE | t2 | ALL | NULL | NULL | NULL | NULL | 7924 | Using where; Using join buffer |
+----+-------------+-------+------+---------------+------+---------+------+------+---------------------------------+

语句在hotfix环境运行时间：14 rows in set (2.67 sec)

第三个问题：到这里优化还没有结束，看到上述执行计划中有Using join buffer 出现，这是MySQL内部的一个优化，可大大减少join时的IO开销。但如果能在join字段上添加适当索引的话，性能还能更加显著的提升。

那么是要在t_payment_account_dtl表还是在t_payment_bank_account_info表的account_no字段添加索引呢？

可以看一下每个表中account_No字段的筛选度：

mysql> select count(*) from t_payment_account_dtl;
+----------+
| count(*) |
+----------+
| 7594 |
+----------+
1 row in set (0.04 sec)
mysql> select count(distinct(account_No)) from t_payment_account_dtl;
+-----------------------------+
| count(distinct(account_No)) |
+-----------------------------+
| 75 |
+-----------------------------+
1 row in set (0.00 sec)
mysql> select count(distinct(account_No)) from t_payment_bank_account_info\G
*************************** 1. row ***************************
count(distinct(account_No)): 4753
1 row in set (0.00 sec)
mysql> select count(*) from t_payment_bank_account_info\G
*************************** 1. row ***************************
count(*): 4789
1 row in set (0.01 sec)

发现t_payment_bank_account_info表account_no字段筛选度较高，那么在该表添加索引。((inner)join时MySQL会自动根据索引情况选择哪个表做内部表那个表做外部表)

mysql> alter table t_payment_bank_account_info add index idx_account_no(account_no);
Query OK, 0 rows affected (0.14 sec)
Records: 0 Duplicates: 0 Warnings: 0

在看执行计划

mysql> desc select t1.* from t_payment_bank_account_info t1 join t_payment_account_dtl t2 using(account_no) where t2.parent_account_no = '7311810182600115231' AND t2.txn_Date >= '2015-12-23' AND t2.account_no != t2.opp_acc_no group by 需要查询的字段;
+----+-------------+-------+------+----------------+----------------+---------+---------------------------+------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+----------------+----------------+---------+---------------------------+------+----------------------------------------------+
| 1 | SIMPLE | t2 | ALL | NULL | NULL | NULL | NULL | 7924 | Using where; Using temporary; Using filesort |
| 1 | SIMPLE | t1 | ref | idx_account_no | idx_account_no | 99 | dcf_payment.t2.account_No | 22 | Using where |
+----+-------------+-------+------+----------------+----------------+---------+---------------------------+------+----------------------------------------------+
2 rows in set (0.00 sec)

发现可以使用刚才创建的索引，hotfix执行时间只需14 rows in set (0.01 sec)

这里MySQL选择t2做内部表t1做外部表，join阶段对于t2中的每条记录依次从t1的索引中进行查找。

在上边统计t1, t2行数的时候发现t2记录数是t1的近两倍，若果两表account_no字段上都有索引，那么使用记录数较少的表t1做内部表性能可能会更好。

我们尝试一下

mysql> alter table t_payment_account_dtl add index idx_account_no(account_no);
mysql> desc select t1.* from t_payment_bank_account_info t1 join t_payment_account_dtl t2 using(account_no) where t2.parent_account_no = '7311810182600115231' AND t2.txn_Date >= '2015-12-23' AND t2.account_no != t2.opp_acc_no group by 需要查询的字段;
+----+-------------+-------+------+----------------+----------------+---------+---------------------------+------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+----------------+----------------+---------+---------------------------+------+----------------------------------------------+
| 1 | SIMPLE | t2 | ALL | NULL | NULL | NULL | NULL | 7924 | Using where; Using temporary; Using filesort |
| 1 | SIMPLE | t1 | ref | idx_account_no | idx_account_no | 99 | dcf_payment.t2.account_No | 22 | Using where |
+----+-------------+-------+------+----------------+----------------+---------+---------------------------+------+----------------------------------------------+
2 rows in set (0.00 sec)

发现并没有像我们想象的那样选择记录数较少的t1做内部表。而是使用了t2做内部表使用索引筛选度较高的t1做了外部表~

猜你喜欢

转载自blog.csdn.net/eagle89/article/details/81320409

SQL优化案例：相关子查询优化

【SQL】（不）相关子查询

SQL相关子查询与非相关子查询

SQL相关子查询与非相关子查询入门版

从一次SQL改写体会exists相关子查询的代价及MySQL优化器的“聪明之处”

sql查询优化相关

SQL子查询、相关子查询

相关子查询与非相关子查询

使用相关子查询

相关子查询

关于My SQL中EXISTS在相关子查询的应用

Mysql改写子查询SQL优化案例

相关子查询和不相关子查询

相关子查询和非相关子查询

Oracle 相关子查询理解

sql查询优化索引优化

mysql 优化 sql查询优化

sql查询优化、索引优化

SQL优化案例

SQL优化案例-1

SQL优化案例一

【mysql】SQL嵌套子查询和相关子查询的执行过程有什么区别(推荐)

SQL嵌套子查询和相关子查询的执行过程有什么区别(推荐)

SQL相关子查询是什么？和嵌套子查询有什么区别？

SQL相关子查询是什么？和嵌套子查询有什么区别？

SQL语句带有exists谓词的子查询和相关子查询

MySQL优化查询相关

Greenplum查询相关优化

相关子查询和嵌套子查询

高级子查询(相关子查询)

今日推荐

Linus “吃狗粮”最积极！

开源日报 | Winamp播放器即将开源；生成式AI之战升级第二轮；Linus“吃狗粮”最积极；AI进入泡沫前期；吴泳铭为阿里云带来了什么？

NetBSD 禁止提交由 AI 生成的代码

Apache Doris 2.0.10 版本正式发布！

开源日报 | 大模型开战；大模型独角兽被曝卖身；周鸿祎建议谷歌开源所有产品；最大开源AI社区提供1000万美元共享GPU

开源日报 | Chrome内置Gemini的意义不在于Gemini；中国AI追随之路的五大误区；ECharts创始人“下海”养鱼；谷歌I/O开发者大会什么都有，只是没有惊喜

微软回应中国区AI团队“打包赴美”传闻

周排行

SVN服务端安装在阿里云

实战 | 相机标定

webpack核心概念

note20——》只要肯低头吃苦，人生就会有救

PAT甲级 1062 Talent and Virtue （25 分）排序

NG Toolset开发笔记--5GNR Resource Grid（26）

如何对待上司

oracle命令

第9章 STL迭代器

logstash使用es映射模板

每日归档

更多

2024-05-20(36)

2024-05-19(0)

2024-05-18(4)

2024-05-17(34)

2024-05-16(6)

2024-05-15(24)

2024-05-14(0)

2024-05-13(18)

2024-05-12(0)

2024-05-11(38)