分析函数改善自连接-20191006-01

1 数据准备

drop table test1;

create table test1 as select * from dba_objects where rownum<10000;

alter table test1 add constraint pk_test1 primary key (object_id);

commit;

2 需求分析

以owner分组，求每组内小于最大object_id的数据

3 SQL改善

3.1 写法一

SELECT t1.owner, t1.object_name

FROM test1 t1

WHERE t1.object_id

< (SELECT MAX(t2.object_id)

FROM test1 t2

WHERE t1.owner = t2.owner);

首先，这虽然有不等的关联子查询，优化器还是很智能的进行了先聚合，再连接的执行计划，没有直接走Filter。总体性能还可以。

3.2 写法二

SELECT t1.owner, t1.object_name

FROM test1 t1

, (select t.owner,max(t.object_id) as max_object_id

from test1 t

group by t.owner) t2

where t1.owner=t2.owner

and t1.object_id<t2.max_object_id;

这种写法是根据写法一的执行计划进行的SQL改写，SQL执行计划当然与写法一相同。

3.3 写法三

SELECT distinct t1.owner, t1.object_name

FROM test1 t1

LEFT JOIN test1 t2

on t1.owner=t2.owner

and t1.object_id<t2.object_id

where t2.object_id is not null;

因为数据分布不均，比如sys用户下有大量数据，进行t1.object_id<t2.object_id连接时，会差生大量的数据，这种写法性能很差。

因为返回量数据超级大，连接不能用索引，建议用写法一或者写法二的链接方式，先聚合，减少连接的体积，再进行连接。

3.4 写法四

select t.owner,t.object_name

from (SELECT t1.owner, t1.object_name,t1.object_id

,max(t1.object_id) over(partition by t1.owner) as max_object_id

FROM test1 t1) t

where t.object_id<max_object_id;

如果是自连接，建议先检讨分析函数是否可以解决需求，因为分析函数可以减少表的扫描次数。是性能最好的写法。

发布了51 篇原创文章 · 获赞 4 · 访问量 4230

私信关注