1 数据准备
drop table test1;
create table test1 as select * from dba_objects where rownum<10000;
alter table test1 add constraint pk_test1 primary key (object_id);
commit;
2 需求分析
以owner分组,求每组内小于最大object_id的数据
3 SQL改善
3.1 写法一
SELECT t1.owner, t1.object_name
FROM test1 t1
WHERE t1.object_id
< (SELECT MAX(t2.object_id)
FROM test1 t2
WHERE t1.owner = t2.owner);
首先,这虽然有不等的关联子查询,优化器还是很智能的进行了先聚合,再连接的执行计划,没有直接走Filter。总体性能还可以。
3.2 写法二
SELECT t1.owner, t1.object_name
FROM test1 t1
, (select t.owner,max(t.object_id) as max_object_id
from test1 t
group by t.owner) t2
where t1.owner=t2.owner
and t1.object_id<t2.max_object_id;
这种写法是根据写法一的执行计划进行的SQL改写,SQL执行计划当然与写法一相同。
3.3 写法三
SELECT distinct t1.owner, t1.object_name
FROM test1 t1
LEFT JOIN test1 t2
on t1.owner=t2.owner
and t1.object_id<t2.object_id
where t2.object_id is not null;
因为数据分布不均,比如sys用户下有大量数据,进行t1.object_id<t2.object_id连接时,会差生大量的数据,这种写法性能很差。
因为返回量数据超级大,连接不能用索引,建议用写法一或者写法二的链接方式,先聚合,减少连接的体积,再进行连接。
3.4 写法四
select t.owner,t.object_name
from (SELECT t1.owner, t1.object_name,t1.object_id
,max(t1.object_id) over(partition by t1.owner) as max_object_id
FROM test1 t1) t
where t.object_id<max_object_id;
如果是自连接,建议先检讨分析函数是否可以解决需求,因为分析函数可以减少表的扫描次数。是性能最好的写法。