Will the sequence of judgment conditions in SQL cause index failure?

I saw a basic question in the group about the use of indexes.

The question is right here. Some friends say that they choose B, some choose C, some say that the question is not rigorous, and some say that there is no answer, which are all wrong.

After a long discussion, there are two common issues, which are worth mentioning:

  • a=1 and b=1 and b=1 and a=1 will effectively use idx(b,a)?

  • b=1 Will the index idx(a,b) still be used?

Real knowledge comes out of practice, so I try to operate it on the computer.

create database factory ;

use factory 
go 

create table dbo.workflow ( flowid int, flowamount int, flowcount int )

go 

Answer the first question first, will the order of judgment conditions affect the use of the index?

这儿模拟题目中的 idx(b,a) 索引结构

create index idx_amt_id on dbo.workflow(flowamount,flowid)

模拟 a=1 and b=1 的查询

select * from dbo.workflow 
where flowid = 1 and flowamount = 1 

模拟 b=1 and a=1 的查询

select * from dbo.workflow 
where flowamount = 1 and flowid = 1 

It can be seen that when the table is newly created and there is no data, the optimizer will not judge whether to use an index at all, but will directly scan the entire table. Anyway, just one data page.

When we add some data, look at the reaction:

I have to mention the usage of tally table again. I really can’t stand the method of using loops to generate test data.

DECLARE @BEGIN DATETIME = '2010-01-01'

                ,@END DATETIME = '2017-10-30'

DECLARE @INC INT ;

SELECT @INC = DATEDIFF(DAY,@BEGIN,@END)



; WITH 

    L0 AS ( 

            SELECT * FROM (VALUES(1),(2),(3)) AS T(C) )

,    L1 AS (

            SELECT a.C,b.C AS BC FROM L0 AS a cross join L0 AS b )

,    L2 AS (

            SELECT a.C,b.C AS BC FROM L1 AS a cross join L1 AS b )

,    L3 AS (

            SELECT a.C,b.C AS BC FROM L2 AS a cross join L2 AS b )

,    L4 AS (

            SELECT a.C,b.C AS BC FROM L3 AS a cross join L3 AS b )

,    L5 AS (

            SELECT a.C,b.C AS BC FROM L4 AS a cross join L4 AS b )

insert into  dbo.workflow (flowid,flowamount,flowcount)            

SELECT TOP 50000 RNK , RNK * 10, RNK + 20 

FROM 

(

SELECT ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) AS RNK 

FROM L5

) M 

At this time, there are 50,000 pieces of data in the table, and then look at the execution plan of the above two queries:

这儿模拟题目中的 idx(b,a) 索引结构

create index idx_amt_id on dbo.workflow(flowamount,flowid)

模拟 a=1 and b=1 的查询

select * from dbo.workflow 
where flowid = 1 and flowamount = 1 

模拟 b=1 and a=1 的查询

select * from dbo.workflow 
where flowamount = 1 and flowid = 1 

Obviously, we will follow the index idx(b,a) model, which has nothing to do with b first and a first. The optimizer can optimize the reorganization of this part of the expression.

However, are there no order requirements for all conditional expressions ? surely not

Only when the condition of equality is judged, the order is not important. Once an expression is used for non-equal judgment, the order is very important, as follows:

select * from dbo.workflow 
where flowamount > 39 and flowid = 1 


select * from dbo.workflow 
where flowid = 1 and flowamount > 39  

Here the optimizer prompts (the part in green font) to create an index for equal judgment conditions first, and an index (flowid, flowamount) for non-equal judgment fields after. So in essence, the sequence of fields in the index structure is not restricted by the order of the equality judgment condition expression fields in the query, but by the inequality judgment expression. That is, the non-equal judgment field (flowamount>39) needs to be placed after the equal judgment field (flowid=1).

create index idx_id_amtr on dbo.workflow(flowid,flowamount)


select * from dbo.workflow 
where flowamount > 39 and flowid = 1 


select * from dbo.workflow 
where flowid = 1 and flowamount > 39  

Look at the two execution plans:

Here is the index idx_id_amtr we just created

The second question, will b=1 still use index idx(a,b) ?

In the above example, the index of index(flowamount,flowid) is established, then the problem to be solved is corresponding to the index of index(flowamount,flowid) where flowid = 1?

select * from dbo.workflow 
where  flowid = 1 

It can be seen that b=1 will not use the index idx(a,b).

Note that other databases such as oracle, mysql, pg, etc. may be different. You can try the actual operation and discuss it together. The algorithms of the respective optimizers are different, and the optimization is a bit surprising. Don't be too entangled.

Guess you like

Origin blog.csdn.net/weixin_45784983/article/details/108143484