Preface
When writing SQL recently, it is required to exclude some records with the same attribute value in table B existing in table A.
Then I thought of in and exist, but in fact, I have never really understood the difference between the two,
so I checked some articles of the big guys on the Internet, and then practiced it, and finally made a summary
Let's start
1. Create table initialization
-------------------------------------
create table a1(c1 int,c2 int);
create table a2(c1 int,c2 int);
create table a3(c1 int,c2 int);
-------------------------------------
insert into a1 values(1,NULL);
insert into a1 values(1,2);
insert into a1 values(1,3);
--------------------------------------
insert into a2 values(1,NULL);
insert into a2 values(1,2);
insert into a2 values(2,1);
--------------------------------------
insert into a3 values(1,2);
insert into a3 values(2,1);
2. Execute the statement
select * from a1 where c2 in (NULL)
select * from a1 where c2 not in (NULL)
--注意: in 后面的查询值不能为null,否则会出现逻辑异常,结果为无
3. Time efficiency comparison
Same size table
select * from a1 where c2 in (select c2 from a2);
select * from a1 where exists (select c2 from a2 where a2.c2=a1.c2)
When using exist and in to query two tables of the same size, there is no difference in the query results, and the query rate is obviously better with exist.
select * from a1 where c2 not in (select c2 from a2);
select * from a1 where not exists (select c2 from a2 where a2.c2=a1.c2)
When using not exist and not in to query two tables of the same size, the query results are different, there is a logic error in not in , and the query rate is obviously better than not exist.
One big and one small table
-The subquery is a small table
select * from a1 where c2 in (select c2 from a3);
select * from a1 where exists (select c2 from a3 where a3.c2=a1.c2)
When the subquery table is a small table, there is no difference in the results, and exists is better
select * from a1 where c2 not in (select c2 from a3);
select * from a1 where not exists (select c2 from a3 where a3.c2=a1.c2)
When the subquery table is a small table, the results are different, and there is a logic error in not in
-The subquery is a large table
select * from a3 where c2 in (select c2 from a2);
select * from a3 where exists (select c2 from a2 where a2.c2=a3.c2)
** When the subquery is a large table, there is no difference in the results, and in is better
select * from a3 where c2 not in (select c2 from a2);
select * from a3 where not exists (select c2 from a2 where a2.c2=a3.c2)
** When the subquery is a large table and the query element in the subquery table does not have a null value, the results are not different, and not exists is better
to sum up:
1. If there is a null value in the subquery element, not in query can not be used.
2. In general query, using exists query is more efficient
4. Reference link