The difference between ORACLE in and exists statement

select * from A
where id in(select id from B)

The above query uses the in statement, in() is executed only once, it finds all the id fields in the B table and caches them. After that, check whether the id of the A table matches the The ids in table B are equal. If they are equal, the records in table A will be added to the result set until all records in table A have been traversed.
Its query process is similar to the following process:

List resultSet=[];
Array A=(select * from A );
Array B=(select id from B);

for(int i=0;i<A.length;i++) {
   for(int j=0;j<B.length;j++) {
      if(A[i] .id==B[j].id) {
         resultSet.add(A[i]);
         break;
      }
   }
}
return resultSet;

It can be seen that it is not suitable to use in() when the B table data is large, because it It will traverse all the data in table B once.
For example: table A has 10,000 records, table B has 1,000,000 records, then it is possible to traverse 10,000*1,000,000 times at most, which is very inefficient.
Another example: table A has 10,000 records, table B has 10,000 records There are 100 records, then it is possible to traverse 10000*100 times at most, the number of traversals is greatly reduced, and the efficiency is greatly improved.

Conclusion: in() is suitable for the case where the data in table B is smaller than that in table A

select a.* from A a
where exists(select 1 from B b where a.id=b.id)

The above query uses the exists statement, exists() will execute A.length times, it does not cache the exists() result set , because the content of the exists() result set is not important, the important thing is whether there is a record in the result set, if there is a record, it returns true, if not, it returns false.
Its query process is similar to the following process

List resultSet=[];
Array A= (select * from A)

for(int i=0;i<A.length;i++) {
   if(exists(A[i].id) { //Execute select 1 from B b where b.id=a.id Whether there is a record to return
       resultSet.add(A[i]);
   }
}
return resultSet;

When the data of table B is larger than that of table A, exists() is suitable, because it does not have the traversal operation, and only needs to execute the query again.
For example : There are 10,000 records in table A and 1,000,000 records in table B, then exists() will be executed 10,000 times to determine whether the id in table A is equal to the id in table B.
For example: table A has 10,000 records, table B has 10,000 records There are 100,000,000 records, then exists() is still executed 10,000 times, because it only executes A.length times. It can be seen that the more data in table B, the more suitable exists() is to play its effect.
Another example: Table A has 10,000 records and table B has 100 records, so exists() is still executed 10,000 times, it is better to use in() to traverse 10,000*100 times, because in() is to traverse and compare in memory, and exists() needs to query the database. We all know that querying the database consumes higher performance, and the memory is relatively fast.

Conclusion: exists() is suitable for the case where the data in table B is larger than that in table A.

When data in table A is as large as the data in table B When the efficiency of in and exists is similar, you can use either one.

Guess you like

Origin http://10.200.1.11:23101/article/api/json?id=326743264&siteId=291194637