Oracle nested loops、hash join、sort merge join 详解

Table connection method

  • Theoretical execution efficiency:hash join > nested loops > sort merge join
  • Under the CBO optimizer, the table connection method is not fixed and can be changed forcibly through the hint keyword.
  • Optimization principle: minimum IO consumption
Table connection method working principle Applicable principle
nested loops 2-level nested loop 1. Drive table data volume <1W
2. The lookup table has an index
hash join The smaller table creates a hash table in RAM, and the larger table reads records, which is the most efficient 1. Equivalent connection
2. When the drive meter is large, the effect is better; when the drive meter is small, the effect is better
3. The parameter HASH_AREA_SIZE needs to be set
sort merge join sort merge join Sort records and merge 1. Non-equivalent join or sort
star join star join Multiple dimension tables and a large data table, and then nested loop connection 1. Generally used for data warehouse
2. Need to enable parameter STAR_TRANSFORMATION_ENABLED

Insert picture description here

nested loops

  • Read data from one table cyclically (driving the outer table), and then access another table (inner table, 通常有索引). Each row in the driving table is associated with the corresponding field in the inner table.
FOR o IN 1 .. n LOOP -- 一般 n < 1W
  FOR i IN 1 .. m LOOP
     索引字段 join;
  END LOOP;
END LOOP

-- 内部连接过程
row source1 的 row1 -> probe -> row source2
row source1 的 row2 -> probe -> row source2
row source1 的 row3 -> probe -> row source2
...
row source1 的 rowN -> probe -> row source2

hash join

  • This connection method was introduced in Oracle 7.3 and can only be used in the CBO optimizer
  • There is also the so-called driver table concept in nested loops. The table by hash table and bitmap is the driver table. When the constructed hash table and bitmap can be stored in the memory, this connection method is extremely efficient.
-- 内部连接过程
row source1 的 row1 -> build hash table and bitmap -> probe -> row source2
row source1 的 row2 -> build hash table and bitmap -> probe -> row source2
row source1 的 row3 -> build hash table and bitmap -> probe -> row source2
...
row source1 的 rowN -> build hash table and bitmap -> probe -> row source2

sort merge join sort merge join

  • First sort the related columns of the related table, and then extract data from the respective sorted table to match in another sorted table.
  • Because merge join needs to do more sorting, it consumes more resources. Generally speaking, where merge join can be used, hash join can exert better performance, that is, the effect of hash join is better than sort merge join. However, if the row source has been sorted, there is no need to sort again when performing a sort merge join. At this time, the performance of the sort merge join will be better than the hash join

hint keyword

  • Note Check the syntax: select /*+ hint*/ ...in /* 和 +no spaces between and after must be followed by select, otherwise invalid
  • Use table alias: If a table alias is specified, the table name cannot be used
  • Prompt to ignore: if it is wrong, it will be treated as a comment and will not work
Common hint keywords for table connection Function description
/*+ LEADING(t)*/ Use the specified table t as the first table in the connection sequence
/*+ USE_NL(t1 t2)*/ Force t1, t2 to use nested loop
/*+ USE_HASH(t1 t2)*/ Force t1, t2 to use hash connection
/*+ USE_MERGE(t1 t2)*/ Force t1, t2 to use sort merge
/*+ PARALLEL(t N)*/ Parallel, table t, concurrent number N
/*+ INDEX(t idx)*/ Index, table t, index name idx
SELECT /*+ leading(t1) use_hash(t1 t2)*/
 t1.*
  FROM table_a t1, 
       table_b t2
 WHERE t1.a = t2.a;

Guess you like

Origin blog.csdn.net/qq_34745941/article/details/96476848