MySQL has three algorithms for join: Nested Loop Join+Hash join+Sort Merge Join and how to check open block_nested_loop(using join buff)

We all know how to use SQL join association tables, but this time we are talking about the algorithm for implementing join. There are three algorithms for join, namely Nested Loop Join, Hash Join, and Sort Merge Join.

It is mentioned in the official MySQL documentation that MySQL only supports the Nested Loop Join join algorithm

MySQL resolves all joins using a nested-loop join method. This means that MySQL reads a row from the first table, and then finds a matching row in the second table, the third table, and so on.
explain-output

 

So this article only talks about Nested Loop Join.

NLJ uses a two-layer loop. The first table is used as the Outer Loop, and the second table is used as the Inner Loop. Each record of the Outer Loop is compared with the record of the Inner Loop, and the output that meets the conditions is output. And NLJ has 3 kinds of subdivision algorithms:

1、Simple Nested Loop Join(SNLJ)

 

    // 伪代码
    for (r in R) {
        for (s in S) {
            if (r satisfy condition s) {
                output <r, s>;
            }
        }
    }

 

 

SNLJ is a two-level loop full scan of the two tables connected, and the two records that meet the conditions are output. This is to make the two tables do the Cartesian product. The number of comparisons is R * S. It is a more violent algorithm and will compare. time consuming.

2、Index Nested Loop Join(INLJ)

 

    // 伪代码
    for (r in R) {
        for (si in SIndex) {
            if (r satisfy condition si) {
                output <r, s>;
            }
        }
    }

 

 

INLJ is optimized on the basis of SNLJ. The available index is determined by the connection condition, and the index is scanned in the Inner Loop without scanning the data itself, thereby improving the efficiency of the Inner Loop.
The INLJ also has the disadvantage that if the scanned index is a non-clustered index, and you need to access non-indexed data, an operation to read data back to the table will occur, which adds another random I/O operation.

3、Block Nested Loop Join(BNLJ)

Under normal circumstances, the MySQL optimizer will prefer to use the INLJ algorithm when the index is available, but when no index is available, or if it is judged that the full scan may be faster than using the index, it will not choose to use the too crude SNLJ algorithm.
Here is the BNLJ algorithm. BNLJ uses the join buffer on the basis of SNLJ, and will read the records required by the Inner Loop into the buffer in advance to improve the efficiency of the Inner Loop.

 

    // 伪代码
    for (r in R) {
        for (sbu in SBuffer) {
            if (r satisfy condition sbu) {
                output <r, s>;
            }
        }
    }

 

 

The parameter name that controls the size of the join buffer in MySQL is join_buffer_size.

We only store the used columns in the join buffer, not the whole rows.
join-buffer-size

According to the MySQL manual, join_buffer_size buffers the columns that are used.

Algorithm comparison (outer table size R, inner table size S):

 

                   \algorithm
comparison\
Simple Nested Loop Join Index Nested Loop Join Block Nested Loop Join
Number of external scans 1 1 1
Internal table scan times R 0
Number of read records R + R * S R + RS_Matches
Number of comparisons R * S R * IndexHeight R * S
Number of back to the table 0 RS_Matches 0

In MySQL 5.6, INLJ's table return operation has been optimized, and Batched Key Access Join (the table association method for batch index access, so translation is not necessary...) and Multi Range Read (mrr, multi-range read) ) Feature, cache the rowid of the data needed in the join operation, and then obtain the data in batches, optimize the I/O from multiple scattered operations to fewer batch operations, and improve efficiency.

 

How does MySQL check and enable the NLJ (Nested Loop Join) algorithm

SELECT @@optimizer_switch;

index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,engine_condition_pushdown=on,index_condition_pushdown=on,mrr=on,mrr_cost_based=on,block_nested_loop=on,batched_key_access=off,materialization=on,semijoin=on,loosescan=on,firstmatch=on,subquery_materialization_cost_based=on,use_index_extensions=on.

How to enable NLJ (Nested Loop Join)

Turn on/off block_nested_loop = on/off

set @@optimizer_switch = 'index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,engine_condition_pushdown=on,index_condition_pushdown=on,mrr=on,mrr_cost_based=on,block_nested_loop=on,batched_key_access=off,materialization=on,semijoin=on,loosescan=on,firstmatch=on,duplicateweedout=on,subquery_materialization_cost_based=on,use_index_extensions=on,condition_fanout_filter=on,derived_merge=on';

Verification opened successfully

EXPLAIN SELECT * FROM appt_appointment t1 JOIN phe_institution t2;

 

Guess you like

Origin blog.csdn.net/zw764987243/article/details/114384346