MySQL's JOIN (5): JOIN optimization practice sorting

This blog post describes how to optimize JOIN queries with sorting . It is roughly divided into two cases: sorting connection attributes and sorting non-connection attributes . Insert test data.

CREATE TABLE t1 (
        id INT PRIMARY KEY AUTO_INCREMENT,
        type INT
    );
    SELECT COUNT(*) FROM t1;
    +----------+
    | COUNT(*) |
    +----------+
    |   10000  |
    +----------+
    CREATE TABLE t2 (
        id INT PRIMARY KEY AUTO_INCREMENT,
        type INT
    );
    SELECT COUNT(*) FROM t2;
    +----------+
    | COUNT(*) |
    +----------+
    |      100 |
    +----------+

 

Sort connection properties

Now it is required to do an inner join between t1 and t2. The connection condition is t1.id=t2.id, and the connection attribute id attribute is sorted (MySQL has established an index for the primary key id).

There are two options, method one [...ORDER BY t1.id], method two [...ORDER BY t2.id], which one should I choose?

First, we find the driving table and the driven table. According to the principle that the small table drives the large table, the large table is t1 and the small table is t2, so t2 is the driving table, t1 is the non-driving table, and t2 drives t1. Then perform analysis. If we use method 1, MySQL will sort t1 first and then execute the table join algorithm. If we use method 2, we can only execute the table join algorithm and then sort the result set (extra: using temporary), Efficiency is bound to be low.

Therefore, when sorting the join attributes, the attributes of the driving table should be selected as the criteria in the sorting table.

-- Sort driven table fields
    EXPLAIN SELECT * FROM t1 INNER JOIN t2 ON t1.id =t2.id ORDER BY t1.id;
    +----+-------+--------+---------+------+---------------------------------+
    | id | table | type   | key     | rows | Extra                           |
    +----+-------+--------+---------+------+---------------------------------+
    |  1 | t2    | ALL    | NULL    |  100 | Using temporary; Using filesort |
    |  1 | t1    | eq_ref | PRIMARY |    1 | NULL                            |
    +----+-------+--------+---------+------+---------------------------------+


    -- Sort the driver table fields, no Using temporary, and no Using filesort
    EXPLAIN SELECT * FROM t1 INNER JOIN t2 ON t1.id =t2.id ORDER BY t2.id;
    +----+-------+--------+---------+------+-------+
    | id | table | type   | key     | rows | Extra |
    +----+-------+--------+---------+------+-------+
    |  1 | t2    | index  | PRIMARY |  100 | NULL  |
    |  1 | t1    | eq_ref | PRIMARY |    1 | NULL  |
    +----+-------+--------+---------+------+-------+

 

Sort non-connected properties

Now it is required to do an inner join on t1 and t2, the join condition is t1.id=t2.id, and sort the type attribute of the non-join attribute t1, [...ORDER BY t1.type].

First, we find the driving table and the driven table. According to the principle that a small table drives a large table, the large table is t1, and the small table is t2, so MySQL Optimizer will use t2 to drive t1. Now we have to sort the type attribute of t1, t1 is the driven table, which will inevitably lead to sorting the result set after the connection Using temporary (more serious than Using filesort). So, can I use a large table to drive a small table without using MySQL Optimizer? 
STRAIGHT_JOIN please!

EXPLAIN SELECT * FROM t1 INNER JOIN t2 ON t1.id =t2.id ORDER BY t1.type;
    +----+-------+--------+---------+------+---------------------------------+
    | id | table | type   | key     | rows | Extra                           |
    +----+-------+--------+---------+------+---------------------------------+
    |  1 | t2    | ALL    | NULL    |  100 | Using temporary; Using filesort |
    |  1 | t1    | eq_ref | PRIMARY |    1 | NULL                            |
    +----+-------+--------+---------+------+---------------------------------+


    -- Using temporary is gone, but the large table drives the small table, which leads to an increase in the number of inner loops. In actual development, we must proceed from the actual situation.
    -- weigh this.
    EXPLAIN SELECT * FROM t1 STRAIGHT_JOIN t2 ON t1.id =t2.id ORDER BY t1.type;
    +----+-------+--------+---------+-------+----------------+
    | id | table | type   | key     | rows  | Extra          |
    +----+-------+--------+---------+-------+----------------+
    |  1 | t1    | ALL    | NULL    | 10000 | Using filesort |
    |  1 | t2    | eq_ref | PRIMARY |     1 | NULL           |
    +----+-------+--------+---------+-------+----------------+

 

Finally, I dug a hole in MySQL's JOIN (1): usage , and now fill it in: INNER JOIN, JOIN, WHERE and other value joins and STRAIGHT_JOIN can represent inner joins, so how do you usually choose? In general, use INNER JOIN, JOIN or WHERE equivalent joins, because MySQL Optimizer will optimize according to the "small table driving large table strategy". When the above problems occur, consider using STRAIGHT_JOIN

Summarize

"MySQL JOIN" ends here.

This series of blog posts describes the usage of JOIN, the principle of JOIN, and the means of optimization based on the principle of JOIN. Hope it helps everyone :)

MySQL JOIN (1): Usage

MySQL's JOIN (2): JOIN principle

MySQL's JOIN (3): The number of loops within the JOIN optimization practice

MySQL's JOIN (4): Fast Matching of JOIN Optimization Practices

MySQL's JOIN (5): JOIN optimization practice sorting

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325166110&siteId=291194637