Mysql Join syntax analysis and performance analysis

one. Join Syntax Overview

join is used for the connection between fields in multiple tables, the syntax is as follows:

... FROM table1 INNER|LEFT|RIGHT JOIN table2 ON conditiona

 table1: left table; table2: right table.

JOIN is roughly divided into the following three categories according to its function:

INNER JOIN (inner join, or equi-join): Get records that have a join matching relationship in the two tables.

LEFT JOIN (left join): Obtain the complete records of the left table (table1), that is, the right table (table2) has no corresponding matching records.

RIGHT JOIN (right join): Contrary to LEFT JOIN, the complete records of the right table (table2) are obtained, that is, the left table (table1) has no matching corresponding records.

Note: mysql does not support Full join , but you can combine LEFT JOIN and RIGHT JOIN through the UNION keyword to simulate FULL join.

Next, a column is given to explain the following categories. The following two tables (A, B)

mysql> select A.id,A.name,B.name from A,B where A.id=B.id;
+----+-----------+-------------+
| id | name       | name             |
+----+-----------+-------------+
|  1 | Pirate       | Rutabaga      |
|  2 | Monkey    | Pirate            |3 | Ninja | Darth Vader |
|  4 | Spaghetti  | Ninja             |
+----+-----------+-------------+
4 rows in set (0.00 sec)

 

二.Inner join

Inner join, also known as equijoin, inner join produces a set of data that matches both A and B.

mysql> select * from A inner join B on A.name = B.name;
+----+--------+----+--------+
| id | name   | id | name   |
+----+--------+----+--------+
|  1 | Pirate |  2 | Pirate |
| 3 | Ninjas | 4 | Ninjas |
+----+--------+----+--------+

 

三.Left join

mysql> select * from A left join B on A.name = B.name;
#或者:select * from A left outer join B on A.name = B.name;

+----+-----------+------+--------+
| id | name      | id   | name   |
+----+-----------+------+--------+
|  1 | Pirate    |    2 | Pirate |
|  2 | Monkey    | NULL | NULL   |
| 3 | Ninjas | 4 | Ninjas |
|  4 | Spaghetti | NULL | NULL   |
+----+-----------+------+--------+
4 rows in set (0.00 sec)

 Left join, (or left outer join: the two are equivalent in Mysql, it is recommended to use left join. ) Left join produces a complete set of records from the left table (A), and matching records (right table (B)). If there is no match, the right side will contain null.

 

If you want to generate only a set of records from the left table (A), but not the records of the right table (B), you can execute it by setting the where statement, as follows:

mysql> select * from A left join B on A.name=B.name where A.id is null or B.id is null;
+----+-----------+------+------+
| id | name      | id   | name |
+----+-----------+------+------+
|  2 | Monkey    | NULL | NULL |
|  4 | Spaghetti | NULL | NULL |
+----+-----------+------+------+
2 rows in set (0.00 sec)

 

In the same way, you can also simulate inner join. As follows:

mysql> select * from A left join B on A.name=B.name where A.id is not null and B.id is not null;
+----+--------+------+--------+
| id | name   | id   | name   |
+----+--------+------+--------+
|  1 | Pirate |    2 | Pirate |
| 3 | Ninjas | 4 | Ninjas |
+----+--------+------+--------+
2 rows in set (0.00 sec)

 Find the difference:

According to the above example, the difference set can be calculated as follows:

SELECT * FROM A LEFT JOIN B ON A.name = B.name
WHERE B.id IS NULL
union
SELECT * FROM A right JOIN B ON A.name = B.name
WHERE A.id IS NULL;
# result
    +------+-----------+------+-------------+
| id   | name      | id   | name        |
+------+-----------+------+-------------+
|    2 | Monkey    | NULL | NULL        |
|    4 | Spaghetti | NULL | NULL        |
| NULL | NULL      |    1 | Rutabaga    |
| NULL | NULL      |    3 | Darth Vader |
+------+-----------+------+-------------+

 

 

四.Right join

mysql> select * from A right join B on A.name = B.name;
+------+--------+----+-------------+
| id   | name   | id | name        |
+------+--------+----+-------------+
| NULL | NULL   |  1 | Rutabaga    |
|    1 | Pirate |  2 | Pirate      |
| NULL | NULL   |  3 | Darth Vader |
| 3 | Ninjas | 4 | Ninjas |
+------+--------+----+-------------+
4 rows in set (0.00 sec)

 同left join。

五.Cross join

cross join: cross join, the result is the product of the two tables, that is, the Cartesian product

The Descartes product is also called the direct product. Assuming set A={a,b}, set B={0,1,2}, then the Cartesian product of the two sets is {(a,0),(a,1),(a,2),( b,0),(b,1), (b,2)}. Can be extended to the case of multiple collections. A similar example is, if A represents the set of students in a certain school, and B represents the set of all courses in the school, then the Cartesian product of A and B represents all possible course selections.

mysql> select * from A cross join B;
+----+-----------+----+-------------+
| id | name      | id | name        |
+----+-----------+----+-------------+
|  1 | Pirate    |  1 | Rutabaga    |
|  2 | Monkey    |  1 | Rutabaga    |
|  3 | Ninja     |  1 | Rutabaga    |
|  4 | Spaghetti |  1 | Rutabaga    |
|  1 | Pirate    |  2 | Pirate      |
|  2 | Monkey    |  2 | Pirate      |
|  3 | Ninja     |  2 | Pirate      |
|  4 | Spaghetti |  2 | Pirate      |
|  1 | Pirate    |  3 | Darth Vader |
|  2 | Monkey    |  3 | Darth Vader |3 | Ninja | 3 | Darth Vader |
|  4 | Spaghetti |  3 | Darth Vader |
|  1 | Pirate    |  4 | Ninja       |
|  2 | Monkey    |  4 | Ninja       |
| 3 | Ninjas | 4 | Ninjas |
|  4 | Spaghetti |  4 | Ninja       |
+----+-----------+----+-------------+
16 rows in set (0.00 sec)

#Execute again: mysql> select * from A inner join B; try

#When executing mysql> select * from A cross join B on A.name = B.name; try

 In fact, in MySQL (only for MySQL), the performance of CROSS JOIN and INNER JOIN is the same. The result obtained without specifying the ON condition is the Cartesian product, otherwise, the result of the exact match between the two tables is obtained.

INNER JOIN and CROSS JOIN can omit the INNER or CROSS keyword, so the following SQL has the same effect:
... FROM table1 INNER JOIN table2
... FROM table1 CROSS JOIN table2
... FROM table1 JOIN table2

  

六.Full join

mysql> select * from A left join B on B.name = A.name
    -> union
    -> select * from A right join B on B.name = A.name;
+------+-----------+------+-------------+
| id   | name      | id   | name        |
+------+-----------+------+-------------+
|    1 | Pirate    |    2 | Pirate      |
|    2 | Monkey    | NULL | NULL        |
| 3 | Ninjas | 4 | Ninjas |
|    4 | Spaghetti | NULL | NULL        |
| NULL | NULL      |    1 | Rutabaga    |
| NULL | NULL      |    3 | Darth Vader |
+------+-----------+------+-------------+
6 rows in set (0.00 sec)

 All records generated by the full join (matching records on both sides) are in table A and table B. If there is no match, the opposite will contain null.

 

7. Performance optimization

1. Explicit inner join VS implicit inner join

Such as:

select * from
table a inner join table b
on a.id = b.id;

VS

select a.*, b.*
from table a, table b
where a.id = b.id;

 我在数据库中比较(10w数据)得之,它们用时几乎相同,第一个是显示的inner join,后一个是隐式的inner join。

参照:Explicit vs implicit SQL joins

2.left join/right join VS inner join

尽量用inner join.避免 LEFT JOIN 和 NULL.

在使用left join(或right join)时,应该清楚的知道以下几点:

(1). on与 where的执行顺序

ON 条件(“A LEFT JOIN B ON 条件表达式”中的ON)用来决定如何从 B 表中检索数据行。如果 B 表中没有任何一行数据匹配 ON 的条件,将会额外生成一行所有列为 NULL 的数据,在匹配阶段 WHERE 子句的条件都不会被使用。仅在匹配阶段完成以后,WHERE 子句条件才会被使用。它将从匹配阶段产生的数据中检索过滤。

所以我们要注意:在使用Left (right) join的时候,一定要在先给出尽可能多的匹配满足条件,减少Where的执行。如:

PASS:

 

select * from A
inner join B on B.name = A.name
left join C on C.name = B.name
left join D on D.id = C.id
where C.status>1 and D.status=1;

 GREAT:

 

select * from A
inner join B on B.name = A.name
left join C on C.name = B.name and C.status>1
left join D on D.id = C.id and D.status=1

 从上面例子可以看出,尽可能满足ON的条件,而少用Where的条件。从执行性能来看第二个显然更加省时。注意上面GREAT查询结果里仍然查出左表的全部数据,and条件只是过滤掉右表不符合条件的数据,返回记录数量

(2).注意ON 子句和 WHERE 子句的不同

如作者举了一个列子:

mysql> SELECT * FROM product LEFT JOIN product_details
       ON (product.id = product_details.id)
       AND product_details.id=2;
+----+--------+------+--------+-------+
| id | amount | id   | weight | exist |
+----+--------+------+--------+-------+
|  1 |    100 | NULL |   NULL |  NULL |
|  2 |    200 |    2 |     22 |     0 |
|  3 |    300 | NULL |   NULL |  NULL |
|  4 |    400 | NULL |   NULL |  NULL |
+----+--------+------+--------+-------+
4 rows in set (0.00 sec)
 
mysql> SELECT * FROM product LEFT JOIN product_details
       ON (product.id = product_details.id)
       WHERE product_details.id=2;
+----+--------+----+--------+-------+
| id | amount | id | weight | exist |
+----+--------+----+--------+-------+
|  2 |    200 |  2 |     22 |     0 |
+----+--------+----+--------+-------+
1 row in set (0.01 sec)

 从上可知,第一条查询使用 ON 条件决定了从 LEFT JOIN的 product_details表中检索符合的所有数据行。第二条查询做了简单的LEFT JOIN,然后使用 WHERE 子句从 LEFT JOIN的数据中过滤掉不符合条件的数据行。

总结:

1. 对于left join,不管on后面跟什么条件,左表的数据全部查出来,因此要想过滤需把条件放到where后面

2. 对于inner join,满足on后面的条件表的数据才能查出,可以起到过滤作用。也可以把条件放到where后面。

3、 on条件是在生成临时表时使用的条件,它不管on中的条件是否为真,都会返回左边表中的记录。

4、where条件是在临时表生成好后,再对临时表进行过滤的条件。这时已经没有left join的含义(必须返回左边表的记录)了,条件不为真的就全部过滤掉。

 

(3).尽量避免子查询,而用join

往往性能这玩意儿,更多时候体现在数据量比较大的时候,此时,我们应该避免复杂的子查询。如下:

PASS

 

insert into t1(a1) select b1 from t2 where not exists(select 1 from t1 where t1.id = t2.r_id); 

 

GREAT:

insert into t1(a1)  
select b1 from t2  
left join (select distinct t1.id from t1 ) t1 on t1.id = t2.r_id   
where t1.id is null;  

 

 

Guess you like

Origin http://10.200.1.11:23101/article/api/json?id=326641012&siteId=291194637