准备5张表的数据,例如
select * from yxl_test;
+----+-------+-----+
| id | name | val |
+----+-------+-----+
| 3 | au | 90.0 |
| 6 | pp | 92.0 |
| 8 | we | 57.0 |
| 8 | we | 27.0 |
| 6 | | 85.0 |
| 3 | tom | 30.0 |
| 12 | | 78.0 |
| NULL| jay | 49.0 |
| 7 | jy | 28.0 |
| 9 | | NULL |
+-------+-------+-------+
cache table yxl_test1 as select id,name,val from yxl_test order by id limit 2;
select * from yxl_test1;
+-------+-------+-------+
| id | name | val |
+-------+-------+-------+
| NULL | jay | 49.0 |
| 3 | tom | 30.0 |
+-------+-------+-------+
cache table yxl_test2 as select id,name,val from yxl_test order by id desc limit 2;
select * from yxl_test2;
+-----+-------+-------+--+
| id | name | val |
+-----+-------+-------+--+
| 12 | | 78.0 |
| 9 | | NULL |
+-----+-------+-------+--+
cache table yxl_test3 as select id,name,val from yxl_test where id = 6;
select * from yxl_test3;
+-----+-------+-------+--+
| id | name | val |
+-----+-------+-------+--+
| 6 | pp | 92.0 |
| 6 | | 85.0 |
+-----+-------+-------+--+
cache table yxl_test4 as select id,name,val from yxl_test where id=3;
select * from yxl_test4;
+-----+-------+-------+--+
| id | name | val |
+-----+-------+-------+--+
| 3 | au | 90.0 |
| 3 | tom | 30.0 |
+-----+-------+-------+--+
cache table yxl_test5 as select id,name,val from yxl_test where id=3 and name='au';
select * from yxl_test5;
+-----+-------+-------+--+
| id | name | val |
+-----+-------+-------+--+
| 3 | au | 90.0 |
+-----+-------+-------+--+
====================================TEST====================================
1:a表left join b表,a表left join c 表
结论:a为主表,数据量与a相同,列数增加为三倍,第一份a表,第二份b表,第三份c表,关联不上的列置空
select * from yxl_test a left join yxl_test1 b on a.id=b.id left join yxl_test2 c on a.id=c.id
+-------+-------+-------+-------+-------+-------+-------+-------+-------+--+
| id | name | val | id | name | val | id | name | val |
+-------+-------+-------+-------+-------+-------+-------+-------+-------+--+
| 3 | au | 90.0 | 3 | tom | 30.0 | NULL | NULL | NULL |
| 6 | pp | 92.0 | NULL | NULL | NULL | NULL | NULL | NULL |
| 8 | we | 57.0 | NULL | NULL | NULL | NULL | NULL | NULL |
| 8 | we | 27.0 | NULL | NULL | NULL | NULL | NULL | NULL |
| 6 | | 85.0 | NULL | NULL | NULL | NULL | NULL | NULL |
| 3 | tom | 30.0 | 3 | tom | 30.0 | NULL | NULL | NULL |
| 12 | | 78.0 | NULL | NULL | NULL | 12 | | 78.0 |
| NULL | jay | 49.0 | NULL | NULL | NULL | NULL | NULL | NULL |
| 7 | jy | 28.0 | NULL | NULL | NULL | NULL | NULL | NULL |
| 9 | | NULL | NULL | NULL | NULL | 9 | | NULL |
+-------+-------+-------+-------+-------+-------+-------+-------+-------+--+
物理执行计划关键部分:2821为yxl_test2; 2897为yxl_test1
== Physical Plan ==
*BroadcastHashJoin [id#3224], [id#2897], LeftOuter, BuildRight:- *BroadcastHashJoin [id#3224], [id#2821], LeftOuter, BuildRight
2:a表left join b表,b表inner join c 表
结论:执行计划显示a表先与b表进行inner join,再b表与c表进行inner join
例1:select * from yxl_test a left join yxl_test1 b on a.id=b.id inner join yxl_test5 c on b.id=c.id;
+-----+-------+-------+-----+-------+-------+-----+-------+-------+--+| id | name | val | id | name | val | id | name | val |
+-----+-------+-------+-----+-------+-------+-----+-------+-------+--+
| 3 | au | 90.0 | 3 | tom | 30.0 | 3 | au | 90.0 |
| 3 | tom | 30.0 | 3 | tom | 30.0 | 3 | au | 90.0 |
+-----+-------+-------+-----+-------+-------+-----+-------+-------+--+
物理执行计划关键部分:4371为yxl_test; 2821为yxl_test1; 4150为 yxl_test5
== Physical Plan ==
*BroadcastHashJoin [id#2821], [id#4150], Inner, BuildRight
:- *BroadcastHashJoin [id#4371], [id#2821], Inner, BuildRight
例2:select * from yxl_test a left join yxl_test1 b on a.id=b.id inner join yxl_test3 c on b.id=c.id;
+-----+-------+------+-----+-------+------+-----+-------+------+--+
| id | name | val | id | name | val | id | name | val |
+-----+-------+------+-----+-------+------+-----+-------+------+--+
+-----+-------+------+-----+-------+------+-----+-------+------+--+
3:a表inner join b表,a表left join c 表
结论:a表先与b表inner join ,再将取出来的数据集与c表left join
例1:select * from yxl_test a inner join yxl_test1 b on a.id=b.id left join yxl_test2 c on a.id=c.id;
+-----+-------+-------+-----+-------+-------+-------+-------+-------+--+
| id | name | val | id | name | val | id | name | val |
+-----+-------+-------+-----+-------+-------+-------+-------+-------+--+
| 3 | au | 90.0 | 3 | tom | 30.0 | NULL | NULL | NULL |
| 3 | tom | 30.0 | 3 | tom | 30.0 | NULL | NULL | NULL |
+-----+-------+-------+-----+-------+-------+-------+-------+-------+--+
物理执行计划关键部分:2821为yxl_test1; 2897为yxl_test2
== Physical Plan ==
*BroadcastHashJoin [id#3562], [id#2897], LeftOuter, BuildRight:- *BroadcastHashJoin [id#3562], [id#2821], Inner, BuildRight
例2:select * from yxl_test a inner join yxl_test1 b on a.id=b.id left join yxl_test4 c on a.id=c.id;
+-----+-------+-------+-----+-------+-------+-----+-------+-------+--+
| id | name | val | id | name | val | id | name | val |
+-----+-------+-------+-----+-------+-------+-----+-------+-------+--+
| 3 | au | 90.0 | 3 | tom | 30.0 | 3 | tom | 30.0 |
| 3 | au | 90.0 | 3 | tom | 30.0 | 3 | au | 90.0 |
| 3 | tom | 30.0 | 3 | tom | 30.0 | 3 | tom | 30.0 |
| 3 | tom | 30.0 | 3 | tom | 30.0 | 3 | au | 90.0 |
+-----+-------+-------+-----+-------+-------+-----+-------+-------+--+
物理执行计划关键部分:2821为yxl_test1; 3730为yxl_test4
== Physical Plan ==
*BroadcastHashJoin [id#3964], [id#3730], LeftOuter, BuildRight
:- *BroadcastHashJoin [id#3964], [id#2821], Inner, BuildRight
4:a表inner join b表,a表inner join c 表
select * from yxl_test a inner join yxl_test1 b on a.id=b.id inner join yxl_test3 c on a.id=c.id;
+-----+-------+------+------------+-----------------+---------+-----+-------+------+-----+-------+------+--+
| id | name | val | bandclass | p_provincecode | p_date | id | name | val | id | name | val |
+-----+-------+------+------------+-----------------+---------+-----+-------+------+-----+-------+------+--+
+-----+-------+------+------------+-----------------+---------+-----+-------+------+-----+-------+------+--+
物理执行计划关键部分:2821为yxl_test1; 2973为yxl_test3
== Physical Plan ==
*BroadcastHashJoin [id#5332], [id#2973], Inner, BuildRight
:- *BroadcastHashJoin [id#5332], [id#2821], Inner, BuildRight
5:a表 left join b表,a表inner join c表
结论:a为主表,先执行left join,再执行inner join
select * from yxl_test a left join yxl_test1 b on a.id=b.id inner join yxl_test2 c on a.id=c.id;
+-----+-------+-------+------------+-----------------+-------------+-------+-------+-------+-----+-------+-------+--+
| id | name | val | bandclass | p_provincecode | p_date | id | name | val | id | name | val |
+-----+-------+-------+------------+-----------------+-------------+-------+-------+-------+-----+-------+-------+--+
| 12 | | 78.0 | NULL | 510000 | 2018-04-10 | NULL | NULL | NULL | 12 | | 78.0 |
| 9 | | NULL | NULL | 510000 | 2018-04-10 | NULL | NULL | NULL | 9 | | NULL |
+-----+-------+-------+------------+-----------------+-------------+-------+-------+-------+-----+-------+-------+--+
物理执行计划关键部分:6488为yxl_test1 ;6540 为 yxl_test2
== Physical Plan ==
*BroadcastHashJoin [id#6776], [id#6540], Inner, BuildRight
:- *BroadcastHashJoin [id#6776], [id#6488], LeftOuter, BuildRight